From: 任明明 (0110018_at_mail.nankai.edu.cn)
Date: Wed Mar 23 2005 - 06:35:35 PST
Thank you very much! I will wait for the new version. And Thank you all. 在您的来信中曾经提到: >From: Jeff Squyres <[email protected]> >Reply-To: >To: "$BG$L@L@(B" <[email protected]> >Subject: Re: lam/mpi blcr problem >Date:Wed, 23 Mar 2005 09:27:56 -0500 > >I'm sorry -- I neglected to mention in my previous e-mail that we had >some problems with the logic for checkpoint/restart initialization in >LAM/MPI v7.1.1. Can you try the soon-to-be-released 7.1.2 beta? > > http://www.lam-mpi.org/beta/ > >That should solve your problems. > > >On Mar 23, 2005, at 9:27 AM, 任明明 wrote: > >> >> thank you for your help! >> I can use blcr to checkpoint the non-MPI program,such as the examples >> included in the blcr software.And all the nodes are ok to checkpoint a >> non-MPI program. >> but when i use cr_checkpoint to checkpoint a MPI program, it doesn't >> generate >> context file for each process, only generate a context file for mpirun >> command. >> >> all i do is the the following: >> >> In one window: >> **************************************************** >> [rmingming@node01 lam]$ mpicc cpi.c -o cpi >> [rmingming@node01 lam]$ lamboot -v nodes >> >> LAM 7.1.1/MPI 2 C++/ROMIO - Indiana University >> >> n-1<8238> ssi:boot:base:linear: booting n0 (node01) >> n-1<8238> ssi:boot:base:linear: booting n1 (node02) >> n-1<8238> ssi:boot:base:linear: booting n2 (node03) >> n-1<8238> ssi:boot:base:linear: booting n3 (node04) >> n-1<8238> ssi:boot:base:linear: finished >> [rmingming@node01 lam]$ mpirun C -ssi rpi crtcp -ssi cr blcr ./cpi >> Process 0 on node01 >> Process 1 on node02 >> Process 3 on node04 >> Process 2 on node03 >> Enter the number of intervals: (0 quits) 0 (---during this i use >> cr_checkpoint) >> [rmingming@node01 lam]$ >> >> ****************************************************** >> >> in another window: >> >> ****************************************************** >> >> [rmingming@node01 lam]$ cr_checkpoint 8248 >> [rmingming@node01 lam]$ ls >> context.8248 cpi cpi.c hello.c nodes ring >>