From: Jeff Squyres (jsquyres_at_lam-mpi.org)
Date: Wed Mar 23 2005 - 07:17:38 PST
If you wouldn't mind, could you try the beta and ensure that it works for you? On Mar 23, 2005, at 9:35 AM, 任明明 wrote: > > Thank you very much! I will wait for the new version. > And Thank you all. > > ÔÚÄúµÄÀ´ÐÅÖÐÔø¾Ìáµ½: >> From: Jeff Squyres <[email protected]> >> Reply-To: >> To: "$BG$L@L@(B" <[email protected]> >> Subject: Re: lam/mpi blcr problem >> Date:Wed, 23 Mar 2005 09:27:56 -0500 >> >> I'm sorry -- I neglected to mention in my previous e-mail that we had >> some problems with the logic for checkpoint/restart initialization in >> LAM/MPI v7.1.1. Can you try the soon-to-be-released 7.1.2 beta? >> >> http://www.lam-mpi.org/beta/ >> >> That should solve your problems. >> >> >> On Mar 23, 2005, at 9:27 AM, ÈÎÃ÷Ã÷ wrote: >> >>> >>> thank you for your help! >>> I can use blcr to checkpoint the non-MPI program,such as the examples >>> included in the blcr software.And all the nodes are ok to checkpoint >>> a >>> non-MPI program. >>> but when i use cr_checkpoint to checkpoint a MPI program, it doesn't >>> generate >>> context file for each process, only generate a context file for >>> mpirun >>> command. >>> >>> all i do is the the following: >>> >>> In one window: >>> **************************************************** >>> [rmingming@node01 lam]$ mpicc cpi.c -o cpi >>> [rmingming@node01 lam]$ lamboot -v nodes >>> >>> LAM 7.1.1/MPI 2 C++/ROMIO - Indiana University >>> >>> n-1<8238> ssi:boot:base:linear: booting n0 (node01) >>> n-1<8238> ssi:boot:base:linear: booting n1 (node02) >>> n-1<8238> ssi:boot:base:linear: booting n2 (node03) >>> n-1<8238> ssi:boot:base:linear: booting n3 (node04) >>> n-1<8238> ssi:boot:base:linear: finished >>> [rmingming@node01 lam]$ mpirun C -ssi rpi crtcp -ssi cr blcr ./cpi >>> Process 0 on node01 >>> Process 1 on node02 >>> Process 3 on node04 >>> Process 2 on node03 >>> Enter the number of intervals: (0 quits) 0 (---during this i use >>> cr_checkpoint) >>> [rmingming@node01 lam]$ >>> >>> ****************************************************** >>> >>> in another window: >>> >>> ****************************************************** >>> >>> [rmingming@node01 lam]$ cr_checkpoint 8248 >>> [rmingming@node01 lam]$ ls >>> context.8248 cpi cpi.c hello.c nodes ring >>> > > -- {+} Jeff Squyres {+} [email protected] {+} http://www.lam-mpi.org/