Re: lam/mpi blcr problem

From: Jeff Squyres (jsquyres_at_lam-mpi.org)
Date: Wed Mar 23 2005 - 09:29:08 PST

  • Next message: Paul H. Hargrove: "Re: lam/mpi blcr problem"
    It could well be that stdin is not being checkpointed.
    
    Paul?
    
    
    On Mar 23, 2005, at 12:18 PM, 浠绘槑鏄 wrote:
    
    >
    > I changed for another program which just does matrix multiplication,
    > this time checkpoint and restart of the MPI program worked very well.
    >
    >
    > 脭脷脛煤碌脛脌麓脨脜脰脨脭酶戮颅脤谩碌陆:
    >> From: "脠脦脙梅脙梅" <0110018@mail.nankai.edu.cn>
    >> Reply-To: "脠脦脙梅脙梅" <0110018@mail.nankai.edu.cn>
    >> To: checkpoint_at_lbl_dot_gov
    >> Subject: Re: lam/mpi blcr problem
    >> Date:Thu, 24 Mar 2005 00:33:10 +0800
    >>
    >>
    >> it seems ok now, at least i can see the context files for each 
    >> process.
    >> but as to my cpi program(it needs input from the first process, and i
    >> checkpointed it when it is waiting for the keyboard input),
    >> when use cr_restart, the program quits quickly.
    >> by the way, when use cr_checkpoint PID-of-mpirun(doesn't use --term) 
    >> to
    >> this cpi example program, it quits running. I don't know what's the 
    >> problem
    >> is, and wish i have expressed this problem clearly.:-)
    >>
    >> Thank you for your valuable information.
    >>
    >>
    >> 脭脷脛煤碌脛脌麓脨脜脰脨脭酶戮颅脤谩碌陆:
    >>> From: "脠脦脙梅脙梅" <0110018@mail.nankai.edu.cn>
    >>> Reply-To: "脠脦脙梅脙梅" <0110018@mail.nankai.edu.cn>
    >>> To: checkpoint_at_lbl_dot_gov
    >>> Subject: Re: lam/mpi blcr problem
    >>> Date:Wed, 23 Mar 2005 23:34:11 +0800
    >>>
    >>>
    >>> I will, I will use this version:
    >>>
    >>> http://www.lam-mpi.org/download/files/lam-7.1.2b18.tar.bz2
    >>>
    >>> 脭脷脛煤碌脛脌麓脨脜脰脨脭酶戮颅脤谩碌陆:
    >>>> From: Jeff Squyres <jsquyres@lam-mpi.org>
    >>>> Reply-To:
    >>>> To: "$BG$L@L@(B" <0110018@mail.nankai.edu.cn>
    >>>> Subject: Re: lam/mpi blcr problem
    >>>> Date:Wed, 23 Mar 2005 10:17:38 -0500
    >>>>
    >>>> If you wouldn't mind, could you try the beta and ensure that it 
    >>>> works
    >>>> for you?
    >>>>
    >>>>
    >>>> On Mar 23, 2005, at 9:35 AM, 脠脦脙梅脙梅 wrote:
    >>>>
    >>>>>
    >>>>> Thank you very much! I will wait for the new version.
    >>>>> And Thank you all.
    >>>>>
    >>>>
    >>>
    >>>
    >>>
    >>>
    >>
    >>
    >>
    >
    >
    
    -- 
    {+} Jeff Squyres
    {+} jsquyres@lam-mpi.org
    {+} http://www.lam-mpi.org/
    

  • Next message: Paul H. Hargrove: "Re: lam/mpi blcr problem"