Re: lam/mpi blcr problem

From: Jeff Squyres (jsquyres_at_lam-mpi.org)
Date: Tue Mar 22 2005 - 12:23:46 PST

  • Next message: 任明明: "Re: lam/mpi blcr problem"
    On Mar 22, 2005, at 12:05 PM, Paul H. Hargrove wrote:
    
    > I am sorry to hear that you are having problems.  Lets see if we can 
    > help.
    >
    > As far as I can tell your LAM configuration is OK, but I am cc:ing 
    > this to one of the LAM developers who may be able to spot something I 
    > could not.
    
    No need -- I'm actually on the checkpoint_at_lbl_dot_gov list.  :-)
    
    > Have you tried 'make check' in the blcr build directory or 
    > checkpointing/restarting some of the non-mpi examples in blcr's 
    > examples directory?  It would be good to know that the blcr build was 
    > OK before bring LAM into the mix.
    >
    > When LAM ran the mpi application, was blcr installed (and the kernel 
    > modules loaded) on all the compute nodes running the mpi job?
    
    Additionally, were you using the crtcp RPI?  I.e., what was the 
    specific command that you used to mpirun your application?  And how did 
    you try to checkpoint it?
    
    -- 
    {+} Jeff Squyres
    {+} jsquyres@lam-mpi.org
    {+} http://www.lam-mpi.org/
    

  • Next message: 任明明: "Re: lam/mpi blcr problem"