inquring about checkpoint

From: luyang dong (dongluyang2006_at_yahoo.com.cn)
Date: Thu Mar 11 2010 - 18:44:25 PST

  • Next message: Paul H. Hargrove: "Re: inquring about checkpoint"
    dear teachers:
           I am a graduate student from department of computer science and technologny of shandong university.Recently, I was confused with the use of LAM/MPI integrating with blcr. I run a mpi program like this,mpirun -np 4 -ssi cr_base_dir /home/cu0605/blcr  hello_world.(and hello_world is a the name of mpi program),and then I use ps -ef|grep hello_world to find its pid. After that I run another command lamcheckpoint -ssi cr blcr -pid 24224.(assuming the pid of mpirun is 24224). Then I press ctrl-c to kill the mpirun program,and run lamrestart -ssi cr blcr -ssi cr_blcr_context_file context.mpirun.24224. But the result of this command always outputs mpirun: Bad file descriptor. I do not know how to deal with it,and I want to know how to solve this problem.
     
                                                                                                        thanks a lot 
                                                                                                        best wishes
                                                                                                       Luyang Dong 
                                                                                                       3.12 2010
    
    
          
    

  • Next message: Paul H. Hargrove: "Re: inquring about checkpoint"