From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Thu Mar 11 2010 - 20:34:46 PST
Luyang Dong, I am afraid that we in the BLCR team did not write the LAM/MPI code for integration with BLCR. So, I may have no better idea than you about what that error means. The only thing that I can think of is that I recall that there was some problem with open log files if LAM/MPI was built with --enable-debug and BLCR support. I doubt that is your problem, but thought I'd mention it just in case it was useful. There is no longer any LAM/MPI development community who might be able to help with your question. All of the developers from LAM/MPI that are still in that line of work are now part of the Open MPI project: http://www.open-mpi.org Open MPI is integrated with BLCR at least as well as LAM/MPI ever was. So, unless you have some specific need for LAM/MPI I would suggest you switch to Open MPI. -Paul luyang dong wrote: > dear teachers: > I am a graduate student from department of computer science and > technologny of shandong university.Recently, I was confused with the > use of LAM/MPI integrating with blcr. I run a mpi program like > this,*mpirun -np 4 -ssi cr_base_dir /home/cu0605/blcr > hello_world*.(and hello_world is a the name of mpi program),and then I > use ps -ef|grep hello_world to find its pid. After that I run another > command *lamcheckpoint -ssi cr blcr -pid 24224.(assuming the pid of > mpirun is 24224). *Then I press ctrl-c to kill the mpirun program,and > run *lamrestart -ssi cr blcr -ssi cr_blcr_context_file > context.mpirun.24224. *But the result of this command always outputs > *mpirun: Bad file descriptor.* I do not know how to deal with it,and I > want to know how to solve this problem. > > > thanks a lot > > best wishes > > Luyang Dong > > 3.12 2010 > > > -- Paul H. Hargrove PHHargrove_at_lbl_dot_gov Future Technologies Group Tel: +1-510-495-2352 HPC Research Department Fax: +1-510-486-6900 Lawrence Berkeley National Laboratory