From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Wed Apr 22 2009 - 21:48:03 PDT
Wei Zhongwei , What you describe is a known problem because the hsperfdata files are removed by the JRE when the job terminates, including by a fatal signal. You can read more about the problem and at least one solution in the BLCR FAQ: (http://upc-bugs.lbl.gov/blcr/doc/html/FAQ.html#hsperfdata). However, if you wish you should try adding --save-shared to the checkpoint command. Note that switching to a different JRE would also resolve the problem. -Paul Weizhongwei wrote: > Dear Professor: > When I checkpoint a java program I encounter some problems. Now I > list the steps: > Program code: > public class Hello { > public static void main(String[] args) { > // TODO Auto-generated method stub > for(int i=0;i<=220000;i++){ > System.out.println(i); > } > } > } > > Step1 > #cr_run java Hello > The program is running correctly…….. > Step2 > [root@localhost ~]# ps -a > PID TTY TIME CMD > 31746 pts/4 00:00:01 java > 31756 pts/5 00:00:00 ps > Step3: > [root@localhost ~]# cr_checkpoint 31746 > [root@localhost ~]# ls > -a anaconda-ks.cfg context.31746 Desktop install.log > install.log.syslog test.c > Step4:(some errors ,restart failed) > [root@localhost ~]# cr_restart context.31746 > - open('/tmp/hsperfdata_root/31746', 0x2) failed: -2 > - mmap failed: /tmp/hsperfdata_root/31746 > - thaw_threads returned error, aborting. -2 > - thaw_threads returned error, aborting. -2 > - thaw_threads returned error, aborting. -2 > - thaw_threads returned error, aborting. -2 > - thaw_threads returned error, aborting. -2 > - thaw_threads returned error, aborting. -2 > - thaw_threads returned error, aborting. -2 > - thaw_threads returned error, aborting. -2 > - thaw_threads returned error, aborting. -2 > Restart failed: No such file or directory > But when I checkpoint a C program it can be correctly restarted. > Can you help me resolve this problem ? > Thank you very much ! > > blcr version 8.0 > linux kernel version 2.2.18 > > > > > ------------------------------------------------------------------------ > 好玩贺卡等你发,邮箱贺卡全新上线! > <http://cn.rd.yahoo.com/mail_cn/tagline/card/*http://card.mail.cn.yahoo.com/> -- Paul H. Hargrove PHHargrove_at_lbl_dot_gov Future Technologies Group Tel: +1-510-495-2352 HPC Research Department Fax: +1-510-486-6900 Lawrence Berkeley National Laboratory