From: Locus Jackson (locus_jackson_at_yahoo_dot_com)
Date: Mon Jan 28 2008 - 22:40:47 PST
Hi, After I close the file,I do not get the error any more,but I still can not restart my program. ...... signal(SIGSEGV,myhandler) char buf[5]; set_checkpoint(); //a scanf("%s",buf); //b ...... in signal function,I catch the signal SIGSEGV,signal(SIGSEGV,myhandler) in myhandler fucntion,I restart my program from place a where I set a checkpoint before If I input more than 5 characters in buf,it will overflow,so I will catch signal SIGSEGV to call myhandler,then it will rollback to place a to ask me to re-input buf. But when it rollback to place a,it will not allow me to input anything, just show me a character � and run into the end.If I input less than 5 characters,it is true. In /var/log,it tells me that " cr_rstrt_procs: No rstrt_req attached to filp!" By the way, I can use checkpoint file by command-line form to restart program successfully,either I input less than 5 chars or more than 5. Thank you for your help. Regards Locus. ----- Original Message ---- From: Paul H. Hargrove <PHHargrove_at_lbl_dot_gov> To: Locus Jackson <[email protected]> Cc: checkpoint_at_lbl_dot_gov Sent: Tuesday, January 29, 2008 3:14:34 AM Subject: Re: Why do I get the error "cri_syscall(CR_OP_RSTRT_PROCS):Invalid arguments" Locus, Most of the time, EINVAL from the restart operation means that the checkpoint context file appears invalid, either because i contains unexpected data or because ti is truncated. Based on what you describe, you are sending the checkpoint output to a filedescriptor via "cr_checkpoint -F" (as opposed to "-f"). So one possibility is that you may simply need to close the file before the parent exits (immediately after cr_poll_checkpoint() is a good place). It is also possible that the file was open in the original program and some data other than the checkpoint was written to it. You should check your system logs (/var/log/message or dmesg) to see what information the kernel printed when the CR_RSTRT_PROCS call failed. That may help narrow down the source of the problem. -Paul Locus Jackson wrote: > Hi, > I come across this problem when I restart my program. > I use system("cr_checkpoint -F fildes") in my program to set a > checkpoint,but when I use execlp("cr_restart filename") > to restart my program in its child process(wait until parent is > exited),it always tells me that "cri_syscall(CR_RSTRT_PROCS): Invalid > arguments". > What does this error mean,and what should I do to solve this error in > order to use execlp("cr_restart filename") in child process to > rollback to the checkpoint set by system("cr_checkpoint -F fildes") > (fildes: file descriptor of filename). > Thank you for your help. > Regards > Locus. > > ------------------------------------------------------------------------ > Looking for last minute shopping deals? Find them fast with Yahoo! > Search. > <http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping> -- Paul H. Hargrove PHHargrove_at_lbl_dot_gov Future Technologies Group HPC Research Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs