From: Andrea Autiero S143785 (andrea.autiero_at_studenti.polito.it)
Date: Tue Mar 03 2009 - 01:32:28 PST
well..i've resolved using files instead of shared memory.. thanks for the support!! Andrea Autiero On Wed, 25 Feb 2009 12:22:39 -0800, "Paul H. Hargrove" <PHHargrove_at_lbl_dot_gov> wrote: > Andrea Autiero S143785 wrote: >> i'm using shared memory in my program >> removing every line refering to them let blcr checkpoint my >> applications.. >> could be this the problem? >> > Yes, that is almost certainly the problem. In the dmesg output you sent > I found > blcr: vfs_read returned -22 > blcr: write returned -22 on copy-out of mmap()ed data > blcr: vfs_read returned -22 > blcr: write returned -22 on copy-out of mmap()ed data > which is consistent with use of SysV or POSIX shared memory. > > Unfortunately, BLCR does not yet have support for SvsY or POSIX shared > memory. However, if you can change your program to instead use an > anonymous mmap() to obtain shared memory, that *is* supported by BLCR. > > Additionally, it is possible to construct a program with BLCR callbacks > that would disconnect from the shared memory when a checkpoint request > is received, allowing the checkpoint to be taken, and then reconnect > afterwards. However, that opens up the messy issue of adding a > mechanism for preserving the shared memory values. > > -Paul > > >> On Mon, 23 Feb 2009 13:50:39 -0800, "Paul H. Hargrove" >> <PHHargrove_at_lbl_dot_gov> >> wrote: >> >>> Andrea, >>> >>> I cannot tell from the information you have provided what the problem >>> might be. If I construct a simple example program that behaves as you >>> describe, and I compile it as you describe, then I am able to checkpoint >>> >>> it and restart it just fine. >>> Could you please check the output of the "dmesg" command and/or your >>> system logs to see if there are any kernel messages that might help >>> explain the failure. >>> >>> -Paul >>> >>> Andrea Autiero S143785 wrote: >>> >>>> hi! >>>> it's me another time.. >>>> after made statically linked file with blcr I've got another problem.. >>>> I'm trying to checkpoint a program after it forks twice >>>> then from another shell (but in the future it will be done by the >>>> >> program >> >>>> itself) >>>> i try to checkpoint it and the answer is: >>>> >ps -a >>>> PID TTY TIME CMD >>>> 5878 pts/0 00:00:00 controller >>>> 5879 pts/0 00:00:02 controller >>>> 5880 pts/0 00:00:02 controller >>>> 5881 pts/1 00:00:00 ps >>>> >cr_checkpoint 5878 >>>> Checkpoint failed: Invalid argument >>>> >>>> 5878 is the father.. >>>> i've compiled it by >>>> >gcc -o controller controller.c -L/usr/local/lib/ -lcr_run -u >>>> cr_run_link_me -ldl -lpthread >>>> >nm controller | grep _link_me >>>> U cr_run_link_me >>>> >>>> (now is not statically linked because I'm trying on a pc and not on an >>>> embedded system, but is in the last one that it must work) >>>> why it do this?could you help me to make it works? >>>> thanks.. >>>> have a good day >>>> Andrea Autiero >>>> >>>>