Re: using blcr on program with fork

From: Andrea Autiero S143785 (andrea.autiero_at_studenti.polito.it)
Date: Tue Mar 03 2009 - 01:32:28 PST

  • Next message: checkpoint_at_lbl_dot_gov: "Your health is our business"
    well..i've resolved using files instead of shared memory..
    thanks for the support!!
    Andrea Autiero
    
    On Wed, 25 Feb 2009 12:22:39 -0800, "Paul H. Hargrove" <PHHargrove_at_lbl_dot_gov>
    wrote:
    > Andrea Autiero S143785 wrote:
    >> i'm using shared memory in my program
    >> removing every line refering to them let blcr checkpoint my
    >> applications..
    >> could be this the problem?
    >>   
    > Yes, that is almost certainly the problem.  In the dmesg output you sent 
    > I found
    >     blcr: vfs_read returned -22
    >     blcr: write returned -22 on copy-out of mmap()ed data
    >     blcr: vfs_read returned -22
    >     blcr: write returned -22 on copy-out of mmap()ed data
    > which is consistent with use of SysV or POSIX shared memory.
    > 
    > Unfortunately, BLCR does not yet have support for SvsY or POSIX shared 
    > memory.  However, if you can change your program to instead use an 
    > anonymous mmap() to obtain shared memory, that *is* supported by BLCR.
    > 
    > Additionally, it is possible to construct a program with BLCR callbacks 
    > that would disconnect from the shared memory when a checkpoint request 
    > is received, allowing the checkpoint to be taken, and then reconnect 
    > afterwards.  However, that opens up the messy issue of adding a 
    > mechanism for preserving the shared memory values.
    > 
    > -Paul
    > 
    > 
    >> On Mon, 23 Feb 2009 13:50:39 -0800, "Paul H. Hargrove"
    >> <PHHargrove_at_lbl_dot_gov>
    >> wrote:
    >>   
    >>> Andrea,
    >>>
    >>>   I cannot tell from the information you have provided what the problem
    
    >>> might be.  If I construct a simple example program that behaves as you 
    >>> describe, and I compile it as you describe, then I am able to
    checkpoint
    >>>
    >>> it and restart it just fine.
    >>>   Could you please check the output of the "dmesg" command and/or your 
    >>> system logs to see if there are any kernel messages that might help 
    >>> explain the failure.
    >>>
    >>> -Paul
    >>>
    >>> Andrea Autiero S143785 wrote:
    >>>     
    >>>> hi!
    >>>> it's me another time..
    >>>> after made statically linked file with blcr I've got another problem..
    >>>> I'm trying to checkpoint a program after it forks twice
    >>>> then from another shell (but in the future it will be done by the
    >>>>       
    >> program
    >>   
    >>>> itself)
    >>>> i try to checkpoint it and the answer is:
    >>>>  >ps -a
    >>>>    PID TTY          TIME CMD
    >>>>    5878 pts/0    00:00:00 controller
    >>>>    5879 pts/0    00:00:02 controller
    >>>>    5880 pts/0    00:00:02 controller
    >>>>    5881 pts/1    00:00:00 ps
    >>>>  >cr_checkpoint 5878
    >>>> Checkpoint failed: Invalid argument
    >>>>
    >>>> 5878 is the father..
    >>>> i've compiled it by 
    >>>>     >gcc -o controller controller.c -L/usr/local/lib/ -lcr_run -u
    >>>> cr_run_link_me -ldl -lpthread
    >>>>     >nm controller | grep _link_me
    >>>>          U cr_run_link_me
    >>>>
    >>>> (now is not statically linked because I'm trying on a pc and not on an
    >>>> embedded system, but is in the last one that it must work)
    >>>> why it do this?could you help me to make it works?
    >>>> thanks..
    >>>> have a good day
    >>>> Andrea Autiero
    >>>>
    >>>>
    

  • Next message: checkpoint_at_lbl_dot_gov: "Your health is our business"