Re: Can checkpoint at will ?

From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Wed Dec 12 2007 - 12:20:48 PST

  • Next message: 王磊: "Checkpoint program at any time ?"
    王磊 wrote:
    > Dear Sir,
    >
    > Is there a mechanism that I can set checkpoints at my will,and
    > restart my program from one checkpoint which is
    >
    > pointed out by me ?
    >
    > And examples may be few ,so I still have many questions of the source
    > in libcr.h . Would you please
    >
    > offer us more examples ?
    >
    > Thank you !
    
    The purpose of BLCR is to allow checkpoints to be taken at any time a
    user chooses and to restart from any of the checkpoints taken. The BLCR
    User's Guide
    (http://upc-bugs.lbl.gov/blcr/doc/html/BLCR_Users_Guide.html) includes a
    brief description of the command line tools in BLCR. The following
    transcript shows me checkpointing and restarting the "counting" example
    program in the BLCR source tree. This program is linked to libcr.so, but
    has no other changes to make it checkpoint. In this example I ask
    cr_checkpoint to kill (with SIGTERM) the program, but that is not required.
    
    $ cd examples/counting/
    $ make
    [...make output removed...]
    $ ./counting &
    [1] 21040
    Counting demo starting with pid 21040
    Count = 0
    Count = 1
    Count = 2
    [...some output removed...]
    Count = 22
    Count = 23
    Count = 24
    $ cr_checkpoint 21040 --term
    [1]+  Terminated              ./counting
    $ ls
    context.21040  counting  counting.o  Makefile
    $ cr_restart context.21040
    Count = 25
    Count = 26
    Count = 27
    [...some output removed...]
    Count = 118
    Count = 119
    $
    
    
    For more programming examples, I am afraid there is not much I can offer
    you. There is a lot of code in the tests directory that uses BLCR's
    libcr in various ways, but that is not a good place to try to learn. If
    you have specific questions, you can ask this list for help.
    
    It can be very hard for both the person asking and those who answer when
    English is not the first language for both. If I have not answered what
    you wanted to ask, please try to ask again and I'll try again to answer.
    
    -Paul
    
    -- 
    Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    Future Technologies Group
    HPC Research Department                   Tel: +1-510-495-2352
    Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
    

  • Next message: 王磊: "Checkpoint program at any time ?"