Announcing the release of BLCR 0.6.2

From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Mon Jan 14 2008 - 14:49:27 PST

  • Next message: Locus Jackson: "Restart my program failed ?"
    To fix problems found in 0.6.1, I am releasing BLCR 0.6.2.  It can be 
    found at the BLCR Downloads page:
    http://ftg.lbl.gov/CheckpointRestart/CheckpointDownloads.shtml
    
     From the NEWS file:
    
    0.6.2
    --------
    January 14, 2008
    Bug-fix and expanded-support release.
      - This release adds support for 2.6.23 kernels.
      - This release adds support for SuSE's 2.6.22.x kernels.
      - This release fixes a file descriptor leak that occurred on restart from
        a checkpoint-of-self requested via cr_request_checkpoint().
      - This release fixes a deadlock (and unkillable process(es)) when a
        multi-threaded process aborts (or omits itself from) a checkpoint
        under certain conditions.
      - This release fixes a restart-time failure when a checkpoint includes a
        pipe with one end outside the checkpoint scope, and data is buffered
        in the pipe.
      - This release fixes a bug with the cr_request{,_file}() calls in which
        a failed checkpoint would cause failure of the next one if it had the
        same destination file name.
      - This release fixes a race condition with the cr_enter_cs() and 
    checkpoints
        in multi-threaded processes.
      - This release fixes post-checkpoint signal delivery (--stop and friends)
        to occur after the checkpoint is fully completed.  See bug 2201 for
        a full description of the problems addressed by these changes.
      - This release documents (and fully implements) signal-delivery options
        to cr_restart (see bug 2200).
      - This release fixes two kernel Oopses (bugs 2222 and 2223) due to races
        against processes/threads that are exiting.
      - Adds test cases for most of the bugs fixed in this release.
      - Minor improvements/changes to documentation
      - Other minor bug fixes
    
    
    -Paul
    
    PS
    You are receiving this either because you are on the checkpoint_at_lbl_dot_gov
    list, or because you've recently sent email to the list (or me directly)
    asking about BLCR status.
    
    
    -- 
    Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    Future Technologies Group
    HPC Research Department                   Tel: +1-510-495-2352
    Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
    

  • Next message: Locus Jackson: "Restart my program failed ?"