From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Mon Dec 17 2007 - 12:23:30 PST
The first beta of BLCR 0.6.2 is now available at http://mantis.lbl.gov/blcr-dist/ Both source tarball and SRPM are available. The filenames and MD5 checksums are: 93249f20abd4eeec7a07db2f2a6cd2b2 blcr-0.6.2_b1.tar.gz e8ecba22c98de143ced20f83db76d8a1 blcr-0.6.2_b1-1.src.rpm This is a beta of a 0.6.2 patch release. The intent of 0.6.2 is to fix a small number of significant bugs found in 0.6.0 and 0.6.1 and to add support for 2.6.23 kernels and some vendor-patched 2.6.22 kernels. A NEWS entry summarizing these changes appears below. You are receiving this e-mail either because you are subscribed to the checkpoint_at_lbl_dot_gov mailing list or because you have reported one of the bugs or previously unsupported kernel versions addressed by this release. I apologize if you receive multiple copies. I would greatly appreciate any feedback (positive or negative) indicating if this beta fixes any problems you have reported with BLCR 0.6.0 and/or 0.6.1. Only after I have sufficient positive feedback will I make 0.6.2 available for download from the main BLCR web pages. -Paul 0.6.2_b1 -------- December 17, 2007 Bug-fix and expanded-support release. - This release adds support for 2.6.23 kernels. - This release adds support for SuSE's 2.6.22.x kernels. - This release fixes a file descriptor leak that occurred on restart from a checkpoint-of-self requested via cr_request_checkpoint(). - This release fixes a deadlock (and unkillable process(es)) when a multi-threaded process aborts (or omits itself from) a checkpoint under certain conditions. - This release fixes a restart-time failure when a checkpoint includes a pipe with one end outside the checkpoint scope, and data is buffered in the pipe. - This release fixes a bug with the cr_request{,_file}() calls in which a failed checkpoint would cause failure of the next one if it had the same destination file name. - This release fixes a race condition with the cr_enter_cs() and checkpoints in multi-threaded processes. - This release fixes post-checkpoint signal delivery (--stop and friends) to occur after the checkpoint is fully completed. See bug 2201 for a full description of the problems addressed by these changes. - This release documents (and fully implements) signal-delivery options to cr_restart (see bug 2200). - Adds test cases for most of the bugs fixed in this release. -- Paul H. Hargrove PHHargrove_at_lbl_dot_gov Future Technologies Group HPC Research Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900