From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Thu Feb 22 2007 - 16:50:41 PST
A fifth beta release of BLCR 0.5.0 is now available at http://mantis.lbl.gov/blcr-dist Here are the changes since last week's beta4: - Added support for open()s of /dev/{null,zero,full,random,urandom} - Fix additional bugs found during testing: + Bug 1933: crash restoring dup of ignored fd (socket or chrdev) + MAP_SHARED mmap()ed regions would become MAP_PRIVATE upon restart + Checkpointing a process tree failed for multi-threaded processes + Certain failed restart cases would leave unkillable processes Below is the full NEWS entry, relative to the Nov. 2005 0.4.2 release. This will become 0.5.0 in about 7 days, unless significant new bugs are reported. -Paul PS You are receiving this either because you are on the checkpoint_at_lbl_dot_gov list, or because you've recently sent email to the list (or me directly) asking about BLCR status. 0.5.0_b5 -------- February 22, 2007 Functionality and expanded-support release. - Expanded kernel coverage + 2.6.0 through 2.6.19 for x86 and x86_64 + 2.4.0 through 2.4.34 for x86 only - Multi-process support (related processes and associated pipes) + See BLCR_Users_Guide.html and the cr_checkpoint man page - Support for 32-bit apps on 64-bit kernels + See "--enable-multilib" in BLCR_Admin_Guide.html - Support for directories opened with opendir() - Support for open()s of /dev/{null,zero,full,random,urandom} - Support for checkpoints on Luster file systems + Contributed by Dean Luick <luick_at_cray_dot_com> - Support for building static libcr + Contributed by Dean Luick <luick_at_cray_dot_com> - Fixes to many distclean problems + Issues identified by Dean Luick <luick_at_cray_dot_com> - I/O aggregation for improved performance + Contributed by Qi Gao <[email protected]> - Additional examples and test cases - API addition: cr_get_restart_info() - "Retool" of configure code for ease of addition/maintenance - Numerous bug fixes, including: + Bug 1396: SIGPIPE when restarting w/ stdin/out from/to a pipe + Bug 1640: context files > 2GB require O_LARGEFILE + Bug 1662: context files open R/W leads to restart failure + Bug 1669: checkpoint to a socket fails + Bug 1807: unrecognized warning suppression flag passed to gcc + Bug 1854: libcr link failure w/ stack-protection-enabled gcc + Bug 1925: link failure w/ pthread_atfork() on some glibc versions + Bug 1933: crash restoring dup of ignored fd (socket or chrdev) + Incorrect treatment of certain anonymous mmap() cases + MAP_SHARED mmap()ed regions would become MAP_PRIVATE upon restart * NOTE: We still fail to restore any sharing among processes when using MAP_ANONYMOUS or when mapping an unlinked file. However, children fork()ed after a restart will now correctly share with their parent. FIXING THE LOST SHARING IS A HIGH-PRIORITY ITEM FOR 0.6.0 + Wrong parent for restored orphans (children of init) + dup()ed file descriptors always restored together -- Paul H. Hargrove PHHargrove_at_lbl_dot_gov Future Technologies Group HPC Research Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900