From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Mon Sep 10 2007 - 15:16:53 PDT
After several weeks of betas, I've finally released BLCR 0.6.0. It can be found at the BLCR Downloads page: http://ftg.lbl.gov/CheckpointRestart/CheckpointDownloads.shtml This version + adds support for checkpoint/restart of - memory shared via mmap(MAP_SHARED) - open unlinked files - pending signals + extends the range of supported kernels + greatly expands the test suite + fixes numerous bugs + New /experimental/ features include support for - PPC64 and ARM platforms - cross-compilation At the end of this message, I've included the full NEWS entry, relative to July's 0.5.6 release. Before reporting bugs, please read the (updated) FAQ to see if you have a known problem. Many thanks to the dedicated beta testers who identified many bugs I did not or could not reproduce on my own test platforms. Their testing efforts have ensured a much more stable/usable 0.6.0 release than would otherwise have been possible. -Paul PS You are receiving this either because you are on the checkpoint_at_lbl_dot_gov list, or because you've recently sent email to the list (or me directly) asking about BLCR status. NEWS excerpts: 0.6.0 -------- September 10, 2007 Functionality and expanded-support release. - This release adds support for 2.6.22 kernels. - This release includes experimental support for PPC64 platforms + PPC64 supports both 32- and 64-bit applications. + No support for 32-bit kernels. Contact us if you would like to help w/ a PPC32 port. + No support for 2.4.x kernels + Tested with NPTL and kernels 2.6.12 (Gentoo) and 2.6.18 (FC6) + There are known problems with BLCR with LinuxThreads on PPC64 - This release includes experimental support for ARM platforms. + Tested only for 2.6.12 and newer kernels + Thanks to Anton V. Uzunov <anton.uzunov_at_dsto_dot_defence_dot_gov.au> of the Australian Government Department of Defence, Defence Science and Technology Organisation for contributing this port. + ARM-specific questions should be directed to blcr-arm_at_hpcrd_dot_lbl_dot_gov - This release includes experimental support for cross-compilation. + See config/cross_helper.c for information on cross-compilation. + This has been tested only in the context of the ARM port + Thanks to Anton V. Uzonov for motivating and testing this work - This release includes a new API for issuing a checkpoint request. + Allows a program to request a checkpoint without the need to invoke system("cr_checkpoint ..."). + See comments in include/libcr.h for information on the following: cr_initialize_checkpoint_args_t() cr_request_checkpoint() cr_poll_checkpoint() - This release adds a mechanism (CR_CHECKPOINT_OMIT) for processes to exclude themselves from a checkpoint (useful for batch-system helper or shepherd processes). - This release makes cr_checkpoint and cr_restart utilities checkpointable - This release adds full support for mmap()-based shared memory + Repairs the loss of sharing that existed in 0.5.x releases + Supports hugetlbfs - This release adds full support for save/restore of pending signals. - Default scope of cr_checkpoint is now --tree, rather than --pid. - Now checkpoint/restart unlinked open files. - Revised handling of certain file-descriptors at restart: + No longer override "normal" files with correspondingly-numbered fds from cr_restart as that consistently breaks shell "here documents". + Restore pipes endpoints that lie outside the checkpoint scope by attaching them to stdin or stdout of cr_restart, rather than to its correspondingly-numbered fds. + Opens of a process's controlling tty are attached to "/dev/tty" at restart, even if they were open by their "exact" name at checkpoint time (e.g. "/dev/pts/0"). - Experimental support for relocatable kernels on x86 and x86-64 - Expanded test-suite - Option to install the testsuite (--enable-testsuite) - Support "install-strip", "install-exec" and "install-data" make targets - Tested against many scripting and programming language environments: + shells: ash, bash, (t)csh, (pd)ksh and zsh + scripting-type languages: perl, python, tcl/expect, ruby and guile + java runtime environments: Sun, IBM and GNU + misc. language runtimes: php, rep, clisp, emacslisp, gst, ocaml and sml + Run "make bonus-tests" to run these tests on your own machine, but be warned that the tests themselves are fragile (contain races) and may experience random failures. However, please do report any tests that fail consistently. - Many minor bug fixes and code cleanups July, 2007 - DEPRECATED support for LinuxThreads and for Linux 2.4.X kernels - Starting with the 0.6.0 release, new bug reports that one cannot reproduce under NPTL + Linux 2.6.x will receive little or none of our attention. However, we will try to distribute user-contributed fixes for such bugs. Note that the 0.6.0 release *does* pass the BLCR test-suite under LinuxThreads and/or 2.4.x kernels on the developers' x86 systems. However, we have seen test failures on PPC64 when running LinuxThreads with a 2.6.12 kernel (Gentoo distro). - Beginning with the next "full" release (0.7.0) we will begin to remove code in BLCR that exists only to support LinuxThreads and/or Linux 2.4.x. - We have not yet decided the fate of support for those 2.4.x kernels which include Red Hat's backport of NPTL support (RHL9.0, RHEL, RHAS, etc.). - If anybody cares enough about 2.4.x and/or LinuxThreads to volunteer to take over testing and maintenance of BLCR on such platforms, let us know. -- Paul H. Hargrove PHHargrove_at_lbl_dot_gov Future Technologies Group HPC Research Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900