Checkpoint/Restart for Linux (update)

From: Eric Roman (
Date: Tue Mar 26 2002 - 11:46:56 PST

All of you expressed some interest in checkpoint/restart on Linux.  Here's a
quick summary of what's going on.

Checkpoint/Restart web page is up
The project now has a web page.

Requirements for Linux Checkpoint/Restart
We've placed our requirements document online.  We'd like to get some feedback
from users, library developers, and kernel developers.  Please have a look!

Checkpoint/Restart for MPI
We've started working with Professor Andrew Lumsdaine and the LAM crew to
add a checkpoint/restart capability to LAM.  (LAM is a popular implementation
of MPI.)  This work will take place during summer 2002.

Checkpoint/Restart mailing list is now available
We've established a mailing list for checkpoint/restart development.
An archive of the list are available on our web page.  To subscribe, send
a message to majordomo_at_lbl_dot_gov, with the words 
  subscribe checkpoint your-email-address
somewhere in the message body.

Current Work
We are looking at the CRAK implementation of checkpoint/restart, and bproc's
vmadump (meant for process migration, but can do checkpoint/restart).
This work will lead to a technical report describing the work done in
checkpoint/restart for Linux to date.

Our kernel work is making good progress.  We're establishing entry points for
checkpoint/restart in the kernel and user processes, designing a format for
context files, and looking at our testing environment.  In a month or two,
we expect to be able to checkpoint simple processes.

Eric Roman  <eroman_at_lbl_dot_gov>     Future Technologies Group
510-486-6420                     Lawrence Berkeley National Laboratory