From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Tue Jul 03 2007 - 10:54:50 PDT
Jerry Mersel wrote: > Hi: > > I have a few questions about Berkeley Checkpoint/Restart that wasn't > clear to me in the documentation. > > > Can a process be checkpointed and then restarted on a different node > in a grid. Do the kernels have to be identical across all the > different nodes? > > I am considering using berkely checkpointing with Grid Engine 6. > > > Thank you, > Jerry > Jerry, If the nodes are sufficiently identical then restarting on a different node *is* possible. The kernels *do* need to be identical, as do any shared libraries used by the application(s) to be checkpointed. See the BLCR FAQ entry on prelinking (http://upc-bugs.lbl.gov/blcr/doc/html/FAQ.html#prelink) for the most common reason that moving to a different node might fail. -Paul -- Paul H. Hargrove PHHargrove_at_lbl_dot_gov Future Technologies Group HPC Research Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900