From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Wed Feb 23 2005 - 10:29:47 PST
If *any* caller of cr_checkpoint() indicates a failure (CR_CHECKPOINT_TEMP_FAILURE or CR_CHECKPOINT_PERM_FAILURE) then no checkpoint will be taken. The assumption is that each callback is managing some important resource that the process will not be able to restart without. Thus we assume that they must all succeed or the checkpoint is unusable. In the TEMP case, the process is returned to a running state. In the PERM case the process is aborted. -Paul Michael Klemm wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, > > Paul H. Hargrove wrote: > | Micheal, > | Ulisses is correct, our design uses a single callback for > | "checkpoint", "continue after taking checkpoint" and "restart from > | checkpoint". The later two cases are to be distiguished by the value > | returned from cr_checkpoint(). > > Ah... OK. After re-reading the headers and getting an example to work I > understand that fact. Afterall, I can assume that each callback routine > has to call cr_checkpoint to allow BLCR to proceed to the next phase > during checkpoint. If just one callback denies, the whole process is > terminated. Right? > > -michael > > - -- > Computer Science Department 2, University of Erlangen-Nuremberg > Martensstrasse 3, D-91058 Erlangen, Germany > phone: ++49 (0)9131 85-28995, fax: ++49 (0)9131 85-28809 > web: http://www2.informatik.uni-erlangen.de/~klemm > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.2.4 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iD8DBQFCHKC6WEu1syWqdn0RAhJrAJ4keLooppgePd+WwT/7IygkMCRv6QCcDAgp > GrtWWmxtBBIHaMX6hw72Gp4= > =YD0J > -----END PGP SIGNATURE----- -- Paul H. Hargrove PHHargrove_at_lbl_dot_gov Future Technologies Group HPC Research Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900