From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Tue Feb 05 2008 - 14:50:00 PST
Manish Kumar wrote: > hi, > > why do we save the contents of unlinked file in > cr_module/cr_save_open_file() and do corresponding reconstruction by > cr_mkunlinked() in its restart counterpart in cr_rstrt_req.c ? > > I think, its there for process in which a file is open() and unlink() > outright...so that when process exits, file is automatically closed, if > its reference count reaches zero. So if this process is checkpointed, > after the unlink() call in its code, there will be no entry in file system > corresponding to that file, but the process still needs the in-memory data > of file, when reconstructed from the checkpoint file. > > Please correct me, if I am wrong...or is there any other subtle issue ??? > > ----------------------------------------- > Manish Kumar > Dept. of Computer Science & Engineering > Indian Institute of Technology Guwahati > Guwahati - 39, INDIA > ----------------------------------------- Manish, Your reasoning is correct, except for the phrase "in-memory data". Since we cannot know what read or write operations the process may conduct after restart, we must restore the entire "in-file" data regardless of whether it exists anywhere in memory. When the unlinked file's reference count reached zero, the on-disk copy of the file data was removed. So, BLCR must create its own copy before that happens. -Paul -- Paul H. Hargrove PHHargrove_at_lbl_dot_gov Future Technologies Group HPC Research Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900