From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Tue Apr 01 2008 - 09:02:41 PST
I believe that when you say "there is an entry for that file in the proc filesystem", that you mean that "ls -l /proc/<pid>/fd" shows something for the file. If you mean something else, let me know. As to what is happening wrong after you remove the vfs_unlink() from the cr_mkunlinked() logic, I have only one guess based on your description. My guess is that the file is getting created with a name like ".blcr_0123.456789ab" . Because this name starts with '.' the "ls" command won't show it by default, try "ls -a". If this *is* what is happening, then the issue is that the cr_mkunlinked() code is renaming the file before creating it. The rename is done only when the last argument ("unlinked_id") is non-zero. Both the rename and the unlink are triggered by the non-zero "unlinked_id" that cr_mkunlinked() passes to cr_filp_mknod(). It should be sufficient to call cr_mkunlinked() with a zero value for the unlinked_id argument, though you may need to remove the debugging check that checks for non-zero value at lines 933-936 of cr_io.c (assuming BLCR version 0.6.2 or newer). Let me know if you need anything else. -Paul [email protected] wrote: > Dear sir, > > While trying to implement BACKUP_RESTORE policy in file checkpointing, we > came across a problem. Specifically, while restarting the process from > it's context file, the file opened by the process ( outfile : opened by > Examples/file_counting/file_counting ) does not get created on the disk > filesystem. > > > To implement the BACKUP_RESTORE policy, we have used your function > cr_mkunlinked() (in file cr_io.c ) logic, with the modification that we > are not doing vfs_unlink() in our version of the function. We think that > doing this should create a normal file that is not unlinked ( since we > are not performing the vfs_unlink() in our version ), even if we delete > the file ( outfile ) by rm command after we have taken the checkpoint. > > > However, this does not happen. NO file gets created on the disk > filesystem, but there is an entry for that file in the proc filesystem, > which was getting updated after we ran cr_restart. What we wanted was this > file should have been created on the disk (if removed) and should have > been updated like it is being updated right now in the proc filesystem. > > > Could you please tell us what is wrong with our modification ( removal of > vfs_unlink() from your cr_mkunlinked() function ) ? > > > Thanks, > > Manish Kumar & Abhinav Jha > > IIT Guwahati - India > > > ----------------------------------------------------------------------------------- > This email was sent from IIT Guwahati Webmail. If you are not the intended recipient, please contact the sender by email and delete all copies; your cooperation in this regard is appreciated. > http://www.iitg.ac.in -- Paul H. Hargrove PHHargrove_at_lbl_dot_gov Future Technologies Group HPC Research Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900