From: Michael Brown (michael_brown_3_at_yahoo.com)
Date: Tue Jan 03 2006 - 13:31:02 PST
Thanks for the explanation. Now that I understand what's happening, it's easy enough for me to copy the contents of the hsperfdata directory at this point in time. There doesn't seem to be any machine-specific information in this directory. Mike --- "Paul H. Hargrove" <PHHargrove_at_lbl_dot_gov> wrote: > Mike, > Whatever /tmp/hsperfdata_user is, it is not > something internal to BLCR. > So, I can only assume it is created/used by the > Matlab code, perhaps > indirectly through some library linked into the > application. Not knowing > specifically what the files are I can't guarantee > that they can safely > be copied between hosts. You could see problems, for > instance, if the > files contain information like the IP address of the > original host, or > license keys tied to the MAC address of the original > host. > In the present version of BLCR, open files are dealt > with only "by > reference", and we must blindly assume that the > files and containing > directories still exist, with unmodified contents, > at restart. In the > future we will have the option to capture the > content of files as well. > We are looking at having some configuration or > heuristic to distinguish > file systems that are local (such as /tmp) from ones > that are shared > (such as an NFS-mounted /home) to decide when to > capture the file > content. I have no estimated date for such a > feature. > > -Paul > > Michael Brown wrote: > >I'm testing checkpoints for the 32bit linux 2.4 > >kernel. I'm using two hosts with identical > hardware > >and images. I'm trying to make sure that I can > >restore checkpoints between hosts. > > > >I noticed that although the basic counting example > can > >be checkpointed on host0 and started on host1, my > >custom Matlab code cannot. > > > >System messages suggested the restart failed > because > >the /tmp/hsperfdata_user directory didn't exist on > >host1. After copying this directory between hosts, > >the restart worked properly. > > > >I'm wondering if it is safe to do this before I put > it > >in widespread use. Are there any other system > files > >that should be copied also? What is the hsperfdata > >directory? Can this information be stored in the > >checkpoint itself? > > > >Thanks, > > > >Mike > > > > > > > >__________________________________________ > >Yahoo! DSL � Something to write home about. > >Just $16.99/mo. or less. > >dsl.yahoo.com > > > > > > > -- > Paul H. Hargrove > PHHargrove_at_lbl_dot_gov > Future Technologies Group > HPC Research Department Tel: > +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: > +1-510-486-6900 > > __________________________________ Yahoo! for Good - Make a difference this year. http://brand.yahoo.com/cybergivingweek2005/