From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Tue Jan 03 2006 - 10:58:44 PST
Mike, Whatever /tmp/hsperfdata_user is, it is not something internal to BLCR. So, I can only assume it is created/used by the Matlab code, perhaps indirectly through some library linked into the application. Not knowing specifically what the files are I can't guarantee that they can safely be copied between hosts. You could see problems, for instance, if the files contain information like the IP address of the original host, or license keys tied to the MAC address of the original host. In the present version of BLCR, open files are dealt with only "by reference", and we must blindly assume that the files and containing directories still exist, with unmodified contents, at restart. In the future we will have the option to capture the content of files as well. We are looking at having some configuration or heuristic to distinguish file systems that are local (such as /tmp) from ones that are shared (such as an NFS-mounted /home) to decide when to capture the file content. I have no estimated date for such a feature. -Paul Michael Brown wrote: >I'm testing checkpoints for the 32bit linux 2.4 >kernel. I'm using two hosts with identical hardware >and images. I'm trying to make sure that I can >restore checkpoints between hosts. > >I noticed that although the basic counting example can >be checkpointed on host0 and started on host1, my >custom Matlab code cannot. > >System messages suggested the restart failed because >the /tmp/hsperfdata_user directory didn't exist on >host1. After copying this directory between hosts, >the restart worked properly. > >I'm wondering if it is safe to do this before I put it >in widespread use. Are there any other system files >that should be copied also? What is the hsperfdata >directory? Can this information be stored in the >checkpoint itself? > >Thanks, > >Mike > > > >__________________________________________ >Yahoo! DSL � Something to write home about. >Just $16.99/mo. or less. >dsl.yahoo.com > > -- Paul H. Hargrove PHHargrove_at_lbl_dot_gov Future Technologies Group HPC Research Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900