From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Thu Jul 05 2007 - 05:25:08 PDT
Jerry, I am sorry to hear it is not working as you had hoped. Unfortunately there is little I can do to help debug the problem myself, since I have no machines for which I have both a Matlab license and root access to install BLCR. However, I can make a few suggestions for things you might try: 1) Run cr_checkpoint with the --tree option to ensure all of the children of the main matlab process are checkpointed too. By default cr_checkpoint saves only a single process, though it is likely that --tree will become the default in a future release. 2) Run matlab with the -nodisplay option. The connection to the X server is one resource I can guarantee won't restore correctly. 3) Try the most recent BLCR snaphot available at http://mantis.lbl.gov/blc-dist/snapshots, which adds/improves support for various shared memory and unlinked-tempfile tricks that matlab might be using. 4) Check your syslog and/or dmesg output to see if there is some message from BLCR that may indicate what resource is unavailable. 5) Finally, if you configure BLCR with --enable-debug then it will generate additional debug messages from the kernel (to syslog and/or dmesg) that may indicate the origin of the "resource unavailable" if you didn't find anything there in #4. If #4 or #5 turn up any log messages that look related, please send them to me and I'll try to make sense of them for you. I am afraid, however, that matlab may have an open socket to a license server. Since BLCR doesn't restore sockets, it is possible that this could be the problem and there would me no easy way to resolve it. -Paul Jerry Mersel wrote: > Hi Paul: > > I tried matlab with BLCR and I got the error resources not available > when I wnated to restart > matlab. I thought it had something to do with the PID's but I checked > and all the PIDs that matlab > used were free. So I'm not sure what was ca_using the problem. > > Regards, > Jerry_ > > Paul H. Hargrove wrote: > >> Jerry Mersel wrote: >> >>> Does berkeley checkpoint/restart work with matlab? >> >> Jerry, >> >> I am not aware of any reports (positive or negative) of BLCR used with >> Matlab, and am not in a position to make the tests myself. >> If you are able to try checkpoint/restart of Matlab, I'd appreciate >> hearing about your results, either success or failure. >> >> -Paul >> -- Paul H. Hargrove PHHargrove_at_lbl_dot_gov Future Technologies Group HPC Research Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900