Re: berkeley checkpoint and matlab

From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Thu Jul 05 2007 - 05:25:08 PDT

  • Next message: Jerry Mersel: "Re: berkeley checkpoint and matlab"
    Jerry,
      I am sorry to hear it is not working as you had hoped.  Unfortunately
    there is little I can do to help debug the problem myself, since I have
    no machines for which I have both a Matlab license and root access to
    install BLCR.  However, I can make a few suggestions for things you
    might try:
    
    1) Run cr_checkpoint with the --tree option to ensure all of the
    children of the main matlab process are checkpointed too.  By default
    cr_checkpoint saves only a single process, though it is likely that
    --tree will become the default in a future release.
    2) Run matlab with the -nodisplay option.  The connection to the X
    server is one resource I can guarantee won't restore correctly.
    3) Try the most recent BLCR snaphot available at
    http://mantis.lbl.gov/blc-dist/snapshots, which adds/improves support
    for various shared memory and unlinked-tempfile tricks that matlab might
    be using.
    4) Check your syslog and/or dmesg output to see if there is some message
    from BLCR that may indicate what resource is unavailable.
    5) Finally, if you configure BLCR with --enable-debug then it will
    generate additional debug messages from the kernel (to syslog and/or
    dmesg) that may indicate the origin of the "resource unavailable" if you
    didn't find anything there in #4.
    
    If #4 or #5 turn up any log messages that look related, please send them
    to me and I'll try to make sense of them for you.
    
    I am afraid, however, that matlab may have an open socket to a license
    server.  Since BLCR doesn't restore sockets, it is possible that this
    could be the problem and there would me no easy way to resolve it.
    
    
    -Paul
    
    
    Jerry Mersel wrote:
    > Hi Paul:
    > 
    >   I tried matlab with BLCR and I got the error resources not available
    > when I wnated to restart
    > matlab. I thought it had something to do with the PID's but I checked
    > and all the PIDs that matlab
    > used  were free. So  I'm not sure  what was ca_using the problem.
    > 
    > Regards,
    >      Jerry_
    > 
    > Paul H. Hargrove wrote:
    > 
    >> Jerry Mersel wrote:
    >>
    >>> Does berkeley checkpoint/restart work with matlab?
    >>
    >> Jerry,
    >>
    >> I am not aware of any reports (positive or negative) of BLCR used with
    >> Matlab, and am not in a position to make the tests myself.
    >> If you are able to try checkpoint/restart of Matlab, I'd appreciate
    >> hearing about your results, either success or failure.
    >>
    >> -Paul
    >>
    
    
    -- 
    Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    Future Technologies Group
    HPC Research Department                   Tel: +1-510-495-2352
    Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
    

  • Next message: Jerry Mersel: "Re: berkeley checkpoint and matlab"