Re: berkeley checkpoint and matlab

From: Jerry Mersel (jerry.mersel_at_weizmann.ac.il)
Date: Sun Jul 08 2007 - 00:05:16 PDT

  • Next message: Mallikarjuna Shastry: "Re: bugs in blcr"
    Hi:
    
      Thanks for your advice(s) but it didn't seem to help.
    
       I ran matlab like this:
    
          cr_run env LD_PRELOAD=libcr.so.0:libpthread.so.0 matlab  
    -nodisplay -nosplash -nojvm&
          then:
    
          cr_checkpoint --tree --kill <process-id>
          cr_restart ./context.<process-id>
    
       I still got resource not available.
    
    
                                   Regards,
                                     Jerry
    
    
    Paul H. Hargrove wrote:
    
    >Jerry,
    >  I am sorry to hear it is not working as you had hoped.  Unfortunately
    >there is little I can do to help debug the problem myself, since I have
    >no machines for which I have both a Matlab license and root access to
    >install BLCR.  However, I can make a few suggestions for things you
    >might try:
    >
    >1) Run cr_checkpoint with the --tree option to ensure all of the
    >children of the main matlab process are checkpointed too.  By default
    >cr_checkpoint saves only a single process, though it is likely that
    >--tree will become the default in a future release.
    >2) Run matlab with the -nodisplay option.  The connection to the X
    >server is one resource I can guarantee won't restore correctly.
    >3) Try the most recent BLCR snaphot available at
    >http://mantis.lbl.gov/blc-dist/snapshots, which adds/improves support
    >for various shared memory and unlinked-tempfile tricks that matlab might
    >be using.
    >4) Check your syslog and/or dmesg output to see if there is some message
    >from BLCR that may indicate what resource is unavailable.
    >5) Finally, if you configure BLCR with --enable-debug then it will
    >generate additional debug messages from the kernel (to syslog and/or
    >dmesg) that may indicate the origin of the "resource unavailable" if you
    >didn't find anything there in #4.
    >
    >If #4 or #5 turn up any log messages that look related, please send them
    >to me and I'll try to make sense of them for you.
    >
    >I am afraid, however, that matlab may have an open socket to a license
    >server.  Since BLCR doesn't restore sockets, it is possible that this
    >could be the problem and there would me no easy way to resolve it.
    >
    >
    >-Paul
    >
    >
    >Jerry Mersel wrote:
    >  
    >
    >>Hi Paul:
    >>
    >>  I tried matlab with BLCR and I got the error resources not available
    >>when I wnated to restart
    >>matlab. I thought it had something to do with the PID's but I checked
    >>and all the PIDs that matlab
    >>used  were free. So  I'm not sure  what was ca_using the problem.
    >>
    >>Regards,
    >>     Jerry_
    >>
    >>Paul H. Hargrove wrote:
    >>
    >>    
    >>
    >>>Jerry Mersel wrote:
    >>>
    >>>      
    >>>
    >>>>Does berkeley checkpoint/restart work with matlab?
    >>>>        
    >>>>
    >>>Jerry,
    >>>
    >>>I am not aware of any reports (positive or negative) of BLCR used with
    >>>Matlab, and am not in a position to make the tests myself.
    >>>If you are able to try checkpoint/restart of Matlab, I'd appreciate
    >>>hearing about your results, either success or failure.
    >>>
    >>>-Paul
    >>>
    >>>      
    >>>
    >
    >
    >  
    >
    

  • Next message: Mallikarjuna Shastry: "Re: bugs in blcr"