Re: passing parameters on restart

From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Wed Jul 22 2009 - 11:13:05 PDT

  • Next message: Darius Buntinas: "Re: passing parameters on restart"
    Darius,
    
      I am pretty busy, so I'll only give the "outline" of a solution.
    
      What I believe to be the "current best" solution is an "ephemeral 
    file".  In a BLCR callback, one can call cr_get_restart_info() to get a 
    structure that includes various information.  One field of the struct is 
    the pid of the restart requester (the cr_restart process).  Since that 
    value is unique per-instance it can be used to create a unique file 
    name.  The only trick is that one uses the cr_restart utility instead of 
    the cr_request_restart() API, one needs to fork(), create the file, then 
    exec*() rather than just call system("cr_restart ....").
    
    NOTE: in the future we plan to provide an "aux environment" to allow 
    callbacks to use a getenv-style mechanism to query key-value pairs set 
    by the restart requester.  That will become the recommended mechanism 
    when available (by SC09 would be great, but I can't promise that).
    
    -Paul
    
    Darius Buntinas wrote:
    > I'm trying to figure out the best way to provide a process new
    > parameters on a restart.
    >
    > Take the example of a process which is connected to a server over an
    > ephemeral port.  The process is checkpointed, and later restarted and
    > needs to reconnect to a new instance of the server possibly listening on
    > a different port.
    >
    > When the process was originally started the port information was given
    > as a command line parameter or an environment variable.  But when it's
    > restarted from a checkpoint it gets the original values (as they're part
    > of the process image).  So I need to find a way to get this new
    > information to the restarted process.
    >
    > One solution would be to use a well-known port rather than an ephemeral
    > port.  This is not the best solution since there may be more than one
    > instance of the server running, and the process needs to connect to the
    > correct one.  Also, the server may be running on a different node
    > altogether.
    >
    > Another solution is to write this info to a file and have the process
    > read the file on restart.  Again we'd need a well-known filename which
    > would have trouble if there are multiple instances being restarted.
    >
    > I'm wondering if there is a way to have the restarter process open an fd
    > to the server before it forks the new restarted process (I'm not sure
    > how exactly this is done) so the restarted process would inherit the
    > opened fd.
    >
    > Any ideas would be greatly appreciated.
    >
    > Thanks,
    > -d
    >   
    
    
    -- 
    Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    Future Technologies Group                 Tel: +1-510-495-2352
    HPC Research Department                   Fax: +1-510-486-6900
    Lawrence Berkeley National Laboratory     
    

  • Next message: Darius Buntinas: "Re: passing parameters on restart"