From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Wed Jul 22 2009 - 11:13:05 PDT
Darius, I am pretty busy, so I'll only give the "outline" of a solution. What I believe to be the "current best" solution is an "ephemeral file". In a BLCR callback, one can call cr_get_restart_info() to get a structure that includes various information. One field of the struct is the pid of the restart requester (the cr_restart process). Since that value is unique per-instance it can be used to create a unique file name. The only trick is that one uses the cr_restart utility instead of the cr_request_restart() API, one needs to fork(), create the file, then exec*() rather than just call system("cr_restart ...."). NOTE: in the future we plan to provide an "aux environment" to allow callbacks to use a getenv-style mechanism to query key-value pairs set by the restart requester. That will become the recommended mechanism when available (by SC09 would be great, but I can't promise that). -Paul Darius Buntinas wrote: > I'm trying to figure out the best way to provide a process new > parameters on a restart. > > Take the example of a process which is connected to a server over an > ephemeral port. The process is checkpointed, and later restarted and > needs to reconnect to a new instance of the server possibly listening on > a different port. > > When the process was originally started the port information was given > as a command line parameter or an environment variable. But when it's > restarted from a checkpoint it gets the original values (as they're part > of the process image). So I need to find a way to get this new > information to the restarted process. > > One solution would be to use a well-known port rather than an ephemeral > port. This is not the best solution since there may be more than one > instance of the server running, and the process needs to connect to the > correct one. Also, the server may be running on a different node > altogether. > > Another solution is to write this info to a file and have the process > read the file on restart. Again we'd need a well-known filename which > would have trouble if there are multiple instances being restarted. > > I'm wondering if there is a way to have the restarter process open an fd > to the server before it forks the new restarted process (I'm not sure > how exactly this is done) so the restarted process would inherit the > opened fd. > > Any ideas would be greatly appreciated. > > Thanks, > -d > -- Paul H. Hargrove PHHargrove_at_lbl_dot_gov Future Technologies Group Tel: +1-510-495-2352 HPC Research Department Fax: +1-510-486-6900 Lawrence Berkeley National Laboratory