From: Darius Buntinas (buntinas_at_mcs_dot_anl_dot_gov)
Date: Wed Jul 22 2009 - 11:50:16 PDT
OK, that makes sense. We'll use the "ephemeral file" solution for now, but look forward to the aux environment. That sounds exactly like what is needed! Thanks, -d On 07/22/2009 01:13 PM, Paul H. Hargrove wrote: > Darius, > > I am pretty busy, so I'll only give the "outline" of a solution. > > What I believe to be the "current best" solution is an "ephemeral > file". In a BLCR callback, one can call cr_get_restart_info() to get a > structure that includes various information. One field of the struct is > the pid of the restart requester (the cr_restart process). Since that > value is unique per-instance it can be used to create a unique file > name. The only trick is that one uses the cr_restart utility instead of > the cr_request_restart() API, one needs to fork(), create the file, then > exec*() rather than just call system("cr_restart ...."). > > NOTE: in the future we plan to provide an "aux environment" to allow > callbacks to use a getenv-style mechanism to query key-value pairs set > by the restart requester. That will become the recommended mechanism > when available (by SC09 would be great, but I can't promise that). > > -Paul > > Darius Buntinas wrote: >> I'm trying to figure out the best way to provide a process new >> parameters on a restart. >> >> Take the example of a process which is connected to a server over an >> ephemeral port. The process is checkpointed, and later restarted and >> needs to reconnect to a new instance of the server possibly listening on >> a different port. >> >> When the process was originally started the port information was given >> as a command line parameter or an environment variable. But when it's >> restarted from a checkpoint it gets the original values (as they're part >> of the process image). So I need to find a way to get this new >> information to the restarted process. >> >> One solution would be to use a well-known port rather than an ephemeral >> port. This is not the best solution since there may be more than one >> instance of the server running, and the process needs to connect to the >> correct one. Also, the server may be running on a different node >> altogether. >> >> Another solution is to write this info to a file and have the process >> read the file on restart. Again we'd need a well-known filename which >> would have trouble if there are multiple instances being restarted. >> >> I'm wondering if there is a way to have the restarter process open an fd >> to the server before it forks the new restarted process (I'm not sure >> how exactly this is done) so the restarted process would inherit the >> opened fd. >> >> Any ideas would be greatly appreciated. >> >> Thanks, >> -d >> > >