From: Anton V. Uzunov (anton.uzunov_at_dsto_dot_defence_dot_gov.au)
Date: Thu Nov 02 2006 - 21:19:29 PST
Hi, I am currently testing BLCR in the hope of using it as our checkpoint/restore library, and I have encountered a problem with checkpointing multi-process applications. For example, BLCR has trouble (or perhaps I am not doing something correct?) checkpointing a simple C program which uses (a slightly modified version of) the "filecounting" example provided with BLCR: ... pid_t p = fork(); if (p == 0) execlp( "filecounting", ... ); waitpid( p, ... ); ... (The slight modification in "filecounting" consitst of making it multi-threaded as per the other BLCR example, "pthread_counting"). In such a case two PIDs are created, one for the parent and child processes respectively, and while both processes can be checkpointed using cr_checkpoint PID, neither of them can be restored via cr_restart. Perhaps this has to do with BLCR not having implemented checkpointing of process groups? If this is the case, do you know (approximately) when this functionality will be implemented? Is there perhaps a (newer, not entirely stable) CVS snapshot that has (some of) this functionality? Or should I perhaps use the library hooks to implement multi-process checkpointing myself, if this has not already been implemented? I would appreciate any information on this. Best regards, Anton V. Uzunov -- Anton V. Uzunov Information Networks Division, ACC Group, Defence Science and Technology Organization, Edinburgh SA, Australia ph: (+061) (08) 8259-7598 e-mail: Anton.Uzunov_at_dsto_dot_defence_dot_gov.au IMPORTANT: This e-mail remains the property of the Australian Defence Organisation and is subject to the jurisdiction of section 70 of the CRIMES ACT 1914. If you have received this e-mail in error, you are requested to contact the sender and delete the e-mail.