From: Adolfo J. Banchio (banchio_at_famaf_dot_unc_dot_edu.ar)
Date: Tue Sep 02 2008 - 13:58:44 PDT
Hi, I have upgraded the cluster to Rocks 5.0 (Centos 5.0) and blcr 0.7.3 (from blcr 0.5) and now I have the following problem. When I checkpoint running programs directly from the command line it works fine. But the same checkpoint command when it is given by the SGE (batch queueing system) checkpointing script ends up in a core dump file. What I can see is that blcr started to create the checkpoint file ( .context...) and it then writes a core.PID file (I presume the PID there is the one from the cr_checkpoint process). I can not figure out where the difference might lie, since the script is run the same user I use when it does work. Any help will be welcome. thanks in advance, adolfo -- Adolfo J. Banchio <banchio_at_famaf_dot_unc_dot_edu.ar>