Re: Segmentation fault running cr_restart

From: Jerome (jerome_at_ibt.unam.mx)
Date: Fri Feb 08 2008 - 16:30:11 PST

  • Next message: David Kesler: "BLCR full system lockup during cr_checkpoint"
    Hi Paul
    
    Paul H. Hargrove wrote:
    > Jerome,
    >  I am fairly certain you are seeing the ill-effects of "prelinking".  We 
    > have an FAQ on this issue:
    >       http://mantis.lbl.gov/blcr/doc/html/FAQ.html#prelink
    >  Let us know if that proves not to be the source of your problem and 
    > we'll see how we can help
    
    You're rigth! I've just modified the prelink option in my master node, 
    and after this, the checkpoint run well, without segmentation fault.
    It was just for understand the problem, as normaly the master node don't 
    will run any program..
    
    Thank's a lot.
    
    best Regards
    
    > Jerome wrote:
    >> Hi all
    >>
    >> i'm just beginning to use BLCR library for my own cluster, in the case 
    >> of package that's dont include chekpointing avaibility.
    >> As to understand how BLCR run, i'm just doing a dummy program as 
    >> "hello world" with a sleep command to have time to do a checkpoint .-)
    >>
    >> I run this program on mi cluster's master node and on the nodes.
    >> But i'v notice that when i do a checkpoint on the master node, i'v got 
    >> an horrible "Segmentation Fault" restarting it on a node. And the 
    >> master and nodes have the same kernel version, the same libraries.
    >> What i have to do te detect from where comes the problem?
    >>
    >> Best regards.
    >>
    >>
    > 
    > 
    
    
    -- 
    -- Jérôme
    - Chez nous, pour le réveilon, je lui ai dit, il y aura mémé, ma tante
    Dorothée, et tonton Eugène.
    - Chez nous, m'a dit Alceste, il y aura du boudin blanc, et de la dinde.
      	(Histoires inédites du Petit Nicolas, Goscinny & Sempé)
    

  • Next message: David Kesler: "BLCR full system lockup during cr_checkpoint"