Re: Program received signal SIGSEGV after restarted on different hosts.

From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Wed Apr 23 2008 - 12:18:44 PDT

  • Next message: Barracuda Spam Firewall: "**Message you sent blocked by our bulk email filter**"
    I believe your problem is probably "prelinking", as described in the
    following FAQ entry:
       http://mantis.lbl.gov/blcr/doc/html/FAQ.html#prelink
    Let us know if after disabling prelinking you still have problems moving
    checkpoints between nodes.
    
    -Paul
    
    
    ǿ  wrote:
    > My cluster has 64 nodes with  Linux 2.6.20, x86-64 .
    > MPI programs make checkpoint with BLCR-0.5.5, ans restarted successfully
    > on the same node, e.g. cn01. But programs  receive signal SIGSEGV in
    > libc when restarted on the nodes other than cn01.
    > I find /lib64/tls/libc.so.6 has  different entry point on different  nodes.
    > How to do?
    > 
    > ------------------------------------------------------------------------
    > Ż䣬䣡 <http://cn.mail.yahoo.com/>
    
    
    -- 
    Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    Future Technologies Group
    HPC Research Department                   Tel: +1-510-495-2352
    Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
    

  • Next message: Barracuda Spam Firewall: "**Message you sent blocked by our bulk email filter**"