Re: /proc/PID/exe not restored on restart in 0.8.2

From: Josh Hursey (jjhursey_at_open-mpi.org)
Date: Wed Aug 26 2009 - 13:02:09 PDT

  • Next message: Paul H. Hargrove: "Re: /proc/PID/exe not restored on restart in 0.8.2"
    I posted the Bug #2620:
       http://upc-bugs.lbl.gov/bugzilla/show_bug.cgi?id=2620
    
    As another data point, I have access to another machine running BLCR  
    0.8.1 and Linux kernel 2.6.18-53, and it is running correctly in this  
    regard.
    
    I will try to downgrade the kernel and see if that helps.
    
    Thanks,
    Josh
    
    On Aug 26, 2009, at 3:32 PM, Paul H. Hargrove wrote:
    
    > Josh,
    >
    > The error you see building 0.8.1 is because it does not support a  
    > 2.6.29.6 kernel (supported 2.6.29, but the put_fs_struct change  
    > took place somewhere in the 2.6.26.X series).
    >
    > I have tried 0.8.1 and 0.8.2 on 2.6.29 and 2.6.30 kernels and all  
    > three valid combinations show the invalid /proc/pid/exe link.
    >
    > The problem does not appear to be dependent on BLCR version, and a  
    > quick look at kernel sources suggest the problem may originate in a  
    > kernel change between 2.6.25 and 2.6.26.  So, I suspect that if you  
    > use a kernel 2.6.25 or older the symlink will be correct.
    >
    > If you have a moment, please enter a bug report for this.  Based on  
    > the kernel change I noticed, this is probably less than a 1-day job  
    > to fix and test, but I don't know when I'll be able to start.
    >
    > -Paul
    >
    > Josh Hursey wrote:
    >> I have a Linux box running Fedora 11, and BLCR 0.8.2
    >> ----------
    >> shell$ uname -a
    >> Linux cloud9 2.6.29.6-217.2.8.fc11.i586 #1 SMP Sat Aug 15 00:44:39  
    >> EDT 2009 i686 i686 i386 GNU/Linux
    >> ----------
    >>
    >> I am finding that the /proc/PID/exe link is not restored on  
    >> cr_restart (I used the counting example in the distribution). It  
    >> is valid when running normally, but after restart the link is  
    >> invalid (not pointing to anything).
    >>
    >> I believe that 0.8.1 was working correctly in this regard, but I  
    >> cannot verify on this machine at the moment (build error  
    >> cr_dest_file.c:189: error: implicit declaration of function  
    >> �put_fs_struct�).
    >>
    >> Have others seen this problem? I can send along more info if that  
    >> might help.
    >>
    >> -- Josh
    >>
    >>
    >
    >
    > -- 
    > Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    > Future Technologies Group                 Tel: +1-510-495-2352
    > HPC Research Department                   Fax: +1-510-486-6900
    > Lawrence Berkeley National Laboratory
    

  • Next message: Paul H. Hargrove: "Re: /proc/PID/exe not restored on restart in 0.8.2"