Re: Problems with BLCR?

From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Wed Jul 27 2005 - 10:21:00 PDT

  • Next message: Jeff Squyres: "Re: Problems with BLCR?"
    Jeff,
    
      I am not sure this explains why a simple hello world program should 
    fail to restart.  Even if romio runs some initialization code at 
    MPI_Init time, I can see how any actual async I/O would be started.
    
    -Paul
    
    Jeff Squyres wrote:
    
    > On Jul 26, 2005, at 5:01 PM, Paul H. Hargrove wrote:
    >
    >>   There is no support in current BLCR versions for either POSIX or 
    >> Linux-native async I/O support.  While this has nothing to do with 
    >> whatever linker problems Jeff mentioned, it could be the cause of the 
    >> problems you've been seeing.
    >
    >
    > I'm inferring from Pradeep's mail that there was an RPM that was 
    > removed, but has now been replaced (LAM won't use libaio unless it 
    > finds it during configure -- so it must have been there at some point 
    > and then was later removed).
    >
    >>   How/when is async I/O used in LAM?  Is there a simple way to 
    >> disable it via ssi params?
    >
    >
    > It's used in ROMIO.  There are currently no SSI params to remove its 
    > use -- part of the problem is that the wrapper compilers add "-laio"   
    > So it's not just a run-time switch to change ROMIO's behavior, it's a 
    > compile-time decision (ROMIO makes a bunch of decisions and sets 
    > #define's based on whether AIO is present or not) for both LAM and ROMIO.
    >
    > But this also explains why we rarely (never?) saw this problem in our 
    > own testing -- the vast majority of our manual testing builds disable 
    > ROMIO because it takes so long to compile.  Urgh.  This also explains 
    > why my LAM build on Pradeep's system worked -- I configured and built 
    > LAM after the libaio-devel RPM was removed, so my build did not add 
    > -laio.
    >
    > The quick and easy solution is to disable ROMIO ("--without-romio").  
    > Not really an optimal solution, but it'll work.
    >
    
    -- 
    Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    Future Technologies Group                 
    HPC Research Department                   Tel: +1-510-495-2352
    Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
    

  • Next message: Jeff Squyres: "Re: Problems with BLCR?"