From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Wed Jul 27 2005 - 10:21:00 PDT
Jeff, I am not sure this explains why a simple hello world program should fail to restart. Even if romio runs some initialization code at MPI_Init time, I can see how any actual async I/O would be started. -Paul Jeff Squyres wrote: > On Jul 26, 2005, at 5:01 PM, Paul H. Hargrove wrote: > >> There is no support in current BLCR versions for either POSIX or >> Linux-native async I/O support. While this has nothing to do with >> whatever linker problems Jeff mentioned, it could be the cause of the >> problems you've been seeing. > > > I'm inferring from Pradeep's mail that there was an RPM that was > removed, but has now been replaced (LAM won't use libaio unless it > finds it during configure -- so it must have been there at some point > and then was later removed). > >> How/when is async I/O used in LAM? Is there a simple way to >> disable it via ssi params? > > > It's used in ROMIO. There are currently no SSI params to remove its > use -- part of the problem is that the wrapper compilers add "-laio" > So it's not just a run-time switch to change ROMIO's behavior, it's a > compile-time decision (ROMIO makes a bunch of decisions and sets > #define's based on whether AIO is present or not) for both LAM and ROMIO. > > But this also explains why we rarely (never?) saw this problem in our > own testing -- the vast majority of our manual testing builds disable > ROMIO because it takes so long to compile. Urgh. This also explains > why my LAM build on Pradeep's system worked -- I configured and built > LAM after the libaio-devel RPM was removed, so my build did not add > -laio. > > The quick and easy solution is to disable ROMIO ("--without-romio"). > Not really an optimal solution, but it'll work. > -- Paul H. Hargrove PHHargrove_at_lbl_dot_gov Future Technologies Group HPC Research Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900