From: Jeff Squyres (jsquyres_at_lam-mpi.org)
Date: Wed Nov 26 2003 - 06:08:25 PST
On Wed, 26 Nov 2003, Ѧ���� wrote: > I have successfully installed BCLR in my home folder ~/blcr, and > cr.o and vmadump.o in /usr/local/lib/blcr/2.4.20-8/. I'm not sure what you mean here -- you installed BLCR with a prefix of $HOME/blcr and then did the proper insmod's for cr.o and vmadump.o from that directory? Or do you mean that you expanded the BLCR tarball in $HOME/blcr and installed it to somewhere else (and ran the insmod's from that installation directory)? What did you supply as the --prefix argument for BLCR's ./configure? Where did "make install" go? > Then I type the following instruction(from blcr > webpage) in lam-7.0.3 : > [xue@localhost ~] ./configure --with-blcr=/usr/local/lib/blcr > --with-rpi=crtcp --prefix=$HOME/lam > [xue@localhost ~] make > [xue@localhost ~] make install > No error! Good! Does the "laminfo" command report that the blcr module was included? (it will probably be listed at the bottom) > Then, configue .bashrc for the env variables and activate them What environment variables did you put in your .bashrc, specifically, and what values did you assign to them? > Then, I try: > [xue@localhost ~] lamboot > [xue@localhost ~] error while loading shared libraries: libcr.so.0: > cannot open shared object file: No such file or directory > > I found libcr.so.0 in ~/blcr/libcr/.libs/. So, I told myself perhaps > --with-blcr should be ~/blcr. I tried again, but failed with the > same problem. I'm still not clear on where BLCR was installed -- was it $HOME/blcr or /usr/local/lib/blcr? See below for your "not finding the libco.so" problem. > So frustrated, I referred to the maillist page, and found so many > new args, such as --with_cr, --with_file_dir. I am really confused. > :( I'm not sure what you're referring to, specifically. Page 30 of the LAM/MPI installation guide discusses the --with-blcr and --with-cr-file-dir options to LAM's configure script; is this what you mean? Is there something in the instructions that is not clear? > I think most of the problems are coming from improper configurations. > BLCR group should post a newest, detailed, tested configration, > which would be a great help to all of us. I think you're having more of a general unix environment issue and misunderstanding of parallel computing environments than a specific problem with LAM/MPI or BLCR. The problem is that the LAM executables are not able to find the libcr.so library. This is a common unix problem (finding shared libraries) and is solved by putting the directory location of that file in the LD_LIBRARY_PATH environment variable (perhaps adding it to the end of the existing $LD_LIBRARY_PATH value, if it already exists). For example, if you are using a Bourne-type shell and do not already have an LD_LIBRARY_PATH set, you can use: $ LD_LIBRARY_PATH=/usr/local/lib/blcr/lib $ export LD_LIBRARY_PATH $ lamboot (in this example, I'm assuming that libcr.so is in /usr/local/lib/blcr/lib -- but per my statements above, I'm not entirely sure where you installed it. You can change the value of LD_LIBRARY_PATH to match wherever libcr.so was installed. I would *not* recommend using the tree where the BLCR tarball was expanded and built) And that should work fine. However, be aware that libcr.so will need to be found on *every* node. Hence, you will probably want to ensure that BLCR is installed on the same place on every node, and need to edit your "dot" files (.bashrc or .cshrc or whatever) to add /usr/local/lib/blcr/lib into LD_LIBRARY_PATH. In this way, all future sessions will have that value set automatically. (there's other ways to do this, but this is among the simplest) Additionally, you could forego all this and have your system administrator setup /etc/ld.so.conf (not recommended for home users -- if you mess up the /etc/ld.so.conf file, you'll have a totally unusable system). Hope this helps. -- {+} Jeff Squyres {+} [email protected] {+} http://www.lam-mpi.org/