Re: lam-blcr start error

From: Jeff Squyres (jsquyres_at_lam-mpi.org)
Date: Wed Nov 26 2003 - 06:08:25 PST


On Wed, 26 Nov 2003, Ѧ wrote:

> 	I have successfully installed BCLR in my home folder ~/blcr, and
> 	cr.o and vmadump.o in /usr/local/lib/blcr/2.4.20-8/.

I'm not sure what you mean here -- you installed BLCR with a prefix of
$HOME/blcr and then did the proper insmod's for cr.o and vmadump.o
from that directory?  Or do you mean that you expanded the BLCR
tarball in $HOME/blcr and installed it to somewhere else (and ran the
insmod's from that installation directory)?

What did you supply as the --prefix argument for BLCR's ./configure?
Where did "make install" go?

> 	Then I type the following instruction(from blcr
> 	webpage) in lam-7.0.3 :
> 	[xue@localhost ~] ./configure --with-blcr=/usr/local/lib/blcr
> 	--with-rpi=crtcp --prefix=$HOME/lam
> 	[xue@localhost ~] make
> 	[xue@localhost ~] make install
> 	No error!

Good!

Does the "laminfo" command report that the blcr module was included?
(it will probably be listed at the bottom)

> 	Then, configue .bashrc for the env variables and activate them

What environment variables did you put in your .bashrc, specifically,
and what values did you assign to them?

> 	Then, I try:
> 	[xue@localhost ~] lamboot
> 	[xue@localhost ~] error while loading shared libraries: libcr.so.0:
> 	cannot open shared object file: No such file or directory
>
> 	I found libcr.so.0 in ~/blcr/libcr/.libs/. So, I told myself perhaps
> 	--with-blcr should be ~/blcr. I tried again, but failed with the
> 	same problem.

I'm still not clear on where BLCR was installed -- was it $HOME/blcr
or /usr/local/lib/blcr?

See below for your "not finding the libco.so" problem.

> 	So frustrated, I referred to the maillist page, and found so many
> 	new args, such as --with_cr, --with_file_dir. I am really confused.
> 	:(

I'm not sure what you're referring to, specifically.

Page 30 of the LAM/MPI installation guide discusses the --with-blcr
and --with-cr-file-dir options to LAM's configure script; is this what
you mean?  Is there something in the instructions that is not clear?

> 	I think most of the problems are coming from improper configurations.
> 	BLCR group should post a newest, detailed, tested configration,
> 	which would be a great help to all of us.

I think you're having more of a general unix environment issue and
misunderstanding of parallel computing environments than a specific
problem with LAM/MPI or BLCR.

The problem is that the LAM executables are not able to find the
libcr.so library.  This is a common unix problem (finding shared
libraries) and is solved by putting the directory location of that
file in the LD_LIBRARY_PATH environment variable (perhaps adding it to
the end of the existing $LD_LIBRARY_PATH value, if it already exists).
For example, if you are using a Bourne-type shell and do not already
have an LD_LIBRARY_PATH set, you can use:

$ LD_LIBRARY_PATH=/usr/local/lib/blcr/lib
$ export LD_LIBRARY_PATH
$ lamboot

(in this example, I'm assuming that libcr.so is in
/usr/local/lib/blcr/lib -- but per my statements above, I'm not
entirely sure where you installed it.  You can change the value of
LD_LIBRARY_PATH to match wherever libcr.so was installed.  I would
*not* recommend using the tree where the BLCR tarball was expanded and
built)

And that should work fine.  However, be aware that libcr.so will need
to be found on *every* node.  Hence, you will probably want to ensure
that BLCR is installed on the same place on every node, and need to
edit your "dot" files (.bashrc or .cshrc or whatever) to add
/usr/local/lib/blcr/lib into LD_LIBRARY_PATH.  In this way, all future
sessions will have that value set automatically.  (there's other ways
to do this, but this is among the simplest)

Additionally, you could forego all this and have your system
administrator setup /etc/ld.so.conf (not recommended for home users --
if you mess up the /etc/ld.so.conf file, you'll have a totally
unusable system).

Hope this helps.

-- 
{+} Jeff Squyres
{+} jsquyres@lam-mpi.org
{+} http://www.lam-mpi.org/