From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Fri Sep 23 2005 - 16:21:43 PDT
Adolfo, It is as simple as "blcr does not support static linked binaries". Not only does cr_run not perform its magic with a statically linked binary, but we've never really tested the case of libcr statically linked into an application. Since we don't generate a libcr.a by default, however, I think linking explicitly with -static will either fail or result in a binary that is dynamically linked to blcr - not sure. -Paul Adolfo J. Banchio wrote: > Paul, > > thanks for your prompt reply. > > The program was run using cr_run, and then I checked > at /proc/<PID>/maps and there is no line for > the blcr libraries. So this is the reason. > > The only difference with other codes (compiled with > same compiler) is that this one (and another one > I recompiled for testing) is compiled with -static flag. > > Is as simple as that no program compiled "statically" will > accept cr_run for checkpointing? In other words, for > statically linked codes you have to include the libraries > at linking time. Is this true? > > > thanks for your help > > > best regards, > > adolfo > > > > > > On Fri, 2005-09-23 at 14:08, Paul H. Hargrove wrote: > >> Checkpointing with BLCR requires that a small stub library be linked >>into an application. The message you are seeing is the one generated >>when a checkpoint request is issued for an application that does not >>include this support. >> >> A LAM/MPI built with BLCR support will automatically link in this >>library into applications it compiles. Other applications may do so >>explicitly when they are built, or more typically via an LD_PRELOAD done >>by the "cr_run" utility we provide. For instance, "cr_run ./a.out" >>would run a.out with the BLCR library loaded. >> >> It is also possible that the application is correctly linked with the >>library, but is somehow disabling the BLCR hook. One can look for >>"libcr.so" in /proc/<pid>/maps to determine if the process with the >>given pid has the BLCR library loaded. If it is loaded and you still >>get the "support missing from application" messages, then we can discuss >>how to determine the cause of the interference. >> >>-Paul >> >>Adolfo J. Banchio wrote: >> >> >>>Hello, >>> >>>first of all my excuses if this question was already answered >>>(in this case just point me to that answer), since I can not >>>get access to the search page of the archive. >>> >>>Now, the problem, >>> >>>I have a process running (started with cr_run) >>> >>>which gives this error message when checkpointed: >>> >>> "Checkpoint failed: support missing from application" >>> >>>and the exit status of cr_checkpoint is 52. >>> >>>What could be the reason for this? >>> >>>By the way, I have BLCR working with SGE, and besides for this >>>user, it is working Very good for process migration. >>> >>>best regards, >>> >>>adolfo >>> >>> >>> >>> > > -- Paul H. Hargrove PHHargrove_at_lbl_dot_gov Future Technologies Group HPC Research Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900