From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Fri Sep 23 2005 - 10:08:38 PDT
Checkpointing with BLCR requires that a small stub library be linked into an application. The message you are seeing is the one generated when a checkpoint request is issued for an application that does not include this support. A LAM/MPI built with BLCR support will automatically link in this library into applications it compiles. Other applications may do so explicitly when they are built, or more typically via an LD_PRELOAD done by the "cr_run" utility we provide. For instance, "cr_run ./a.out" would run a.out with the BLCR library loaded. It is also possible that the application is correctly linked with the library, but is somehow disabling the BLCR hook. One can look for "libcr.so" in /proc/<pid>/maps to determine if the process with the given pid has the BLCR library loaded. If it is loaded and you still get the "support missing from application" messages, then we can discuss how to determine the cause of the interference. -Paul Adolfo J. Banchio wrote: >Hello, > >first of all my excuses if this question was already answered >(in this case just point me to that answer), since I can not >get access to the search page of the archive. > >Now, the problem, > >I have a process running (started with cr_run) > >which gives this error message when checkpointed: > > "Checkpoint failed: support missing from application" > >and the exit status of cr_checkpoint is 52. > >What could be the reason for this? > >By the way, I have BLCR working with SGE, and besides for this >user, it is working Very good for process migration. > >best regards, > >adolfo > > > >