From: Jerry Mersel (jerry.mersel_at_weizmann.ac.il)
Date: Wed Jan 02 2008 - 04:32:26 PST
I manage to checkpoint matlab processes from the command line. But when I want to use SGE I get the error: /lib64/libc.so.6: relocation error: /lib64/tls/libpthread.so.0: symbol errno, version GLIBC_PRIVATE not defined in file libc.so.6 with link time reference Restart failed: No such device or address The relocation error I get on the start using cr_run. The Restart failed I get when trying to restart. I start matlab thus: ${BLCR_HOME}/bin/cr_run env LD_PRELOAD=libcr.so.0:libpthread.so.0 matlab -nojvm -nodisplay -nosplash < $H/test.m and try to restart thus: ${BLCR_HOME}/bin/cr_restart $ckptfile my log file says this: Jan 2 14:24:36 kam02 kernel: Skipping a socket. Jan 2 14:24:36 kam02 kernel: Skipping a socket. Jan 2 14:26:03 kam02 kernel: Failed to open chrdev major=5 minor=0 path='/dev/tty') Jan 2 14:26:03 kam02 kernel: cr_restore_all_files [28703]: Unable to restore fd 3 (type=6,err=-6) Jan 2 14:26:03 kam02 kernel: cr_rstrt_child [28703]: Unable to restore files! (err=-6) Perhaps something to do with the socket. What do you think? Regards, Jerry P.S. I have prelinking turned off. cat Paul H. Hargrove wrote: > Jerry Mersel wrote: > >> Hi: >> >> I am trying to migrate jobs on a grid after checkpointing. >> Does the "prelinking" fix as mentioned in the faq must it be done >> on the checkpointed node and the migrated to node? >> >> Regards, >> Jerry > > Yes, the prelinking of libraries should be disabled on both the > "checkpointed on" and "migrated to" nodes. > I will clarify this in the next FAQ version. > > -Paul >