From: Jeff Squyres (jsquyres_at_open-mpi.org)
Date: Mon Jul 25 2005 - 08:57:49 PDT
FWIW, the original processes were not still running when I tried to restart (I tried several times, all of which were after the original processes had completed normally). In my original mail, I neglected to mention that lsmod shows that the proper BLCR modules were loaded: [jeff@linf1 ~]$ /sbin/lsmod Module Size Used by blcr 52104 0 vmadump_blcr 26516 1 blcr nfsd 211393 9 exportfs 10177 1 nfsd i915 22977 1 drm 68821 2 i915 vmnet 38820 12 vmmon 172652 4 parport_pc 31621 1 lp 16585 0 parport 39049 2 parport_pc,lp nfs 197769 2 lockd 63721 3 nfsd,nfs sunrpc 139781 20 nfsd,nfs,lockd dm_mod 59749 0 video 19909 0 button 10577 0 battery 13381 0 ac 8773 0 md5 8001 1 ipv6 265601 18 uhci_hcd 35409 0 ehci_hcd 38093 0 hw_random 9557 0 i2c_i801 12621 0 i2c_core 25409 1 i2c_i801 snd_intel8x0 35969 1 snd_ac97_codec 78393 1 snd_intel8x0 snd_seq_dummy 7621 0 snd_seq_oss 35777 0 snd_seq_midi_event 11585 1 snd_seq_oss snd_seq 54097 5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event snd_seq_device 12621 3 snd_seq_dummy,snd_seq_oss,snd_seq snd_pcm_oss 54257 0 snd_mixer_oss 21953 1 snd_pcm_oss snd_pcm 91973 3 snd_intel8x0,snd_ac97_codec,snd_pcm_oss snd_timer 28357 2 snd_seq,snd_pcm snd 58149 11 snd_intel8x0,snd_ac97_codec,snd_seq_oss,snd_seq,snd_seq_device,snd_pcm_o ss,snd_mixer_oss,snd_pcm,snd_timer soundcore 13345 1 snd snd_page_alloc 13765 2 snd_intel8x0,snd_pcm epic100 23877 0 e100 41153 0 mii 9409 2 epic100,e100 floppy 62421 1 ext3 133193 4 jbd 61913 1 ext3 ata_piix 13381 0 libata 49349 1 ata_piix sd_mod 22977 0 scsi_mod 136329 2 libata,sd_mod On Jul 25, 2005, at 11:27 AM, Pradeep Padala wrote: > Hi Jeff, > >> [jeff@linf1 ~]$ cr_restart context.4037 >> cri_syscall(CR_OP_RSTRT_REQ, &req): Device or resource busy >> cri_syscall(CR_OP_RSTRT_REQ, &req): Device or resource busy >> cri_syscall(CR_OP_RSTRT_REQ, &req): Device or resource busy >> cri_syscall(CR_OP_RSTRT_REQ, &req): Device or resource busy > > As far as I know, these errors mean that the process id for the > restarted process is already taken up. This usually means that the > original task is still running (explained in Troubleshooting FAQ at > http://mantis.lbl.gov/blcr/doc/html/BLCR_Users_Guide.html) > > > -- > Pradeep Padala > http://ppadala.blogspot.com > -- {+} Jeff Squyres {+} The Open MPI Project {+} http://www.open-mpi.org/