From: José M. Martín (jmartin_at_onsager.ugr.es)
Date: Thu Feb 21 2008 - 00:51:53 PST
I have done some aditional test. It only fails on a volume mounted with GlusterFS, a distribuited FS. In local drive, it works. So, it must be a issue with this FS. There are no entries in /var/log/messages and dmesg about the error. Thanks, José El Wednesday 20 February 2008 18:07:30 Paul H. Hargrove escribió: > José, > > Sorry the error reporting isn't very clear. That is one of the weaker > parts of BLCR right now. > Since the testsuite passes, the most likely reason for the message you > see is an actual I/O failure when trying to write out the checkpoint > context file for your application. The BLCR code will map (nearly) all > failed write() calls to EIO, even if the actual cause was an > out-of-space or over-quota error. > You might find some useful information in /var/log/messages, or via > dmesg, about what BLCR was doing at the time of the error. If you can > send us those messages, we may be able to narrow down what the problem is. > > -Paul > > P.S. > I will ensure the next release of BLCR produces a less confusing error > message, such as "cr_checkpoint: checkpoint failed: Input/output > error". There really should be no reference to the internal ioctl() call. > > José M. Martín wrote: > > Hello, > > > > first, thanks for this project. > > > > I tried to set up blcr, but I have a problem. When I lunch a program and > > I do the checkpoint, I get the following error: > > ioctl(/proc/checkpoint/ctrl, CR_OP_CHKPT_REAP): Input/output error > > > > I have tried with kernels 2.6.20 (vanilla) and 2.6.18.8-0.8 (opensuse > > 10.2 default) on a node. On both, I get the same error. > > Nevertheless, on other node with opensuse 10.2 and kernel 2.6.23.1, it > > runs without problem. > > > > I have passed the testsuite: > > ====================== > > All 34 tests passed > > (1 tests were not run) > > ====================== > > > > No hugetlbfs mount point found (test skipped) > > SKIP: hugetlbfs.ct > > > > I can load the blcr modules without problem, execute binaries, link > > libraries,... > > > > I'm using version 0.6.4 > > Nodes are x86 (Pentium 4) > > > > Any help will be apreciated. > > > > Thanks in advance