Re: BLCR full system lockup during cr_checkpoint

From: David Kesler (keslerdr_at_gmail_dot_com)
Date: Wed Feb 13 2008 - 13:38:57 PST

  • Next message: José M. Martín: "problems with cr_checkpoint: ioctl(/proc/checkpoint/ctrl, CR_OP_CHKPT_REAP): Input/output error"
    On Feb 13, 2008 12:48 PM, Paul H. Hargrove <PHHargrove_at_lbl_dot_gov> wrote:
    
    > David Kesler wrote:
    > > Hello,
    > >
    > > I've been trying to get BLCR to work on two different systems, one
    > > Ubuntu 7.1, one Fedora Core 8, both running on an x86 machine as Dom0
    > > of a Xen setup.  In both cases, I can compile, install, and
    > > successfully run BLCR IF I do not load Xen and instead boot into the
    > > generic kernel.  If I am in the Xen version of the kernel though I
    > > have two problems.  One, the installation process fails due to
    > > missing  #defines, include files, or other variables in the linux
    > > source directory's include folder.  I can, however, finagle BLCR into
    > > compiling by selectively modifying certain headers.  (Yes, I am well
    > > aware that this is unsafe and may be a leading cause of my problems
    > > and I'm also wondering if you know why this would be happening.)
    > >
    > > In both systems however, if I attempt to call cr_checkpoint on a
    > > running process while booted into the Xen kernel, I get a full system
    > > hang where it responds to absolutely nothing, requiring a hard
    > > reboot.  I know that messing around with the headers probably doesn't
    > > help the situation, but because both systems fail in the exact same
    > > way I was wondering whether, assuming that BLCR compiled correctly,
    > > there's some problem with running BLCR from within a kernel loaded by
    > > Xen.
    > >
    > > Thank you,
    > > David Kesler
    >
    > David,
    >
    >  We are able to run within Xen in our current development version of
    > BLCR, having adjusted our autoconf magic to locate the proper headers.
    > We have not tested the released version with a Xen paravitrualized
    > kernel; mainly because of the same header problems you have encountered.
    >  I suspect that the lockup occurs as a result of using a
    > non-paravitualized instruction to access one of the CPU's special
    > registers (a possible result of getting "generic" headers).
    >  If you are willing to play guinea pig, I can create a snapshot of the
    > current development and send you a URL.  Let me know.
    >
    > -Paul
    >
    > --
    > Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    > Future Technologies Group
    > HPC Research Department                   Tel: +1-510-495-2352
    > Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
    >
    >
    >
    
    Sure, I'll give it a shot and let you know how it turns out.
    
    Thanks,
    David
    

  • Next message: José M. Martín: "problems with cr_checkpoint: ioctl(/proc/checkpoint/ctrl, CR_OP_CHKPT_REAP): Input/output error"