MPI support for BLCR

From: Greg Bronevetsky (greg_at_bronevetsky_dot_com)
Date: Tue Feb 28 2006 - 12:09:11 PST

  • Next message: jcduell_at_lbl_dot_gov: "Re: MPI support for BLCR"
    I am a grad student at Cornell, working on checkpointing of MPI 
    applications. Our checkpointer works with any implementation of MPI and 
    (in principle) with any single-process checkpointer. However, in 
    practice integration with single process checkpointers is made more 
    complex because by default such a checkpointer will save the state of 
    the entire process, including MPI state. This is generally incorrect as 
    MPI state contains hardware information that will not be valid on restart.
    
    I know that you've integrated BLCR with LAM, presumably in a way that 
    doesn't save LAM's state but instead lets LAM save its own state. How 
    did you do this? Was it via a special API (the callbacks referred to in 
    your FAQ) or did you use a more general technique?
    
    -- 
                                 Greg Bronevetsky
    

  • Next message: jcduell_at_lbl_dot_gov: "Re: MPI support for BLCR"