From: Rusty Lusk (lusk_at_mcs.anl.gov)
Date: Fri May 10 2002 - 15:18:22 PDT
I am of course interested in how this work can be made relevant to multiple MPI implementations. This doesn't mean that it should not take advantage of features found only in LAM, but it should also focus on defining what it checkpointing needs from the MPI implementation in order to function well. We would then be interested in adding such functionality to MPICH. The point is not to have it then work on two implementations instead of one, but to think from the beginning in a more abstract way about checkpointing requirements. Rusty | So, I want to know where Sriram is with respect to LAM/MPI and | checkpoint/restart. Is there specific work in LAM that Sriram is | already doing and should continue? Should we dive right into discussing | how we expect to trigger LAM in the event of a checkpoint? Is Sriram in | a possition to teach us Berkeley folks about how LAM applications | interact with the lamd and how the lamd's interact with eachother? | | A second issue is whether we should plan to have a conference call | sometime during this first week, or wait for the AG time the following | week?