Re: Sriram's first week

From: Rusty Lusk (lusk_at_mcs.anl.gov)
Date: Fri May 10 2002 - 15:18:22 PDT


I am of course interested in how this work can be made relevant to
multiple MPI implementations.  This doesn't mean that it should not
take advantage of features found only in LAM, but it should also
focus on defining what it checkpointing needs from the MPI implementation in
order to function well.  We would then be interested in adding such
functionality to MPICH.  The point is not to have it then work on two
implementations instead of one, but to think from the beginning in a
more abstract way about checkpointing requirements.

Rusty

| So, I want to know where Sriram is with respect to LAM/MPI and
| checkpoint/restart.  Is there specific work in LAM that Sriram is
| already doing and should continue?  Should we dive right into discussing
| how we expect to trigger LAM in the event of a checkpoint?  Is Sriram in
| a possition to teach us Berkeley folks about how LAM applications
| interact with the lamd and how the lamd's interact with eachother?
| 
| A second issue is whether we should plan to have a conference call
| sometime during this first week, or wait for the AG time the following
| week?