Re: Passing parameters to the CR Callback function.

From: Karthik Gopalakrishnan (gopalakk_at_cse.ohio-state.edu)
Date: Thu Mar 05 2009 - 17:24:04 PST

  • Next message: Andrea Autiero S143785: "Re: using blcr on program with fork"
    Hi Paul.
    
    Thanks for the detailed explanation. :-) I will surely look into the
    socket / FIFO implementation. I do plan to create such an interface to
    pass arguments. I will surely share it with you once it is complete.
    And in the case of MPI, we could pass the arguments to all the ranks
    and let the individual ranks figure out what each need. ;-)
    
    Thanks & Regards,
    Karthik
    
    On Thu, Mar 5, 2009 at 6:51 PM, Paul H. Hargrove <PHHargrove_at_lbl_dot_gov> wrote:
    > Karthik,
    >
    >  The way we have envisioned passing arguments to a checkpoint callback is
    > through what is often called a "client data pointer" or "context pointer".
    >  The basic idea is that when you register the callback you pass the
    > registration function a (void*) that points to a callback-dependent data
    > structure.  Your application can use that data structure in any way you
    > please to pass arguments.  This is, of course, designed for passing of
    > argument within the address space of the application, which is not what you
    > are asking about.
    >
    >  What you are asking for is something we have not considered: the passing of
    > arguments from the checkpoint *requester* to the checkpoint *target*.  While
    > the mechanism you describe does sound useful for some cases, I am concerned
    > that there are complications that arice in the general case.  In particular,
    > I am thinking of any situation in which there are multiple "clients" of
    > libcr in the application's address space, such as the application code
    > itself plus one of more BLCR-enabled libraries (MPI comes to mind).  In such
    > a case the question arises of which arguments are for which callbacks.
    >
    >  I think that your use of a file is a reasonable solution, even if it
    > doesn't seem too elegant.  You might consider something slightly fancier
    > like opening  a FIFO or socket to a "server" process that provides the
    > arguments.  To be honest, if I were to implement something like what you
    > suggest in cr_checkpoint, it would likely be implemented in that manner:
    > using a socket or FIFO connection between the cr_checkpoint program and the
    > libcr code linked into the target application.  You can, of course, do
    > exactly that on your own by creating a "my_checkpoint" wrapper around
    > cr_checkpoint to handle the argument parsing and the connection, and you
    > callback would contain (or call) the code implementing the other end of the
    > connection.  A potential key to making this work in the presence of multiplt
    > checkpoints is the fact that the requester and target know each others IDs
    > (target can call cr_get_checkpoint_info() to find the requester's pid).
    >
    >  While your suggestion is potentially useful, I know I don't have time to
    > implement something like this any time soon.  If you like, you could create
    > an entry in our Bugzilla database (http://mantis.lbl.gov/bugzilla) to
    > request this feature.  Also, if you do implement something that you'd be
    > willing to share, I might add it to the examples directory in the BLCR
    > distribution for others to use if it is useful to them.
    >
    > -Paul
    >
    > Karthik Gopalakrishnan wrote:
    >>
    >> Hi Paul.
    >>
    >> I see that the CR Callback function can accept one  void * argument.
    >> However, I don't see a 'proper' way to pass data to my application's
    >> callback function when I do a 'cr_checkpoint'. It will be nice if I
    >> could do a 'cr_checkpoint [options] ID arg1 arg2 ... argN' with
    >> arg[1..N] being passed to the registered callbacks, maybe as 'char
    >> **args' & args[N] = NULL. The definition of the callback could be
    >> changed to "typedef int (*cr_callback_t)(char **, void *)". I admit I
    >> have not thought this through, but I feel something like this will be
    >> pretty useful.
    >>
    >> Currently, I have a wrapper that writes the parameters to some tmp
    >> file before calling cr_checkpoint and I get my callback function to
    >> read the arguments off that file. I'll be grateful if you could
    >> suggest a better way for me to achieve this, in the current scenario.
    >>
    >> Thanks & Regards,
    >> Karthik
    >>
    >
    >
    > --
    > Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    > Future Technologies Group                 Tel: +1-510-495-2352
    > HPC Research Department                   Fax: +1-510-486-6900
    > Lawrence Berkeley National Laboratory
    >
    >
    

  • Next message: Andrea Autiero S143785: "Re: using blcr on program with fork"