From: Eric Roman (eroman_at_lbl.gov)
Date: Tue Sep 17 2002 - 18:24:10 PDT
I fixed the restart (and checkpoint) code so that they return the "correct" values to user space. (The return codes were dyslexic before, so RESTORE was being passed to continuing processes, and CONTINUE was being passed to restoring processes). I've tested this out with the pthread hack, and it looks like things are working sort of ok. The right number of threads comes back, and they do (to first order) the right things. I tested thread creation after a restart. This is where life gets funny. It looks like (since we're not yet restoring the process tree) that the behavior (of my test code) depends on the way the threads were created. I think that since my threads exit, the signal is going to the wrong thread... The end result is that sometimes my test code executes correctly and sometimes not. This needs some more time put into it. It either works perfectly, repeatedly, or fails repeatedly. Weird. All in all, this is mostly good news. - E -- Eric Roman Future Technologies Group 510-486-6420 Lawrence Berkeley National Laboratory