Re: Thoughts on juggling the stack.

From: Paul H. Hargrove (PHHargrove_at_lbl.gov)
Date: Wed Aug 07 2002 - 11:42:24 PDT


Eric,
A couple of comments (not in the order of your questions).

(question 2) Your guess about the maps is almost correct:
40105000-40109000 Is libc text
40109000-4010d000 Is libc data, such as errno and malloc's data structs
4010d000-40115000 Is the stack you allocated
bfffc000-c0000000 Is the original stack
So it is the libc data that you are going to scribble on if you overrun
your stack.  In practice you should probably create a guard page when
using clone(), using mprotect(stackaddr, PAGESIZE, PROT_NONE).

(question 3) As far as alloca() failing to return NULL goes, I'd guess
that the alloca() implementation is lazy.  It is either assuming that
because the page it tried to use was mapped it is OK, or else it is not
checking at all.  I couldn't guess how one would identify the end of the
stack in the alloca() implementation.  By evidence suggests that gcc
implements alloca as a builtin function which just blindly advances the
stack pointer, possibly w/o bounds checking.

(question 1) I have three possible guesses for why you see the SEGV in
the "wrong" process:  
1) The SEGV is actually going to 2523 because of how the kernel is
treating clone()d threads.  In particular check if you are passing the
CLONE_THREAD flag (or CLONE_SIGNAL which includes it).  I believe that
if you use this flag then the SEGV will get sent to BOTH threads.
2) The second possibility is that your alloca() in 2524 is scribbling on
the libc data and that is causing whatever 2523 is doing to SEGV.
3) The last possibility is that the alloca() thread DOES get the SEGV
and die.  It is the possible that its parent (2523) is exiting and
propagating the original exit code to your shell (I doubt this would be
happening unless you are explicitly doing this your self). 
Alternatively if you are using CLONE_PARENT, then the shell is the
parent of both 2523 and 2524, but the shell is only _aware_ of 2523. 
Thus when 2524 dies with the SEGV a wait4() oe equivalent is returning
in your shell with a pid which is unknown to the shell.  If this is the
case you may (or may not) see the larger pid printed if you use a
different shell.

I suggest installing (before the checkpoint if possible) a signal
handler for SEGV containing printf("SEGV in %d\n, (int)getpid()).  That
should help you distinguish the possibilities. The signal handler should
print one line for each thread in the first case, just a single line for
the older process in case #2, and just a single line for the younger
process in case #3.  This could also be a test to see that the signal
handlers are being restored.

Eric Roman wrote:
> 
> Ok,
> 
> The guts of multithreaded (clone'd) process restart are now in place.
> (pthreads is coming later).  Right now, I can dump clone'd processes, and
> restore them in a clone'd state.
> 
> The test code I'm using for this basically calls clone a few times, dumps it's
> state out and exits.  On restart, it starts allocating memory.  It looks like
> all the good process state is being set up correctly.  I still need to run a
> few more tests.  (Ideas are welcome.  What I have now doesn't test much.)
> 
> The problem I'm running into now is with growing the stack through alloca().
> Seems that one of the threads throws a segmentation violation when alloca()
> is called.  Ick.  And it gets worse.
> 
> So I jammed a few thousand printf's into the test code.  Here's what I'm getting
> while extending the stack.  First, here's some of our maps (from /proc/self/maps)
> 
> 40105000-40109000 rw-p 000ec000 00:08 1276283    /lib/libc-2.1.3.so
> 40109000-4010d000 rw-p 00000000 00:00 0
> 4010d000-40115000 rwxp 00000000 00:00 0
> bfffc000-c0000000 rwxp 00000000 00:00 0
> 
> So we see that there are 2 adjacent regions with different protections.
> 
> 3523 is the main thread.  3524 is the newly recreated thread that's dinking
> around with alloca().  The code basically loops, doing:
>   while (1) { foo=alloca(GROW_CHUNK); }
> 
> PID 3524's stack is reported to be at:
> Thread 1: PID 3524      [0x4010d000-0x40115000]
> 
> The lines "About to alloca() ?= 0x4010ffcc" is the value of
> foo - GROW_CHUNK.  That should tell us where the new end of the stack
> is.
> 
> Here's the output of the test code:
> 3523: Wating for children
> 3524: About to alloca() ?= 0x4010ffcc
> 3524: foo=0x4010ffcc &i=0x40114fd8
> 3524: memset(0x4010ffcc, 0) successful
> 3524: About to alloca() ?= 0x4010d7cc
> 3524: foo=0x4010d7cc &i=0x40114fd8
> 3524: memset(0x4010d7cc, 0) successful
> 3524: About to alloca() ?= 0x4010afcc
> zsh: 3523 segmentation fault  ./vmadthread -u
> 
> So looking at this, it seems that we get the segmentation fault right when
> the stack crosses the boundary of those two memory regions.
> 
> More or less the right thing is occurring.  We're making the stack slightly
> too big with alloca(), and then we're getting a segmentation violation.
> 
> BUT: We're getting a segmentation violation in a completely different process.
> Sigh.
> 
> So a few questions?
> 1/  Why is thread 3523 segv'ing when this occurs?
> It's _not_ the thread that did the alloca!
> 
> 2/ What allocated this memory in the first place?
> This memory is placed just after the mappings for the C libraries, but has
> no backing.  It's also in place at program launch, before any mmap()'s have
> taken place in main().  We might accidentally be overwriting C library text,
> I'm not certain.
> 
> Any explanations?  It seems clear that we're somehow touching something that
> would rather not be touched.  If you have any idea what the exact mechanism
> is, I'm all ears.
> 
> 3/ Why didn't alloca return NULL?
> Might be my fault.  alloca() might just be confused about how much stack space
> is available to it, since I created a new (smaller?) stack for the thread without
> telling it.
> 
> The root cause of all of this would seem to be my fault anyways.  It looks
> like I just haven't allocated enough stack space for recovering threads.
> So the "solution" for the test code is just to track how much stack space is
> allocated, and make sure that we don't go out of bounds.  (If pthreads were here,
> it would have taken care of all of this, so we shouldn't have to worry about
> this problem for any of the later on cases)
> 
> Thoughts anyone?
> 
> --
> Eric Roman                       Future Technologies Group
> 510-486-6420                     Lawrence Berkeley National Laboratory

-- 
Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
NERSC Future Technologies Group           Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-495-2998