Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 13 Feb 2012 14:50:45 -0800
From:      Dmitry Mikulin <dmitrym@juniper.net>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        freebsd-current Current <freebsd-current@freebsd.org>, Marcel Moolenaar <marcelm@juniper.net>
Subject:   Re: [ptrace] please review follow fork/exec changes
Message-ID:  <4F3993C5.5020703@juniper.net>
In-Reply-To: <20120213222521.GK3283@deviant.kiev.zoral.com.ua>
References:  <20120207121022.GC3283@deviant.kiev.zoral.com.ua> <4F318D74.9030506@juniper.net> <4F31C89C.7010705@juniper.net> <4F3318AD.6000607@juniper.net> <20120209122908.GD3283@deviant.kiev.zoral.com.ua> <4F34311A.9050702@juniper.net> <20120210001725.GJ3283@deviant.kiev.zoral.com.ua> <4F3478B3.9040809@juniper.net> <20120213152825.GH3283@deviant.kiev.zoral.com.ua> <4F3988E8.2040705@juniper.net> <20120213222521.GK3283@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help

>> The problem I'm trying to solve is to allow a parent to collect it's child
>> exit status while we're following its child. Gdb detaches from the parent
>> upon successful switch-over from parent to child. At this point due to
>> re-parenting the parent loses the child to gdb and if it's in a wait()
>> it'll get a return status that it has no children to wait for.
> This text should be put somewhere in the comment. It took me some time
> to re-create the reason for the patch during the read.

I'll find a place in the code to add this comment.

>
> I will take a look at the example tomorrow, thanks.
>>> The new LIST_FOREACH(&q->p_orphans) body is copy/pasted, together
>>> with the comments, from the LIST_FOREACH(&q->p_children). Can the
>>> common code be moved into some function ?
>> Moved the common code into a function. Didn't have time to test though.
> Ok. Do not put the space between function name and '('.
> Both calls to proc_to_reap() has the space.

Habit of a different coding convention... fixed

>
>>> Shouldn't there be some assertion in proc_reparent() for the case when
>>> we remove child from the orphans list, that the child is no longer
>>> debugged ?
>> Hmm... Not sure I understand...
> proc_reparent() can move the child both to and from the orphan list.
> If child is traced, you instert it into the orhpan list.
> When removing the child from the orphan list, it means that
> debugger finished with the process. My suggestion is to assert this
> in proc_reparent (but I am not completely sure that this can be done
> easily).

Need to think about this one.

>
>>> Why in proc_reparent(), in the case of P_TRACED child, you do
>>> PROC_UNLOC/PROC_LOCK ?
>> No idea how it ended up like that... I'll clean it up.
>>
>>> It seems that now wait4(2) can be called from the real (non-debugger)
>>> parent first and result in the call to proc_reap(), isn't it ? We would
>>> then just reparent the child back to the caller, still leaving the
>>> zombie and confusing debugger.
>> When either gdb or the real parent gets to proc_reap() the process wouldn't
>> get destroyed, it'll get caught by the following clause:
>>      if (p->p_oppid&&  (t = pfind(p->p_oppid)) != NULL) {
>>
>> and the real parent with get the child back into the children's list while
>> gdb will get it into the orphan list. The second time around when
>> proc_reap() is entered, p->p_oppid will be 0 and the process will get
>> really reaped. Does it make sense? And proc_reparent() attempts to keep the
>> orphan list clean and not have the same entries and the list of siblings.
> Right, this is what I figured. But I asked about some further implication
> of this change:
>
> if real parent spuriosly calls wait4(2) on the child pid after the child
> exited, but before the debugger called the wait4(), then exactly the
> code you noted above will be run. This results in the child being fully
> returned to the original parent.
>
> Next, the wait4() call from debugger gets an error, and zombie will be
> kept around until parent calls wait4() for this pid once more.
>
> Am I missed something ?

In this case the process will move from gdb's child list to gdb's orphan list when the real parent does a wait4(). Next time around the wait loop in gdb it'll be caught by the orphan's proc_reap().




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F3993C5.5020703>