From owner-freebsd-current@FreeBSD.ORG Mon Feb 13 22:52:10 2012 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DEB8B106566B for ; Mon, 13 Feb 2012 22:52:09 +0000 (UTC) (envelope-from dmitrym@juniper.net) Received: from exprod7og116.obsmtp.com (exprod7og116.obsmtp.com [64.18.2.219]) by mx1.freebsd.org (Postfix) with ESMTP id 6F6FE8FC0A for ; Mon, 13 Feb 2012 22:52:09 +0000 (UTC) Received: from P-EMHUB01-HQ.jnpr.net ([66.129.224.36]) (using TLSv1) by exprod7ob116.postini.com ([64.18.6.12]) with SMTP ID DSNKTzmUFzx0ryJ2kCyPNb6gAj0mKJlrRyTU@postini.com; Mon, 13 Feb 2012 14:52:09 PST Received: from magenta.juniper.net (172.17.27.123) by P-EMHUB01-HQ.jnpr.net (172.24.192.33) with Microsoft SMTP Server (TLS) id 8.3.213.0; Mon, 13 Feb 2012 14:50:50 -0800 Received: from [172.24.26.191] (dmitrym-lnx.jnpr.net [172.24.26.191]) by magenta.juniper.net (8.11.3/8.11.3) with ESMTP id q1DMon147453; Mon, 13 Feb 2012 14:50:49 -0800 (PST) (envelope-from dmitrym@juniper.net) Message-ID: <4F3993C5.5020703@juniper.net> Date: Mon, 13 Feb 2012 14:50:45 -0800 From: Dmitry Mikulin User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:9.0) Gecko/20111229 Thunderbird/9.0 MIME-Version: 1.0 To: Konstantin Belousov References: <20120207121022.GC3283@deviant.kiev.zoral.com.ua> <4F318D74.9030506@juniper.net> <4F31C89C.7010705@juniper.net> <4F3318AD.6000607@juniper.net> <20120209122908.GD3283@deviant.kiev.zoral.com.ua> <4F34311A.9050702@juniper.net> <20120210001725.GJ3283@deviant.kiev.zoral.com.ua> <4F3478B3.9040809@juniper.net> <20120213152825.GH3283@deviant.kiev.zoral.com.ua> <4F3988E8.2040705@juniper.net> <20120213222521.GK3283@deviant.kiev.zoral.com.ua> In-Reply-To: <20120213222521.GK3283@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-EXCLAIMER-MD-CONFIG: f8e27f27-03b2-4c3e-9447-119194e72cb6 X-Mailman-Approved-At: Mon, 13 Feb 2012 22:56:48 +0000 Cc: freebsd-current Current , Marcel Moolenaar Subject: Re: [ptrace] please review follow fork/exec changes X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Feb 2012 22:52:10 -0000 >> The problem I'm trying to solve is to allow a parent to collect it's child >> exit status while we're following its child. Gdb detaches from the parent >> upon successful switch-over from parent to child. At this point due to >> re-parenting the parent loses the child to gdb and if it's in a wait() >> it'll get a return status that it has no children to wait for. > This text should be put somewhere in the comment. It took me some time > to re-create the reason for the patch during the read. I'll find a place in the code to add this comment. > > I will take a look at the example tomorrow, thanks. >>> The new LIST_FOREACH(&q->p_orphans) body is copy/pasted, together >>> with the comments, from the LIST_FOREACH(&q->p_children). Can the >>> common code be moved into some function ? >> Moved the common code into a function. Didn't have time to test though. > Ok. Do not put the space between function name and '('. > Both calls to proc_to_reap() has the space. Habit of a different coding convention... fixed > >>> Shouldn't there be some assertion in proc_reparent() for the case when >>> we remove child from the orphans list, that the child is no longer >>> debugged ? >> Hmm... Not sure I understand... > proc_reparent() can move the child both to and from the orphan list. > If child is traced, you instert it into the orhpan list. > When removing the child from the orphan list, it means that > debugger finished with the process. My suggestion is to assert this > in proc_reparent (but I am not completely sure that this can be done > easily). Need to think about this one. > >>> Why in proc_reparent(), in the case of P_TRACED child, you do >>> PROC_UNLOC/PROC_LOCK ? >> No idea how it ended up like that... I'll clean it up. >> >>> It seems that now wait4(2) can be called from the real (non-debugger) >>> parent first and result in the call to proc_reap(), isn't it ? We would >>> then just reparent the child back to the caller, still leaving the >>> zombie and confusing debugger. >> When either gdb or the real parent gets to proc_reap() the process wouldn't >> get destroyed, it'll get caught by the following clause: >> if (p->p_oppid&& (t = pfind(p->p_oppid)) != NULL) { >> >> and the real parent with get the child back into the children's list while >> gdb will get it into the orphan list. The second time around when >> proc_reap() is entered, p->p_oppid will be 0 and the process will get >> really reaped. Does it make sense? And proc_reparent() attempts to keep the >> orphan list clean and not have the same entries and the list of siblings. > Right, this is what I figured. But I asked about some further implication > of this change: > > if real parent spuriosly calls wait4(2) on the child pid after the child > exited, but before the debugger called the wait4(), then exactly the > code you noted above will be run. This results in the child being fully > returned to the original parent. > > Next, the wait4() call from debugger gets an error, and zombie will be > kept around until parent calls wait4() for this pid once more. > > Am I missed something ? In this case the process will move from gdb's child list to gdb's orphan list when the real parent does a wait4(). Next time around the wait loop in gdb it'll be caught by the orphan's proc_reap().