From owner-freebsd-current@FreeBSD.ORG Wed Feb 15 18:02:18 2012 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 87066106566C for ; Wed, 15 Feb 2012 18:02:18 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 199418FC13 for ; Wed, 15 Feb 2012 18:02:17 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q1FI2AKg068158 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 15 Feb 2012 20:02:11 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q1FI2A6t006074; Wed, 15 Feb 2012 20:02:10 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q1FI2ABI006073; Wed, 15 Feb 2012 20:02:10 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 15 Feb 2012 20:02:10 +0200 From: Konstantin Belousov To: Dmitry Mikulin Message-ID: <20120215180210.GC3283@deviant.kiev.zoral.com.ua> References: <20120210001725.GJ3283@deviant.kiev.zoral.com.ua> <4F3478B3.9040809@juniper.net> <20120213152825.GH3283@deviant.kiev.zoral.com.ua> <4F3988E8.2040705@juniper.net> <20120213222521.GK3283@deviant.kiev.zoral.com.ua> <4F3993C5.5020703@juniper.net> <20120215163252.GZ3283@deviant.kiev.zoral.com.ua> <4F3BE9C2.8040908@juniper.net> <20120215174031.GB3283@deviant.kiev.zoral.com.ua> <4F3BF164.2020506@juniper.net> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="5tsE7/DTPdxyIfi9" Content-Disposition: inline In-Reply-To: <4F3BF164.2020506@juniper.net> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-3.9 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-current Current , Marcel Moolenaar Subject: Re: [ptrace] please review follow fork/exec changes X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Feb 2012 18:02:18 -0000 --5tsE7/DTPdxyIfi9 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Feb 15, 2012 at 09:54:44AM -0800, Dmitry Mikulin wrote: >=20 >=20 > On 02/15/2012 09:40 AM, Konstantin Belousov wrote: > >On Wed, Feb 15, 2012 at 09:22:10AM -0800, Dmitry Mikulin wrote: > >> > >>On 02/15/2012 08:32 AM, Konstantin Belousov wrote: > >>>On Mon, Feb 13, 2012 at 02:50:45PM -0800, Dmitry Mikulin wrote: > >>>>>>>It seems that now wait4(2) can be called from the real (non-debugg= er) > >>>>>>>parent first and result in the call to proc_reap(), isn't it ? We= =20 > >>>>>>>would > >>>>>>>then just reparent the child back to the caller, still leaving the > >>>>>>>zombie and confusing debugger. > >>>>>>When either gdb or the real parent gets to proc_reap() the process > >>>>>>wouldn't > >>>>>>get destroyed, it'll get caught by the following clause: > >>>>>> if (p->p_oppid&& (t =3D pfind(p->p_oppid)) !=3D NULL) { > >>>>>> > >>>>>>and the real parent with get the child back into the children's list > >>>>>>while > >>>>>>gdb will get it into the orphan list. The second time around when > >>>>>>proc_reap() is entered, p->p_oppid will be 0 and the process will g= et > >>>>>>really reaped. Does it make sense? And proc_reparent() attempts to= =20 > >>>>>>keep > >>>>>>the > >>>>>>orphan list clean and not have the same entries and the list of > >>>>>>siblings. > >>>>>Right, this is what I figured. But I asked about some further=20 > >>>>>implication > >>>>>of this change: > >>>>> > >>>>>if real parent spuriosly calls wait4(2) on the child pid after the= =20 > >>>>>child > >>>>>exited, but before the debugger called the wait4(), then exactly the > >>>>>code you noted above will be run. This results in the child being fu= lly > >>>>>returned to the original parent. > >>>>> > >>>>>Next, the wait4() call from debugger gets an error, and zombie will = be > >>>>>kept around until parent calls wait4() for this pid once more. > >>>>> > >>>>>Am I missed something ? > >>>>In this case the process will move from gdb's child list to gdb's orp= han > >>>>list when the real parent does a wait4(). Next time around the wait l= oop > >>>>in > >>>>gdb it'll be caught by the orphan's proc_reap(). > >>>I do not see how the next debugger loop could find this process at all, > >>>since the first wait4() call reparented it to the original parent. > >>Not the debugger loop, the kern_wait() loop. The child get re-parented = to > >>the original parent but moves to the orphan list of the debugger proces= s. > >Either the debugger loop which calls wait4/waitpid, or the kern_wait loop > >resulting from the debugger calling wait*. > > > >Could you, please, describe, how the patched kernel moves the wait'ed > >zombie to the orphan list of the debugger ? > >For me, it seems that there is another bug, the child appears both on > >the childdren list, and on the orphan list of the real parent. >=20 >=20 > The first attempt to reap the child will get into the > if (p->p_oppid && (t =3D pfind(p->p_oppid)) !=3D NULL) { > clause, which will re-parent it to the real parent. The child will not be= =20 > destroyed at this point. >=20 > The following loop in proc_reparent() will make sure that the child does= =20 > not stay in both lists: > LIST_FOREACH(p, &parent->p_orphans, p_orphan) { > if (p =3D=3D child) { > LIST_REMOVE(child, p_orphan); > break; > } > } >=20 > Since the child parent is gdb and it's still being traced, the following= =20 > will move it to gdb's orphan list: >=20 > if (child->p_flag & P_TRACED) > LIST_INSERT_HEAD(&child->p_pptr->p_orphans, child, p_orphan); No, the child parent at this point is no longer the gdb, it is the original parent. And since P_TRACED is set, the process is inserted also in the orphans list of the original parent. This all happens during the first execution of wait4/waitpid from the real parent, in the proc_reparent. >=20 > After this the real parent will get the exit status. >=20 > The next pass through the kern_wait() loop called from gdb will catch the= =20 > child in its orphan list and will reap it this time for real since=20 > p->p_oppid will be set to 0 in the previous attempt to reap it. Gdb gets= =20 > the exit code, the child is destroyed. >=20 No, the child has no longer any assotiation with the debugger process, since the block in the if (p->p_oppid && (t =3D pfind(p->p_oppid)) !=3D NULL) { statement destroyed it. --5tsE7/DTPdxyIfi9 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAk878yIACgkQC3+MBN1Mb4jFuwCfQdv31kmmtUNAIqe1Ns5iO4/8 4k0AnjFqs12UDtnot3rJlh9qPrCJoIqA =4+pR -----END PGP SIGNATURE----- --5tsE7/DTPdxyIfi9--