From owner-freebsd-current Mon Mar 10 13:43:59 2003 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B972137B404; Mon, 10 Mar 2003 13:43:57 -0800 (PST) Received: from smtp01.syd.iprimus.net.au (smtp01.syd.iprimus.net.au [210.50.30.70]) by mx1.FreeBSD.org (Postfix) with ESMTP id 26E2343F93; Mon, 10 Mar 2003 13:43:56 -0800 (PST) (envelope-from tim@robbins.dropbear.id.au) Received: from dilbert.robbins.dropbear.id.au ([210.50.86.1]) by smtp01.syd.iprimus.net.au with Microsoft SMTPSVC(5.0.2195.5600); Tue, 11 Mar 2003 08:43:53 +1100 Received: from dilbert.robbins.dropbear.id.au (f79xte1z8vsoefr2@localhost [127.0.0.1]) by dilbert.robbins.dropbear.id.au (8.12.6/8.12.6) with ESMTP id h2ALhmJK064008; Tue, 11 Mar 2003 08:43:48 +1100 (EST) (envelope-from tim@dilbert.robbins.dropbear.id.au) Received: (from tim@localhost) by dilbert.robbins.dropbear.id.au (8.12.6/8.12.6/Submit) id h2ALhkHc064007; Tue, 11 Mar 2003 08:43:46 +1100 (EST) (envelope-from tim) Date: Tue, 11 Mar 2003 08:43:46 +1100 From: Tim Robbins To: John Baldwin Cc: Kris Kennaway , alfred@FreeBSD.org, current@FreeBSD.org, Poul-Henning Kamp Subject: Re: NULL pointer problem in pid selection ? Message-ID: <20030311084346.A63542@dilbert.robbins.dropbear.id.au> References: <20030308213535.GE56020@rot13.obsecurity.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: ; from jhb@FreeBSD.org on Mon, Mar 10, 2003 at 01:00:15PM -0500 X-OriginalArrivalTime: 10 Mar 2003 21:43:54.0136 (UTC) FILETIME=[25700980:01C2E74E] Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Mon, Mar 10, 2003 at 01:00:15PM -0500, John Baldwin wrote: > On 08-Mar-2003 Kris Kennaway wrote: > > On Sat, Mar 08, 2003 at 11:46:34AM +0100, Poul-Henning Kamp wrote: > >> > >> Just got this crash on -current, and I belive I have seen similar > >> before. addr2line(1) reports the faulting address to be > >> ../../../kern/kern_fork.c:395 > >> which is in the inner loop of pid collision avoidance. > > > > I've been running this patch from Alfred for the past month or so on > > bento, which has fixed a similar panic I was seeing regularly. > > Using just a shared lock instead of an xlock should be ok there. You > aren't modifying the process tree, just looking at it. OTOH, the > proc lock is supposed to protect p_grp and p_session, so they shouldn't > be NULL. :( I have a suspiscion that the bug is actually in wait1(): sx_xlock(&proctree_lock); [...] /* * Remove other references to this process to ensure * we have an exclusive reference. */ leavepgrp(p); sx_xlock(&allproc_lock); LIST_REMOVE(p, p_list); /* off zombproc */ sx_xunlock(&allproc_lock); LIST_REMOVE(p, p_sibling); sx_xunlock(&proctree_lock); Shouldn't we be removing the process from zombproc before setting p_pgrp to NULL via leavepgrp()? Does this even matter at all when both fork1() and wait1() are still protected by Giant? Tim To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message