From owner-freebsd-stable@FreeBSD.ORG Tue May 8 10:28:17 2007 Return-Path: X-Original-To: freebsd-stable@FreeBSD.ORG Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9B12C16A400; Tue, 8 May 2007 10:28:17 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 3C55513C45D; Tue, 8 May 2007 10:28:17 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 7BD9F47030; Tue, 8 May 2007 06:28:15 -0400 (EDT) Date: Tue, 8 May 2007 11:28:15 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: "Marc G. Fournier" In-Reply-To: Message-ID: <20070508112544.J24765@fledge.watson.org> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: ups@FreeBSD.org, Oliver Fromme , freebsd-stable@FreeBSD.ORG, jhb@FreeBSD.org Subject: Re: Socket leak (Was: Re: What triggers "No Buffer Space) Available"? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 May 2007 10:28:17 -0000 On Tue, 8 May 2007, Marc G. Fournier wrote: > So, over 7000 sockets with pretty much all processes shut down ... > > Shouldn't the garbage collector be cutting in somewhere here? > > I'm willing to shut everthing down like this again the next time it happens > (in 2-3 days) if someone has some other command / output they'd like fo rme > to provide the output of? > > And, I have the following outputs as of the above, where everythign is > shutdown and its running on minimal processes: I think there may be a bug in the MFC of the UNIX domain socket reference count changes in RELENG_6: revision 1.155.2.8 date: 2007/01/12 16:24:23; author: jhb; state: Exp; lines: +36 -7 MFC: Close a race between enumerating UNIX domain socket pcb structures via sysctl and socket teardown. Note that we engage in a bit of trickery to preserve the ABI of 'struct unpcb' in 6.x. We change the UMA zone to hold a 'struct unpcb_wrapper' which holds a 6.x 'struct unpcb' followed by the new reference count needed for handling the race. We then cast 'struct unpcb' pointers to 'struct unpcb_wrapper' pointers when we need to access the reference count. Submitted by: ups (including the ABI trickery) Could you try backing this out locally and see if the problem goes away? I've forwarded the information you sent to me previously to Stephan so he can take a look. Robert N M Watson Computer Laboratory University of Cambridge