From owner-freebsd-stable@FreeBSD.ORG Fri Aug 1 12:37:19 2003 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B89BA37B401 for ; Fri, 1 Aug 2003 12:37:19 -0700 (PDT) Received: from mobile.hub.org (u134n133.eastlink.ca [24.224.134.133]) by mx1.FreeBSD.org (Postfix) with ESMTP id F1A9343FBD for ; Fri, 1 Aug 2003 12:37:17 -0700 (PDT) (envelope-from scrappy@hub.org) Received: by mobile.hub.org (Postfix, from userid 1001) id A080D213; Fri, 1 Aug 2003 16:37:16 -0300 (ADT) Received: from localhost (localhost [127.0.0.1]) by mobile.hub.org (Postfix) with ESMTP id 8EC5E1C9; Fri, 1 Aug 2003 16:37:16 -0300 (ADT) Date: Fri, 1 Aug 2003 16:37:16 -0300 (ADT) From: The Hermit Hacker To: Don Bowman In-Reply-To: Message-ID: <20030801163616.O38014@hub.org> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: "'freebsd-stable@freebsd.org'" Subject: RE: kernel deadlock X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Aug 2003 19:37:20 -0000 On Fri, 1 Aug 2003, Don Bowman wrote: > > > On Tue, 29 Jul 2003, Don Bowman wrote: > > > > > From: Don Bowman [mailto:don@sandvine.com] > > > > > > > > From: Robert Watson [mailto:rwatson@freebsd.org] > > > > > On Tue, 29 Jul 2003, Dave Dolson wrote: > > > > > > > > > > > To follow up, I've discovered that the system has > > > > exhausted its "FFS > > > > > > node" malloc type. > > > > ... > > > > > > > > > > Some problems with this have turned up in -CURRENT on > > large-memory > > > > > machines where some of the scaling factors have been off. In > > > > > > > > We currently have kern.maxvnodes=70354 set (automatically > > > > scaled). This > > > > is a 1GB box. > > > > > > > > I will try re-running the test with less. > > > > > > > > when it hits kern.maxvnodes, what will it do? > > > > > > After applying the fixes from RELENG_4 for kern/52425, > > > I can still easily reproduce this hang without low memory. > > > Further debugging shows that vnlru process is waiting on > > > vlrup. This line is shown below. ie vnlru_nowhere is being > > > incremented ever 3 seconds. > > So what is happening here is that vnlru wakes up, runs through, > and there is nothing to free, so it goes back to sleep having > freed nothing. The caller doesn't wake up. There's no vnodes > to free, and everything in the system locks up. > > One possible solution is to make vnlru more aggressive, so > that before giving up, it tries to free pages that have > many references etc (which it currently skips). > Another option is to have it simply bump the kern.maxvnodes > number and wake up the process which called it. > > Suggestions? > check out 4.8-STABLE, which Tor.Egge(sp?) made modifications to the vnlru process that sound exactly what you are proposing ...