From owner-freebsd-hackers Sat Feb 10 14:22:51 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id OAA14269 for hackers-outgoing; Sat, 10 Feb 1996 14:22:51 -0800 (PST) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id OAA14262 for ; Sat, 10 Feb 1996 14:22:49 -0800 (PST) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id PAA16797; Sat, 10 Feb 1996 15:19:44 -0700 From: Terry Lambert Message-Id: <199602102219.PAA16797@phaeton.artisoft.com> Subject: Re: can't free vnode To: nwestfal@Vorlon.odc.net (Neal Westfall) Date: Sat, 10 Feb 1996 15:19:44 -0700 (MST) Cc: hackers@FreeBSD.org In-Reply-To: <199602102055.MAA07072@Vorlon.odc.net> from "Neal Westfall" at Feb 10, 96 12:55:32 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-hackers@FreeBSD.org Precedence: bulk > I have a news server here running FreeBSD-stable and INN 1.4-unoff and I > am having a consistent problem of kernel panics. The reason for the > panic is usually something to the effect of "can't free vnode; free vnode > isn't". > > This has been a problem ever since we first installed FreeBSD 2.0.5 on > the machine and we have tried upgrading everything, even to the point > of downloading the src tree for -stable and all the ctm deltas and > running a make world. All to no avail, we still have the same > problems as before. [ ... ] > We have heard that FreeBSD is a more appropriate operating system > for a news server than Linux because of performance and stability > issues. However, if we can't figure out what is causing these > problems, we will have to switch to Linux because this occurs > almost every day. > > Any help is appreciated. There are two occasions when this happens; the first is in a very low RAM condition when you fill up swap. The fix is to add more swap. The second is when numvnodes == desiredvnodes and a vgone occurs when a recursive lock is held and you get a page fault for the same vnode in the middle of a read or a write. This is actually an endemic problem in the use if the ihash code itself. The fix is to change the vp lock operation to a counting semaphore, and either move the directory cache code up above the lookup instead of doing it per FS (and treat a cache refrence as a reference count), or to actually get rid of the ihash abstraction entirely (some non-trival VM cache index changes are necessary). The easiest workaround for now is to make sure you have sufficient free vnodes by jacking up the desiredvnodes. A less easy workaround would be to call VOP_ISLOCKED() on the vp in vrele() in /sys/kern/vfs_subr.c, and if it is, go to sleep on the address until it isn't. This would be something like: while( VOP_ISLOCKED(vp)) { (void) tsleep( (caddr_t)vp->v_data, PINOD, "vfslk", 0); } This is gross, but works because vp->v_data is the same address as the inode that VOP_UNLOCK(vp) calls the wakeup on in the FS's that use the recursion semantics for potential dissociation of the vnode from the underlying FS (see the vclean() comments in vfs_subr.c)... the address of the per FS inode data. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.