From owner-freebsd-hackers Fri Apr 4 07:43:26 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id HAA14524 for hackers-outgoing; Fri, 4 Apr 1997 07:43:26 -0800 (PST) Received: from nlsystems.com (nlsys.demon.co.uk [158.152.125.33]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id HAA14516 for ; Fri, 4 Apr 1997 07:43:20 -0800 (PST) Received: from herring.nlsystems.com (herring.nlsystems.com [10.0.0.2]) by nlsystems.com (8.8.5/8.8.5) with SMTP id QAA11191; Fri, 4 Apr 1997 16:43:08 +0100 (BST) Date: Fri, 4 Apr 1997 16:43:08 +0100 (BST) From: Doug Rabson To: Tor Egge cc: dg@root.com, ponds!rivers@dg-rtp.dg.com, freebsd-hackers@FreeBSD.ORG Subject: Re: kern/3184: vnodes are used after they are freed. (dup alloc?) In-Reply-To: <199704041503.RAA05693@pat.idt.unit.no> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk On Fri, 4 Apr 1997, Tor Egge wrote: > > Uh, this is wrong since VOP_INACTIVE really wants a '0' usecount vnode, > > and there are assumptions throughout the code that a '0' usecount also > > implies that the vnode is on the free list. A quick code review of Tor's > > suggested fix shows that it will fail in several places in the kernel and > > basically needs to be re-thought...which is why it hasn't been committed > > yet. > > I'm running with the modified suggested fix now, and have not seen any > failures due to that suggested fix. The original suggested fix failed > due to the assumptions that a `0' usecount meant that it was on the > free list, and a NULL pointer was dereferenced when trying to move the > vnode to the head of the free list. Adding a kludge (magic number > 0xdeadb, used elsewhere in the code to mark that the vnode was not on > the freelist) made the code work for my tests. I tried testing your fix this morning and the 0xdeadb stuff just caused vget to fault a couple of minutes into my test (simultaneous rm -rf largetree and cvs co src, both remote). This problem really has little to do with nfs_inactive. What is happening is a race between vgone and vget which would normally be solved by the vnode locks. Since NFS doesn't have vnode locks, the race happens. I am most of the way there in implementing the right solution for NFS which is to used shared locks for NFS; vgone can then use the lock manager to wait for all the shared locks to drain before recycling the vnode. -- Doug Rabson Mail: dfr@nlsystems.com Nonlinear Systems Ltd. Phone: +44 181 951 1891