From owner-freebsd-stable Wed Apr 10 11:54:51 2002 Delivered-To: freebsd-stable@freebsd.org Received: from engmail.uwaterloo.ca (engmail.uwaterloo.ca [129.97.50.62]) by hub.freebsd.org (Postfix) with ESMTP id E41E637B420 for ; Wed, 10 Apr 2002 11:54:42 -0700 (PDT) Received: from localhost (bruce@localhost) by engmail.uwaterloo.ca (8.11.6/8.11.6) with ESMTP id g3AIsLw28613; Wed, 10 Apr 2002 14:54:21 -0400 (EDT) Date: Wed, 10 Apr 2002 14:54:21 -0400 (EDT) From: Bruce Campbell To: Matthew Dillon Cc: Danny Schales , Rolandas Naujikas , Doug White , Wilko Bulte , Paul Horechuk , stable@FreeBSD.ORG, jah4007@cs.rit.edu Subject: Re: nfs_fsync: not dirty error in 4.5-RELEASE (possible solution) In-Reply-To: <200203292234.g2TMYpq67679@apollo.backplane.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG After experiencing 9 such panics in a 10 day period, on 3 different machines, I applied the below possible solution to just one of the 3 systems. That was 4 days ago. Since then, none of the 3 systems have panic'ed ;-( During the last panics, I was able to determine that the only user connected was one who was over quota. He then moved to another of the 3 servers and was the only one connected there, and then that one panic'ed also. Despite a number of over quota experiments, I was unable to reproduce the condition. On Fri, 29 Mar 2002, Matthew Dillon wrote: > Ok, I am putting this back on the main list. > > After looking at a kernel core that Danny graciously provided, I believe > I have located the problem. > > The core shows NFS panicing on a struct buf showing up on the vnode's > v_dirtyblkhd list that is not marked B_DELWRI. > > After examining the core I found that the buffer was marked B_INVAL, > and I found a case in brelse() where B_DELWRI is cleared on a buffer > marked B_DELWRI|B_INVAL without moving it out of the vnode's v_dirtyblkhd > list. Specifically, line 1214 if kern/vfs_bio.c: > > /* > * If B_INVAL, clear B_DELWRI. We've already placed the buffer > * on the correct queue. > */ > if ((bp->b_flags & (B_INVAL|B_DELWRI)) == (B_INVAL|B_DELWRI)) { > bp->b_flags &= ~B_DELWRI; > --numdirtybuffers; > numdirtywakeup(lodirtybuffers); > } > > I believe that the correct fix is to change this code to: > > /* > * If B_INVAL, clear B_DELWRI. We've already placed the buffer > * on the correct queue. > */ > if ((bp->b_flags & (B_INVAL|B_DELWRI)) == (B_INVAL|B_DELWRI)) > bundirty(bp); > > I would appreciate it if everyone who is able to easily reproduce this > panic would test this fix and post your results back to the list. If > this solves the problem I will commit it to -current and -stable. > > -Matt > Matthew Dillon > > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message