From owner-freebsd-current  Sun Jun 27  3:18:57 1999
Delivered-To: freebsd-current@freebsd.org
Received: from overcee.netplex.com.au (overcee.netplex.com.au [202.12.86.7])
	by hub.freebsd.org (Postfix) with ESMTP id E417314C1D
	for <current@freebsd.org>; Sun, 27 Jun 1999 03:18:44 -0700 (PDT)
	(envelope-from peter@netplex.com.au)
Received: from netplex.com.au (localhost [127.0.0.1])
	by overcee.netplex.com.au (Postfix) with ESMTP
	id 3A77B81; Sun, 27 Jun 1999 18:18:42 +0800 (WST)
	(envelope-from peter@netplex.com.au)
X-Mailer: exmh version 2.0.2 2/24/98
To: Doug Rabson <dfr@nlsystems.com>
Cc: Matthew Dillon <dillon@apollo.backplane.com>,
	current@freebsd.org, mckusick@mckusick.com
Subject: Re: BUF_LOCK() related panic.. 
In-reply-to: Your message of "Sun, 27 Jun 1999 10:51:29 +0100."
             <Pine.BSF.4.05.9906271044580.80685-100000@herring.nlsystems.com> 
Date: Sun, 27 Jun 1999 18:18:42 +0800
From: Peter Wemm <peter@netplex.com.au>
Message-Id: <19990627101842.3A77B81@overcee.netplex.com.au>
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Doug Rabson wrote:
> On Sun, 27 Jun 1999, Peter Wemm wrote:
> 
> > Matthew Dillon wrote:
> > >     Ah, yes, some of us were just discussing this in a small mailing list
    .
> > >     Hopefully Kirk will pick up on it soon.  Ah well.. someone else gets 
    to b
> >     e
> > >     the brunt of it for a change :-).  Kirk doesn't have an SMP box so he
> > >     didn't see the bug.
> > > 
> > >     I have tentitively tracked the problem down to the apparent inability
     of
> > >     lockmgr() locks to function from interrupts, even when used in a
> > >     non-blocking manner, due to the simplelock's it uses internally.  The
> > >     new buffer cache code Kirk committed switched from B_BUSY (manually
> > >     implemented locks) to lockmgr() locks.  I think what is going on is
> > >     that mainline code is getting a simplelock and then an interrupt is
> > >     coming along and also trying to get the same lock, but I can't be sur
    e
> > >     because my DDB backtraces are somewhat munged.
> > 
> > In this case, it was just a programming error..  The key to remember is tha
    t
> > the simplelocks are used to protect the state of the complex lock, they are
> > not the lock themselves.  lockmgr() holds the interlock while gaining or
> > removing references etc and then frees the simplelock so that it can sleep
> > if required etc.  The actual implementation of the simplelock routines
> > is interrupt safe (and has to be).
> 
> The simple_lock* macros don't seem to use the interrupt safe versions
> (ss_lock etc). What happens if an interrupt is recieved after gaining
> buftimelock and the interrupt routine also tries to call BUF_LOCK?

Good question, but I'm not sure ss_lock is what's needed either since that
does a cli for the duration of the simplelock being held..

I think the BUF_*() inlines need internal splbio() protection since
a biodone() can be called from the tail end of an interrupt, and that *does*
try and get a simplelock during a BUF_UNLOCK()... (and BUF_REFCNT()).

Cheers,
-Peter
--
Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message