Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 27 Jun 1999 18:36:49 +0800
From:      Peter Wemm <peter@netplex.com.au>
To:        Doug Rabson <dfr@nlsystems.com>
Cc:        Matthew Dillon <dillon@apollo.backplane.com>, current@freebsd.org, mckusick@mckusick.com
Subject:   Re: BUF_LOCK() related panic.. 
Message-ID:  <19990627103649.CEB4B81@overcee.netplex.com.au>
In-Reply-To: Your message of "Sun, 27 Jun 1999 11:35:44 %2B0100." <Pine.BSF.4.05.9906271126390.80685-100000@herring.nlsystems.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
Doug Rabson wrote:
> On Sun, 27 Jun 1999, Peter Wemm wrote:
> 
> > Doug Rabson wrote:
> > > On Sun, 27 Jun 1999, Peter Wemm wrote:
> > > 
> > > > Matthew Dillon wrote:
> > > > >     Ah, yes, some of us were just discussing this in a small mailing 
    list
> >     .
> > > > >     Hopefully Kirk will pick up on it soon.  Ah well.. someone else g
    ets 
> >     to b
> > > >     e
> > > > >     the brunt of it for a change :-).  Kirk doesn't have an SMP box s
    o he
> > > > >     didn't see the bug.
> > > > > 
> > > > >     I have tentitively tracked the problem down to the apparent inabi
    lity
> >      of
> > > > >     lockmgr() locks to function from interrupts, even when used in a
> > > > >     non-blocking manner, due to the simplelock's it uses internally. 
     The
> > > > >     new buffer cache code Kirk committed switched from B_BUSY (manual
    ly
> > > > >     implemented locks) to lockmgr() locks.  I think what is going on 
    is
> > > > >     that mainline code is getting a simplelock and then an interrupt 
    is
> > > > >     coming along and also trying to get the same lock, but I can't be
     sur
> >     e
> > > > >     because my DDB backtraces are somewhat munged.
> > > > 
> > > > In this case, it was just a programming error..  The key to remember is
     tha
> >     t
> > > > the simplelocks are used to protect the state of the complex lock, they
     are
> > > > not the lock themselves.  lockmgr() holds the interlock while gaining o
    r
> > > > removing references etc and then frees the simplelock so that it can sl
    eep
> > > > if required etc.  The actual implementation of the simplelock routines
> > > > is interrupt safe (and has to be).
> > > 
> > > The simple_lock* macros don't seem to use the interrupt safe versions
> > > (ss_lock etc). What happens if an interrupt is recieved after gaining
> > > buftimelock and the interrupt routine also tries to call BUF_LOCK?
> > 
> > Good question, but I'm not sure ss_lock is what's needed either since that
> > does a cli for the duration of the simplelock being held..
> > 
> > I think the BUF_*() inlines need internal splbio() protection since
> > a biodone() can be called from the tail end of an interrupt, and that *does
    *
> > try and get a simplelock during a BUF_UNLOCK()... (and BUF_REFCNT()).
> 
> In the long term, we probably need an spl-aware simplelock or maybe the
> cunning no-cost interrupt thread scheme which BSDi are using.

Yes.  I will also note that this approach is what John Dyson wanted to do
for the G2 kernel.

Cheers,
-Peter
--
Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19990627103649.CEB4B81>