Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 20 Oct 2002 17:44:10 -0700 (PDT)
From:      Don Lewis <dl-freebsd@catspoiler.org>
To:        zipzippy@sonic.net
Cc:        current@FreeBSD.ORG
Subject:   Re: kernel: lock order reversal
Message-ID:  <200210210044.g9L0iAvU075096@gw.catspoiler.org>
In-Reply-To: <20021020102654.GA47626@blarf.homeip.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On 20 Oct, Alex Zepeda wrote:
> I see this on a fairly regular basis (at least once per 24 hour period):
> 
> lock order reversal
>  1st 0xc0490ca0 spechash (spechash) @ ../../../kern/vfs_subr.c:2748
>  2nd 0xc1ed2818 vnode interlock (vnode interlock) @ ../../../kern/vfs_subr.c:2751
> 
> Dunno if it's any help, but it happened at 03:08:50, so right towards the
> midle of the daily scripts (takes about 20mins here), fairly heavy disk
> access.

The quick and dirty fix would be to drop the vnode interlock before
calling vcount(), and to always grab the interlock inside vcount(). Many
of vcount()'s callers don't bother grabbing the interlock anyway.

The bigger problem is that this makes the race condition problem worse.
It looks like the correct fix would be to somehow protect the special
device reference count with lock rather than attempting to rely on the
the vnode interlock, which only protects the reference count for one of
potentially many aliases.  The could would have to do something like:

	Grab the lock.
	
	Examine the device reference count.
	
	Execute any code that depends on the reference count and adjust
        the reference count if necessary.

	Release the lock.

Where this gets ugly is that this affects any code that adjusts vnode
reference counts if the vnode happens to be a device vnode.  The device
reference could probably has to be locked at a high level rather than in
vref() etc., since the above code will be using the same low level code
as everything else.  This means that lots of places in the code will
have to know about this lock.

The lock could either be global, possibly even spechash_mtx, or it could
be attached to struct cdev.  The reference count could either be a
member of struct cdev, or we could continue to calculate it with
vcount().


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200210210044.g9L0iAvU075096>