Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 27 Aug 2005 11:40:13 -0600 (MDT)
From:      "M. Warner Losh" <imp@bsdimp.com>
To:        rwatson@FreeBSD.org
Cc:        bzeeb-lists@lists.zabbadoz.net, freebsd-current@FreeBSD.org, dandee@volny.cz
Subject:   Re: LOR route vr0
Message-ID:  <20050827.114013.35047360.imp@bsdimp.com>
In-Reply-To: <20050827181827.O24510@fledge.watson.org>
References:  <Pine.BSF.4.53.0508270912550.969@e0-0.zab2.int.zabbadoz.net> <20050827.104631.10908351.imp@bsdimp.com> <20050827181827.O24510@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
In message: <20050827181827.O24510@fledge.watson.org>
            Robert Watson <rwatson@FreeBSD.org> writes:
: 
: On Sat, 27 Aug 2005, M. Warner Losh wrote:
: 
: > In message: <Pine.BSF.4.53.0508270912550.969@e0-0.zab2.int.zabbadoz.net>
: >            "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net> writes:
: > : > lock order reversal
: > : >  1st 0xc17621ec rtentry (rtentry) @ /usr/src/sys/net/route.c:1269
: > : >  2nd 0xc15ec938 vr0 (network driver) @ /usr/src/sys/pci/if_vr.c:1391
: > :
: > : added with ID 140: http://sources.zabbadoz.net/freebsd/lor.html#140
: >
: > I've noticed a *HUGE* number of LORs that look like this:
: >
: > ock order reversal
: > 1st 0xc17490e4 rtentry (rtentry) @ sys/netinet/if_ether.c:445
: > 2nd 0xc15c94b0 rl1 (network driver) @ sys/pci/if_rl.c:1451
: 
: Generally speaking, network interface device driver locks follow network 
: stack locks in the lock order.  However, I've not really looked much at 
: the route table locking so can't speak to whether that is the case 
: specifically for routing locks.  If it is, the below traces reflect the 
: correct order, and you might want to add a hard-coded entry to witness in 
: order to catch the reverse order.

Can you pose a quickie summary on how to do that? I tried last night
and was unsuccessful...

: Lock order reversals between the 
: network stack and device drivers tend to occur as a result of the device 
: driver calling into the network stack while holding the device driver 
: mutex.

I'm as sure as I can be that no locks are held when I call INTO the
network layer.  As far as I can tell, I only do that when I call
ifp->if_input, and I drop the locks to do that.

: Someone (tm) should work out if the right order is route locks -> 
: device driver locks, as it's likely a common calss of bugs across many 
: drivers.

I just discovered the problem in my code.  I'm not sure where the
other order happens, but in my code I do the following:

	ED_LOCK(sc);
	ed_setrcr(sc);
	    ed_ds_getmcst(sc);
		IF_ADDR_LOCK(sc->ifp);
		TAILQ_FOREACH(ifma, &sc->ifp->if_multiaddrs, ifma_link) {
		...		
		IF_ADDR_UNLOCK(sc->ifp);
	ED_UNLOCK(sc);

since the lock for ED should be a leaf lock, this causes problems.
I'm guessing that the network layer calls into the driver with this
lock held.  Without hard coding the locking into witness (see above),
I'm unsure where this happens.  A quick grep of the code doesn't
reveal anything obvious...

When I comment out the abouve IF_ADDR locks, I have no more LORs, but
I think maybe other problems :-).

Warner



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050827.114013.35047360.imp>