From owner-freebsd-current@FreeBSD.ORG Sat Aug 27 17:44:53 2005 Return-Path: X-Original-To: freebsd-current@FreeBSD.org Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D894716A41F for ; Sat, 27 Aug 2005 17:44:53 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [204.156.12.53]) by mx1.FreeBSD.org (Postfix) with ESMTP id 27FE243D4C for ; Sat, 27 Aug 2005 17:44:51 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by cyrus.watson.org (Postfix) with ESMTP id 7CCCB46B4B; Sat, 27 Aug 2005 13:44:51 -0400 (EDT) Date: Sat, 27 Aug 2005 18:44:51 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: "M. Warner Losh" In-Reply-To: <20050827.114013.35047360.imp@bsdimp.com> Message-ID: <20050827184153.A24510@fledge.watson.org> References: <20050827.104631.10908351.imp@bsdimp.com> <20050827181827.O24510@fledge.watson.org> <20050827.114013.35047360.imp@bsdimp.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: bzeeb-lists@lists.zabbadoz.net, freebsd-current@FreeBSD.org, dandee@volny.cz Subject: Re: LOR route vr0 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Aug 2005 17:44:54 -0000 On Sat, 27 Aug 2005, M. Warner Losh wrote: > : Generally speaking, network interface device driver locks follow network > : stack locks in the lock order. However, I've not really looked much at > : the route table locking so can't speak to whether that is the case > : specifically for routing locks. If it is, the below traces reflect the > : correct order, and you might want to add a hard-coded entry to witness in > : order to catch the reverse order. > > Can you pose a quickie summary on how to do that? I tried last night and > was unsuccessful... You need to add an entry to subr_witness.c creating a graph edge between the softc lock and the routing lock. An example of an entry in subr_witness.c: /* * TCP/IP */ { "tcp", &lock_class_mtx_sleep }, { "tcpinp", &lock_class_mtx_sleep }, { "so_snd", &lock_class_mtx_sleep }, { NULL, NULL }, Note that sets of ordered entries are terminated with a double-null. This declares that locks of type "tcp" preceed "tcpinp" which preceed "so_snd". > : Lock order reversals between the > : network stack and device drivers tend to occur as a result of the device > : driver calling into the network stack while holding the device driver > : mutex. > > I'm as sure as I can be that no locks are held when I call INTO the > network layer. As far as I can tell, I only do that when I call > ifp->if_input, and I drop the locks to do that. If I had to guess, you do a media status update, which can cause routing socket events indicating the link went up or down. > : Someone (tm) should work out if the right order is route locks -> > : device driver locks, as it's likely a common calss of bugs across many > : drivers. > > I just discovered the problem in my code. I'm not sure where the > other order happens, but in my code I do the following: > > ED_LOCK(sc); > ed_setrcr(sc); > ed_ds_getmcst(sc); > IF_ADDR_LOCK(sc->ifp); > TAILQ_FOREACH(ifma, &sc->ifp->if_multiaddrs, ifma_link) { > ... > IF_ADDR_UNLOCK(sc->ifp); > ED_UNLOCK(sc); > > since the lock for ED should be a leaf lock, this causes problems. I'm > guessing that the network layer calls into the driver with this lock > held. Without hard coding the locking into witness (see above), I'm > unsure where this happens. A quick grep of the code doesn't reveal > anything obvious... I think this case should be OK, and we should document that as being the case using a hard-coded witness entry. > When I comment out the abouve IF_ADDR locks, I have no more LORs, but I > think maybe other problems :-). Hmmm. I was thinking that it was a separate issue. Could you try adding a graph edge to witness forcing the ifaddrmtx's to fall before the driver mutexes, in order to identify a path by which ifaddrmtx preceeds the driver mutex? Robert N M Watson