From owner-freebsd-current@FreeBSD.ORG Mon Apr 12 16:57:37 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B6E021065670 for ; Mon, 12 Apr 2010 16:57:37 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 856148FC19 for ; Mon, 12 Apr 2010 16:57:37 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 1C6BF46B6C; Mon, 12 Apr 2010 12:57:37 -0400 (EDT) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPA id 3E19B8A01F; Mon, 12 Apr 2010 12:57:36 -0400 (EDT) From: John Baldwin To: Jack Vogel Date: Mon, 12 Apr 2010 12:56:12 -0400 User-Agent: KMail/1.12.1 (FreeBSD/7.3-CBSD-20100217; KDE/4.3.1; amd64; ; ) References: <201004081313.o38DD4JM041821@lava.sentex.ca> <201004121052.42350.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201004121256.12161.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Mon, 12 Apr 2010 12:57:36 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.8 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: pyunyh@gmail.com, freebsd-current@freebsd.org, Mike Tancsa Subject: Re: LOR on em in HEAD ( was Re: em driver regression X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Apr 2010 16:57:37 -0000 On Monday 12 April 2010 12:26:06 pm Jack Vogel wrote: > On Mon, Apr 12, 2010 at 7:52 AM, John Baldwin wrote: > > > On Friday 09 April 2010 3:09:24 pm Jack Vogel wrote: > > > Someone else also pointed this out. I'm dubious about its claim. > > > This happens because there is an RX lock taken in rxeof, its held > > > thru the call into the stack, it then encounters another lock there > > > and hence this complaint. I've had the RX hold as it is for a long > > > while and would rather not have to give it up, can someone look > > > at it and advise? > > > > I've seen it happen with igb. I suspect it is a transitive lock order. > > That > > is, you probably never have the UDP lock acquired before an em/igb RX lock. > > However, if you have an em/igb adapter TX lock held when you acquire an > > em/igb > > RX lock in one place, and in if_start() you acquire the TX lock while the > > UDP > > lock is held, that can trigger the LOR. Specifically, those two paths > > would > > give you these two orders: > > > > TX -> RX > > UDP -> TX > > > > which implies the order > > > > UDP -> RX > > > > (lock order relationsips are transitive, just like a > b and b > c implies > > a > c). > > > > However, I haven't been able to track down what the raw orders are that > > might > > lead to this transitive order. Attilio added some sysctls to dump all the > > raw > > lock orders in one of the debug.witness sysctls. You can also try > > hardcoding > > the 'RX -> UDP' order using WITNESS_DEFINEORDER() before any of the em/igb > > RX/TX locks are acquired to see what different LOR is triggered. If that > > LOR > > looks valid then you can keep hardcoding valid orders until you find the > > invalid one. > > > > Do you think releasing the RX lock before the stack entry would get rid of > the problem? > > Other ideas? Well, while that might quiet the LOR, I suspect it would be masking another problem that is the "real" LOR. -- John Baldwin