From owner-freebsd-current@FreeBSD.ORG Thu May 14 21:39:17 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 443CD1065677 for ; Thu, 14 May 2009 21:39:17 +0000 (UTC) (envelope-from delphij@delphij.net) Received: from tarsier.delphij.net (delphij-pt.tunnel.tserv2.fmt.ipv6.he.net [IPv6:2001:470:1f03:2c9::2]) by mx1.freebsd.org (Postfix) with ESMTP id CB6018FC17 for ; Thu, 14 May 2009 21:39:16 +0000 (UTC) (envelope-from delphij@delphij.net) Received: from tarsier.geekcn.org (tarsier.geekcn.org [211.166.10.233]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by tarsier.delphij.net (Postfix) with ESMTPS id C4C065C024 for ; Fri, 15 May 2009 05:39:15 +0800 (CST) Received: from localhost (tarsier.geekcn.org [211.166.10.233]) by tarsier.geekcn.org (Postfix) with ESMTP id 6F90B55D1755; Fri, 15 May 2009 05:39:15 +0800 (CST) X-Virus-Scanned: amavisd-new at geekcn.org Received: from tarsier.geekcn.org ([211.166.10.233]) by localhost (mail.geekcn.org [211.166.10.233]) (amavisd-new, port 10024) with ESMTP id 0caat1ItyvyH; Fri, 15 May 2009 05:38:16 +0800 (CST) Received: from charlie.delphij.net (adsl-76-237-33-62.dsl.pltn13.sbcglobal.net [76.237.33.62]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by tarsier.geekcn.org (Postfix) with ESMTPSA id 5EC0B55D1754; Fri, 15 May 2009 05:38:10 +0800 (CST) DomainKey-Signature: a=rsa-sha1; s=default; d=delphij.net; c=nofws; q=dns; h=message-id:date:from:reply-to:organization:user-agent: mime-version:to:cc:subject:references:in-reply-to: x-enigmail-version:openpgp:content-type:content-transfer-encoding; b=sYDLMgbGeSYEel0a+QQlqafZnsK2BpXTK3uLsnoS5EgXRsB8JunaDFHH4ybHQCg0r 3pKN1m4WsHFw9lcQnQKyA== Message-ID: <4A0C8F30.8080404@delphij.net> Date: Thu, 14 May 2009 14:37:52 -0700 From: Xin LI Organization: The FreeBSD Project User-Agent: Thunderbird 2.0.0.21 (X11/20090408) MIME-Version: 1.0 To: Alexander Sack References: <3c0b01820905141202w113966dp4bfbab73d84d585@mail.gmail.com> <4A0C7544.6010304@delphij.net> <3c0b01820905141301h1b08fc0ay1e6a1676b5a149d4@mail.gmail.com> In-Reply-To: <3c0b01820905141301h1b08fc0ay1e6a1676b5a149d4@mail.gmail.com> X-Enigmail-Version: 0.95.7 OpenPGP: id=18EDEBA0; url=http://www.delphij.net/delphij.asc Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-current@freebsd.org Subject: Re: Broadcom bge(4) panics while shutting down X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: d@delphij.net List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 May 2009 21:39:17 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, Alexander, Alexander Sack wrote: [...] >>> Preliminary testing shows no more panics start and stopping ports >>> under heavy load (panics were almost immediate otherwise). >>> >>> Thoughts? >> I think this would solve the problem but I'm not sure whether this would >> increase some overhead on the RX path. It seems that there is a race >> between bge_release_resources() and bge_intr(), I mean, it might be a >> good idea to "drain" bge_intr() instead? > > Are you talking about detach time? Because bge_stop() gets called > before bge_release_resources() and stops host interrupts so where is > the race again? I mean at this point no more interrupts should be > delivered to bge_intr() (I can confirm from spec since BGE has > released it in the wild). So why would you "drain" it at this > point....(the hardware is down including the firmware). > > I agree it adds a little overhead to the standard bge_rxeof() path > which I agree is very sensitive to change. However, I think the check > at top is tolerable since the other recourse is crash. I mean its > very easy to reproduce. Flood a Broadcom card with traffic then stop > the card and let the race begin...it will go down in bge_rxeof() after > bge_stop releases the lock. > > I actually did not look at changing anything structurally to perhaps > make this whole predicament better but minimally there should be a > shield against this no? I'm all for a workaround at this point, but I really want to make sure that we are doing things in a safer manner if we can reach it timely. What about something like this, in my opinion, because bge_rxeof() would unlock sc in the middle, it would be probably better to re-check the IFF_DRV_RUNNING flag and avoid calling txeof after that (I've moved the condition to after we re-grab the lock to match the rest of the code: ===== [delphij@charlie] /sys/dev/bge> svn diff -x -p Index: if_bge.c =================================================================== - --- if_bge.c (revision 191995) +++ if_bge.c (working copy) @@ -3073,7 +3073,7 @@ bge_rxeof(struct bge_softc *sc) bus_dmamap_sync(sc->bge_cdata.bge_rx_jumbo_ring_tag, sc->bge_cdata.bge_rx_jumbo_ring_map, BUS_DMASYNC_POSTREAD); - - while(sc->bge_rx_saved_considx != + while (sc->bge_rx_saved_considx != sc->bge_ldata.bge_status_block->bge_idx[0].bge_rx_prod_idx) { struct bge_rx_bd *cur_rx; uint32_t rxidx; @@ -3193,6 +3193,9 @@ bge_rxeof(struct bge_softc *sc) BGE_UNLOCK(sc); (*ifp->if_input)(ifp, m); BGE_LOCK(sc); + + if (!(ifp->if_drv_flags & IFF_DRV_RUNNING)) + return; } if (stdcnt > 0) @@ -3301,6 +3304,10 @@ bge_poll(struct ifnet *ifp, enum poll_cmd cmd, int sc->rxcycles = count; bge_rxeof(sc); + if (!(ifp->if_drv_flags & IFF_DRV_RUNNING)) { + BGE_UNLOCK(sc); + return; + } bge_txeof(sc); if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) bge_start_locked(ifp); @@ -3370,7 +3377,9 @@ bge_intr(void *xsc) if (ifp->if_drv_flags & IFF_DRV_RUNNING) { /* Check RX return ring producer/consumer. */ bge_rxeof(sc); + } + if (ifp->if_drv_flags & IFF_DRV_RUNNING) { /* Check TX ring producer/consumer. */ bge_txeof(sc); } Cheers, - -- Xin LI http://www.delphij.net/ FreeBSD - The Power to Serve! -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (FreeBSD) iEYEARECAAYFAkoMjy8ACgkQi+vbBBjt66DgLwCeMtfBGPr9J5MIrSN2ZLADIzf2 8lsAnj6JSw/A55agiJWwMQVZtZiq1uQ+ =oXJB -----END PGP SIGNATURE-----