From owner-freebsd-net@FreeBSD.ORG Wed Apr 16 18:56:28 2008 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from [127.0.0.1] (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by hub.freebsd.org (Postfix) with ESMTP id 68782106566B; Wed, 16 Apr 2008 18:56:28 +0000 (UTC) (envelope-from jkim@FreeBSD.org) From: Jung-uk Kim To: "Alexander Sack" Date: Wed, 16 Apr 2008 14:56:18 -0400 User-Agent: KMail/1.6.2 References: <3c0b01820804160929i76cc04fdy975929e2a04c0368@mail.gmail.com> <200804161732.RAA23071@sopwith.solgatos.com> <3c0b01820804161120udb54ab3tea4bf7baade0061f@mail.gmail.com> In-Reply-To: <3c0b01820804161120udb54ab3tea4bf7baade0061f@mail.gmail.com> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200804161456.20823.jkim@FreeBSD.org> Cc: freebsd-net@FreeBSD.org, Dieter Subject: Re: bge dropping packets issue X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Apr 2008 18:56:28 -0000 [CC trimmed] On Wednesday 16 April 2008 02:20 pm, Alexander Sack wrote: > Dieter: Thanks, at 20Mbps! That's pretty aweful. > > JK: Thanks again. Wow, I searched the list and didn't see much > discussion with respect to bge and packet loss! I will try the > rest of that patch including pushing the TCP receive buffer up > (though I don't think that's going to help in this case). The > above is based on just looking at code.... > > I guess some follow-up questions would be: > > 1) Why isn't BGE_SSLOTS tunable (to a point)? Why can't that be > added the driver? I noticed that CURRENT has added a lot more > SYSCTL information. Moreover it seems the Linux driver can set it > up to 1024. IIRC, Linux tg3 uses one ring for both standard and jumbo. > 2) bge_tick() uses the same global mutex for its callout as the > rest of the driver. Moreover, it really doesn't have to hold it > while updating statistics, they are reads of volatile registers > anyway (blocking the ISR doesn't prevent the firmware from updating > its stat struct). Would there be any interest in using a separate > mutex for the callout itself and then just hold the lock for the > other small calls (bge_asf_driver_up(), bge_watchdog())? I'm > experimenting with right now just dropping the BGE mutex around the > bge_stats_update() calls to give more time to bge_rxeof() to drain > rx_bd's. I admit that bge_tick doesn't do a whole lot but it seems > this driver is very sensitive to resource starvation and I'm trying > to get the BGE driver to drain as much rx_bd's as possible to avoid > drops due to the firmware having no place to put them! > > 3) How does interrupt non-DEVICE_POLLING perform? If you just use default values, it won't perform very well. There are some patches around the net to automatically adjust these numbers, e.g., http://docs.freebsd.org/cgi/mid.cgi?20071117194615.L67319 http://mail-index.netbsd.org/tech-kern/2004/03/16/0000.html Jung-uk Kim > Thanks guys! > > -aps > > On Wed, Apr 16, 2008 at 5:32 AM, Dieter wrote: > > > I'm investigating an issue we are seeing with 6.1-RELEASE and > > > the bge > > > > > > driver dropping packets sporadically at 100MBps speed. > > > > > > Its get mainly aggravated when heavy disk I/O occurs > > > > > > > > > Has anyone seen this problem before with bge? Am I barking up > > > the > > > > > > wrong tree with my initial investigation? Does anyone know if > > > its even possible to achieve 100% packet capture with bge at > > > its supported speeds (10/100/1000)? > > > > I had a similar problem with bge and 6.0-RELEASE. Bge works > > great for me in 6.2-RELEASE. So far 7.0-RELEASE looks good as > > well (bge-wise, I do have unresolved issues with 7). > > > > My app is TCP, cranking the TCP receive buffer way up helped, as > > did turning off Nagle. > > > > My bge is 1000, but connected at 100 since that is what the > > other end is. I saw problems at less than 20 Mbps. > > > > There is still a problem that other drivers can lock out bge for > > too long. For example kern/118093: firewire bus reset.