From owner-freebsd-net@FreeBSD.ORG Mon Mar 24 00:47:13 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 132F226D; Mon, 24 Mar 2014 00:47:13 +0000 (UTC) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id A659AA52; Mon, 24 Mar 2014 00:47:12 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqIEAGl/L1ODaFve/2dsb2JhbABZg0FXgwe/bIEndIIlAQEBAwEjBFIFFg4KAgINGQIjNgYTh2UDCQgNqzWaXA2HBReBKYs0HIEfEQEcNAeCb4FJBJZdjlWFSYNJIYE1OQ X-IronPort-AV: E=Sophos;i="4.97,716,1389762000"; d="scan'208";a="108512871" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 23 Mar 2014 20:47:06 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 2E808B4052; Sun, 23 Mar 2014 20:47:06 -0400 (EDT) Date: Sun, 23 Mar 2014 20:47:06 -0400 (EDT) From: Rick Macklem To: Christopher Forgeron Message-ID: <1164414873.1690348.1395622026185.JavaMail.root@uoguelph.ca> In-Reply-To: Subject: Re: 9.2 ixgbe tx queue hang MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.209] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: FreeBSD Net , Garrett Wollman , Jack Vogel , Markus Gebert X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Mar 2014 00:47:13 -0000 Christopher Forgeron wrote: > > > > > > > > > Update: > > For giggles, I set IP_MAXPACKET = 32768. > Well, I'm pretty sure you don't want to do that, except for an experiment. You can just set if_hw_tsomax to whatever you want to try, at the place my ixgbe.patch put it (just before the call to ether_ifattach()). > Over a hour of runtime, and no issues. This is better than with the > TSO patch and the 9.2 ixgbe, as that was just a drastic reduction in > errors. > So now the question becomes "how much does if_hw_tsomax need to be reduced from 65535 to get this?". If reducing it by the additional 4bytes for a vlan header is sufficient, then I understand what is going on. If it needs to be reduced by more than that, then there is something going on that I still don't understand. > Still have an 'angry' netstat -m on boot, and I'm still incrementing > denied netbuf calls, so something else is wrong. > > I'm going to modify Rick's prinft in ixgbe to also output when we're > over 32768. I'm sure it's still happening, but with an extra 32k of > space, we're not busting like we did before. > > > I notice a few interesting ip->ip_len changes since 9.2 - Like here, > at line 720 > > http://fxr.watson.org/fxr/diff/netinet/ip_output.c?v=FREEBSD10;im=kwqeqdhhvovqn;diffval=FREEBSD92;diffvar=v > > Looks like older code didn't byteswap with ntohs - I see that often > in tcp_output.c, and in tcp_options.c. > > > I'm also curious about this:Line 524 > http://fxr.watson.org/fxr/diff/netinet/ip_options.c?v=FREEBSD10;diffval=FREEBSD92;diffvar=v > > > New 10 code: > > ip ->ip_len = htons ( ntohs ( ip ->ip_len) + optlen); Old 9.2 Code: > ip ->ip_len += optlen; > Well, TSO segments aren't generated when optlen > 0, so I doubt this matters for our issue (and I would find it hard to believe that this would have been broken?). You can always look at the svn commit logs to see why/how something was changed. > > > I wonder if there are any unexpected consequences of these changes, > or perhaps a line someplace that doesn't make the change. > > Is there a dtrace command I could use to watch these functions and > compare the new ip_len with ip->ip_len or other variables? > > > > > > > > On Sun, Mar 23, 2014 at 12:25 PM, Christopher Forgeron < > csforgeron@gmail.com > wrote: > > > > > > > > On Sat, Mar 22, 2014 at 11:58 PM, Rick Macklem < rmacklem@uoguelph.ca > > wrote: > > > > > Christopher Forgeron wrote: > > > > > Also should we not also subtract ETHER_VLAN_ENCAP_LEN from tsomax > > to > > make sure VLANs fit? > > > I took a look and, yes, this does seem to be needed. It will only be > needed for the case where a vlan is in use and hwtagging is disabled, > if I read the code correctly. > > > > Yes, or in the rare care where you configure your switch to pass the > v_lan header through to the NIC. > > > > Do you use vlans? > > > (Answered in above email) > > > > > > I've attached an updated patch. > > It might be nice to have the printf() patch in the driver too, so > we can see how big the ones that are too big are? > > > > Yes, I'm going to leave those in until I know we have this fixed.. > will probably leave it in a while longer as it should only have a > minor performance impact to iter-loop like that, and I'd like to see > what the story is a few months down the road. > > > Thanks for the patches, will have to start giving them code-names so > we can keep them straight. :-) I guess we have printf, tsomax, and > this one. > >