From owner-freebsd-net@FreeBSD.ORG Tue Mar 25 19:25:02 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D53147F8; Tue, 25 Mar 2014 19:25:02 +0000 (UTC) Received: from mail-qc0-x22d.google.com (mail-qc0-x22d.google.com [IPv6:2607:f8b0:400d:c01::22d]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 81B21A2A; Tue, 25 Mar 2014 19:25:02 +0000 (UTC) Received: by mail-qc0-f173.google.com with SMTP id r5so1293498qcx.18 for ; Tue, 25 Mar 2014 12:25:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=LCuUVuZryU+TxFGWuvR9OkvT/N9f9ThECrmyvj9eat4=; b=wTjH6TJrzLZ0KC+jqFsZ89FzYUrjqHU6hs/dSy54JBy4IPb7JctRvhMJs5cdzxzi0Z +Dv0IKekmvrNvgB+V9lSq9o4tG9Hr8ZoAe2ZP9xROSKZzKjxxC8E7oayo+MGIhGOV8jg T4PMHBMQJ087alMtsJGAwL8xCZTxCPjOo3MwZKBGG+1KUI4clD5nU1tQaV6wKBMq//X/ dTSyJT4yUeoKTU26pEQUV2/ZEZKMu6VXD4WVtcriZZD7OkKpXGj2AkHULrrKo0fqw7Bf JgkYUh9ApHb6F5Dn5O1wd2/+75bml7LdqYpmPJRXsRv3DPP+v9JlT0FeLgj2z6/Vf/Rs nVLw== MIME-Version: 1.0 X-Received: by 10.140.20.167 with SMTP id 36mr50244943qgj.54.1395775501045; Tue, 25 Mar 2014 12:25:01 -0700 (PDT) Received: by 10.96.79.97 with HTTP; Tue, 25 Mar 2014 12:25:00 -0700 (PDT) In-Reply-To: <1236110257.2510701.1395709458870.JavaMail.root@uoguelph.ca> References: <0BC10908-2081-45AC-A1C8-14220D81EC0A@hostpoint.ch> <1236110257.2510701.1395709458870.JavaMail.root@uoguelph.ca> Date: Tue, 25 Mar 2014 16:25:00 -0300 Message-ID: Subject: Re: 9.2 ixgbe tx queue hang From: Christopher Forgeron To: Rick Macklem Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.17 Cc: FreeBSD Net , Garrett Wollman , Jack Vogel , Markus Gebert X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Mar 2014 19:25:03 -0000 I'm quite positive that an IP_MAXPACKET = 65518 would fix this, as I've never seen a packet overshoot by more than 11 bytes, although that's just in my case. It's next up on my test list. BTW, to answer the next message: I am expierencing the error with a raw ix or lagg interface. Originally I was on lagg, but have dropped down to a single ix for testing. Thanks for your continued help. On Mon, Mar 24, 2014 at 10:04 PM, Rick Macklem wrote: > Markus Gebert wrote: > > > > On 24.03.2014, at 16:21, Christopher Forgeron > > wrote: > > > > > This is regarding the TSO patch that Rick suggested earlier. (With > > > many > > > thanks for his time and suggestion) > > > > > > As I mentioned earlier, it did not fix the issue on a 10.0 system. > > > It did > > > make it less of a problem on 9.2, but either way, I think it's not > > > needed, > > > and shouldn't be considered as a patch for testing/etc. > > > > > > Patching TSO to anything other than a max value (and by default the > > > code > > > gives it IP_MAXPACKET) is confusing the matter, as the packet > > > length > > > ultimately needs to be adjusted for many things on the fly like TCP > > > Options, etc. Using static header sizes won't be a good idea. > > > > > > Additionally, it seems that setting nic TSO will/may be ignored by > > > code > > > like this in sys/netinet/tcp_output.c: > > > > > > 10.0 Code: > > > > > > 780 if (len > tp->t_tsomax - hdrlen) > > > { !! > > > 781 len = tp->t_tsomax - > > > hdrlen; !! > > > 782 sendalot = > > > 1; > > > 783 } > > > > > > > > > I've put debugging here, set the nic's max TSO as per Rick's patch > > > ( set to > > > say 32k), and have seen that tp->t_tsomax == IP_MAXPACKET. It's > > > being set > > > someplace else, and thus our attempts to set TSO on the nic may be > > > in vain. > > > > > > It may have mattered more in 9.2, as I see the code doesn't use > > > tp->t_tsomax in some locations, and may actually default to what > > > the nic is > > > set to. > > > > > > The NIC may still win, I didn't walk through the code to confirm, > > > it was > > > enough to suggest to me that setting TSO wouldn't fix this issue. > > > > > > I just applied Rick's ixgbe TSO patch and additionally wanted to be > > able to easily change the value of hw_tsomax, so I made a sysctl out > > of it. > > > > While doing that, I asked myself the same question. Where and how > > will this value actually be used and how comes that tcp_output() > > uses that other value in struct tcpcb. > > > > The only place tcpcb->t_tsomax gets set, that I have found so far, is > > in tcp_input.c's tcp_mss() function. Some subfunctions get called: > > > > tcp_mss() -> tcp_mss_update() -> tcp_maxmtu() > > > > Then tcp_maxmtu() indeed uses the interface's hw_tsomax value: > > > > 1746 cap->tsomax = ifp->if_hw_tsomax; > > > > It get's passed back to tcp_mss() where it is set on the connection > > level which will be used in tcp_output() later on. > > > > tcp_mss() gets called from multiple places, I'll look into that > > later. I will let you know if I find out more. > > > > > > Markus > > > Well, if tp->t_tsomax isn't set to a value of 65518, then the ixgbe.patch > isn't doing what I thought it would. > > The only explanation I can think of for this is that there might be > another net interface driver stacked on top of the ixgbe.c one and > that the setting doesn't get propagated up. > Does this make any sense? > > IP_MAXPACKET can't be changed from 65535, but I can see an argument > for setting the default value of if_hw_tsomax to a smaller value. > For example, in sys/net/if.c change it from: > 657 if (ifp->if_hw_tsomax == 0) > 658 ifp->if_hw_tsomax = IP_MAXPACKET; > to > 657 if (ifp->if_hw_tsomax == 0) > 658 ifp->if_hw_tsomax = 65536 - (ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN); > > This is a slightly smaller default which won't have much impact unless > the hardware device can only handle 32 mbuf clusters for transmit of > a segment and there are several of those. > > Christopher, can you do your test run with IP_MAXPACKET set to 65518, > which should be the same as the above. If that gets rid of all the > EFBIG error replies, then I think the above patch will have the same > effect. > > Thanks, rick > > > > > > However, this is still a TSO related issue, it's just not one > > > related to > > > the setting of TSO's max size. > > > > > > A 10.0-STABLE system with tso disabled on ix0 doesn't have a single > > > packet > > > over IP_MAXPACKET in 1 hour of runtime. I'll let it go a bit longer > > > to > > > increase confidence in this assertion, but I don't want to waste > > > time on > > > this when I could be logging problem packets on a system with TSO > > > enabled. > > > > > > Comments are very welcome.. > > > _______________________________________________ > > > freebsd-net@freebsd.org mailing list > > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > > To unsubscribe, send any mail to > > > "freebsd-net-unsubscribe@freebsd.org" > > > > > > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >