Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 24 Mar 2014 18:52:44 -0300
From:      Christopher Forgeron <csforgeron@gmail.com>
To:        Markus Gebert <markus.gebert@hostpoint.ch>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>, Rick Macklem <rmacklem@uoguelph.ca>, Garrett Wollman <wollman@freebsd.org>, Jack Vogel <jfvogel@gmail.com>
Subject:   Re: 9.2 ixgbe tx queue hang
Message-ID:  <CAB2_NwDBmJqSD6s-2J7e-GWOvfPQkVqfqsLSribsXEOmZ1cn=Q@mail.gmail.com>
In-Reply-To: <CAB2_NwDk2Aw78BsUWG%2B6uFd6=TnMv72Aga6ReExS_r7_i-6LTQ@mail.gmail.com>
References:  <CAB2_NwAcDPM6YKNLQMC0=YSp%2Bn9nBpXGJQR9ajbgbfcQFoWYPw@mail.gmail.com> <1164414873.1690348.1395622026185.JavaMail.root@uoguelph.ca> <CAB2_NwAbHzFqa8RM5pwV7Yy5t=96JwzaF%2BSdjJN9kK3uhKKn_w@mail.gmail.com> <CAB2_NwCHM9D1HZSMsuQQ-dYNAt-t2721jKqfO=2h3M4qdumY7w@mail.gmail.com> <CAB2_NwDKkgTfNuapm2gA5xhuBgVK6jE2uHwb2Nu-vsRvw_NwKQ@mail.gmail.com> <0BC10908-2081-45AC-A1C8-14220D81EC0A@hostpoint.ch> <CAB2_NwDk2Aw78BsUWG%2B6uFd6=TnMv72Aga6ReExS_r7_i-6LTQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Well, a few more hours of running, and it's fairly easy to catch the
packets with tcpdump, but not as easy to see if there is a pattern to them
or what is different about them from the other packets that do pass with
normal sizes.

I'm using:

 tcpdump -ennvvvSuxx -i ix0 -s 64 greater 65495

here's some output.

18:41:41.311025 00:1b:21:d6:4c:4c > 00:50:56:7d:b8:ff, ethertype IPv4
(0x0800), length 65502: (tos 0x0, ttl 64, id 37273, offset 0, flags [DF],
proto TCP (6), length 65488, bad cksum 0 (->50ee)!)
    172.16.0.30.2049 > 172.16.0.97.947: Flags [P.], seq
3009729118:3009794554, ack 3477042952, win 28478, options [nop,nop,TS[|tcp]>
        0x0000:  0050 567d b8ff 001b 21d6 4c4c 0800 4500

18:42:11.284028 00:1b:21:d6:4c:4c > 00:50:56:7d:b8:ff, ethertype IPv4
(0x0800), length 65502: (tos 0x0, ttl 64, id 52388, offset 0, flags [DF],
proto TCP (6), length 65488, bad cksum 0 (->15e3)!)
    172.16.0.30.2049 > 172.16.0.97.947: Flags [.], seq
1533469358:1533534794, ack 478673276, win 29127, options [nop,nop,TS[|tcp]>
        0x0000:  0050 567d b8ff 001b 21d6 4c4c 0800 4500

18:42:31.385082 00:1b:21:d6:4c:4c > 00:50:56:7d:b8:ff, ethertype IPv4
(0x0800), length 65498: (tos 0x0, ttl 64, id 25808, offset 0, flags [DF],
proto TCP (6), length 65484, bad cksum 0 (->7dbb)!)
    172.16.0.30.2049 > 172.16.0.97.947: Flags [P.], seq
3658906462:3658971894, ack 1460462120, win 29127, options [nop,nop,TS[|tcp]>
        0x0000:  0050 567d b8ff 001b 21d6 4c4c 0800 4500

18:42:45.200094 00:1b:21:d6:4c:4c > 00:50:56:7d:b8:ff, ethertype IPv4
(0x0800), length 65502: (tos 0x0, ttl 64, id 43985, offset 0, flags [DF],
proto TCP (6), length 65488, bad cksum 0 (->36b6)!)
    172.16.0.30.2049 > 172.16.0.97.947: Flags [P.], seq
805280454:805345890, ack 2122788052, win 29127, options [nop,nop,TS[|tcp]>
        0x0000:  0050 567d b8ff 001b 21d6 4c4c 0800 4500

18:43:16.601738 00:1b:21:d6:4c:4c > 00:50:56:7d:b8:ff, ethertype IPv4
(0x0800), length 65502: (tos 0x0, ttl 64, id 5657, offset 0, flags [DF],
proto TCP (6), length 65488, bad cksum 0 (->cc6e)!)
    172.16.0.30.2049 > 172.16.0.97.947: Flags [.], seq
3978046962:3978112398, ack 3596907688, win 29127, options [nop,nop,TS[|tcp]>
        0x0000:  0050 567d b8ff 001b 21d6 4c4c 0800 4500

18:43:37.345685 00:1b:21:d6:4c:4c > 00:50:56:7d:b8:ff, ethertype IPv4
(0x0800), length 65506: (tos 0x0, ttl 64, id 41062, offset 0, flags [DF],
proto TCP (6), length 65492, bad cksum 0 (->421d)!)
    172.16.0.30.2049 > 172.16.0.97.947: Flags [P.], seq
1419570518:1419635958, ack 104148460, win 29127, options [nop,nop,TS[|tcp]>
        0x0000:  0050 567d b8ff 001b 21d6 4c4c 0800 4500

18:45:50.266944 00:1b:21:d6:4c:4c > 00:50:56:7d:b8:ff, ethertype IPv4
(0x0800), length 65506: (tos 0x0, ttl 64, id 5853, offset 0, flags [DF],
proto TCP (6), length 65492, bad cksum 0 (->cba6)!)
    172.16.0.30.2049 > 172.16.0.97.947: Flags [P.], seq
2161102562:2161168002, ack 2086338240, win 29127, options [nop,nop,TS[|tcp]>

With the IP_MAXPACKET = 65495, I've had zero problems with networking.


On Mon, Mar 24, 2014 at 1:23 PM, Christopher Forgeron
<csforgeron@gmail.com>wrote:

> I think making hw_tsomax a sysctl would be a good patch to commit - It
> could enable easy debugging/performance testing for the masses.
>
> I'm curious to hear how your environment is working with a tso turned off
> on your nics.
>
> My testbed just hit the 2 hour mark. With TSO off, I don't get a single
> packet over IP_MAXPACKET.  That puts my confidence at around 95% in the
> statement 'turning off tso negates this issue for me'.
>
> I'm now rebooting into a +tso env to see if I can capture the bad packets.
>
> I am also sure that the netstat -m mbuf denied is a completely separate
> issue. I'm going around the lab and powering up different boxes with
> 10.0-RELEASE, and they all have mbuf/mbuf clusters denied on boot, and that
> number increases with network traffic.  It's probably not helping the
> IP_MAXPACKET issue.
>
>
>
> I'll create a separate thread for that one shortly.
>
>
> On Mon, Mar 24, 2014 at 1:14 PM, Markus Gebert <markus.gebert@hostpoint.ch
> > wrote:
>
>>
>> On 24.03.2014, at 16:21, Christopher Forgeron <csforgeron@gmail.com>
>> wrote:
>>
>> > This is regarding the TSO patch that Rick suggested earlier. (With many
>> > thanks for his time and suggestion)
>> >
>> > As I mentioned earlier, it did not fix the issue on a 10.0 system. It
>> did
>> > make it less of a problem on 9.2, but either way, I think it's not
>> needed,
>> > and shouldn't be considered as a patch for testing/etc.
>> >
>> > Patching TSO to anything other than a max value (and by default the code
>> > gives it IP_MAXPACKET) is confusing the matter, as the packet length
>> > ultimately needs to be adjusted for many things on the fly like TCP
>> > Options, etc. Using static header sizes won't be a good idea.
>> >
>> > Additionally, it seems that setting nic TSO will/may be ignored by code
>> > like this in sys/netinet/tcp_output.c:
>> >
>> > 10.0 Code:
>> >
>> >  780                         if (len > tp->t_tsomax - hdrlen)
>> > {                 !!
>> >  781                                 len = tp->t_tsomax -
>> > hdrlen;               !!
>> >  782                                 sendalot =
>> > 1;
>> >  783                         }
>> >
>> >
>> > I've put debugging here, set the nic's max TSO as per Rick's patch (
>> set to
>> > say 32k), and have seen that tp->t_tsomax == IP_MAXPACKET. It's being
>> set
>> > someplace else, and thus our attempts to set TSO on the nic may be in
>> vain.
>> >
>> > It may have mattered more in 9.2, as I see the code doesn't use
>> > tp->t_tsomax in some locations, and may actually default to what the
>> nic is
>> > set to.
>> >
>> > The NIC may still win, I didn't walk through the code to confirm, it was
>> > enough to suggest to me that setting TSO wouldn't fix this issue.
>>
>>
>> I just applied Rick's ixgbe TSO patch and additionally wanted to be able
>> to easily change the value of hw_tsomax, so I made a sysctl out of it.
>>
>> While doing that, I asked myself the same question. Where and how will
>> this value actually be used and how comes that tcp_output() uses that other
>> value in struct tcpcb.
>>
>> The only place tcpcb->t_tsomax gets set, that I have found so far, is in
>> tcp_input.c's tcp_mss() function. Some subfunctions get called:
>>
>> tcp_mss() -> tcp_mss_update() -> tcp_maxmtu()
>>
>> Then tcp_maxmtu() indeed uses the interface's hw_tsomax value:
>>
>> 1746                                 cap->tsomax = ifp->if_hw_tsomax;
>>
>> It get's passed back to tcp_mss() where it is set on the  connection
>> level which will be used in tcp_output() later on.
>>
>> tcp_mss() gets called from multiple places, I'll look into that later. I
>> will let you know if I find out more.
>>
>>
>> Markus
>>
>>
>> > However, this is still a TSO related issue, it's just not one related to
>> > the setting of TSO's max size.
>> >
>> > A 10.0-STABLE system with tso disabled on ix0 doesn't have a single
>> packet
>> > over IP_MAXPACKET in 1 hour of runtime. I'll let it go a bit longer to
>> > increase confidence in this assertion, but I don't want to waste time on
>> > this when I could be logging problem packets on a system with TSO
>> enabled.
>> >
>> > Comments are very welcome..
>> > _______________________________________________
>> > freebsd-net@freebsd.org mailing list
>> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>> >
>>
>>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAB2_NwDBmJqSD6s-2J7e-GWOvfPQkVqfqsLSribsXEOmZ1cn=Q>