Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 Feb 2013 23:54:24 -0800
From:      Doug Hardie <bc979@lafn.org>
To:        pyunyh@gmail.com
Cc:        Jeremy Chadwick <jdc@koitsu.org>, freebsd-stable@freebsd.org, Eugene Grosbein <egrosbein@rdtc.ru>, yongari@freebsd.org
Subject:   Re: Unusual TCP/IP Packet Size
Message-ID:  <3BB4EC29-0FD5-4F5D-9189-51770E2B55D5@lafn.org>
In-Reply-To: <20130214064521.GA1464@michelle.cdnetworks.com>
References:  <96AE8BD1-79C2-4743-854F-B8386C54E4A1@lafn.org> <511B6B21.5030606@rdtc.ru> <20130213130059.GA57337@icarus.home.lan> <20130214013723.GB2945@michelle.cdnetworks.com> <CAN6yY1v9oc7BEQXDkAwSCxi65ibuApP6geXA1hi0fzQZRXVjxQ@mail.gmail.com> <20130214064521.GA1464@michelle.cdnetworks.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On 13 February 2013, at 22:45, YongHyeon PYUN <pyunyh@gmail.com> wrote:

> On Wed, Feb 13, 2013 at 09:10:36PM -0800, Kevin Oberman wrote:
>> On Wed, Feb 13, 2013 at 5:37 PM, YongHyeon PYUN <pyunyh@gmail.com> =
wrote:
>>> On Wed, Feb 13, 2013 at 05:00:59AM -0800, Jeremy Chadwick wrote:
>>>> On Wed, Feb 13, 2013 at 05:29:53PM +0700, Eugene Grosbein wrote:
>>>>> 13.02.2013 17:25, Doug Hardie ??????????:
>>>>>> Monitoring a tcpdump between two systems, a FreeBSD 9.1 system =
has the following interface:
>>>>>>=20
>>>>>> msk0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric =
0 mtu 1500
>>>>>>  =
options=3Dc011b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,VLAN_HWTSO,LINK=
STATE>
>>>>>>  ether 00:11:2f:2a:c7:03
>>>>>>  inet 10.0.1.199 netmask 0xffffff00 broadcast 10.0.1.255
>>>>>>  inet6 fe80::211:2fff:fe2a:c703%msk0 prefixlen 64 scopeid 0x1
>>>>>>  nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>>>>>>  media: Ethernet autoselect (100baseTX =
<full-duplex,flowcontrol,rxpause,txpause>)
>>>>>>  status: active
>>>>>>=20
>>>>>>=20
>>>>>> It sent the following packet:  (data content abbreviated)
>>>>>>=20
>>>>>> 02:14:42.081617 IP 10.0.1.199.443 > 10.0.1.2.61258: Flags [P.], =
seq 930:4876, ack 846, win 1040, options [nop,nop,TS val 401838072 ecr =
920110183], length 3946
>>>>>>  0x0000:  4500 0f9e ea89 4000 4006 2a08 0a00 01c7  =
E.....@.@.*.....
>>>>>>  0x0010:  0a00 0102 01bb ef4a ece1 680b ae37 1bbc  =
.......J..h..7..
>>>>>>  0x0020:  8018 0410 3407 0000 0101 080a 17f3 8ff8  =
....4...??????.
>>>>>>=20
>>>>>>=20
>>>>>> The indicated packet length is 3946 and the load of data shown is =
that size.  The MTU on both interfaces is 1500.  The receiving system =
received 3 packets.  There is a router and switch between them.  One of =
them fragmented that packet. This is part of a SSL/TLS exchange and one =
side or the other is hanging on this and just dropping the connection.  =
I suspect the packet size is the issue. ssldump complains about the =
packet too and stops monitoring.  Could this possibly be related to the =
hardware checksums?
>>>>>=20
>>>>> You have TSO enabled on the interface, so large outgoing TCP =
packet is pretty normal.
>>>>> It will be split by the NIC. Disable TSO with ifconfig if it =
interferes with your ssldump.
>>>>=20
>>>> This is not the behaviour I see with em(4) on a 82573E with all =
defaults
>>>> used (which includes TSO4).  Note that Doug is using msk(4).
>>>>=20
>>>> I can provide packet captures on both ends of a LAN segment using =
both
>>>> tcpdump (on the FreeBSD side) and Wireshark (on the Windows side) =
that
>>>> show a difference in behaviour compared to what Doug sees.
>>>=20
>>> This is strange. tcpdump sees a (big) TCP segment right before
>>> controller actually transmits it. So if TSO is active for the TCP
>>> segment, you should see a series of small TCP packets on receiver
>>> side(i.e. 3 TCP packets in Doug's case). If you don't see a big TCP
>>> segment with tcpdump on TX path, probably TSO was not used for the
>>> TCP segment.
>>> It's possible for controller to corrupt the TCP segment during
>>> segmentation but Doug's tcpdump looks completely normal to me since
>>> tcpdump sees the segment before TCP segmentation.
>>>=20
>>>>=20
>>>> What I see on the FreeBSD side with tcpdump is repeated "bad-len 0"
>>>> messages for payloads which are chunked or segmented as a result of =
TSO.
>>>> I do not see a 1:1 ratio of "bad-len" entries to chunked payloads; =
I
>>>> only see one "bad-len" entry for all chunks (up until the next ACK =
or
>>>> PSH+ACK of course).
>>>>=20
>>>=20
>>> I vaguely recall that some users reported similar TSO issues on
>>> various drivers. The root cause of the issue was not identified
>>> though. Personally I couldn't reproduce the issue at that time.
>>> It could be a driver or network stack bug.
>>=20
>> Beware TSO. It can significantly improve throughput on high speed
>> networks, but it really has issues.
>>=20
>> TSO segments the data and transmits all of them back-to-back with no
>> delay beyond IFG (the 802.3 mandated space between frames)  TSO does
>> not understand congestion control. If there is congestion and TSO
>> sends several frames in a row, it is entirely possible that a queue =
is
>> full or getting close enough to full to start dropping packets and
>> these segmented frames are excellent candidates.
>=20
> I'm not saying the drawback of TSO.  Sometimes segmented packets
> have malformed IP header length under certain circumstances such
> that these packets were dropped on receiver side.

How do I configure the msk0 interface in rc.conf to disable tso4?  I can =
easily do it with ifconfig, but don't see how to make sure its disabled =
after a boot.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3BB4EC29-0FD5-4F5D-9189-51770E2B55D5>