Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 9 May 2014 16:22:36 +0200
From:      Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
To:        pyunyh@gmail.com
Cc:        FreeBSD Net <freebsd-net@freebsd.org>, "Bjoern A. Zeeb" <bz@FreeBSD.org>
Subject:   Re: RX checksum offloading problem
Message-ID:  <DE46DC96-6D5F-4CB4-91EC-F884298BD508@lurchi.franken.de>
In-Reply-To: <B4F16B70-825A-441A-82AA-F7ED28C9D254@lurchi.franken.de>
References:  <0EB8F4F6-65C2-4B90-8101-FCC53A15C6F9@lurchi.franken.de> <A345E6A0-D6FF-4E69-AFBD-9BB67B82F02E@FreeBSD.org> <B149FC4B-4F15-4619-A04F-F1A08DDC1741@lurchi.franken.de> <20140507075612.GA1376@michelle.cdnetworks.com> <36469814-FAC8-4172-A792-487E2AB8ECB9@lurchi.franken.de> <20140507083751.GB1376@michelle.cdnetworks.com> <415C1CB5-3AF9-44E4-943A-74116037980E@lurchi.franken.de> <20140509013556.GA3014@michelle.cdnetworks.com> <B4F16B70-825A-441A-82AA-F7ED28C9D254@lurchi.franken.de>

next in thread | previous in thread | raw e-mail | index | archive | help

On 09 May 2014, at 12:46, Michael Tuexen =
<Michael.Tuexen@lurchi.franken.de> wrote:

> On 09 May 2014, at 03:35, Yonghyeon PYUN <pyunyh@gmail.com> wrote:
>=20
>> On Thu, May 08, 2014 at 08:40:22PM +0200, Michael Tuexen wrote:
>>> On 07 May 2014, at 10:37, Yonghyeon PYUN <pyunyh@gmail.com> wrote:
>>>=20
>>>> On Wed, May 07, 2014 at 10:07:09AM +0200, Michael Tuexen wrote:
>>>>> On 07 May 2014, at 09:56, Yonghyeon PYUN <pyunyh@gmail.com> wrote:
>>>>>=20
>>>>>> On Sat, May 03, 2014 at 11:52:47AM +0200, Michael Tuexen wrote:
>>>>>>> On 02 May 2014, at 16:02, Bjoern A. Zeeb <bz@FreeBSD.org> wrote:
>>>>>>>=20
>>>>>>>>=20
>>>>>>>> On 02 May 2014, at 10:22 , Michael Tuexen =
<Michael.Tuexen@lurchi.franken.de> wrote:
>>>>>>>>=20
>>>>>>>>> Dear all,
>>>>>>>>>=20
>>>>>>>>> during testing I found that FreeBSD head (on a raspberry pi) =
accepts SCTP packet
>>>>>>>>> with bad checksums. After debugging this I figured out that =
this is a problem with
>>>>>>>>> the csum_flags defined in mbuf.h.
>>>>>>>>>=20
>>>>>>>>> The SCTP code on its input path checks for CSUM_SCTP_VALID, =
which is defined in mbuf.h:
>>>>>>>>> #define CSUM_SCTP_VALID         CSUM_L4_VALID
>>>>>>>>> This makes sense: If CSUM_SCTP_VALID is set in csum_flags, the =
packet is considered
>>>>>>>>> to have a correct checksum.
>>>>>>>>>=20
>>>>>>>>> For UDP and TCP some drivers calculate the UDP/TCP checksum =
and set CSUM_DATA_VALID in
>>>>>>>>> csum_flags to indicate that the UDP/TCP should consider =
csum_data to figure out if
>>>>>>>>> the packet has a correct checksum. The problem is that =
CSUM_DATA_VALID is defined as
>>>>>>>>> #define CSUM_DATA_VALID         CSUM_L4_VALID
>>>>>>>>> In this case the semantic is not that the packet has a valid =
checksum, but the csum_data
>>>>>>>>> field contains information.
>>>>>>>>>=20
>>>>>>>>> Now the following happens (on the raspberry pi the driver used =
is
>>>>>>>>> dev/usb/net/if_smsc.c
>>>>>>>>>=20
>>>>>>>>> 1. A packet is received and if it is not too short, the =
checksum computed
>>>>>>>>> is stored in csum_data and the flag CSUM_DATA_VALID is set. =
This happens
>>>>>>>>> for all IP packets, not only for UDP and TCP packets.
>>>>>>>>> 2. In case of SCTP packets, the SCTP interprets =
CSUM_DATA_VALID as CSUM_SCTP_VALID
>>>>>>>>> and accepts the packet. So no SCTP checksum check ever =
happened.
>>>>>>>>>=20
>>>>>>>>> Alternatives to fix this:
>>>>>>>>>=20
>>>>>>>>> 1. Change all drivers to set CSUM_DATA_VALID only in case of =
UDP or TCP packets, since
>>>>>>>>> it only makes sense in these cases.
>>>>>>>>=20
>>>>>>>> Wait, or for SCTP in cad the crc32 (I think it was)  was =
actually checked but not otherwise.   This is how it should be imho.  It =
seems like a driver bug.
>>>>>>> I went through the list of drivers and you are right, it seems =
to be a bug
>>>>>>> in if_smsc.c. Most of the other drivers check for UDP/TCP, a =
small set I can't tell.
>>>>>>>=20
>>>>>>=20
>>>>>> I'm not sure how the controller computes TCP/UDP checksum values.
>>>>>> It seems the publicly available data sheet was highly sanitized =
so
>>>>>> it was useless to me.  The comment in the driver says that the
>>>>> Same for me...
>>>>>> controller computes RX checksum after the IPv4 header to the end =
of
>>>>>> ethernet frame. After seeing that comment, three questions popped
>>>>>> up:
>>>>>>=20
>>> OK, I did some testing. It looks like the card is just computing the
>>> checksum over the IP payload taking the correct IP header length =
into account.
>>>>>> 1. Is the controller smart enough to skip IP options header in
>>>>>> TCP/UDP checksum offloading?
>>> Yes, I can send fragmented and un-fragmented UDP packets with IP =
options
>>> and they are handled correctly. Even if the last fragment is too =
short.
>>=20
>> I'm assuming you're taking about receiving fragmented UDP packets
>> with RX checksum offloading, right?
> Correct.
>>=20
>>>>>> 2. How controller handles UDP checksum value 0x0000(i.e. sender
>>>>>> didn't compute UDP checksum)?
>>> This case isn't handled. However, udp_input() looks first for zero =
checksums
>>> and only after that in the csum_flags. So it doesn't result in any =
problems.
>>> Would you prefer not to set CSUM_DATA_VALID in this case?
>>=20
>> At least, it correctly updates UDP stats of netstat(1).
> Let me double check that...
I double checked it. The statistic counters are incremented.
Please note that we had a bug in the sending code of head, which
made it impossible to send UDP packets with 0 checksum. That is
fixed in
http://svnweb.freebsd.org/base?view=3Drevision&revision=3D265776

So any preference whether to report CSUM_DATA_VALID if a UDP packet
with checksum 0 is received or not? I'm pretty open, since it does
not have any effect right now...

Best regards
Michael
>>=20
>>>>>> 3. How the controller can compute TCP checksum of fragmented
>>>>>> packets?
>>> At least it does it right for UDP...
>>>=20
>>=20
>> Hmm, CSUM_DATA_VALID indicates driver computed RX TCP/UDP checksum
>> without pseudo header. As you know, controller can't compute
>> TCP/UDP checksum until all its fragmented payload are read from
>> wire.  Packets may arrive out of order and may be mixed with other
> I'm not sure I understand this... Please note that the pseudo header
> is not taken into account. So the card can compute the checksum over
> the payload of IP for each fragment. This is stored in the csum_data
> field. During reassembly the csum_data fields of the fragments are
> combined in
> =
http://svnweb.freebsd.org/base/head/sys/netinet/ip_input.c?view=3Dmarkup#l=
1075
> This looks OK to me. I'm not sure why you think the cards needs
> to keep state for this. I understand it needs state if it wants
> to take the pseudoheader into account, but this is not done here.
>> packets. Advanced controllers with enough memory may be able to
>> compute TCP/UDP checksums by tracking each connections(e.g LRO) but
>> low-end controllers may be not. It seems the controller does not
>> even support RX TCP/UDP pseudo header checksum offloading so I
>> wonder how this controller can support RX TCP/UDP checksum
>> offloading for fragmented TCP/UDP packets.
> I'm not sure what I'm missing here... You compute the sum over
> the parts and add the sums of the parts. That should work for
> the UDP/TCP checksum.
>>=20
>> Some controllers maintain two bits for TCP/UDP checksum offloading
>> result in status word.  One bit is used to indicate whether
>> controller performed TCP/UDP checksum offloading and the other bit
>> is used to indicate whether the computed checksum is correct or
> For SCTP we only get a bit that the checksum was computed and is =
correct...
>> not. For UDP checksum value 0x0000 and fragmented TCP/UDP packets,
>> these controllers do not attempt to compute TCP/UDP checksum.
> I think it "just" computes it and leaves it up to the upper layer
> to use the result or not...
>=20
> Best regards
> Michael
>>=20
>>> Best regards
>>> Michael
>>>>>>=20
>>>>>> Since you have the controller I guess it's easy to verify all
>>>>>> cases.  For case 3, I believe the controller can't handle
>>>>>> fragmented frames so driver should have to explicitly check =
ip_off
>>>>>> field of IPv4 header.  See how gem(4)/sk(4)/hme(4) and fxp(4)
>>>>>> handle it.
>>>>> Let me check this. Is there a tool to send UDP/TCP with IP level =
options
>>>>> or do I need to write a small test program myself?
>>>>>=20
>>>>=20
>>>> I recall I used buggy ipsend of ipfilter package in the past but it
>>>> would be more easy to write a simple test program or patch driver
>>>> to generate those frames.
>>>>=20
>>>=20
>>>=20
>>=20
>=20
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>=20




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?DE46DC96-6D5F-4CB4-91EC-F884298BD508>