Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 26 Sep 2021 02:53:15 +0200
From:      Harry Schmalzbauer <freebsd@omnilan.de>
To:        Kevin Bowling <kbowling@FreeBSD.org>
Cc:        "Andrey V. Elsukov" <bu7cher@yandex.ru>, "net@FreeBSD.org" <net@freebsd.org>
Subject:   Re: git: 1a72c3d76aea - stable/13 - e1000: always enable PCSD when RSS hashing [Was: TCP6 regression for MTU path on stable/13]
Message-ID:  <57df3182-a7ec-112c-c8d8-a8faa21a97a8@omnilan.de>
In-Reply-To: <14f7348c-a11f-9ae8-8a4e-77e0333ba478@omnilan.de>
References:  <8e4f78e5-0717-8002-5364-44df5c8d7dad@omnilan.de> <36d9d998-c484-a4f6-6c62-c6ec103aeb33@yandex.ru> <14f7348c-a11f-9ae8-8a4e-77e0333ba478@omnilan.de>

next in thread | previous in thread | raw e-mail | index | archive | help
Am 13.09.2021 um 13:18 schrieb Harry Schmalzbauer:
> Am 13.09.2021 um 12:37 schrieb Andrey V. Elsukov:
>> 12.09.2021 14:12, Harry Schmalzbauer пишет:
>>> Will try to further track it down, but in case anybody has an idea, 
>>> what
>>> change during the last view months in stable/13 could have caused this
>>> real-world problem regarding resulting TCP6 throughput, I'm happy to
>>> start testing at that point.
>>
>> Hi,
>>
>> Take a look at:
>>
>>    https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=255749
>>    https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=248005
>>
>> does the problem described in these PRs is the same as yours?
>
> Hi, thank you very much for your attention!
> Most likely these are unrelated to the regression I'm suffering from, 
> because these affect 13-release and earlier.
> Mine arose during the last months.
> And it seems not to be a jumbo frame problem.
:
> Hope to get back to you soon with more info.


Since the setup was hard to replicate, it took some time.
Here's the commit, causing the heavy IPv6 performance drop with Intel 
Powerville and IPv6:

> The branch stable/13 has been updated by kbowling (ports committer):
>
> URL: 
> https://cgit.FreeBSD.org/src/commit/?id=1a72c3d76aeafe4422ff20f81c4142efb983b7d7
>
> commit 1a72c3d76aeafe4422ff20f81c4142efb983b7d7
> Author:     Kevin Bowling <kbowling@FreeBSD.org>
> AuthorDate: 2021-08-16 17:17:34 +0000
> Commit:     Kevin Bowling <kbowling@FreeBSD.org>
> CommitDate: 2021-08-23 16:23:43 +0000
>
>     e1000: always enable PCSD when RSS hashing
>
>     To enable RSS hashing in the NIC, the PCSD bit must be set.
>
>     By default, this is never set when RXCSUM is disabled - which
>     causes problems higher up in the stack.
>
>     While here improve the RXCSUM flag assignments when enabling or
>     disabling IFCAP_RXCSUM.
>
>     See also: 
> https://lists.freebsd.org/pipermail/freebsd-current/2020-May/076148.html
>
>     Reviewed by:    markj, Franco Fichtner <franco@opnsense.org>,
>                     Stephan de Wit <stephan.dewt@yahoo.co.uk>
>     Obtained from:  OPNsense
>     MFC after:      1 week
>     Differential Revision:  https://reviews.freebsd.org/D31501
>     Co-authored-by: Stephan de Wit <stephan.dewt@yahoo.co.uk>
>     Co-authored-by: Franco Fichtner <franco@opnsense.org>
>
>     (cherry picked from commit 69e8e8ea3d4be9da6b5bc904a444b51958128ff5)
> :

Noticed and successfully (double-{a8446d412+f72cdea25}) falsified with 
i350 Powerville, device=0x1521.
*Reverting git: 1a72c3d76aea against today's stable/13(-f72cdea25-dirty) 
sloves the issue, which seems to be IPv6 related only.*
(kernel  a8446d412 from 21/09/25 shows issue, reverting this commit 
solves it with old kernel too)

Very brief check against IPv4 on identical paths seems to be unaffected, 
but I can't guarantee since v4 isn't in use (where I 1st noticed and 
suffer from) and I just did one comparing in order to narrow down 
(asymmetric FIB setup regarding inet and inet6).

What this made complicated: ng_brige(4), mpd5/pppoe,ppt,bhyve are 
involved as well (and vlan(4), lagg(4) and vtnet(4), etc.), but it seems 
to be just a e1000 driver issue.
There were many changes/iprovements/cleanups between July and September, 
but I tracked it down as root cause for my IPv6 issue (performance 
dropping from 33MB/s to <=0.3MB/s).


That beeing said, it was hard to find the time replicating the setup, 
and I have nothing for a solution.  Haven't semantically checked 
anything yet and didn't do any tests beside my single IPv6 performance 
test.  Contrary to my first suspicion, at least in my clone-lab, it 
isn't MTU/jumbo frame related, just plain e1000/i350 IPv6 regression.


Happy to test anything, can test-drive swiftly but without further diag 
during work days.

Thanks,
-harry





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?57df3182-a7ec-112c-c8d8-a8faa21a97a8>