Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 23 Jan 2017 16:39:36 +0100
From:      =?UTF-8?Q?Olivier_Cochard=2DLabb=C3=A9?= <olivier@freebsd.org>
To:        Matthew Macy <mmacy@nextbsd.org>
Cc:        Sean Bruno <sbruno@freebsd.org>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>,  "freebsd-current@freebsd.org" <freebsd-current@freebsd.org>
Subject:   Re: HEADS-UP: IFLIB implementations of sys/dev/e1000 em, lem, igb pending
Message-ID:  <CA%2Bq%2BTcpHmuOGyp5A290WmUvGTnOSse7v8gj4=R8kZ=m51-_s4A@mail.gmail.com>
In-Reply-To: <159902b73ed.10775291e21533.7488368455500235608@nextbsd.org>
References:  <30f21c75-d3a2-edcd-1999-d5ed9f970c06@freebsd.org> <b000a957-8d17-a04d-6275-0d3920aa8a17@freebsd.org> <CA%2Bq%2BTcramTrYgYT-s%2B=aBZzRJV8FmKQqGt=1twPhLBR7AoXkcQ@mail.gmail.com> <1598d97bf2a.c6bcb76838987.6501340920645175463@nextbsd.org> <574a7ac7-4842-9518-8286-a4d89a9f7a27@freebsd.org> <CA%2Bq%2BTco-dcoU8EZnDEzgoK-v2Q2=U5GF6ASMSj0kwzd_wB5xig@mail.gmail.com> <6c6cb534-73c7-464b-8af1-7445a9c0188c@freebsd.org> <1598f29d379.ea6360351471.8752933472741761813@nextbsd.org> <CA%2Bq%2BTcpUXXPEQtdMFup6EZzyCKs9Ep%2BnS5SB%2Bfm6bSJSDs34_w@mail.gmail.com> <1598f3f8588.d20017893749.339651164872952258@nextbsd.org> <1598f42ad77.eeec05be4113.9201780237587761460@nextbsd.org> <CA%2Bq%2BTcp5LwrnXt75tNpYpAr1KWx9YpLx5kMHhPR%2BYgAs__n1eA@mail.gmail.com> <159902b73ed.10775291e21533.7488368455500235608@nextbsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jan 12, 2017 at 1:54 AM, Matthew Macy <mmacy@nextbsd.org> wrote:

>  >  A flame graph for the core cycle count and a flame graph with cache
> miss stats from pmc would be a great start.
>  >
>  >
>  > =E2=80=8BI didn't know the exact event name to use for cache miss stat=
s, but
> here are the flame graphs for CPU_CLK_UNHALTED_CORE:
>  > http://dev.bsdrp.net/netgate.r311848.CPU_CLK_UNHALTED_CORE.svg
>  > http://dev.bsdrp.net/netgate.r311849.CPU_CLK_UNHALTED_CORE.svg
>
> Thanks. Having twice as many txqs would definitely help. It's also clear
> that there may be some sort of peformance issue in iflib_txq_drain.
> Although it could just be non-stop cache misses on the packet headers.
>
>
> =E2=80=8BAny news about the performance issue in iflib_txq_drain ?

On a different hardware (PC Engine APU2), I've got -20% performance drop:

x head r311848: packets per second
+ head r311849: packets per second
+--------------------------------------------------------------------------=
+
| ++                                                                      x=
|
|+++                                                                 x xx x=
|
|                                                                     |_A_|=
|
||A|                                                                       =
|
+--------------------------------------------------------------------------=
+
    N           Min           Max        Median           Avg        Stddev
x   5        580021        588650        585676      585406.1     3550.8673
+   5        463865        467599        465428      465638.6     1437.9347
Difference at 95.0% confidence
        -119768 +/- 3950.78
        -20.4589% +/- 0.558328%
        (Student's t, pooled s =3D 2708.9)
=E2=80=8B

=E2=80=8BBecause it's an AMD processor I didn't found the pmc equivalent of
CPU_CLK_UNHALTED_CORE, then I've used BU_CPU_CLK_UNHALTED but I've no idea
if it's the good one.

http://dev.bsdrp.net/apu2.r311848.BU_CPU_CLK_UNHALTED.svg
http://dev.bsdrp.net/apu2.r311849.BU_CPU_CLK_UNHALTED.svg
=E2=80=8B
=E2=80=8BThanks=E2=80=8B



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2Bq%2BTcpHmuOGyp5A290WmUvGTnOSse7v8gj4=R8kZ=m51-_s4A>