Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 27 Jan 2015 17:38:11 -0600
From:      Jim Thompson <jim@netgate.com>
To:        =?utf-8?Q?Antoine_Beaupr=C3=A9?= <anarcat@koumbit.org>
Cc:        freebsd-net@freebsd.org, wishmaster <artemrts@ukr.net>
Subject:   Re: is polling still a thing?
Message-ID:  <6BB47230-9AB8-4F0B-843B-7C51330F8306@netgate.com>
In-Reply-To: <87pp9zc1wk.fsf@marcos.anarc.at>
References:  <871tmgceup.fsf@marcos.anarc.at> <1422384769.867067950.y2iiuu53@frv34.fwdcdn.com> <87pp9zc1wk.fsf@marcos.anarc.at>

next in thread | previous in thread | raw e-mail | index | archive | help

> On Jan 27, 2015, at 4:08 PM, Antoine Beaupr=C3=A9 =
<anarcat@koumbit.org> wrote:
>=20
> On 2015-01-27 13:57:20, wishmaster wrote:
>> Have you consider to use netmap-based ipfw instead pf in DDoS =
mitigation? I think you should. And without any network ''haks'' like =
polling.
>=20
> My understanding of netmap was that it wasn't useful for packet
> forwarding, because its design is for transmitting packets directly to
> userland faster, whereas routers dataflow stay mostly in the router=E2=80=
=A6

the problem is that the =E2=80=9Cdata flow=E2=80=9D in freebsd isn=E2=80=99=
t very fast.   (I=E2=80=99d go so far to say, =E2=80=9Cbroken=E2=80=9D, =
but that=E2=80=99s throwing rocks.)

But as long as the window is already broken:
the rtentry locking is a good example of how the stack is broken.
the lack of FIB caching is another issue
and the packet-at-a-time-to-completion is another.   (no batching)

So =E2=80=99N=E2=80=99 packets worth of address lookups, (ACLs, =E2=80=A6,=
 etc) at a time.  Just like =E2=80=9CClick=E2=80=9D showed a decade ago =
(and where the polling mode was of use).

But it=E2=80=99s trivial to build a packet forwarder (more L2 than L3, =
but all things are possible) using netmap (or dpdk) that smacks the =
freebsd (and linux) stacks with a large stick.
The netmap code comes with a =E2=80=9Cbridge.c=E2=80=9D example that is =
just that, a dead-simple bridge.  Another example, =E2=80=9Cnetmap-fwd=E2=80=
=9D runs at 14.88Mpps between two 10Gbps interfaces.
(Neither pf or the kernel-resident ipfw will come close, both are more =
than an order of magnitude slower.)

Here=E2=80=99s something a bit more than =E2=80=9Cdead simple=E2=80=9D: =
https://github.com/caladri/brilter <https://github.com/caladri/brilter>;

This would be even faster if Juli would use one of the Lua JITs, e.g.:
=
http://wingolog.org/archives/2014/09/02/high-performance-packet-filtering-=
with-pflua

And if you want to go =E2=80=98full tilt=E2=80=99, Click runs on top of =
netmap since 2012:
https://github.com/kohler/click/commits/netmap =
<https://github.com/kohler/click/commits/netmap>;  (the code is in the =
master branch, too.  use master.)

As for the netmap-ipfw code, it=E2=80=99s 6.5Mpps to 10Mpps (later =
editions of the code:
=
http://freebsd.1045724.n5.nabble.com/ipfw-meets-netmap-6-5-Mpps-in-userspa=
ce-td5734014.html =
<http://freebsd.1045724.n5.nabble.com/ipfw-meets-netmap-6-5-Mpps-in-usersp=
ace-td5734014.html>


> I'm hesitant in switching back to ipfw, considering how nice the
> featureset and syntax of pf is. But if that's what's needed to restore
> sanity=E2=80=A6

pf is sane?  No, I don=E2=80=99t think so.

(yes, it does say =E2=80=9Cpf=E2=80=9D at the front of =E2=80=9CpfSense=E2=
=80=9D.  so what?  I mean, have you looked at the code?)

Turn off polling, unless you know you need it.   You=E2=80=99ll know you =
=E2=80=98need it=E2=80=9D if you start making changes to the stack.

There is a lot of =E2=80=9Cmystery meat=E2=80=9D in most fields, and the =
field of computers / operating systems contains it=E2=80=99s share.

As a somewhat associated example, Intel says, "hyperthreading helps =
(networking) performance!=E2=80=9D 6wind says this too.   freebsd =
developers say, "hyperthreading hurts performance=E2=80=9D.

In the end, it depends what is stalling the CPU.  Hyper-threading is a =
trick to share the write pipes on the core, and traditional =
implementations of memcpy() will fill these pipes (call them buffers if =
you like.)
And the stack does a lot of =E2=80=9Cmemcpy()=E2=80=9D  (I=E2=80=99m =
waiting for the yowls of =E2=80=9Cwe zero-copy!=E2=80=9D, because anyone =
who asserts this just hasn=E2=80=99t looked at the stack.)  There are =
tricks (if your code is interleaving access to the write pipes well, =
you=E2=80=99ll see more benefit.  This really wants cache-aligned data =
structures, etc.)

So, that=E2=80=99s just a long-winded =E2=80=9CYMMV=E2=80=9D.

Jim





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6BB47230-9AB8-4F0B-843B-7C51330F8306>