Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 30 May 2014 15:08:21 +0400
From:      "Alexander V. Chernikov" <melifaro@FreeBSD.org>
To:        Nadav Har'El <nyh@math.technion.ac.il>
Cc:        freebsd-net@freebsd.org, osv-dev@googlegroups.com
Subject:   Re: Route caching
Message-ID:  <538866A5.9050901@FreeBSD.org>
In-Reply-To: <20140529123306.GA16644@fermat.math.technion.ac.il>
References:  <20140529123306.GA16644@fermat.math.technion.ac.il>

next in thread | previous in thread | raw e-mail | index | archive | help
On 29.05.2014 16:33, Nadav Har'El wrote:
> Hi,
Hello!
> I'm working on the OSv project (http://osv.io/), a new BSD-licensed
> operating system for virtual machines. OSv's networking code is based
> on that of FreeBSD.
>
> I recently noticed an inefficiency that I believe exists also in
> FreeBSD's networking code, and I was wondering why this was done,
> and whether FreeBSD can also be improved in the same way by fixing
> this problem.
>
> My issue is that, for example, when running a UDP server answering
> hundreds of thousands of requests per second, I get the same number of
> calls to the routing table lookup function (rtalloc_ign_fib(), etc.).
> These calls are relatively slow: Each involves several mutex locks and
> unlocks (a rwlock for the radix tree, and a mutex for the individual
> route), which are relatively slow in the uncontended case, but even worse
> when several CPUs start to access the network heavily, and we start to see
> context switches hurting the performance of the server even further.

Yes, that's true.
> Looking at FreeBSD's udp_output(), I see it does the following:
>
>     error = ip_output(m, inp->inp_options, NULL, ipflags,
>                       inp->inp_moptions, inp)
>
> Note how NULL is passed as the third parameter. This tells ip_output
> that it can't cache the previously found route, and needs to look for
> it again and again on every packet output - even in the common case
> where a socket will only ever send packets on one interface.
>
> It seems that this change was done around FreeBSD 5.4. In the original
> UCB code (4.4Lite), I see this:
>
> 	error = ip_output(m, inp->inp_options, &inp->inp_route,
>                  inp->inp_socket->so_options & (SO_DONTROUTE | SO_BROADCAST),
>                  inp->inp_moptions);
>
> So the last-found route was cached in inp->inp_route, and possibly
> reused on the next packet to be sent.
>
> Does anyone have any idea why inp->inp_route was removed in FreeBSD?
> Doesn't this also hurt FreeBSD's network performance?
Well, there are two problems:
First, using cached routes makes it more complex to change routing stack 
to be more efficient.
The second one is basically the fact that using cached routes is simply 
incorrect:
no one protects/notifies you on interface removal. That's why we've 
removed cached route support from various tunneling schemes.

There is some ongoing work to change rte_* api and eliminate the need 
for per-rte mutex (and use different, more efficient lookup mechanisms).
There is also another alternative which you can currently use: flowtable 
(not included in GENERIC).
It has been fixed recently and should work better in your case.

> Thanks,
> Nadav.
>
>




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?538866A5.9050901>