Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 14 Mar 2007 18:45:20 -0700
From:      "Kip Macy" <kip.macy@gmail.com>
To:        "Kris Kennaway" <kris@obsecurity.org>, net@freebsd.org
Subject:   Re: Scalability problem from route refcounting
Message-ID:  <b1fa29170703141845x2be165d5ief6ecd5938f3aee2@mail.gmail.com>
In-Reply-To: <20070315011511.GA55003@xor.obsecurity.org>
References:  <20070315011511.GA55003@xor.obsecurity.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Apologies in advance if you have already answered this question
elsewhere - can you point me to a HOWTO for replicating the test in my
local environment?

     -Kip

On 3/14/07, Kris Kennaway <kris@obsecurity.org> wrote:
> I have recently started looking at database performance over gigabit
> ethernet, and there seems to be a bottleneck coming from the way route
> reference counting is implemented.  On an 8-core system it looks like
> we spend a lot of time waiting for the rtentry mutex:
>
>    max        total   wait_total       count   avg wait_avg     cnt_hold
> cnt_lock name
> [...]
>    408       950496      1135994      301418     3     3        24876
> 55936 net/if_ethersubr.c:397 (sleep mutex:bge1)
>    974       968617      1515169      253772     3     5        14741
> 60581 dev/bge/if_bge.c:2949 (sleep mutex:bge1)
>   2415     18255976      1607511      253841    71     6       125174
>  3131 netinet/tcp_input.c:770 (sleep mutex:inp)
>    233      1850252      2080506      141817    13    14            0
> 126897 netinet/tcp_usrreq.c:756 (sleep mutex:inp)
>    384      6895050      2737492      299002    23     9        92100
> 73942 dev/bge/if_bge.c:3506 (sleep mutex:bge1)
>    626      5342286      2760193      301477    17     9        47616
> 54158 net/route.c:147 (sleep mutex:radix node head)
>    326      3562050      3381510      301477    11    11       133968
> 110104 net/route.c:197 (sleep mutex:rtentry)
>    146       947173      5173813      301477     3    17        44578
> 120961 net/route.c:1290 (sleep mutex:rtentry)
>    146       953718      5501119      301476     3    18        63285
> 121819 netinet/ip_output.c:610 (sleep mutex:rtentry)
>     50      4530645      7885304     1423098     3     5       642391
> 788230 kern/subr_turnstile.c:489 (spin mutex:turnstile chain)
>
> i.e. during a 30 second sample we spend a total of >14 seconds (on all
> cpus) waiting to acquire the rtentry mutex.
>
> This appears to be because (among other things), we increment and then
> decrement the route refcount for each packet we send, each of which
> requires acquiring the rtentry mutex for that route before adjusting
> the refcount.  So multiplexing traffic for lots of connections over a
> single route is being partly rate-limited by those mutex operations.
>
> This is not the end of the story though, the bge driver is a serious
> bottleneck on its own (e.g. I nulled out the route locking since it is
> not relevant in my environment, at least for the purposes of this
> test, and that exposed bge as the next problem -- but other drivers
> may not be so bad).
>
> Kris
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?b1fa29170703141845x2be165d5ief6ecd5938f3aee2>