FreeBSD Mail Archives

Date:      Thu, 9 Jan 2014 10:44:51 -0800
From:      Adrian Chadd <adrian@freebsd.org>
To:        John Baldwin <jhb@freebsd.org>
Cc:        "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>
Subject:   Re: Acquiring a lock on the same CPU that holds it - what can be done?
Message-ID:  <CAJ-Vmo=rayYvUYsNLs2A-T=a7WbrSA%2BTUPgDoGCHdbQjeJ9ynw@mail.gmail.com>
In-Reply-To: <9508909.MMfryVDtI5@ralph.baldwin.cx>
References:  <CAJ-Vmok-AJkz0THu72ThTdRhO2h1CnHwffq=cFZGZkbC=cWJZA@mail.gmail.com> <9508909.MMfryVDtI5@ralph.baldwin.cx>

On 9 January 2014 10:31, John Baldwin <jhb@freebsd.org> wrote:
> On Friday, January 03, 2014 04:55:48 PM Adrian Chadd wrote:
>> Hi,
>>
>> So here's a fun one.
>>
>> When doing TCP traffic + socket affinity + thread pinning experiments,
>> I seem to hit this very annoying scenario that caps my performance and
>> scalability.
>>
>> Assume I've lined up everything relating to a socket to run on the
>> same CPU (ie, TX, RX, TCP timers, userland thread):
>
> Are you sure this is really the best setup?  Especially if you have free CPUs
> in the system the time you lose in context switches fighting over the one
> assigned CPU for a flow when you have idle CPUs is quite wasteful.  I know
> that tying all of the work for a given flow to a single CPU is all the rage
> right now, but I wonder if you had considered assigning a pair of CPUs to a
> flow, one CPU to do the top-half (TX and userland thread) and one CPU to
> do the bottom-half (RX and timers).  This would remove the context switches
> you see and replace it with spinning in the times when the two cores actually
> contend.  It may also be fairly well suited to SMT (which I suspect you might
> have turned off currently).  If you do have SMT turned off, then you can get
> a pair of CPUs for each queue without having to reduce the number of queues
> you are using.  I'm not sure this would work better than creating one queue
> for every CPU, but I think it is probably something worth trying for your use
> case at least.
>
> BTW, the problem with just slapping critical enter into mutexes is you will
> run afoul of assertions the first time you contend on a mutex and have to
> block.  It may be that only the assertions would break and nothing else, but
> I'm not certain there aren't other assumptions about critical sections and
> not ever context switching for any reason, voluntary or otherwise.

It's the rage because it turns out it bounds the system behaviour rather nicely.

The idea is to scale upwards of 60,000 active TCP sockets. Some people
are looking at upwards of 100,000 active concurrent sockets. The
amount of contention is non-trivial if it's not lined up.

And yeah, I'm aware of the problem of just slapping critical sections
around mutexes. I've faced this stuff in Linux. It's why doing this
stuff is much more fragile on Linux.. :-P


-a

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmo=rayYvUYsNLs2A-T=a7WbrSA%2BTUPgDoGCHdbQjeJ9ynw>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation