Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 13 Jan 2014 14:43:52 -0500
From:      John Baldwin <jhb@freebsd.org>
To:        Adrian Chadd <adrian@freebsd.org>
Cc:        "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>
Subject:   Re: Acquiring a lock on the same CPU that holds it - what can be done?
Message-ID:  <201401131443.52550.jhb@freebsd.org>
In-Reply-To: <CAJ-Vmo=rayYvUYsNLs2A-T=a7WbrSA%2BTUPgDoGCHdbQjeJ9ynw@mail.gmail.com>
References:  <CAJ-Vmok-AJkz0THu72ThTdRhO2h1CnHwffq=cFZGZkbC=cWJZA@mail.gmail.com> <9508909.MMfryVDtI5@ralph.baldwin.cx> <CAJ-Vmo=rayYvUYsNLs2A-T=a7WbrSA%2BTUPgDoGCHdbQjeJ9ynw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday, January 09, 2014 1:44:51 pm Adrian Chadd wrote:
> On 9 January 2014 10:31, John Baldwin <jhb@freebsd.org> wrote:
> > On Friday, January 03, 2014 04:55:48 PM Adrian Chadd wrote:
> >> Hi,
> >>
> >> So here's a fun one.
> >>
> >> When doing TCP traffic + socket affinity + thread pinning experiments,
> >> I seem to hit this very annoying scenario that caps my performance and
> >> scalability.
> >>
> >> Assume I've lined up everything relating to a socket to run on the
> >> same CPU (ie, TX, RX, TCP timers, userland thread):
> >
> > Are you sure this is really the best setup?  Especially if you have free CPUs
> > in the system the time you lose in context switches fighting over the one
> > assigned CPU for a flow when you have idle CPUs is quite wasteful.  I know
> > that tying all of the work for a given flow to a single CPU is all the rage
> > right now, but I wonder if you had considered assigning a pair of CPUs to a
> > flow, one CPU to do the top-half (TX and userland thread) and one CPU to
> > do the bottom-half (RX and timers).  This would remove the context switches
> > you see and replace it with spinning in the times when the two cores actually
> > contend.  It may also be fairly well suited to SMT (which I suspect you might
> > have turned off currently).  If you do have SMT turned off, then you can get
> > a pair of CPUs for each queue without having to reduce the number of queues
> > you are using.  I'm not sure this would work better than creating one queue
> > for every CPU, but I think it is probably something worth trying for your use
> > case at least.
> >
> > BTW, the problem with just slapping critical enter into mutexes is you will
> > run afoul of assertions the first time you contend on a mutex and have to
> > block.  It may be that only the assertions would break and nothing else, but
> > I'm not certain there aren't other assumptions about critical sections and
> > not ever context switching for any reason, voluntary or otherwise.
> 
> It's the rage because it turns out it bounds the system behaviour rather nicely.

Yes, but are you willing to try the suggestion?  This doesn't restrict to you
a single queue-pair.  It might net you 1 per core (instead of 1 per thread),
but that's still more than 1.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201401131443.52550.jhb>