From owner-freebsd-arch@FreeBSD.ORG  Tue Jan 14 05:08:45 2014
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id EEE4E68C;
 Tue, 14 Jan 2014 05:08:44 +0000 (UTC)
Received: from mail-qe0-x22b.google.com (mail-qe0-x22b.google.com
 [IPv6:2607:f8b0:400d:c02::22b])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 9B0D71E6B;
 Tue, 14 Jan 2014 05:08:44 +0000 (UTC)
Received: by mail-qe0-f43.google.com with SMTP id nc12so1196561qeb.16
 for <multiple recipients>; Mon, 13 Jan 2014 21:08:43 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date:message-id:subject
 :from:to:cc:content-type;
 bh=Bk9AfpGuJOF6kUFQrof4uasT0sDq3k4RzLvTjP4lkFI=;
 b=lITVFquLDEzRw10NmziOQ9UeKYGI8Ns4JyZTYVSu/Z3DMhOoAdqqWtuoNLXyHnnLBo
 d96nBYi5rdJzUuoc5U8UrLSZ104yElBBZewepvvJhz9xgtmH1U6+QGHqDpJIy3Ntkodi
 wSFSfbYI9+t+vRGPoiK7UGmZyIFaYV5oJ/7nG0b0IoJ056HwoFZsvC6P6TWFXdBnD1By
 ZGzM5Azs/5QWx/nWd9gzpOvcNFB99FcZJVoDpFZkK0hVEzMH4C+q0AWhR40n85OaTcwu
 jsIV6OVB+9R8QO3sKm5apg9FuixHtpPaThhURh4nrGYaVTZUxi8kE9L5unh4wXHZztn+
 m3sw==
MIME-Version: 1.0
X-Received: by 10.49.24.211 with SMTP id w19mr47495914qef.9.1389676123711;
 Mon, 13 Jan 2014 21:08:43 -0800 (PST)
Sender: adrian.chadd@gmail.com
Received: by 10.224.52.8 with HTTP; Mon, 13 Jan 2014 21:08:43 -0800 (PST)
In-Reply-To: <201401131443.52550.jhb@freebsd.org>
References: <CAJ-Vmok-AJkz0THu72ThTdRhO2h1CnHwffq=cFZGZkbC=cWJZA@mail.gmail.com>
 <9508909.MMfryVDtI5@ralph.baldwin.cx>
 <CAJ-Vmo=rayYvUYsNLs2A-T=a7WbrSA+TUPgDoGCHdbQjeJ9ynw@mail.gmail.com>
 <201401131443.52550.jhb@freebsd.org>
Date: Mon, 13 Jan 2014 21:08:43 -0800
X-Google-Sender-Auth: ypmVWlLHCVMA1YlNLPCEBO2b2T0
Message-ID: <CAJ-VmomhbDcwL-eGK6Doh+XjUkcaFZYbZxZaNKGgNgRREccesw@mail.gmail.com>
Subject: Re: Acquiring a lock on the same CPU that holds it - what can be done?
From: Adrian Chadd <adrian@freebsd.org>
To: John Baldwin <jhb@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch/>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Jan 2014 05:08:45 -0000

On 13 January 2014 11:43, John Baldwin <jhb@freebsd.org> wrote:
> On Thursday, January 09, 2014 1:44:51 pm Adrian Chadd wrote:
>> On 9 January 2014 10:31, John Baldwin <jhb@freebsd.org> wrote:
>> > On Friday, January 03, 2014 04:55:48 PM Adrian Chadd wrote:
>> >> Hi,
>> >>
>> >> So here's a fun one.
>> >>
>> >> When doing TCP traffic + socket affinity + thread pinning experiments,
>> >> I seem to hit this very annoying scenario that caps my performance and
>> >> scalability.
>> >>
>> >> Assume I've lined up everything relating to a socket to run on the
>> >> same CPU (ie, TX, RX, TCP timers, userland thread):
>> >
>> > Are you sure this is really the best setup?  Especially if you have free CPUs
>> > in the system the time you lose in context switches fighting over the one
>> > assigned CPU for a flow when you have idle CPUs is quite wasteful.  I know
>> > that tying all of the work for a given flow to a single CPU is all the rage
>> > right now, but I wonder if you had considered assigning a pair of CPUs to a
>> > flow, one CPU to do the top-half (TX and userland thread) and one CPU to
>> > do the bottom-half (RX and timers).  This would remove the context switches
>> > you see and replace it with spinning in the times when the two cores actually
>> > contend.  It may also be fairly well suited to SMT (which I suspect you might
>> > have turned off currently).  If you do have SMT turned off, then you can get
>> > a pair of CPUs for each queue without having to reduce the number of queues
>> > you are using.  I'm not sure this would work better than creating one queue
>> > for every CPU, but I think it is probably something worth trying for your use
>> > case at least.
>> >
>> > BTW, the problem with just slapping critical enter into mutexes is you will
>> > run afoul of assertions the first time you contend on a mutex and have to
>> > block.  It may be that only the assertions would break and nothing else, but
>> > I'm not certain there aren't other assumptions about critical sections and
>> > not ever context switching for any reason, voluntary or otherwise.
>>
>> It's the rage because it turns out it bounds the system behaviour rather nicely.
>
> Yes, but are you willing to try the suggestion?  This doesn't restrict to you
> a single queue-pair.  It might net you 1 per core (instead of 1 per thread),
> but that's still more than 1.

Sure. I can also try your suggestion of binding them to SMT pairs and
see if that has any effect.

But I'm specifically looking to _avoid_ contention at all in the main
data path, not try to occasionally have the cores spin.


-a