From owner-freebsd-arch@FreeBSD.ORG Tue Jan 14 05:08:45 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EEE4E68C; Tue, 14 Jan 2014 05:08:44 +0000 (UTC) Received: from mail-qe0-x22b.google.com (mail-qe0-x22b.google.com [IPv6:2607:f8b0:400d:c02::22b]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 9B0D71E6B; Tue, 14 Jan 2014 05:08:44 +0000 (UTC) Received: by mail-qe0-f43.google.com with SMTP id nc12so1196561qeb.16 for ; Mon, 13 Jan 2014 21:08:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=Bk9AfpGuJOF6kUFQrof4uasT0sDq3k4RzLvTjP4lkFI=; b=lITVFquLDEzRw10NmziOQ9UeKYGI8Ns4JyZTYVSu/Z3DMhOoAdqqWtuoNLXyHnnLBo d96nBYi5rdJzUuoc5U8UrLSZ104yElBBZewepvvJhz9xgtmH1U6+QGHqDpJIy3Ntkodi wSFSfbYI9+t+vRGPoiK7UGmZyIFaYV5oJ/7nG0b0IoJ056HwoFZsvC6P6TWFXdBnD1By ZGzM5Azs/5QWx/nWd9gzpOvcNFB99FcZJVoDpFZkK0hVEzMH4C+q0AWhR40n85OaTcwu jsIV6OVB+9R8QO3sKm5apg9FuixHtpPaThhURh4nrGYaVTZUxi8kE9L5unh4wXHZztn+ m3sw== MIME-Version: 1.0 X-Received: by 10.49.24.211 with SMTP id w19mr47495914qef.9.1389676123711; Mon, 13 Jan 2014 21:08:43 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.224.52.8 with HTTP; Mon, 13 Jan 2014 21:08:43 -0800 (PST) In-Reply-To: <201401131443.52550.jhb@freebsd.org> References: <9508909.MMfryVDtI5@ralph.baldwin.cx> <201401131443.52550.jhb@freebsd.org> Date: Mon, 13 Jan 2014 21:08:43 -0800 X-Google-Sender-Auth: ypmVWlLHCVMA1YlNLPCEBO2b2T0 Message-ID: Subject: Re: Acquiring a lock on the same CPU that holds it - what can be done? From: Adrian Chadd To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Jan 2014 05:08:45 -0000 On 13 January 2014 11:43, John Baldwin wrote: > On Thursday, January 09, 2014 1:44:51 pm Adrian Chadd wrote: >> On 9 January 2014 10:31, John Baldwin wrote: >> > On Friday, January 03, 2014 04:55:48 PM Adrian Chadd wrote: >> >> Hi, >> >> >> >> So here's a fun one. >> >> >> >> When doing TCP traffic + socket affinity + thread pinning experiments, >> >> I seem to hit this very annoying scenario that caps my performance and >> >> scalability. >> >> >> >> Assume I've lined up everything relating to a socket to run on the >> >> same CPU (ie, TX, RX, TCP timers, userland thread): >> > >> > Are you sure this is really the best setup? Especially if you have free CPUs >> > in the system the time you lose in context switches fighting over the one >> > assigned CPU for a flow when you have idle CPUs is quite wasteful. I know >> > that tying all of the work for a given flow to a single CPU is all the rage >> > right now, but I wonder if you had considered assigning a pair of CPUs to a >> > flow, one CPU to do the top-half (TX and userland thread) and one CPU to >> > do the bottom-half (RX and timers). This would remove the context switches >> > you see and replace it with spinning in the times when the two cores actually >> > contend. It may also be fairly well suited to SMT (which I suspect you might >> > have turned off currently). If you do have SMT turned off, then you can get >> > a pair of CPUs for each queue without having to reduce the number of queues >> > you are using. I'm not sure this would work better than creating one queue >> > for every CPU, but I think it is probably something worth trying for your use >> > case at least. >> > >> > BTW, the problem with just slapping critical enter into mutexes is you will >> > run afoul of assertions the first time you contend on a mutex and have to >> > block. It may be that only the assertions would break and nothing else, but >> > I'm not certain there aren't other assumptions about critical sections and >> > not ever context switching for any reason, voluntary or otherwise. >> >> It's the rage because it turns out it bounds the system behaviour rather nicely. > > Yes, but are you willing to try the suggestion? This doesn't restrict to you > a single queue-pair. It might net you 1 per core (instead of 1 per thread), > but that's still more than 1. Sure. I can also try your suggestion of binding them to SMT pairs and see if that has any effect. But I'm specifically looking to _avoid_ contention at all in the main data path, not try to occasionally have the cores spin. -a