From owner-svn-src-projects@FreeBSD.ORG  Sat Nov 24 15:43:29 2012
Return-Path: <owner-svn-src-projects@FreeBSD.ORG>
Delivered-To: svn-src-projects@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 022A2ECF;
 Sat, 24 Nov 2012 15:43:29 +0000 (UTC)
 (envelope-from asmrookie@gmail.com)
Received: from mail-la0-f54.google.com (mail-la0-f54.google.com
 [209.85.215.54])
 by mx1.freebsd.org (Postfix) with ESMTP id 6AC0D8FC12;
 Sat, 24 Nov 2012 15:43:26 +0000 (UTC)
Received: by mail-la0-f54.google.com with SMTP id j13so9459690lah.13
 for <multiple recipients>; Sat, 24 Nov 2012 07:43:26 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:reply-to:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=q8msky12cZxfMj7V7Dbx5wOeQ9FmGKtoWoWAbZ3FdME=;
 b=AdS2dKMh46CWqVrrFydfnzvvtwzmfI1ogcgmzxxUUfgMqhby02mQLauEUMhVQJwsLO
 UERr8Kzftaz2MTn+ci3chmQyEtA8nfxPI1ei82n+h0ap9cpU8Mg3yJyH5UdYz5CCZV+l
 l1cPmUHY0kjUFUHhFFbIjBjIwE+TznyF1tOtCiF9vRdXWDkd615inKB9UyoRp2Xp1/SZ
 Aacdly30/7Ax6xTHrztyEoXyCuNbHQv11EfB4+oVs33JW806m7lDHuonoP04b61uJk8Q
 DJvK3AEFUdAUhQOubkwuziJjC+rD5Rw3OKpeF5w7gyHx8iG96rN/+qCzK2g7+nXFZXA3
 E2jg==
MIME-Version: 1.0
Received: by 10.152.104.50 with SMTP id gb18mr6424578lab.9.1353771806090; Sat,
 24 Nov 2012 07:43:26 -0800 (PST)
Sender: asmrookie@gmail.com
Received: by 10.112.134.5 with HTTP; Sat, 24 Nov 2012 07:43:26 -0800 (PST)
In-Reply-To: <20121109034942.C5338@besplex.bde.org>
References: <201207301350.q6UDobCI099069@svn.freebsd.org>
 <CAJ-FndBj8tpC_BJXs_RH8sG2TBG8yA=Lxu3-GTVT9Ap_zOCuVQ@mail.gmail.com>
 <CAJ-FndDnO7wjnWPV0tTu+UGHjsxa3YDarMxmyei3ZmjLAFvRkQ@mail.gmail.com>
 <201207301732.33474.jhb@freebsd.org>
 <CAJ-FndD5EO12xsWOAe6u0EvX00q33wxO4OivnGjzj0=T2Oe8uA@mail.gmail.com>
 <CAJ-FndCRg0UCThFkatp=tw7rUWWCvhsApLE=iztLpxpGBC1F9w@mail.gmail.com>
 <CAJ-FndBqV2uD8Th9ePtxyJwhMAPzY3AXA5cQ7HszLp=+fSpHTA@mail.gmail.com>
 <CAJ-FndDPLmkpAJeGVN2wgbhdgHYezyUV-PPvH9e-CA7Go7HG3A@mail.gmail.com>
 <20121029155136.O943@besplex.bde.org>
 <CAJ-FndDpB2vQ7nBgYS4mNHb3_G4sXi3SfU7Y7kdVHMQZYKapig@mail.gmail.com>
 <20121109034942.C5338@besplex.bde.org>
Date: Sat, 24 Nov 2012 15:43:26 +0000
X-Google-Sender-Auth: rzYqKIwVjZsP611JMYi_sF7e2Pw
Message-ID: <CAJ-FndCZKpHaRcRMBT5XTRh0nRDjK3f-aVT3rjRzKpMRLU+DzA@mail.gmail.com>
Subject: Re: svn commit: r238907 - projects/calloutng/sys/kern
From: Attilio Rao <attilio@freebsd.org>
To: Bruce Evans <brde@optusnet.com.au>
Content-Type: text/plain; charset=UTF-8
Cc: src-committers@freebsd.org, John Baldwin <jhb@freebsd.org>,
 Jeff Roberson <jeff@freebsd.org>, Florian Smeets <flo@freebsd.org>,
 Bruce Evans <bde@freebsd.org>, svn-src-projects@freebsd.org
X-BeenThere: svn-src-projects@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: attilio@FreeBSD.org
List-Id: "SVN commit messages for the src &quot; projects&quot;
 tree" <svn-src-projects.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/svn-src-projects>, 
 <mailto:svn-src-projects-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-projects>
List-Post: <mailto:svn-src-projects@freebsd.org>
List-Help: <mailto:svn-src-projects-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-projects>, 
 <mailto:svn-src-projects-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 24 Nov 2012 15:43:29 -0000

On Thu, Nov 8, 2012 at 5:26 PM, Bruce Evans <brde@optusnet.com.au> wrote:
> On Fri, 2 Nov 2012, Attilio Rao wrote:
>
>> On 10/29/12, Bruce Evans <brde@optusnet.com.au> wrote:
>>>
>>> On Mon, 29 Oct 2012, Attilio Rao wrote:
>>>
>>>> Now that sched_pin()/sched_unpin() are fixed I would like to introduce
>>>> this new patch, making critical_enter()/critical_exit() inline:
>>>> http://www.freebsd.org/~attilio/inline_critical.patch
>>>
>>> ...
>>>
>>> My version goes the other way and uninlines mtx_lock_spin() and
>>> mtx_unlock_spin().  Then it inlines (open codes) critical_enter() and
>>> ...
>>
>>
>> So, I thought more about this and I think that inlining
>> critical_exit() is not really going to bring any benefit here but
>> bloat.
>> This is because nested critical sections are rare rather not, which
>
>
> Rather rare !not? :-)
>
>
>> means you will end up in doing the function call most of the time and
>> plus we have the possible pessimization introduced by the memory
>> clobber (__compiler_membar()) and as you note possible deficiency
>> caming from the branch misprediction*.
>
>
> This seems best.
>
> I see a point about the rareness of the branch in critical_exit().
> Not sure if it is the one you are making: since the nested case is
> rare, then the branch will normally be correctly predicted.  If the
> function remains uninlined, then the branch still has a good chance
> of being correctly predicted.  This depends on the nested case being
> so rare across all callers, that the non-nested case doesn't mess
> up the prediction by happening often.  The branch predictor only
> has to maintain history for 1 branch for this.  However, if the
> call is inlined and there are many callers, there might be too many
> to maintain history for them all.

Yes, that's basically the same conclusion I came up with.

It seems you are not opposed to this version of the patch.
I made some in-kernel benchmarks but they didn't really show a
performance improvements, bigger than 1-2%, which on SMP system is
basically accountable to thirdy-part effects.

However we already employ the same code for sched_pin() and I then
think that we can just commit this patch as-is.

>
>> I feel very unsure style-wise about my add to sys/systm.h because it
>> is already very messy that I cannot give a logical sense to that
>> chunk.
>
>
> It is less bad than most places.  Some of the inlines in libkern.h
> are misplaced.  However, the bitcount inlines in systm.h belong
> closer to libkern (or /dev/null).  The split between systm.h and
> kernel.h is bogus.  But systemy things like critical* don't
> belong near libkern.
>
>
>> * In theory we could use __predict_false() or similar but I don't
>> think this will really help as x86 doesn't have explicit instructions
>> to control branch prediction
>
>
> It can help as follows on x86:
> - modern x86 has a default for when the branch predictor is cold.  The
>   compiler should arrange the code so that the default matches the
>   usual case.  Compilers can either follow the code order or guess
>   which branch is usual (-fpredict-branch-probability IIRC).  The
>   branches in critical_exit() and x86 spinlock_enter() seem to have
>   a good order for the first, but might be mispredicted by the compiler.
>   The branch in spinlock_exit() seems to be in a bad order for the first.
>   This doesn't matter if the CPU's branch predictor stays warm.
> - compilers can help the CPU's branch predictor and instruction prefetcher
>   by moving the unusual case far away.  This helps mainly for large code.
>   The code in critical_exit() is not large, except possibly if it is
>   inlined,
>
> But __predict_false() never did much for me, perhaps because I test mainly
> non-large code in loops where branch predictors stay warm.

Shouldn't large code benefit less likely by correct branch prediction?
As the pipeline are not huge wrongly predicting a branch should have
no effect if the jump label is far enough to not be pre-fetched.
Anyway, I never saw __predict_false() giving a measurable performance
bump on x86, neither x86 has explicit instructions to control it.
That's what I wanted to actually say.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein