Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 9 Dec 2012 04:55:50 +0000
From:      Attilio Rao <attilio@freebsd.org>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        src-committers@freebsd.org, John Baldwin <jhb@freebsd.org>, Jeff Roberson <jeff@freebsd.org>, Florian Smeets <flo@freebsd.org>, Bruce Evans <bde@freebsd.org>, svn-src-projects@freebsd.org
Subject:   Re: svn commit: r238907 - projects/calloutng/sys/kern
Message-ID:  <CAJ-FndAfmV7GTRzZfRTbOf0Fw2Bq6JJeCqHgybdm04GmN8p-mg@mail.gmail.com>
In-Reply-To: <20121207125401.S1231@besplex.bde.org>
References:  <201207301350.q6UDobCI099069@svn.freebsd.org> <CAJ-FndBj8tpC_BJXs_RH8sG2TBG8yA=Lxu3-GTVT9Ap_zOCuVQ@mail.gmail.com> <CAJ-FndDnO7wjnWPV0tTu%2BUGHjsxa3YDarMxmyei3ZmjLAFvRkQ@mail.gmail.com> <201207301732.33474.jhb@freebsd.org> <CAJ-FndD5EO12xsWOAe6u0EvX00q33wxO4OivnGjzj0=T2Oe8uA@mail.gmail.com> <CAJ-FndCRg0UCThFkatp=tw7rUWWCvhsApLE=iztLpxpGBC1F9w@mail.gmail.com> <CAJ-FndBqV2uD8Th9ePtxyJwhMAPzY3AXA5cQ7HszLp=%2BfSpHTA@mail.gmail.com> <CAJ-FndDPLmkpAJeGVN2wgbhdgHYezyUV-PPvH9e-CA7Go7HG3A@mail.gmail.com> <20121029155136.O943@besplex.bde.org> <CAJ-FndDpB2vQ7nBgYS4mNHb3_G4sXi3SfU7Y7kdVHMQZYKapig@mail.gmail.com> <20121109034942.C5338@besplex.bde.org> <CAJ-FndCZKpHaRcRMBT5XTRh0nRDjK3f-aVT3rjRzKpMRLU%2BDzA@mail.gmail.com> <CAJ-FndCnj9hKV9f%2BRmF-OCijtRpKUWv88AktvrsBYp5_hFUmVA@mail.gmail.com> <20121207125401.S1231@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Dec 7, 2012 at 2:42 AM, Bruce Evans <brde@optusnet.com.au> wrote:
> On Wed, 5 Dec 2012, Attilio Rao wrote:
>
>> On Sat, Nov 24, 2012 at 3:43 PM, Attilio Rao <attilio@freebsd.org> wrote:
>>>
>>> On Thu, Nov 8, 2012 at 5:26 PM, Bruce Evans <brde@optusnet.com.au> wrote:
>>>>
>>>> On Fri, 2 Nov 2012, Attilio Rao wrote:
>>>>
>>>>> On 10/29/12, Bruce Evans <brde@optusnet.com.au> wrote:
>>>>>>
>>>>>>
>>>>>> On Mon, 29 Oct 2012, Attilio Rao wrote:
>>>>>>
>>>>>>> Now that sched_pin()/sched_unpin() are fixed I would like to
>>>>>>> introduce
>>>>>>> this new patch, making critical_enter()/critical_exit() inline:
>>>>>>> http://www.freebsd.org/~attilio/inline_critical.patch
>>>>>>
>>>>>>
>>>>>> ...
>>>>>>
>>>>>> My version goes the other way and uninlines mtx_lock_spin() and
>>>>>> mtx_unlock_spin().  Then it inlines (open codes) critical_enter() and
>>>>>> ...
>>>>>
>>>>>
>>>>>
>>>>> So, I thought more about this and I think that inlining
>>>>> critical_exit() is not really going to bring any benefit here but
>>>>> bloat.
>>>>> This is because nested critical sections are rare rather not, which
>>>>
>>>>
>>>>
>>>> Rather rare !not? :-)
>>>>
>>>>
>>>>> means you will end up in doing the function call most of the time and
>>>>> plus we have the possible pessimization introduced by the memory
>>>>> clobber (__compiler_membar()) and as you note possible deficiency
>>>>> caming from the branch misprediction*.
>>>>
>>>>
>>>>
>>>> This seems best.
>>>>
>>>> I see a point about the rareness of the branch in critical_exit().
>>>> Not sure if it is the one you are making: since the nested case is
>>>> rare, then the branch will normally be correctly predicted.  If the
>>>> function remains uninlined, then the branch still has a good chance
>>>> of being correctly predicted.  This depends on the nested case being
>>>> so rare across all callers, that the non-nested case doesn't mess
>>>> up the prediction by happening often.  The branch predictor only
>>>> has to maintain history for 1 branch for this.  However, if the
>>>> call is inlined and there are many callers, there might be too many
>>>> to maintain history for them all.
>>>
>>>
>>> Yes, that's basically the same conclusion I came up with.
>>>
>>> It seems you are not opposed to this version of the patch.
>>> I made some in-kernel benchmarks but they didn't really show a
>>> performance improvements, bigger than 1-2%, which on SMP system is
>>> basically accountable to thirdy-part effects.
>
>
> 1-2% is about the best that can be hoped for from a single change,
> but is too hard to measure with confidence.
>
>
>>> However we already employ the same code for sched_pin() and I then
>>> think that we can just commit this patch as-is.
>>
>>
>> I made up my mind on instead not committing this patch, as I cannot
>> prove a real performance gain, as also Jeff agreed with privately.
>> Instead, I would like to commit this small comment explaining why it
>> is not inlined (see below). Let me know what you think.
>
>
> Good.
>
> Minor grammar and fixes:
>
>
>> Index: sys/kern/kern_switch.c
>> ===================================================================
>> --- sys/kern/kern_switch.c      (revision 243911)
>> +++ sys/kern/kern_switch.c      (working copy)
>> @@ -176,6 +176,11 @@ retry:
>> /*
>>  * Kernel thread preemption implementation.  Critical sections mark
>>  * regions of code in which preemptions are not allowed.
>> + * It would seem a good idea to inline such function but, in order
>
>
> s/would/might/
>
> s/such/such a/  or better (?),  s/such function/crticical_enter()
>
> I think the new part of the comment only applies to critical_enter().
> critical_exit() is much larger, so it never seemed such a good idea
> to inline it.
>
> If this sentence begins a new paragraph, then it should be preceded by
> an empty line.  Otherwise, it should not begin on a new line.  I think
> the latter applies.  In fact, the comment should be separate.  The old
> part of the comment applies to both critical_enter() and critical_exit(),
> but it is bogusly attached to only the former.  After separating it from
> the former, the new part of the comment can be better attached to the
> former alone.
>
>
>> + * to prevent instructions reordering by the compiler, a
>> __compiler_membar()
>
>
> s/instructions/instructions/
>
> s/a __compiler.../__compiler.../
>
>
>> + * must be used here (look at sched_pin() case).  The performance penalty
>
>
> s/must/would have to/
>
> s/look at sched_pin() case/the same as for sched_pin()/
>
> Looking at sched_pin() doesn't show any comment about this.  I seem to
> remember discussions of this.  Maybe the details are in a log message.
>
> Technical points:
> - is the function being extern really enough to force the equivalent of
>   __compiler_membar()?  Inter-file optimization might break this.  It
>   would be easy to add an explicit __compiler_membar() to the beginning
>   of critical_enter(), but there are probably many other functions that
>   would need the same treatment for inter-file optimization.
> - compilers already do intra-file optimization giving automatic inlining
>   of (static) functions that are only called once, unless this is
>   disabled by -fno-inline-functions-called-once.  It should be disabled
>   by default, since it also breaks debugging including stack traces,
>   and profiling.  Perhaps it also breaks implicit membars.
> - IIRC, inlining is not permitted to change function call semantics, so
>   it may be a bug for the membar in sched_pin() to have any effect.
>   Anyway, extern functions are not required to give stricter ordering
>   than static [inline] ones.  Function calls are required to give
>   sequence points (after their parameters have been evaluated), and
>   sequence points are required to have all side effects of previous
>   evaluations complete and no side effects of subsequent evaluations
>   begun.  Is that any different from a membar?  I think membars are
>   a little more magic, and it is hard to see how anything can give
>   stricter ordering requirements on the compiler than a sequence
>   point except by magic.
>
>
>> + * imposed by the membar could, then, produce slower code than
>> + * the function call itself, for most cases.
>>  */
>
>
> The punctuation given by all those commas seems to be correct, but oit
> is painfully formal.

I've integrated your suggestions and committed as r244046, thanks.

For your concern about sched_pin() please check notes reported in this
same thread and commit log.

Thanks,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-FndAfmV7GTRzZfRTbOf0Fw2Bq6JJeCqHgybdm04GmN8p-mg>