Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 17 Jun 2013 23:20:53 +0200
From:      Ed Schouten <ed@80386.nl>
To:        Nathan Whitehorn <nwhitehorn@freebsd.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r251803 - head/sys/kern
Message-ID:  <CAJOYFBD6JE1%2Bn6uX%2BSfHU-WdCnAYDMW6gB_%2BDQ_DRVNfAQS11A@mail.gmail.com>
In-Reply-To: <51BDCEE0.8050000@freebsd.org>
References:  <201306160930.r5G9UZfE059294@svn.freebsd.org> <51BDCEE0.8050000@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Nathan,

2013/6/16 Nathan Whitehorn <nwhitehorn@freebsd.org>:
> I'm a little worried about these kinds of changes from a performance
> standpoint when using GCC 4.2. In particular, from the GCC manual: "In most
> cases, these builtins are considered a full barrier." This is much more
> synchronization that many of the atomic ops in machine/atomic.h actually
> require. I'm worried this could lead to serious performance regressions on
> e.g. PowerPC. gcc mostly seems to do the right thing, but I'm not completely
> sure and it probably needs extensive testing. One way to accomplish that
> could be to implement atomic(9) in terms of stdatomic. If nothing breaks or
> becomes slow, then we will know we are in the clear permanently.
> -Nathan

Agreed. I did indeed implement <machine/atomic.h> on top of
<stdatomic.h> as a test a couple of weeks ago. What is nice, is that
if I look at amd64/i386, the emitted machine code is almost identical,
with the exception that in certain cases, <stdatomic.h> generates more
compact instructions (e.g. "lock inc" instead of adding an immediate
1).

On armv6 the trend is similar, with the exception that in some cases
Clang manages to emit slightly more intelligent code. It seems that
one of our pieces of inline assembly causes the compiler to zero out
certain registers before inserting the inline assembly, even though
these registers tend to be overwritten by the assembly anyway. Weird.

Replacement of <machine/atomic.h> used on amd64:

http://80386.nl/pub/machine-atomic-wrapped.txt

Still, you were actually interested in knowing the difference in
performance when using GCC 4.2. I have to confess, I don't have any
numbers on this, but I suspect there will be a dip, of course. But let
me be clear on this; I am not proposing that we migrate our existing
codebase to C11 atomics within the nearby future. This is something
that should be considered by the time most of the platforms use Clang
(or, unlikely GCC 4.6+).

The reason why I made this chance, was that I at least want to have
some coverage of the C11 atomics both in kernelspace and userspace. My
goal is that C11 atomics work correctly on FreeBSD 10.0. My fear is
that this likely cannot be achieved if there are exactly 0 pieces of
code in our tree that use this. By not doing so, breakage of
<stdatomic.h> could go by unnoticed, maybe already when someone makes
a tiny "harmless" modification to <sys/cdefs.h> or <sys/_types.h>.

Correct me if I'm wrong, but I think it's extremely unlikely that this
specific change will noticeably regress performance of the system as a
whole. If I wanted to cripple performance on these architectures, I
would have changed mtx(9) to use C11 atomics instead.

Unrelated to this, there is something about this specific piece of
code that is actually very interesting if you look at it into more
detail. Notice how I took the liberty of changing filt_timerattach()
to use a compare-and-exchange, instead of the two successive atomic
operations it used to do. Maybe a smart compiler could consider
rewriting this piece of code to something along the lines of this (on
armv6):

ldr r0, [kq_calloutmax]
ldrex r1, [kq_ncallouts]
cmp r0, r1
blt ...
add r2, r1, #1
strex r1, r2, [kq_ncallouts]

In other words, convert this to a "compare-less-than-and-increment",
which is not offered by <machine/atomic.h>. It'll be interesting to
see whether Clang will reach such a level of code quality.

--
Ed Schouten <ed@80386.nl>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJOYFBD6JE1%2Bn6uX%2BSfHU-WdCnAYDMW6gB_%2BDQ_DRVNfAQS11A>