Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Sep 2012 11:29:31 +0100
From:      David Chisnall <theraven@FreeBSD.org>
To:        Dimitry Andric <dim@FreeBSD.org>
Cc:        Davide Italiano <davide@FreeBSD.org>, src-committers@FreeBSD.org, John Baldwin <jhb@FreeBSD.org>, Jeff Roberson <jeff@FreeBSD.org>, attilio@FreeBSD.org, svn-src-projects@FreeBSD.org, Konstantin Belousov <kostikbel@gmail.com>
Subject:   Re: svn commit: r238907 - projects/calloutng/sys/kern
Message-ID:  <EAB6CB55-C1A4-4099-AD86-6FDE3593A867@FreeBSD.org>
In-Reply-To: <505849DB.3090704@FreeBSD.org>
References:  <201207301350.q6UDobCI099069@svn.freebsd.org> <CAJ-FndBj8tpC_BJXs_RH8sG2TBG8yA=Lxu3-GTVT9Ap_zOCuVQ@mail.gmail.com> <CAJ-FndDnO7wjnWPV0tTu%2BUGHjsxa3YDarMxmyei3ZmjLAFvRkQ@mail.gmail.com> <201207301732.33474.jhb@freebsd.org> <CAJ-FndD5EO12xsWOAe6u0EvX00q33wxO4OivnGjzj0=T2Oe8uA@mail.gmail.com> <CAJ-FndCRg0UCThFkatp=tw7rUWWCvhsApLE=iztLpxpGBC1F9w@mail.gmail.com> <505849DB.3090704@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 18 Sep 2012, at 11:15, Dimitry Andric wrote:

> Please use gcc's __sync_synchronize() builtin[1] instead, which is
> specifically for this purpose.  Clang also supports it.
>=20
> The builtin will emit actual memory barrier instructions, if the =
target
> architecture supports it, otherwise it will emit the same asm =
statement
> you show above.  See contrib/gcc/builtins.c, around line 5584, =
function
> expand_builtin_synchronize().

=46rom Attilio's description of the problem in IRC, I believe that =
atomic_signal_fence() is the correct thing to use here.  He stated that =
he cares about reordering of memory access with regard to the current =
CPU, but not with regard to other CPUs / threads.  He also said that he =
only cares about the compiler performing the reordering, not about the =
CPU, but I suspect that is incorrect as there are numerous subtle bugs =
that can creep in on weakly-ordered architectures (e.g. Alpha, ARMv8) if =
you only have a compiler barrier.

That said, this is likely to be incorrect, because it's very unusual for =
that to actually be the requirement, especially in multithreaded code =
(where the atomic.h stuff is actually important).  In most of the cases =
where __compiler_membar() is being used, you actually want at least a =
partial barrier. =20

On 18 Sep 2012, at 11:22, Konstantin Belousov wrote:

> We do not need CPU barriers there, which are already handled by the =
atomic
> asms. It is only to prevent compiler from exploiting the reorder.

If the atomic asm does not state that it clobbers memory, then it is a =
bug.  If it does clobber memory, then following it with an empty asm =
statement that also clobbers memory is redundant.  Looking in atomic.h =
on amd64, they all do already clobber memory...

The atomic operations are memory barriers themselves, although our =
versions are often much stronger barriers than are required.  I would =
like to see us slowly deprecate atomic.h in favour of stdatomic.h, which =
has three significant advantages:

1) It's part of the C standard
2) It provides a well-defined set of barrier types for every operation
3) It's implemented with compiler assistance (including GCC 4.2.1 =
support in our current version, although that version always enforces =
sequentially consistent ordering) and allows the compiler more freedom =
to perform safe optimisations around it

David=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?EAB6CB55-C1A4-4099-AD86-6FDE3593A867>