Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 28 Mar 2015 15:21:07 +0000
From:      David Chisnall <theraven@FreeBSD.org>
To:        Julian Elischer <julian@freebsd.org>
Cc:        freebsd-current@freebsd.org
Subject:   Re: SSE in libthr
Message-ID:  <FDC008DE-6B3A-4D02-A250-67DEFB1E0B1D@FreeBSD.org>
In-Reply-To: <5516B280.6060002@freebsd.org>
References:  <5515AED9.8040408@FreeBSD.org> <3A96AAEC-9C1C-444E-9A73-3CD2AED33116@me.com> <20150327214452.GR2379@kib.kiev.ua> <5516B280.6060002@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 28 Mar 2015, at 13:54, Julian Elischer <julian@freebsd.org> wrote:
>=20
> the point is that clang will do this anywhere it can, because it isn't =
taking into account the
> side effects, just the speed of the commands themselves.

This is also something that is not going to decrease.  Clang now enables =
the SLP vectoriser by default and this code is constantly being =
improved.  Current generation vector units are explicitly designed as =
targets for compiler autovectorisation, not for hand-tuned DSP code =
(which, increasingly, runs on the GPU anyway).  This means that we're =
increasingly going to see SSE/AVX/NEON usage in CPU-bound code, even =
without an explicit programmer decision to do so.  Optimising for the =
case when the vector unit is not used is about as sensible as optimising =
for the single-core case: it will affect some people, but generally not =
those who care about performance, and a decreasing number of people over =
time.

David




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?FDC008DE-6B3A-4D02-A250-67DEFB1E0B1D>