Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 28 Jun 2013 10:42:49 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        enh <enh@google.com>
Cc:        freebsd-numerics@freebsd.org, Steve Kargl <sgk@troutmask.apl.washington.edu>
Subject:   Re: sincos?
Message-ID:  <20130628103209.H1008@besplex.bde.org>
In-Reply-To: <CAJgzZoqbF-bS6M8OYmVx7=eKfpNmavXXZXX0Zgvsxr07CUfC0w@mail.gmail.com>
References:  <CAJgzZopTzfYXecu7zRKhVNEEBOCtz8Z2qK8ka74c5LKZxC8mEw@mail.gmail.com> <20130627013502.GA37295@troutmask.apl.washington.edu> <CAJgzZoqbF-bS6M8OYmVx7=eKfpNmavXXZXX0Zgvsxr07CUfC0w@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 27 Jun 2013, enh wrote:

> well, that was Intel and the code's not been accepted, but yes --- that's
> another reason for me not to accept their patch!
>
> Intel claimed "The reason for this fix [beside workaround for O0 switch] -
> it helps to remove some sin[f]+cos[f] code duplication (which is the whole
> reason for introduction of such function at all), which results in
> 1.58-1.81x performance gain on intervals |x|<100." i've not seen their
> benchmark code, so i don't know what their distribution of values was, and
> i don't understand why they covered a range as large as +/- 100.

+-2*Pi may be a bit too small, but most uses won't require very large
angles.

> when looking at i7 performance though, remember that x86 Android will
> usually be running on Atom (and most Android devices are actually ARM, not
> x86).

Hardware trig may actually be best for Atom (like on x86 before about PPro
for float precision and AthlonXP for double precision).  Some of my
optimizations in software libm depend on out of order execution so they
will be pessimizations (hopefully small) on Atom and other in order
execution CPUs.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130628103209.H1008>