Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 27 Jun 2013 12:25:38 -0700
From:      Steve Kargl <>
To:        enh <>
Subject:   Re: sincos?
Message-ID:  <>
In-Reply-To: <>
References:  <> <> <>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
On Thu, Jun 27, 2013 at 09:12:04AM -0700, enh wrote:
> well, that was Intel and the code's not been accepted, but yes --- that's
> another reason for me not to accept their patch!
> Intel claimed "The reason for this fix [beside workaround for O0 switch] -
> it helps to remove some sin[f]+cos[f] code duplication (which is the whole
> reason for introduction of such function at all), which results in
> 1.58-1.81x performance gain on intervals |x|<100." i've not seen their
> benchmark code, so i don't know what their distribution of values was, and
> i don't understand why they covered a range as large as +/- 100.

The code duplication, which is removed, is the argument reduction
for values |x| > pi / 4 for sin and cos.  If you have 

sincos(x, *s, *c)
   *s = sin(x);
   *c = cos(x);

then both sin and cos call rem_pio2 (or whatever the function is
called) if |x| > pi/4.  The code in question removes one of the
argument reduction calls, and so you get a speed improvement of
1.5 to 2.

As Bruce noted, he would like to see some additional optimizations
for -2*pi < x < 2*pi (may have the range incorrect here) integrated
intoin, cos, sinl, and cosl before we worry about sincos[fl].  I'll
get to those hopefully in August, but coshl, sinhl, and tanhl are
on my plate.


Want to link to this message? Use this URL: <>