Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 4 Oct 2005 19:31:43 +1000 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Steve Kargl <sgk@troutmask.apl.washington.edu>
Cc:        freebsd-standards@FreeBSD.org
Subject:   Re: complex.h math functions
Message-ID:  <20051004164618.U47262@delplex.bde.org>
In-Reply-To: <20051002191841.GA40568@troutmask.apl.washington.edu>
References:  <20050929195552.GA14982@troutmask.apl.washington.edu> <20050930221846.T34595@delplex.bde.org> <20050930152734.GA20655@troutmask.apl.washington.edu> <20051001204005.N37704@delplex.bde.org> <20051002191841.GA40568@troutmask.apl.washington.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 2 Oct 2005, Steve Kargl wrote:

>>> ...
>>> OK, I'll use the implementation details of ccosh to infer the
>>> handling of exceptional values.
>>>
>
> At the risk of embarassing myself once again on freebsd-standards,
> here's an implementation of ccosh.  If this is acceptable, I'll
> proceed with implementations for csinh, ctanh, ccos, csin, and
> ctan.
>
> One question:  Can we use a #define for complex float or do we
> need an trivial C program?  That is, in math.h can we do
>
> #define ccoshf(z)  ccosh((float complex)(z))

A program is needed, since (ccoshf)(z) and &(ccoshf) must work even if
ccoshf() is a macro.

I converted to ccoshf() for testing it and comparing with a simpler
implementation that uses the formula, then converted back.  There are
lots of special cases for -0, Infs and NaNs, and there were lots of
bugs, including compiler bugs (gcc is very far from supporting the IEC
60559 extension).  It turns out that the version using formula handles
most of the special cases automatically and is much easier to fix.  I
fixed both versions and worked around the compiler bugs, and compared
differences to find the best version of the variant using the formula.

First, fixes for your version:

% --- s_ccosh.c~	Mon Oct  3 07:10:59 2005
% +++ s_ccosh.c	Tue Oct  4 16:24:06 2005
% @@ -40,11 +40,11 @@
%  #include <math.h>
% 
% -#include "math_private.h"
% +#include "../math_private.h"

Temporary hack.

% 
%  double complex
%  ccosh(double complex z)
%  {
% -	int con = 1;
% -	double x, y;
% +	double complex f;
% +	double con, x, y;
%  	int32_t hx, hy, ix, iy, lx, ly;
%

Optimize `con' a little, and fix some style bugs (initialization in
declaration, and disordered declarations; most other style bugs are not
fixed or noted).  `f' is used to work around gcc bugs.

% @@ -52,15 +52,4 @@
%  	y = cimag(z);
% 
% -	if (x < 0.) {
% -	    /*  Even function: ccosh(-z) = ccosh(z). */
% -	    x = -x;
% -	    y = -y;
% -	    /* Required identity: ccosh(conj(z)) = conj(ccosh(z)). */
% -	    if (y < 0.) {
% -		con = -1;
% -		y = -y;
% -	    }
% -	}
% -

Move this to after extracting words so that we can look at the sign
bit in ix and iy to classify -0.  Note that n869.pdf only describes
the behaviour explicitly for +0 and +Inf so that it doesn't have to
describe many more cases in detail, but ut is clear that the evenness and
conjugacy rules to the sign bit in +-0 although this isn't stated
explicitly at least near the spec for ccosh().  Evenness is stated
explicitly for cos(+-0) and oddness is stated explicitly for sin(+-0)
(e.g., sin(-0) is specified to be -0, and this is implemented by fdlibm
and i387 hardware sin).

Although changing signs so that all the sign bits are 1 simplifies
n869.pdf, it mostly complicates things for us since we can depend on
multiplication to do the right things with signs.  We need complications
to avoid flipping the sign bit for NaNs so as to set the sign bits in
the correct MD way for the cases where the standard doesn't specify the
sign.  Most operations on NaNs don't change the sign bit on most (?)
machines, but -NaN changes it on i386's at least (0-NaN is different
from -NaN on i386's!).

%  	EXTRACT_WORDS(hx, lx, x);
%  	EXTRACT_WORDS(hy, ly, y);
% @@ -69,10 +58,37 @@
%          iy = 0x7fffffff&hy;  /* if iy >= 0x7ff00000, then inf or NaN */
% 
% -	/* Filter out the non-exceptional cases.  x and y are finite. */

This was changed to match the code, but didn't really move.

% +	/* Even function: ccosh(-z) = ccosh(z). */
% +	if (hx & 0x80000000 && !isnan(x)) {
% +	    x = -x;
% +	    hx = ix;
% +	    if (!isnan(y)) {
% +		y = -y;
% +		hy ^= 0x80000000;
% +	    }
% +	}
% +
% +	/* Required identity: ccosh(conj(z)) = conj(ccosh(z)). */
% +	con = 1;
% +	if (hy & 0x80000000 && !isnan(y)) {
% +	    con = -1;
% +	    y = -y;
% +	    hy = iy;
% +	}

The symmetry for conjugacy is independent of the one for evenness, so
flipping the sign for it must be moved out of the if for flipping the
sign(s) for evenness.

Avoid flipping the sign bit for NaNs.

I didn't bother optimizing isnan(x), etc. using hx, etc.

% +
% +	/* Handle the nearly-non-exceptional cases where x and y are finite. */

Change comment to match code.

%  	if (ix < 0x7ff00000 && iy < 0x7ff00000) {
% -	   /*  cosh(+0 + I 0) = 1.  */
% -	   if ((ix|lx) == 0 && (iy|ly) == 0)
% -		return 1.;

Adjust this to handle all cases where x is 0 (and y is finite).  This
fixes handling of cosh(+-0 + I*+-0) and handles nonzero y at negative
extra cost.  The result of cosh(+-0 + I*+-0) must be
(1. + I*(+-0)*(+-0)).  We have forgotten the original x and y, else we
could return (1. + I*x*y) after working around the compiler bugs.
(x * con) would give a correctly-signed 0, but it is simpler to use
this in new code that handles all finite y (and zero x).

% -	   return (cosh(x) * (cos(y) + I * con * tanh(x) * sin(y)));

This changes but doesn't really move.

% +	    if ((ix | lx) == 0) {

Handle x == 0, y finite.  There are 2 unobvious details:
- getting the sign right when y == 0...

% +		creal(f) = cos(y);
% +		cimag(f) = x * con;
% +		return f;

...
- working around gcc bugs.  gcc's support for IEC 60559 is so noexistent
   that u + I*v is unusable in general to construct a double complex from
   2 doubles.  Here the main bug is that I*-0 gets corrupted to I*+0.  gcc
   generates code that does extra work to implement this bug.  gcc's
   handling of full complex operations is more reasonable -- it generates
   code that doesn't do the extra, very expensive work to avoid related
   bugs.  See n869.pdf section G.4.1 for how difficult it is to implement
   complex multiplication perfectly correctly.  The example there handles
   Infs and and NaNs and some overflows, but not underflows or signed 0's.
   The complications are related to the ones here and to the ones that I
   meantioned for x*x.  E.g., the real part of (a+ib)(c+id) is ac-bd on
   real numbers, but for finite floating point numbers the multiplications
   can overflow without the their difference overflowing, and their
   difference can underflow; Infs, NaNs and signed 0's give additional
   complications.

Fortunately, we can construct u + I*v by assigning to the real and complex
parts.  There should be a macro or inline function to do this.

% +	    }
% +	    if ((iy | ly) == 0) {
% +		creal(f) = cosh(x);
% +		cimag(f) = con * y;
% +		return f;
% +	    }

This special case for x finite and y zero is much more critical.  cosh(x)
can be infinite for finite x, and then we must not perform or use the
multiplication (cosh(x) * tanh(x) * con * sin(y)), since that would be
(Inf * +-0) so it would raise the invalid exception and give result NaN,
but we want a result of +-0 for the imaginary part (just like for x
infinite and y 0).

We work around the compiler bug as usual.

% +	    creal(f) = cosh(x) * cos(y);
% +	    cimag(f) = cosh(x) * tanh(x) * con * sin(y);
% +	    return f;
%  	}
%   	/*

The only change for finite nonzero x and y is to avoid the compiler bug.
When sinh(x) is Inf and y is nonzero, it is correct for the imaginary part
to be NaN.

The (cosh(x) * tanh(x)) term should be written as sinh(x), especially now
that cosh(x) is evaluated twice.  The old way of writing the return value:

     cosh(x) * (cos(y) + I * con * tanh(x) * sin(y))

has the interesting property of avoiding some of the compiler bugs.
The second term is always finite, so gcc doesn't mess up Infs and Nans
in it, and multiplying it by an infinite real coshx() does less damage
than multiplying I by an infinite real (sinh(x) * con * siny())..

Using cosh(x) * tanh(x) instead of sinh(x) is also wrong because in
theory cosh(x) might be Inf when sinh(x) is finite.  I think this
doesn't happen in practice.  Similarly, sinh(x) might be infinite
while the infinite precision sinh(x) * sin(y) is finite -- this is
a subcase of the problem of multiplying 2 double complexes perfectly
correctly.  Regression tests showed that ccoshf() implemented using
float precision differed significantly from ccosh() implemented using
double precision only because the double precision version mostly
overflow here (except when the result overflows a float).

% @@ -81,6 +97,19 @@
%  	 *  cosh(+0 + I NaN) = NaN +- I 0, sign of 0 is unspecified.
%  	 */
% -	if ((ix|lx) == 0 && iy >= 0x7ff00000)
% -		return (y - y) / (y - y);
% +	if ((ix|lx) == 0 && iy >= 0x7ff00000) {
% +		creal(f) = y - y;

Work around compiler bugs as usual.

y is Inf or Nan, so we can use the simpler formula y - y to generate a
NaN for it.  ((y - y) / (y - y)) is only needed to generate a NaN from
a possibly-finite y.

% +		/*
% +		 * Use the sign of (sinh(original_x) * sin(original_y)) for
% +		 * the optional sign.  This expression reduces to (y - y) for
% +		 * all cases checked (mainly i386's).  The fdlibm sin(y) is
% +		 * certainly (y - y), but in the +-Inf cases we may have
% +		 * flipped the sign of original_y to get y, and we want to
% +		 * multiply by the sinh() term.  These complications have no
% +		 * effect with normal NaN semantics (-Inf - -Inf) = same NaN
% +		 * as (Inf - Inf), and (finite * NaN) = same NaN).
% +		 */
% +		cimag(f) = copysign(0, y - y);
% +		return f;
% +	}
%  	/*
%  	 *  cosh(x + I Inf) = NaN + I NaN, finite x != 0, invalid exception.

Intricacies to set the sign to the least surprising/most reproducible value.

% @@ -88,5 +117,5 @@
%  	 */
%  	if (ix < 0x7ff00000 && iy >= 0x7ff00000)
% -	    return (x - x) / (x - x) + I * (x - x) / (x - x);
% +	    return y - y + I * (y - y);
%  	/*
%  	 *  cosh(+Inf + I 0)   = +Inf + I 0

Use the right arg in the return value so that NaNs get propagated to the
result.  Here x is finite (nonzero) and y is +-Inf or NaN.  For y = +-Inf,
y - y gives a standard fixed NaN "real indefinite" on i386's at least,
the same one as the expression involving x, but for y = NaN we want to
return the same NaN and not change to "real indefinite".

Use a simpler expression for the resulting NaN, as above.

Testing showed that working around the compiler bugs is not needed here or
in some other places.  Here this is because everything ends up as NaN, so
the spurious NaNs introduced by gcc have low enough precedence and/or
differences that they don't affect the result.

% @@ -96,8 +125,10 @@
%  	 */
%  	if (ix >= 0x7ff00000 && ((hx&0xfffff)|lx) == 0) {
% -	    if ((iy|ly) == 0)
% -		return (x * x);
% -	    else if (iy >= 0x7ff00000)
% -		return (x * x) + I * (y - y) / (y - y);
% +	    if ((iy|ly) == 0) {
% +		creal(f) = x * x;
% +		cimag(f) = con * y;
% +		return f;

The usual workaround, plus fix the sign for the imaginary part.

% +	    } else if (iy >= 0x7ff00000)
% +		return x * x + I * (y - y);

Use fewer parentheses for the real part and a simpler formula for the
imaginary part.

%  	    return (x * x) * (cos(y) + I * con * sin(y));
%  	}
% @@ -108,7 +139,11 @@
%  	 */
%  	if (ix >= 0x7ff00000 && ((hx&0xfffff)|lx) != 0) {
% -	    if ((iy|ly) == 0)
% -		return x * x;
% -	    return x * x + I * x * x;
% +	    if ((iy|ly) == 0) {
% +		creal(f) = x - x;
% +		cimag(f) = copysign(0, x - x);
% +		return f;

The usual workaround, plus fix the sign for the imaginary part.  x is
NaN, so `con' doesn't apply.

% +	    }
% +	    return (x - x) + (y - y) + I * ((x - x) + (y - y));

Here x and y are both NaNs.  Mix both of them into the output.  The
parentheses are essential, perhaps due to extra precision in intermediate
values.  On i386's, this results in the largest (with the NaNs considered
to be (signed?) 64-bit integers) being propagated.

Propagation of NaNs doesn't always preserve them.  On i386's the only
changes is to convert signaling NaNs to quiet NaNs by setting the
quietness bit.  We use the formula (y - y) extensively to convert
infinities to NaNs and get whatever MD conversions the hardware wants
to do to propagate NaNs.

%  	}
% +	abort();

This should be unreachble, so the previous if statement must always be
redundant.

%  }

---

Second, a simple version:

% /*
%  * Hyperbolic cosine of a complex argument.
%  *
%  * Most exceptional values are automatically correctly handled by the
%  * standard formula.  See n689.pdf for details.
%  */
% 
% #include <complex.h>
% #include <math.h>
% 
% double complex
% ccosh1(double complex z)
% {
% 	double complex f;
% 	double x, y;
% 
% 	x = creal(z);
% 	y = cimag(z);
% 
% 	/*
% 	 * This is subtler than it looks.  We handle the cases x == 0 and
% 	 * y == 0 directly not for efficiency, but to avoid multiplications
% 	 * that don't work like we need.  The result must be almost pure
% 	 * real or a pure imaginary, except it has type double complex and
% 	 * its impure part may be -0.  Multiplication of +-0 by an infinity
% 	 * or a NaN would give a NaN for the impure part, and would raise
% 	 * an unwanted exception.
% 	 *
% 	 * In theory, any of sinh(x), cos(y) and sin(y) may be +-0 for a
% 	 * nonzero arg, because although the infinitely precise value cannot
% 	 * even be rational (Gelfond-Schneider theorem?), the rounded result
% 	 * may be 0.  For such values, we would have to avoid multiplying
% 	 * +-0 by an infinity or a NaN, and cos() and sin() would have to
% 	 * be careful to get the sign right.  Exhaustive checking shows that
% 	 * this never happens for args that are representable as floats.
% 	 *
% 	 * Note: the expression u + I*v is unusable since gcc introduces
% 	 * many overflow, underflow, sign and efficiency bugs by rewriting
% 	 * I*v as (0+I)*(v+0*I) and laboriously computing the full complex
% 	 * product.  In particular, I*Inf is corrupted to NaN+I*Inf, and
% 	 * I*-0 is corrupted to 0+I*0.
% 	 */
% 	if (x == 0) {
% 		creal(f) = cos(y);
% 		cimag(f) = isfinite(y) ? x * copysign(1, y) :
% 		    copysign(0, y - y);
% 		return (f);
% 	}
% 	if (y == 0) {
% 		creal(f) = cosh(x);
% 		cimag(f) = !isnan(x) ? copysign(1, x) * y :
% 		    copysign(0, x - x);
% 		return (f);
% 	}
% 
% 	creal(f) = cosh(x) * cos(y);
% 	cimag(f) = sinh(x) * sin(y);
% 	return (f);
% }

I hope this is described in too much detail in its main comment.

Third, a simple test program that compares implementations of ccosh()
on a reasonably large set of float complex args (including +0, +-Inf
and many NaNs):

% #include <complex.h>
% #include <math.h>
% #include <stdint.h>
% #include <stdio.h>
% #include <stdlib.h>
% 
% #ifndef FUNC
% #define	FUNC	ccosh
% #endif
% 
% double complex ccosh(double complex z);
% double complex ccosh1(double complex z);
% float complex ccoshf(float complex z);
% float complex ccosh1f(float complex z);
% 
% int
% main(void)
% {
% 	float complex vf0, vf1;
% 	float complex z;
% 	volatile float vf0i, vf0r, vf1i, vf1r;
% 	float x, y;
% 	uintmax_t na, tot_er;
% 	uint32_t eri, err, max_er, stride, vf0iu, vf0ru, vf1iu, vf1ru, xu, yu;
% 
% 	max_er = 0;
% 	na = 0;
% 	stride = 0x80000;
% 	tot_er = 0;
% 	for (xu = 0;; xu += stride) {
% 	    for (yu = 0;; yu += stride) {
% 		x = *(float *)&xu;
% 		y = *(float *)&yu;
% 		/* `z = x + I * y;', unbroken: */
% 		crealf(z) = x;
% 		cimagf(z) = y;
% 		vf0 = FUNC(z);
% 		vf1 = __CONCAT(FUNC, 1)(z);
% 		vf0r = crealf(vf0);
% 		vf0ru = *(volatile uint32_t *)&vf0r;
% 		vf0i = cimagf(vf0);
% 		vf0iu = *(volatile uint32_t *)&vf0i;
% 		vf1r = crealf(vf1);
% 		vf1ru = *(volatile uint32_t *)&vf1r;
% 		vf1i = cimagf(vf1);
% 		vf1iu = *(volatile uint32_t *)&vf1i;
% 		na++;
% 		err = abs(vf1ru - vf0ru);
% 		eri = abs(vf1iu - vf0iu);
% 		tot_er += err + eri;
% 		if (max_er < err)
% 			max_er = err;
% 		if (max_er < eri)
% 			max_er = eri;
% 		if (err > 2 || eri > 2) {
% 			printf("z =            %#8x+I%-#8x %.15g+I%-.15g\n",
% 			    xu, yu, x, y);
% 			printf(
% 		__XSTRING(FUNC) "(z) =   %#8x+I%-#8x %.15g+I%-.15g\n",
% 			    vf0ru, vf0iu, vf0r, vf0i);
% 			printf(
% 		__XSTRING(FUNC) "1(z) =  %#8x+I%-#8x %.15g+I%-.15g\n",
% 			    vf1ru, vf1iu, vf1r, vf1i);
% 			printf("err = %s%#x+I%s.%#x\n",
% 			    vf0ru <= vf1ru ? "+" : "-", err,
% 			    vf0iu <= vf1iu ? "+" : "-", eri);
% 			fflush(stdout);
% 		}
% 		if (na % 10000000 == 0) {
% 			printf(
% 		"%ju: %#x+I*%#x: %g+I*%g: max_error = %#x, avg_error = %.3f\n",
% 			    na, xu, yu, x, y, max_er, (double)tot_er / na);
% 			fflush(stdout);
% 		}
% 		if (yu == -stride)
% 			break;
% 	    }
% 	    if (xu == -stride)
% 		    break;
% 	}
% 	printf("%ju cases: max_error = %#x, avg_error = %.3f\n",
% 	    na, max_er, (double)tot_er / na);
% }
% 
% #include "s_ccosh.c"
% #include "s_ccosh1.c"
#ifdef nothere
% #include "s_ccoshf.c"
% #include "s_ccosh1f.c"
#endif
% 
% /* For ccosh(), must avoid using the broken hardware cos() and sin(): */
% #undef rcsid
% #define	rcsid rcsid6
% #include "../s_cos.c"
% 
% #undef rcsid
% #define	rcsid rcsid7
% #include "../s_sin.c"
% 
% #ifdef DEBUG
% #undef rcsid
% #define	rcsid rcsid0
% #include "../e_coshf.c"
% 
% #undef one
% #define	one one0
% #undef rcsid
% #define	rcsid rcsid1
% #include "../e_sinhf.c"
% 
% #undef one
% #define	one one1
% #undef rcsid
% #define	rcsid rcsid2
% #include "../s_tanhf.c"
% #endif

This must be built in a subdir of lib/msun/src.

I used this mainly with ccosh[1]f() and hacked it to work with ccosh[1]().
float complex arg space is too large to check exhaustively, and double
complex arg space is much larger and harder to step through and display
in the error messages, so I kept using float complex arg space.  The
coverage with stride 0x80000 should be more than enough for NaNs.

Testing with stride 0x80000 showed a maximum error of 2 ulps (for the
real and imaginary parts of the result) from using coshf(x) * tanhf(x)
instead of sinhf(x) if the non-complex float functions are compiled with
-O, but the error is 3-4 ulps if they are compiled with -O0.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051004164618.U47262>