Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 9 Mar 2019 00:51:07 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Steve Kargl <sgk@troutmask.apl.washington.edu>
Cc:        freebsd-numerics@freebsd.org
Subject:   cexpl() (was: Re: Update ENTERI() macro)
Message-ID:  <20190308225807.D6410@besplex.bde.org>
In-Reply-To: <20190307195436.GA20856@troutmask.apl.washington.edu>
References:  <20190304212159.GA12587@troutmask.apl.washington.edu> <20190305153243.Y1349@besplex.bde.org> <20190306055201.GA40298@troutmask.apl.washington.edu> <20190306225811.P2731@besplex.bde.org> <20190306184829.GA44023@troutmask.apl.washington.edu> <20190307061214.R4911@besplex.bde.org> <20190306214233.GA23159@troutmask.apl.washington.edu> <20190307144315.N932@besplex.bde.org> <20190307044447.GA16298@troutmask.apl.washington.edu> <20190307163958.V1293@besplex.bde.org> <20190307195436.GA20856@troutmask.apl.washington.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 7 Mar 2019, Steve Kargl wrote:

> On Thu, Mar 07, 2019 at 05:36:22PM +1100, Bruce Evans wrote:
>> On Wed, 6 Mar 2019, Steve Kargl wrote:
>>
>>> On Thu, Mar 07, 2019 at 03:09:06PM +1100, Bruce Evans wrote:
>* ...
>>> 	/*
>>> 	 *  exp_ovfl = 11356.5234062941439497, 0x400c, 0xb17217f7d1cf79ac
>>> 	 * cexp_ovfl = 22755.3287906024445633, 0x400d, 0xb1c6a8573de9768c
>>> 	 */

I checked that these are correct.  Any rounding should be down for
exp_ovfl and up for cexp_ovfl.

>>> 	if ((hx == 0x400c && lx > 0xb17217f7d1cf79acULL) ||
>>> 	    (hx == 0x400d && lx < 0xb1c6a8573de9768cULL)) {
>>
>> Don't duplicate the hex constants in the comment.  The limits can be fuzzy
>> so don't need so many decimal digits or nonzero nonzero bits or LD80C()
>> to round them (s_cexp.c uses only 1 decimal digit and doesn't test the
>> lower 32 bits in the mantissa; here we have more bits than we need in lx).
>
> Being fuzzy was the movitation for the new macro, I was only using
> the upper 32 bits of lx in the original cexpl() I posted.

IIRC, that was done by extracting into a 32-bit lx.  This could be written
as (lx >> 32) > 0xb17217f7 (or a fuzzier value), and it might be better
to do that anyway.

>> Maybe not use bit-fiddling at all for this (in other precisions too).
>> The more important exp() functions check thresholds in FP.  Here x can
>> be NaN so FP comparisons with it don't work right.  That seems to be
>> the only reason to check in bits here.
>>
>>> 		/*
>>> 		 * x is between exp_ovfl and cexp_ovfl, so we must scale to
>>> 		 * avoid overflow in exp(x).
>>> 		 */
>>> 		RETURNI(__ldexp_cexpl(z, 1));
>>
>> Did your tests cover this case, and does it use the fixed version of the
>> old __ldexp_cexpl() in k_expl.h?
>
> No.  The test I reported was restricted to -11350 < x < 11350.
> Spot checked x outside that range.  I had the conditional wrong
> for x < -exp_ovfl.  Also, the comment is actaully a little optimistic.
> One cannot avoid overflow.  For example, x = 11357 is just above
> exp_ovfl, and exp(x) = inf.  __ldexp_cexpl() does scaling and exploits
> |cos(y)| <= 1 and |sin(y)| <= 1 to try to extend the range of cexpl().

I understand this better now.  __ldexp_cexpl() doesn't work for all large
x, so callers must only use it for a range of values above cexp_ovl.
The correctness of this is unobvious.  It depends on sin(y) and cos(y)
never underflowing to 0.  Otherwise, this method would give Inf * 0 = NaN
for finite x and y, and the scaling method would give finite * 0 * 2**k = 0.

>>> 	} else {
>>> 		/*
>>> 		 * Cases covered here:
>>> 		 *  -  x < exp_ovfl and exp(x) won't overflow (common case)
>>> 		 *  -  x > cexp_ovfl, so exp(x) * s overflows for all s > 0
>>> 		 *  -  x = +-Inf (generated by exp())
>>> 		 *  -  x = NaN (spurious inexact exception from y)
>>> 		 */
>>> 		exp_x = expl(x);
>>> 		sincosl(y, &s, &c);
>>> 		RETURNI(CMPLXL(exp_x * c, exp_x * s));
>>> 	}
>>> }
>>
>> I think the last clause can be simplfied and even optimized by using the
>> kernel in all cases:

I was wrong.

>> - the kernel produces exp_x in the form kexp_x * 2**k where kexp_x is near
>>    1.  Just multiply kexp_x by c and s, and later multiply by 2**k.  This
>>    cost an extra multiplication by 2**k (for the real and imaginary parts
>>    separately), but avoids other work so may be faster.
>> - hopefully kexp_x is actually >= 1 (and < 2).  Otherwise we have to
>>    worry about spurious underflow
>> - the above method seems to avoid spurious underflow automatically, and
>>    doesn't even mention this (multiplying by exp_x may underflow, but not
>>    spuriously since c and s are <= 1).
>
> You might be right about using the kernel.  I'll leave that
> to others on freebsd-numerics@; although you and I seem to be
> the only subscribers.

Forget about that.  __ldexp_cexp*() is not really a kernel.  It is used
for ccosh*() and csinh*() too.  Those can only use the kernel for large
positive x.  cexp*() may as well do the same.  Also, the kernel currently
doesn't support very large positive x, so the second overflow threshold
is needed in callers.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190308225807.D6410>