Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 10 Jun 2013 09:37:09 -0700
From:      Steve Kargl <sgk@troutmask.apl.washington.edu>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        David Schultz <das@FreeBSD.org>, freebsd-numerics@FreeBSD.org
Subject:   Re: Implementation for coshl.
Message-ID:  <20130610163709.GB20856@troutmask.apl.washington.edu>
In-Reply-To: <20130610235038.D790@besplex.bde.org>
References:  <20130610003645.GA16444@troutmask.apl.washington.edu> <20130610110740.V24058@besplex.bde.org> <20130610055834.GA68643@zim.MIT.EDU> <20130610235038.D790@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jun 11, 2013 at 01:43:17AM +1000, Bruce Evans wrote:
> On Sun, 9 Jun 2013, David Schultz wrote:
>> On Mon, Jun 10, 2013, Bruce Evans wrote:
>>> On Sun, 9 Jun 2013, Steve Kargl wrote:
>>>> The former would require a refactoring of
>>>> s_expl.c into a kernel __kernel_expl(x, hi, lo).  I have no plans on
>>>> pursuing this at the this time.
>>>
>>> But you need this soon for __ldexp_exp() and __ldexp_cexp(), which are
>>> needed for hyperbolic functions (the large args already fixed for float
>>> and double precision) and for cexp() and trig and hyperbolic complex
>>> functions.  It is much easier to implement these using a kernel.  I do
>>> this only for float precision.
>>
>> That's fine for now.  We need a better k_exp, too.  I believe
>> Bruce was working on one.
> 
> This is especially easy for long double precision, since expl() only
> has 1 return that needs to be changed:
> 
> @ 	/* Evaluate expl(endpoint[n2] + r1 + r2) = tbl[n2] * expl(r1 + r2). */
> @ 	z = r * r;
> @ 	q = r2 + z * (A2 + r * A3) + z * z * (A4 + r * A5) + z * z * z * A6;
> @ 	t = (long double)tbl[n2].lo + tbl[n2].hi;
> @ 	t = tbl[n2].lo + t * (q + r1) + tbl[n2].hi;
> 
> Return the hi and low parts tbl[n2].hi and tbl[n2].lo + t * (q + r1) here
> and add them up in callers.
> 
> @ 
> @ 	/* Scale by 2**k. */
> @ 	if (k >= LDBL_MIN_EXP) {
> @ 		if (k == LDBL_MAX_EXP)
> @ 			RETURNI(t * 2 * 0x1p16383L);
> @ 		RETURNI(t * twopk);
> @ 	} else {
> @ 		RETURNI(t * twopkp10000 * twom10000);
> @ 	}
> 
> Return k and/or twopk here and multiply by them in callers.

Ah, so that's the answer to my unasked question.  I spent a
few hours last night working on this, but I was trying to
apply the scaling to hi and lo before adding them in the
caller.  The result was that the max observed ULP in my
tests went from 0.51 to 0.73, and I believe, but did not
test, that the overflow threshold was effected.

> > I have a few comments on the code... mostly a subset of Bruce's
> > comments:
> > ...
> > - I think the way you handle the tiny-x case is an improvement
> >  over the double version.  However, float-to-int conversion is
> >  very slow on x86, so you might instead try something like
> >  RETURNI(1.0L + tiny), where tiny is volatile and e.g., 0x1p-1000.
> >  That will also return the correct result in all rounding modes.
> 
> No, it is an unimprovement.  It micro-optimizes for an unusual case,
> and breaks the rounding in that case.

OK.  I'll revisit this section of code.

-- 
Steve



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130610163709.GB20856>