Date: Thu, 15 Nov 2007 12:16:25 -0800 From: Steve Kargl <sgk@troutmask.apl.washington.edu> To: Bruce Evans <brde@optusnet.com.au> Cc: freebsd-standards@freebsd.org Subject: Re: long double broken on i386? Message-ID: <20071115201625.GA18739@troutmask.apl.washington.edu> In-Reply-To: <20071103144151.U671@besplex.bde.org> References: <20070928152227.GA39233@troutmask.apl.washington.edu> <20071001173736.U1985@besplex.bde.org> <20071002001154.GA3782@troutmask.apl.washington.edu> <20071002172317.GA95181@VARK.MIT.EDU> <20071002173237.GA12586@troutmask.apl.washington.edu> <20071003103519.X14175@delplex.bde.org> <20071010204249.GA7446@troutmask.apl.washington.edu> <20071103144151.U671@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Nov 03, 2007 at 03:53:19PM +1100, Bruce Evans wrote: > This thread is old, so I quoted too much. I quote very little, but it remains an old thread. I'm sure I have attributions wrong. > % Index: msun/src/s_cosl.c > % ... > % +/* > % + * Table of constants for 2/pi, 396 Hex digits (476 decimal) of 2/pi. > % + * The table was taken from e_rem_pio2.c. > % + */ > % +static const int32_t two_over_pi[] = { > % +0xA2F983, 0x6E4E44, 0x1529FC, 0x2757D1, 0xF534DD, 0xC0DB62, > % +0x95993C, 0x439041, 0xFE5163, 0xABDEBB, 0xC561B7, 0x246E3A, > > This gives far too many copies of this table (instead of just too many). > The float and double versions avoid having a copy for each of cos/sin/tan > by doing arg reduction in e_rem_pio2[f].c. The extra layer for this file > is a pessimization but seems to be worth it to avoid the duplication. > > Are 396 hex digits enough? It can't be right for all of the float, double > and long double cases. I hope it is enough for 128-bit long doubles, and > the float case is just sloppy in using the same table as the double case. > If 396 hex digits is enough for all cases, then the table should be in > k_rem_pio2.c. I started to rework my sinl(), cosl(), and tanl() (and __kernel_*) routines. In doing so, I've taken, for example, __kernel_cos() and changed it to __kernel_cosl() keeping the "extra precision" algorithm. For reference, see code below. Testing shows that I have ULPs exceeding 38 for cosl() with this algorithm. If the "extra precision" algorithm is correct, then perhaps your question "Are 396 hex digits enough?" is relevant. So, I poked around and found http://docs.sun.com/app/docs/doc/801-7639/6i1ucu1uh?a=view which contains the statement: libm and libsunmath trigonometric functions use an "infinitely" precise pi for argument reduction. The value 2/pi is computed to 916 hexadecimal digits and stored in a lookup table to use during argument reduction. I'm looking into this now. -- Steve static const long double xxx[10] = { -0x8.000000000000000p-4L, /* The - here, changes the sign of hz. */ 0xa.aaaaaaaaaaaaaabp-8L, -0x5.b05b05b05b05b05p-12L, 0x1.a01a01a01a019dbp-16L, -0x4.9f93edde27cbed7p-24L, 0x8.f76c77fc4bbff37p-32L, -0xc.9cba54236b4e97fp-40L, 0xd.73f9aa92060766ap-48L, -0xb.4105ba69deeed1ap-56L, 0x7.7e10d7644cea922p-64L, }; #define C xxx long double __kernel_cosl(long double x, long double y) { long double hz, r, z, w; #if 1 /* * XXX - Although y supposely carries extra precision to allow * better accuracy, the following lines are needed to get <= 1 ULP. * Note, very simply testing with arg in [-2000:2000] shows ULPs * that exceed 38 without this. */ x += y; y = 0; #endif z = x * x; r = z * (C[1] + z * (C[2] + z * (C[3] + z * (C[4] + z * (C[5] + z * (C[6] + z * (C[7] + z * (C[8] + z * C[9])))))))); hz = (float)C[0] * z; w = 1 + hz; return (w + (((1 - w) + hz) + (z * r - x * y))); }
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071115201625.GA18739>