Date: Fri, 16 Nov 2007 08:51:06 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Steve Kargl <sgk@troutmask.apl.washington.edu> Cc: freebsd-standards@FreeBSD.org Subject: Re: long double broken on i386? Message-ID: <20071116080356.S10107@besplex.bde.org> In-Reply-To: <20071115201625.GA18739@troutmask.apl.washington.edu> References: <20070928152227.GA39233@troutmask.apl.washington.edu> <20071001173736.U1985@besplex.bde.org> <20071002001154.GA3782@troutmask.apl.washington.edu> <20071002172317.GA95181@VARK.MIT.EDU> <20071002173237.GA12586@troutmask.apl.washington.edu> <20071003103519.X14175@delplex.bde.org> <20071010204249.GA7446@troutmask.apl.washington.edu> <20071103144151.U671@besplex.bde.org> <20071115201625.GA18739@troutmask.apl.washington.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 15 Nov 2007, Steve Kargl wrote: > On Sat, Nov 03, 2007 at 03:53:19PM +1100, Bruce Evans wrote: >> Are 396 hex digits enough? It can't be right for all of the float, double >> and long double cases. I hope it is enough for 128-bit long doubles, and > ... > So, I poked around and found > > http://docs.sun.com/app/docs/doc/801-7639/6i1ucu1uh?a=view > > which contains the statement: > > libm and libsunmath trigonometric functions use an "infinitely" precise > pi for argument reduction. The value 2/pi is computed to 916 hexadecimal > digits and stored in a lookup table to use during argument reduction. That is a lot. Try looking at www.validgh.com for more references. There is a program for calculating the number of digits needed somewhere there or pointed to there. > static const long double xxx[10] = { > -0x8.000000000000000p-4L, /* The - here, changes the sign of hz. */ Why put the minus here? ... > long double > __kernel_cosl(long double x, long double y) > { > long double hz, r, z, w; > > #if 1 > /* > * XXX - Although y supposely carries extra precision to allow > * better accuracy, the following lines are needed to get <= 1 ULP. > * Note, very simply testing with arg in [-2000:2000] shows ULPs > * that exceed 38 without this. > */ > x += y; > y = 0; > #endif y should adjust x by at most 1 ULP (maybe < 0.5 ULPs) so this change should have little (maybe no) effect. The bug is probably in the caller, in adding up the doubles to form a long double. For double precision, y[0] gives the double x directly, and it is up to __ieee754_rem_pio2() to keep y[1] (y here) small. For amd and i386 long double precision, x here must be calculated from y[0] and y[1]: x = (double)(y[0] + y[1]); and y here is the residue (somthing like (double)(x - y[0]) + y[1]). Here it is essential to avoid the gcc bug of casts and assignments not discarding extra precision on i386's. With this calculation of x and y, x += y always has no effect, since the sum has already been calculated as accurately as possible in x. > z = x * x; > > r = z * (C[1] + z * (C[2] + z * (C[3] + z * (C[4] + z * (C[5] + > z * (C[6] + z * (C[7] + z * (C[8] + z * C[9])))))))); > > hz = (float)C[0] * z; Oops, this float cast is a copy of the one in k_cos.c, which is only there due to mismerging the k_cosf.c to k_cos.c. This should make no difference. > w = 1 + hz; > > return (w + (((1 - w) + hz) + (z * r - x * y))); ... hz means half of z. Why change it to mean minus half of z? C[1] etc vs C1 etc in k_cos.c is a regression. C1 is easier to type in debuggers. My program for determining the best polynomials prints the coefficients in a standard form, and I decided to not use the array form and have had second thoughts about the hex values (I now prefer decimal values in code and hex values in comments, as in fdlibm, partly because the C99 hex values are too different from the raw MD hex values to be useful for debugging). Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071116080356.S10107>