From owner-freebsd-numerics@FreeBSD.ORG Fri Dec 6 00:37:04 2013 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 75750D04 for ; Fri, 6 Dec 2013 00:37:04 +0000 (UTC) Received: from mail108.syd.optusnet.com.au (mail108.syd.optusnet.com.au [211.29.132.59]) by mx1.freebsd.org (Postfix) with ESMTP id 2437A1862 for ; Fri, 6 Dec 2013 00:37:03 +0000 (UTC) Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23]) by mail108.syd.optusnet.com.au (Postfix) with ESMTPS id 2C8131A4774; Fri, 6 Dec 2013 11:36:53 +1100 (EST) Date: Fri, 6 Dec 2013 11:36:52 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Steve Kargl Subject: Re: dead code in lgamma_r[f]? In-Reply-To: <20131205195225.GA65732@troutmask.apl.washington.edu> Message-ID: <20131206102724.W1033@besplex.bde.org> References: <20131205173700.GA64575@troutmask.apl.washington.edu> <20131205182324.GA65135@troutmask.apl.washington.edu> <20131205195225.GA65732@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=DstvpgP+ c=1 sm=1 tr=0 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117 a=PO7r1zJSAAAA:8 a=m8NKqXhiB1EA:10 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=ElNGO9NIuDsA:10 a=xXsPXvZvRc-c0by79MAA:9 a=CjuIK1q_8ugA:10 Cc: freebsd-numerics@FreeBSD.org, Filipe Maia X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Dec 2013 00:37:04 -0000 On Thu, 5 Dec 2013, Steve Kargl wrote: > On Thu, Dec 05, 2013 at 08:06:22PM +0100, Filipe Maia wrote: >> What if ix == 0x433xxxxx? There are also 8 more x's that aren't explicitly tested for the double case, and more for the long double case. > See below. > >> Wouldn't the if clause be false? > > See below. > >>> Index: src/e_lgamma_r.c >>> =================================================================== >>> --- src/e_lgamma_r.c (revision 1427) >>> +++ src/e_lgamma_r.c (working copy) >>> @@ -173,19 +173,15 @@ >>> */ >>> z = floor(y); BTW, floor() is a fairly pessimal way to convert to an integer. It doesn't even avoid overflow by rounding towards zero (so callers must limit the arg a little, and I think they do for at least the problematic negative args as a side effect of underflowing early for such args). rint() is probably better. On i386, both floor() and rint() are in asm, but floor() has to do a slow switch of the rounding mode to force rounding towards minus infinity. On amd64, both are in C and take about the same amount of code. On amd64, this not optimal for at least rint(). >>> if(z!=y) { /* inexact anyway */ >>> - y *= 0.5; >>> - y = 2.0*(y - floor(y)); /* y = |x| mod 2.0 */ >>> - n = (int) (y*4.0); >>> + y /= 2; >>> + y = 2*(y - floor(y)); /* y = |x| mod 2.0 */ >>> + n = (int)(y*4); > > If you have a y value that corresponds to ix=0x433xxxxx where > one of the x is nonzero, I believe that you end up going through > this portion of code. Perhaps for the float case where all the bits are in the ix word, but not in the double case assuming that 0x43400000 is the correct threshold for detecting even numbers. Indeed, the double with bits 0x4340000000000001 is odd. It is x = 0x1.0000000000001p+52 x = 4503599627370497.00 We see that the 0x43400000 is the correct threshold for evenness, and 1 below that for the exponent gives oddness by the least significant bit. > >>> } else { > > You end up here if and only if z==y, which means y is integral. > As noted in the original email, the 'if(ix>=0x43400000)' below > is dead code because this code in __ieee754_lgamma_r(): > > if(ix>=0x43300000) /* |x|>=2**52, must be -integer */ > return one/zero; > t = sin_pi(x); > > prevents the call to sin_pi(x). Only for negative integers. > >>> - if(ix>=0x43400000) { >>> - y = zero; n = 0; /* y must be even */ This case might be just an optimization, but it is necessary to avoid overflow somewhere. Adding two52 would preserve evenness, but it might give spurious overflow. >>> - } else { >>> - if(ix<0x43300000) z = y+two52; /* exact */ This case is needed for putting the oddness bit in the least significant bit. When ix == 0x43300000, it is already there, as the above example shows. OTOH, when ix == 0x43300000, two52 must not be added, since it moves the oddness bit up by 1 place. E.g., adding 0x1p52 in the above example gives: x = 0x1p+53 x = 9007199254740992.00 I just noticed that this addition of two52 barely escapes problems with extra precision. The same method is used elswhere and requires more care. It might still not take enough care with double rounding. > If we again look at the code from __ieee754_lgamma_r(), we see > that sin_pi() is called if ix < 0x43300000, so by the time we > arrive at the 'if(ix<0x43300000)' statement we already know that > the condition is true. No, only for negative integers. hx<0 classifies negative values, and then ix>=0x43300000 classifies numbers that are so large negative that they must be integers, and the result is a sort of pole error. We are just filtering out this case, perhaps as an optimization. > >>> - GET_LOW_WORD(n,z); >>> - n &= 1; >>> - y = n; >>> - n<<= 2; >>> - } >>> + z = y+two52; /* exact */ The main bug that this introduces is misclassifying all odd (positive) intgers with ix = 0x433xxxxx. Similarly in float precision, except the misclassified odd integers are easier to describe. They are ones with ix between 0x4b000000 and 0x4b7fffff with low bit 1. >>> + GET_LOW_WORD(n,z); >>> + n &= 1; >>> + y = n; >>> + n<<= 2; >>> } >>> switch (n) { >>> case 0: y = __kernel_sin(pi*y,zero,0); break; >From a previous reply: > Note the first part of the diff allows us to remove the > cast to float in e_lgamma_r.c, which minimizes the diff > between e_lgamma_r.c and e_lgammaf_r.c (and e_lgammal_r.c :). The gamma functions are not good candidates for conversion to long double, since they are very inaccurate (especially near zeros for lgamma()) in float and double precision. Bruce