From owner-freebsd-numerics@FreeBSD.ORG  Fri Dec  6 00:37:04 2013
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 75750D04
 for <freebsd-numerics@FreeBSD.org>; Fri,  6 Dec 2013 00:37:04 +0000 (UTC)
Received: from mail108.syd.optusnet.com.au (mail108.syd.optusnet.com.au
 [211.29.132.59]) by mx1.freebsd.org (Postfix) with ESMTP id 2437A1862
 for <freebsd-numerics@FreeBSD.org>; Fri,  6 Dec 2013 00:37:03 +0000 (UTC)
Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au
 (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23])
 by mail108.syd.optusnet.com.au (Postfix) with ESMTPS id 2C8131A4774;
 Fri,  6 Dec 2013 11:36:53 +1100 (EST)
Date: Fri, 6 Dec 2013 11:36:52 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
Subject: Re: dead code in lgamma_r[f]?
In-Reply-To: <20131205195225.GA65732@troutmask.apl.washington.edu>
Message-ID: <20131206102724.W1033@besplex.bde.org>
References: <20131205173700.GA64575@troutmask.apl.washington.edu>
 <20131205182324.GA65135@troutmask.apl.washington.edu>
 <CAN5hRiV6N1arc4RDv=9JbRiXp-J9o3WzAbeZSOpAxks2ZeG+_w@mail.gmail.com>
 <20131205195225.GA65732@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.1 cv=DstvpgP+ c=1 sm=1 tr=0
 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117 a=PO7r1zJSAAAA:8 a=m8NKqXhiB1EA:10
 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=ElNGO9NIuDsA:10
 a=xXsPXvZvRc-c0by79MAA:9 a=CjuIK1q_8ugA:10
Cc: freebsd-numerics@FreeBSD.org, Filipe Maia <filipe.c.maia@gmail.com>
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Dec 2013 00:37:04 -0000

On Thu, 5 Dec 2013, Steve Kargl wrote:

> On Thu, Dec 05, 2013 at 08:06:22PM +0100, Filipe Maia wrote:
>> What if ix == 0x433xxxxx?

There are also 8 more x's that aren't explicitly tested for the double
case, and more for the long double case.

> See below.
>
>> Wouldn't the if clause be false?
>
> See below.
>
>>> Index: src/e_lgamma_r.c
>>> ===================================================================
>>> --- src/e_lgamma_r.c    (revision 1427)
>>> +++ src/e_lgamma_r.c    (working copy)
>>> @@ -173,19 +173,15 @@
>>>       */
>>>         z = floor(y);

BTW, floor() is a fairly pessimal way to convert to an integer.  It
doesn't even avoid overflow by rounding towards zero (so callers must
limit the arg a little, and I think they do for at least the problematic
negative args as a side effect of underflowing early for such args).
rint() is probably better.  On i386, both floor() and rint() are in
asm, but floor() has to do a slow switch of the rounding mode to force
rounding towards minus infinity.  On amd64, both are in C and take
about the same amount of code.  On amd64, this not optimal for at least
rint().

>>>         if(z!=y) {                              /* inexact anyway */
>>> -           y  *= 0.5;
>>> -           y   = 2.0*(y - floor(y));           /* y = |x| mod 2.0 */
>>> -           n   = (int) (y*4.0);
>>> +           y /= 2;
>>> +           y  = 2*(y - floor(y));              /* y = |x| mod 2.0 */
>>> +           n  = (int)(y*4);
>
> If you have a y value that corresponds to ix=0x433xxxxx where
> one of the x is nonzero, I believe that you end up going through
> this portion of code.

Perhaps for the float case where all the bits are in the ix word, but not
in the double case assuming that 0x43400000 is the correct threshold for
detecting even numbers.  Indeed, the double with bits 0x4340000000000001
is odd.  It is

 	x = 0x1.0000000000001p+52
 	x = 4503599627370497.00

We see that the 0x43400000 is the correct threshold for evenness, and 1
below that for the exponent gives oddness by the least significant bit.

>
>>>         } else {
>
> You end up here if and only if z==y, which means y is integral.
> As noted in the original email, the 'if(ix>=0x43400000)' below
> is dead code because this code in __ieee754_lgamma_r():
>
>   if(ix>=0x43300000) 	/* |x|>=2**52, must be -integer */
>       return one/zero;
>   t = sin_pi(x);
>
> prevents the call to sin_pi(x).

Only for negative integers.

>
>>> -            if(ix>=0x43400000) {
>>> -                y = zero; n = 0;                 /* y must be even */

This case might be just an optimization, but it is necessary to avoid
overflow somewhere.  Adding two52 would preserve evenness, but it might
give spurious overflow.

>>> -            } else {
>>> -                if(ix<0x43300000) z = y+two52; /* exact */

This case is needed for putting the oddness bit in the least significant
bit.  When ix == 0x43300000, it is already there, as the above example
shows.

OTOH, when ix == 0x43300000, two52 must not be added, since it moves the
oddness bit up by 1 place.  E.g., adding 0x1p52 in the above example gives:

 	x = 0x1p+53
 	x = 9007199254740992.00

I just noticed that this addition of two52 barely escapes problems with
extra precision.  The same method is used elswhere and requires more
care.  It might still not take enough care with double rounding.

> If we again look at the code from __ieee754_lgamma_r(), we see
> that sin_pi() is called if ix < 0x43300000, so by the time we
> arrive at the 'if(ix<0x43300000)' statement we already know that
> the condition is true.

No, only for negative integers.  hx<0 classifies negative values, and
then ix>=0x43300000 classifies numbers that are so large negative that
they must be integers, and the result is a sort of pole error.  We
are just filtering out this case, perhaps as an optimization.

>
>>> -               GET_LOW_WORD(n,z);
>>> -               n &= 1;
>>> -                y  = n;
>>> -                n<<= 2;
>>> -            }
>>> +           z  = y+two52;                       /* exact */

The main bug that this introduces is misclassifying all odd (positive)
intgers with ix = 0x433xxxxx.

Similarly in float precision, except the misclassified odd integers are
easier to describe.  They are ones with ix between 0x4b000000 and
0x4b7fffff with low bit 1.

>>> +           GET_LOW_WORD(n,z);
>>> +           n &= 1;
>>> +           y  = n;
>>> +           n<<= 2;
>>>          }
>>>         switch (n) {
>>>             case 0:   y =  __kernel_sin(pi*y,zero,0); break;

>From a previous reply:

> Note the first part of the diff allows us to remove the
> cast to float in e_lgamma_r.c, which minimizes the diff
> between e_lgamma_r.c and e_lgammaf_r.c (and e_lgammal_r.c :).

The gamma functions are not good candidates for conversion to
long double, since they are very inaccurate (especially near
zeros for lgamma()) in float and double precision.

Bruce