From owner-freebsd-numerics@FreeBSD.ORG  Sun Dec  8 01:29:41 2013
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 4B08DE6D
 for <freebsd-numerics@FreeBSD.org>; Sun,  8 Dec 2013 01:29:41 +0000 (UTC)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 2458B16FF
 for <freebsd-numerics@FreeBSD.org>; Sun,  8 Dec 2013 01:29:41 +0000 (UTC)
Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu
 [127.0.0.1])
 by troutmask.apl.washington.edu (8.14.7/8.14.7) with ESMTP id rB81TeuQ080225; 
 Sat, 7 Dec 2013 17:29:40 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.14.7/8.14.7/Submit) id rB81Te0h080224;
 Sat, 7 Dec 2013 17:29:40 -0800 (PST) (envelope-from sgk)
Date: Sat, 7 Dec 2013 17:29:39 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Subject: Re: dead code in lgamma_r[f]?
Message-ID: <20131208012939.GA80145@troutmask.apl.washington.edu>
References: <20131205173700.GA64575@troutmask.apl.washington.edu>
 <20131205182324.GA65135@troutmask.apl.washington.edu>
 <CAN5hRiV6N1arc4RDv=9JbRiXp-J9o3WzAbeZSOpAxks2ZeG+_w@mail.gmail.com>
 <20131205195225.GA65732@troutmask.apl.washington.edu>
 <20131206102724.W1033@besplex.bde.org>
 <20131206114426.I1329@besplex.bde.org>
 <20131207064602.GA76042@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20131207064602.GA76042@troutmask.apl.washington.edu>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-numerics@FreeBSD.org, Filipe Maia <filipe.c.maia@gmail.com>
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 08 Dec 2013 01:29:41 -0000

On Fri, Dec 06, 2013 at 10:46:02PM -0800, Steve Kargl wrote:
> On Fri, Dec 06, 2013 at 12:55:16PM +1100, Bruce Evans wrote:
> > On Fri, 6 Dec 2013, Bruce Evans wrote:
> > 
> > > On Thu, 5 Dec 2013, Steve Kargl wrote:
> > > ...
> > >> If we again look at the code from __ieee754_lgamma_r(), we see
> > >> that sin_pi() is called if ix < 0x43300000, so by the time we
> > >> arrive at the 'if(ix<0x43300000)' statement we already know that
> > >> the condition is true.
> > >
> > > No, only for negative integers.  hx<0 classifies negative values, and
> > > then ix>=0x43300000 classifies numbers that are so large negative that
> > > they must be integers, and the result is a sort of pole error.  We
> > > are just filtering out this case, perhaps as an optimization.
> > 
> > Oops, sin_pi() is only called for negative integers, so your change
> > seems to be correct.  Just add a comment about the limited domain
> > of sin_pi() (it already has one saying that "x is assumed negative".
> > 
> 
> I wish to retract my earlier statement that after 2 additional
> years of reading fdlibm code that it was easier to work with.
> I spent the better part of Friday giving myself a headache trying
> to understand the algorithm for lgamma_r().  The code for x in
> the interval (0,2) does not match any comment in lgamma_r().
> 
> I also think, but can't prove yet, that like erff() the 
> polynomial and rational approximations in lgammaf_r() have
> too many terms.
> 

I really need to stop looking at fdlibm code.  The threshold(s) in
e_lgammaf.c could be chosen better (and actually documented).  In
the float code

    /* 8.0 <= x < 2**58 */
	} else if (ix < 0x4e800000) { // (ix < 0x5c800000) {
	    t = __ieee754_logf(x);
	    z = one/x;
	    y = z*z;
	    w = w0+z*(w1+y*(w2+y*(w3+y*(w4+y*(w5+y*w6)))));
	    r = (x-half)*(t-one)+w;
	} else
    /* 2**58 <= x <= inf */
	    r =  x*(__ieee754_logf(x)-one);

the 2**58 is much too large.  Asymptotically, we have

lgamma(x)~(x-0.5)log(x)-x = x*(log(x)-1) - 0.5*log(x)

where the 0.5*log(x) is insignificant when compared to 2**58.
I suspect that 2**58 is a sloppy threshold for double, which
can be reduced to 2**30 (or smaller) for float.  I suppose I
could look for a tighter bound, but 2**(p+slop) with p, the
precision, probably is sufficient.

-- 
Steve

From owner-freebsd-numerics@FreeBSD.ORG  Sun Dec  8 04:21:15 2013
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 9AA5E67F
 for <freebsd-numerics@FreeBSD.org>; Sun,  8 Dec 2013 04:21:15 +0000 (UTC)
Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au
 [211.29.132.246])
 by mx1.freebsd.org (Postfix) with ESMTP id 484051027
 for <freebsd-numerics@FreeBSD.org>; Sun,  8 Dec 2013 04:21:14 +0000 (UTC)
Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au
 (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23])
 by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 8D33442032F;
 Sun,  8 Dec 2013 15:21:06 +1100 (EST)
Date: Sun, 8 Dec 2013 15:21:04 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
Subject: Re: dead code in lgamma_r[f]?
In-Reply-To: <20131208012939.GA80145@troutmask.apl.washington.edu>
Message-ID: <20131208141627.F1181@besplex.bde.org>
References: <20131205173700.GA64575@troutmask.apl.washington.edu>
 <20131205182324.GA65135@troutmask.apl.washington.edu>
 <CAN5hRiV6N1arc4RDv=9JbRiXp-J9o3WzAbeZSOpAxks2ZeG+_w@mail.gmail.com>
 <20131205195225.GA65732@troutmask.apl.washington.edu>
 <20131206102724.W1033@besplex.bde.org>
 <20131206114426.I1329@besplex.bde.org>
 <20131207064602.GA76042@troutmask.apl.washington.edu>
 <20131208012939.GA80145@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.1 cv=YYGEuWhf c=1 sm=1 tr=0
 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117 a=PO7r1zJSAAAA:8 a=m8NKqXhiB1EA:10
 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=ElNGO9NIuDsA:10
 a=YmPyuLAzrPIqYOFI3MIA:9 a=7i44fyHmEgJOJrlA:21 a=KHVXADVR9NTKJXXW:21
 a=CjuIK1q_8ugA:10
Cc: freebsd-numerics@FreeBSD.org, Filipe Maia <filipe.c.maia@gmail.com>
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 08 Dec 2013 04:21:15 -0000

On Sat, 7 Dec 2013, Steve Kargl wrote:

> I really need to stop looking at fdlibm code.  The threshold(s) in
> e_lgammaf.c could be chosen better (and actually documented).  In
> the float code
>
>    /* 8.0 <= x < 2**58 */
> 	} else if (ix < 0x4e800000) { // (ix < 0x5c800000) {
> 	    t = __ieee754_logf(x);
> 	    z = one/x;
> 	    y = z*z;
> 	    w = w0+z*(w1+y*(w2+y*(w3+y*(w4+y*(w5+y*w6)))));
> 	    r = (x-half)*(t-one)+w;
> 	} else
>    /* 2**58 <= x <= inf */
> 	    r =  x*(__ieee754_logf(x)-one);
>
> the 2**58 is much too large.  Asymptotically, we have

That's Cygnus code, not fdlibm :-).

> lgamma(x)~(x-0.5)log(x)-x = x*(log(x)-1) - 0.5*log(x)
>
> where the 0.5*log(x) is insignificant when compared to 2**58.
> I suspect that 2**58 is a sloppy threshold for double, which
> can be reduced to 2**30 (or smaller) for float.  I suppose I
> could look for a tighter bound, but 2**(p+slop) with p, the
> precision, probably is sufficient.

The Cygnus translation to float didn't look closely enough at
the thresholds.

gamma() was the first part of libm that I looked closely at (even
before trig functions in 2004/5).  I didn't understand it at first and
tried to write my own.  My first idea was to use the asymptotic
expansion as a kernel for all cases, especially for complex args.
To use the asymptotic expansion, the real part of the arg must be
pushed up to about 16 or greater using repeated multiplications.
It is difficult to do these multiplications accurately enough,
but the float case should be very easy using double precision
multiplication.  I tried extra precision in software, but this
was too slow.  This method also works poorly for args with negative
real part, especially for lgamma() near its zeros.  Even using
doubled double precision didn't get near 1 ulp of accuracy in float
precision for lgamma() near its zeros, and the only fix that I
know of is to use a special approximation near each zero down
to about real part -70 (70 cases).  fdlibm is also bad for lgamma()
near its zeros.

Here is a pari version of this (the C version is harder to understand
due to its technical details and less suitable language; pari syntax
is worse than C syntax for most things (including things that pari
should be able to do well) but slightly better for this use:

% g(z) =
% {
% 	local(r);
% 
% 	r = 1;
% 	while (abs(z) < M,
% 		r = r * z;
% 		z = z + 1;
% 	);
% 	return (exp(lgb(z)) / r);
% }

This is cgamma().  The whole thing, modulo not handling exceptional args
and depending on subroutines.  It first pushes up Real(z).  It uses
abs(z) instead of Real(z) for some reason.  Apparently it only works
for real z and is buggy for large negative z (small negative z get
pushed up towards +Inf).  Then it calls the local routine lgb() to
do the asymptotic expansion and the pari builtin exp() to convert from
clgamma() to cgamma().

% 
% lg(z) =
% {
% 	local(bol, rnew);
% 
% 	if (abs(z - 1) < 1.e-1, return (lngamma(z)););
% 	if (abs(z - 2) < 1.e-1, return (lngamma(z)););

Near 1 and 2, I planned to use a poly approximation, but that is not
so easy to do in pari so I call the pari builtin lgamma() to fake it.

% 	bol = 0;
% 	r = 1;
% 	while (abs(z) < M,
% 		rnew = r * z;
% 		if (imag(r) > 0 && imag(rnew) < 0, bol += 1);
% 		if (imag(r) < 0 && imag(rnew) > 0, bol -= 1);
% 		r = rnew;
% 		z = z + 1;
% 	);

This pushes up z, almost as above.

% 	bol = 0;
% 	return (lgb(z) - log(r) - 2 * Pi * bol);

Then use the asymptotic expansion.

% }
% 
% lgb(z) = (z - 0.5) * log(z) - z + 0.5 * log(2 * Pi) + \
%     sum(n = 1, N, b[n] / z^(2 * n - 1))

This is the asymptotic expansion.  Pari syntax is only perfect for
numeric things like this (and probably for some number theory things
which I don't know well).   b[n] is a table of coefficients.

I had forgotten the details of even the first term in this.  You
mentioned using an approximation of just the first term or two above
a huge threshold.  It might be better to go much lower using another
term or two (but not as many as medium-sized args).

% 
% setb(N) =
% {
% 	b = vector(N);
% 	for (n = 1, N,
% 		b[n] = roundn(bernreal(2 * n) / (2 * n * (2 * n - 1)), P);
% 	);
% 	b = precision(b, 100);
% }

This initializes b[] using the builtin bernreal() which gives Bernoulli
numbers.  roundn() is a local function (not shown) that rounds to 64
bits in IEEE round-to-nearest mode.  IEEE arithmetic is too hard to
fake everywhere here.  it would need a rounding step after every operation.
I only adding these steps while developing clog(), mainly to understand
the limitations of tripled double precision.

% 
% pg() = plot(x = A, B, abs(g(x + C * I) / gamma(x + C * I) - 1))
% pq() = plot(x = A, B, imag(g(x + C * I)) / imag(gamma(x + C * I)) - 1)
% xgamma(z) =
% {
% 	if ((truncate(z) - z) < 1e-20, return (100000););
% 	if (abs(gamma(z) - 1) < 1.e-40, return (1 + 1.e-40););
% 	return (gamma(z));
% }
% pr() = plot(x = A, B, abs(1 / (xgamma(x) - 1)))
% 
% plg() = plot(x = A, B, abs(lg(x + C * I) / lngamma(x + C * I) - 1))
% 
% A = 1;
% B = 100;
% C = 0;
% M = 17;
% N = 8;
% P = 64;
% setb(100);
% gamma(10)
% g(10)

Testing stuff.  Only M, N and P are of interest.  M gives the threshold
for the asymptotic expansion.  N gives the number of terms in it that
are used, not counting the furst couple.  P gives the precision.
Results:

362880.0000000000000000000000
362879.9999999999999999714634

Pari gamma(10) gave the exact result (9 factorial).  g(10) got 22 digits
corect, which is more than can be hoped for (there would be more rounding
errors with finite precision throughout).  g() is sloppy for complex
args, but after changing 10 to 10+I it works well:

-217255.6436196325446121126363 + 267132.0341468011581535241915*I (gamma)
-217255.6436196325446121611110 + 267132.0341468011581534939537*I (g)

Cephes has a very complicated lgamma() or gamma(), with many special
poly approximations for 0 < x < 8 or similar.  glibc adopted this.
It seems too hard to scale this for complex args.  You just can't
write a special approximation that even converges in a strip
1 < Re(z) < 8 or similar.  But pushing up the real part until the
asymptotic expansion works is a very general method.  For complex
long double args, maybe we should just use this method and not worry
about the losses of accuracy from the multiplications.  16 multiplications
would lose about 32 ulps.  This would give imprecise.c ;-(.  A few more
than 16 would be needed for more than float precision.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Sun Dec  8 18:33:46 2013
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id EF2624B3
 for <freebsd-numerics@FreeBSD.org>; Sun,  8 Dec 2013 18:33:45 +0000 (UTC)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id C10CB15B5
 for <freebsd-numerics@FreeBSD.org>; Sun,  8 Dec 2013 18:33:45 +0000 (UTC)
Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu
 [127.0.0.1])
 by troutmask.apl.washington.edu (8.14.7/8.14.7) with ESMTP id rB8IXdep083385; 
 Sun, 8 Dec 2013 10:33:39 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.14.7/8.14.7/Submit) id rB8IXdXb083384;
 Sun, 8 Dec 2013 10:33:39 -0800 (PST) (envelope-from sgk)
Date: Sun, 8 Dec 2013 10:33:39 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Subject: Re: dead code in lgamma_r[f]?
Message-ID: <20131208183339.GA83010@troutmask.apl.washington.edu>
References: <20131205173700.GA64575@troutmask.apl.washington.edu>
 <20131205182324.GA65135@troutmask.apl.washington.edu>
 <CAN5hRiV6N1arc4RDv=9JbRiXp-J9o3WzAbeZSOpAxks2ZeG+_w@mail.gmail.com>
 <20131205195225.GA65732@troutmask.apl.washington.edu>
 <20131206102724.W1033@besplex.bde.org>
 <20131206114426.I1329@besplex.bde.org>
 <20131207064602.GA76042@troutmask.apl.washington.edu>
 <20131208012939.GA80145@troutmask.apl.washington.edu>
 <20131208141627.F1181@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20131208141627.F1181@besplex.bde.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-numerics@FreeBSD.org, Filipe Maia <filipe.c.maia@gmail.com>
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 08 Dec 2013 18:33:46 -0000

On Sun, Dec 08, 2013 at 03:21:04PM +1100, Bruce Evans wrote:
> On Sat, 7 Dec 2013, Steve Kargl wrote:
> 
> > I really need to stop looking at fdlibm code.  The threshold(s) in
> > e_lgammaf.c could be chosen better (and actually documented).  In
> > the float code
> >
> >    /* 8.0 <= x < 2**58 */
> > 	} else if (ix < 0x4e800000) { // (ix < 0x5c800000) {
> > 	    t = __ieee754_logf(x);
> > 	    z = one/x;
> > 	    y = z*z;
> > 	    w = w0+z*(w1+y*(w2+y*(w3+y*(w4+y*(w5+y*w6)))));
> > 	    r = (x-half)*(t-one)+w;
> > 	} else
> >    /* 2**58 <= x <= inf */
> > 	    r =  x*(__ieee754_logf(x)-one);
> >
> > the 2**58 is much too large.  Asymptotically, we have
> 
> That's Cygnus code, not fdlibm :-).
> 
> > lgamma(x)~(x-0.5)log(x)-x = x*(log(x)-1) - 0.5*log(x)
> >
> > where the 0.5*log(x) is insignificant when compared to 2**58.
> > I suspect that 2**58 is a sloppy threshold for double, which
> > can be reduced to 2**30 (or smaller) for float.  I suppose I
> > could look for a tighter bound, but 2**(p+slop) with p, the
> > precision, probably is sufficient.
> 
> The Cygnus translation to float didn't look closely enough at
> the thresholds.

It also did not recompute minimax coefficients as there are too
many terms in the approximation.  Looks like Cygnus simply rounded
the double coefficients to float.

> gamma() was the first part of libm that I looked closely at (even
> before trig functions in 2004/5).  I didn't understand it at first and
> tried to write my own.  My first idea was to use the asymptotic
> expansion as a kernel for all cases, especially for complex args.
> To use the asymptotic expansion, the real part of the arg must be
> pushed up to about 16 or greater using repeated multiplications.
> It is difficult to do these multiplications accurately enough,
> but the float case should be very easy using double precision
> multiplication.  I tried extra precision in software, but this
> was too slow.  This method also works poorly for args with negative
> real part, especially for lgamma() near its zeros.  Even using
> doubled double precision didn't get near 1 ulp of accuracy in float
> precision for lgamma() near its zeros, and the only fix that I
> know of is to use a special approximation near each zero down
> to about real part -70 (70 cases).  fdlibm is also bad for lgamma()
> near its zeros.

Interesting idea.  For x in [2,8), fdlibm code (as you probably
know) uses recursion to reduce the evaluation to a log of a 
product and lgamma(2+s) with s in [0,1).  One can then show that
lgamma(2+s) = s * (1 - E) + s**2 * P(s) where E is Euler's constant
and P(s) is a polynomial in s.  But, instead of using a minimax
polynomial approximation fdlibm code uses a rational approximation
lgamma(2+s) = 0.5*s + p(s) / q(s).  This is the origin of the 
s0, ..., s6, r1, ..., r6 coefficients.

> Here is a pari version of this (the C version is harder to understand
> due to its technical details and less suitable language; pari syntax
> is worse than C syntax for most things (including things that pari
> should be able to do well) but slightly better for this use:

I suppose one of these days, I'll learn how to use pari.  
A quick scan of n1256.pdf shows that C99 does not require
lgamma of a complex argument, but it does reserve the 
names clgamma[fl].   

> % lg(z) =
> % {
> % 	local(bol, rnew);
> % 
> % 	if (abs(z - 1) < 1.e-1, return (lngamma(z)););
> % 	if (abs(z - 2) < 1.e-1, return (lngamma(z)););
> 
> Near 1 and 2, I planned to use a poly approximation, but that is not
> so easy to do in pari so I call the pari builtin lgamma() to fake it.

fdlibm's lgammaf_r does not appear to have an issue near 1 or 2.
Starting at the lower bound of each interval and using nextafterf
to scan up to the upper bound, I'm seeing

% make testf && ./testf
Interval:                       Time    ULP      Value
[4.7683716e-07, 2.0000000e+00): 0.04590 0.98491  1.231645e+00
[2.0000000e+00, 8.0000000e+00): 0.08754 0.82825  4.012439e+00
[8.0000000e+00, 2.8823038e+17): 0.05841 1.43789  8.942273e+06
[2.8823038e+17, 2.6584560e+36): 0.05095 0.50000  1.491249e+23

where the reference value is from lgamma_r.  The different 
intervals test specific branches in lgammaf_r.  The upper
bound of 2.6584560e36 is 0x1p121.  Time is the average value
for all calls in the interval in usec/call.

> % 	bol = 0;
> % 	r = 1;
> % 	while (abs(z) < M,
> % 		rnew = r * z;
> % 		if (imag(r) > 0 && imag(rnew) < 0, bol += 1);
> % 		if (imag(r) < 0 && imag(rnew) > 0, bol -= 1);
> % 		r = rnew;
> % 		z = z + 1;
> % 	);
> 
> This pushes up z, almost as above.
> 
> % 	bol = 0;
> % 	return (lgb(z) - log(r) - 2 * Pi * bol);
> 
> Then use the asymptotic expansion.
> 
> % }
> % 
> % lgb(z) = (z - 0.5) * log(z) - z + 0.5 * log(2 * Pi) + \
> %     sum(n = 1, N, b[n] / z^(2 * n - 1))
> 
> This is the asymptotic expansion.  Pari syntax is only perfect for
> numeric things like this (and probably for some number theory things
> which I don't know well).   b[n] is a table of coefficients.
> 
> I had forgotten the details of even the first term in this.  You
> mentioned using an approximation of just the first term or two above
> a huge threshold.  It might be better to go much lower using another
> term or two (but not as many as medium-sized args).

Well, I was only interested in the poorly chosen 2**58 threshold. 
In the [8,2**58) interval the asymptotic expansion is used, but
its written in such a way that the

>%  + 0.5 * log(2 * Pi) +  sum(n = 1, N, b[n] / z^(2 * n - 1))

portion of your above code is approximated by a minimax polynomial.

    /* 8.0 <= x < 2**58 */
	} else if (ix < 0x5c800000) {
	    t = __ieee754_logf(x);
	    z = one/x;
	    y = z*z;
	    w = w0+z*(w1+y*(w2+y*(w3+y*(w4+y*(w5+y*w6)))));
	    r = (x-half)*(t-one)+w;
	} else

fdlibm use only the first 2 terms for the x >= 2**58.

> Cephes has a very complicated lgamma() or gamma(), with many special
> poly approximations for 0 < x < 8 or similar.

fdlibm's approximations in the [0,2) range are mostly black magic
to me. 

--  
Steve

From owner-freebsd-numerics@FreeBSD.ORG  Sun Dec  8 18:59:42 2013
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 01CC7D00
 for <freebsd-numerics@FreeBSD.org>; Sun,  8 Dec 2013 18:59:42 +0000 (UTC)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id B47DF16E9
 for <freebsd-numerics@FreeBSD.org>; Sun,  8 Dec 2013 18:59:41 +0000 (UTC)
Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu
 [127.0.0.1])
 by troutmask.apl.washington.edu (8.14.7/8.14.7) with ESMTP id rB8Ixfn4083492; 
 Sun, 8 Dec 2013 10:59:41 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.14.7/8.14.7/Submit) id rB8IxfuJ083491;
 Sun, 8 Dec 2013 10:59:41 -0800 (PST) (envelope-from sgk)
Date: Sun, 8 Dec 2013 10:59:41 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Subject: Re: dead code in lgamma_r[f]?
Message-ID: <20131208185941.GA83484@troutmask.apl.washington.edu>
References: <20131205173700.GA64575@troutmask.apl.washington.edu>
 <20131205182324.GA65135@troutmask.apl.washington.edu>
 <CAN5hRiV6N1arc4RDv=9JbRiXp-J9o3WzAbeZSOpAxks2ZeG+_w@mail.gmail.com>
 <20131205195225.GA65732@troutmask.apl.washington.edu>
 <20131206102724.W1033@besplex.bde.org>
 <20131206114426.I1329@besplex.bde.org>
 <20131207064602.GA76042@troutmask.apl.washington.edu>
 <20131208012939.GA80145@troutmask.apl.washington.edu>
 <20131208141627.F1181@besplex.bde.org>
 <20131208183339.GA83010@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20131208183339.GA83010@troutmask.apl.washington.edu>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-numerics@FreeBSD.org, Filipe Maia <filipe.c.maia@gmail.com>
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 08 Dec 2013 18:59:42 -0000

On Sun, Dec 08, 2013 at 10:33:39AM -0800, Steve Kargl wrote:
> 
> fdlibm's lgammaf_r does not appear to have an issue near 1 or 2.
> Starting at the lower bound of each interval and using nextafterf
> to scan up to the upper bound, I'm seeing
> 
> % make testf && ./testf
> Interval:                       Time    ULP      Value
> [4.7683716e-07, 2.0000000e+00): 0.04590 0.98491  1.231645e+00
> [2.0000000e+00, 8.0000000e+00): 0.08754 0.82825  4.012439e+00
> [8.0000000e+00, 2.8823038e+17): 0.05841 1.43789  8.942273e+06
> [2.8823038e+17, 2.6584560e+36): 0.05095 0.50000  1.491249e+23
> 
> where the reference value is from lgamma_r.  The different 
> intervals test specific branches in lgammaf_r.  The upper
> bound of 2.6584560e36 is 0x1p121.  Time is the average value
> for all calls in the interval in usec/call.

Correction.  The timing is for 1 million calls uniformly
distributed over the interval.

-- 
Steve

From owner-freebsd-numerics@FreeBSD.ORG  Mon Dec  9 11:06:50 2013
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id CEB5220E
 for <freebsd-numerics@FreeBSD.org>; Mon,  9 Dec 2013 11:06:50 +0000 (UTC)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id B9EE41E81
 for <freebsd-numerics@FreeBSD.org>; Mon,  9 Dec 2013 11:06:50 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id rB9B6ogF071087
 for <freebsd-numerics@FreeBSD.org>; Mon, 9 Dec 2013 11:06:50 GMT
 (envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
 by freefall.freebsd.org (8.14.7/8.14.7/Submit) id rB9B6oYS071085
 for freebsd-numerics@FreeBSD.org; Mon, 9 Dec 2013 11:06:50 GMT
 (envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 9 Dec 2013 11:06:50 GMT
Message-Id: <201312091106.rB9B6oYS071085@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
 owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@freebsd.org>
To: freebsd-numerics@FreeBSD.org
Subject: Current problem reports assigned to freebsd-numerics@FreeBSD.org
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 09 Dec 2013 11:06:50 -0000

Note: to view an individual PR, use:
  http://www.freebsd.org/cgi/query-pr.cgi?pr=(number).

The following is a listing of current problems submitted by FreeBSD users.
These represent problem reports covering all versions including
experimental development code and obsolete releases.


S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o stand/175811 numerics   libstdc++ needs complex support in order use C99
o bin/170206   numerics   [msun] [patch] complex arcsinh, log, etc.
o stand/82654  numerics   C99 long double math functions are missing

3 problems total.


From owner-freebsd-numerics@FreeBSD.ORG  Mon Dec  9 22:37:28 2013
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 90CA923C
 for <freebsd-numerics@FreeBSD.org>; Mon,  9 Dec 2013 22:37:28 +0000 (UTC)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 4776216BA
 for <freebsd-numerics@FreeBSD.org>; Mon,  9 Dec 2013 22:37:28 +0000 (UTC)
Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu
 [127.0.0.1])
 by troutmask.apl.washington.edu (8.14.7/8.14.7) with ESMTP id rB9MbL31091886; 
 Mon, 9 Dec 2013 14:37:21 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.14.7/8.14.7/Submit) id rB9MbK2X091885;
 Mon, 9 Dec 2013 14:37:20 -0800 (PST) (envelope-from sgk)
Date: Mon, 9 Dec 2013 14:37:20 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Subject: Re: dead code in lgamma_r[f]?
Message-ID: <20131209223720.GA91828@troutmask.apl.washington.edu>
References: <20131205182324.GA65135@troutmask.apl.washington.edu>
 <CAN5hRiV6N1arc4RDv=9JbRiXp-J9o3WzAbeZSOpAxks2ZeG+_w@mail.gmail.com>
 <20131205195225.GA65732@troutmask.apl.washington.edu>
 <20131206102724.W1033@besplex.bde.org>
 <20131206114426.I1329@besplex.bde.org>
 <20131207064602.GA76042@troutmask.apl.washington.edu>
 <20131208012939.GA80145@troutmask.apl.washington.edu>
 <20131208141627.F1181@besplex.bde.org>
 <20131208183339.GA83010@troutmask.apl.washington.edu>
 <20131208185941.GA83484@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20131208185941.GA83484@troutmask.apl.washington.edu>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-numerics@FreeBSD.org, Filipe Maia <filipe.c.maia@gmail.com>
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 09 Dec 2013 22:37:28 -0000

On Sun, Dec 08, 2013 at 10:59:41AM -0800, Steve Kargl wrote:
> On Sun, Dec 08, 2013 at 10:33:39AM -0800, Steve Kargl wrote:
> > 
> > fdlibm's lgammaf_r does not appear to have an issue near 1 or 2.
> > Starting at the lower bound of each interval and using nextafterf
> > to scan up to the upper bound, I'm seeing
> > 
> > % make testf && ./testf
> > Interval:                       Time    ULP      Value
> > [4.7683716e-07, 2.0000000e+00): 0.04590 0.98491  1.231645e+00
> > [2.0000000e+00, 8.0000000e+00): 0.08754 0.82825  4.012439e+00
> > [8.0000000e+00, 2.8823038e+17): 0.05841 1.43789  8.942273e+06
> > [2.8823038e+17, 2.6584560e+36): 0.05095 0.50000  1.491249e+23
> > 
> > where the reference value is from lgamma_r.  The different 
> > intervals test specific branches in lgammaf_r.  The upper
> > bound of 2.6584560e36 is 0x1p121.  Time is the average value
> > for all calls in the interval in usec/call.
> 
> Correction.  The timing is for 1 million calls uniformly
> distributed over the interval.
> 

I would like to commit the attached patch.  I will do it in 
3 passes:
   1) whitespace fixes,
   2) literal constants changes (checked by md5),
   3) the code re-organization, threshold change, and dead code removed.

* lib/msun/src/e_lgamma_r.c
  . Remove trailing space and trailing blank line in Copyright.
  . Fix prototype for sin_pi to agree with the intended commit
    r97413 done some 11 years 6 month ago.
  . Remove dead code in sin_pi and remove 'if(ix<0x43300000)' as it is
    always true.
  . In the domain [0,2), move the three minimax approximations embedded
    within __ieee75_lgamma_r() into three 'static inline double' functions.
  . Remove the now unused variables p1, p2, and p3.
  . Use integer literal constants instead of double literal constants
    where possible (checked with md5).
  . Remove explicit cast of the int 'i' to double (checked with md5).

* lib/msun/src/e_lgammaf_r.c:
  . The minimax polynomials have more terms than required for the 
    precision.  Remove a8-a11, t10-t14, and w3-w6.
  . Fix prototype for sin_pif to agree with the intended commit
    r97413 done some 11 years 6 month ago.
  . Remove dead code in sin_pif and remove 'if(ix<0x4b000000)' as it is
    always true.
  . In the domain [0,2), move the three minimax approximations embedded
    within __ieee75_lgamma_r() into three 'static inline float' functions.
  . Remove the now unused variables p1, p2, and p3.
  . Use integer literal constants instead of double literal constants
    where possible (checked with md5).
  . Remove explicit cast of the int 'i' to float (checked with md5).
  . Reduce the 2**58 threshold copied from e_lgamma_r.c to 2**30.


As before and after comparison, I offer

lgammaf_r() without patch
Interval:                       Time    ULP      Value
[4.7683716e-07, 2.0000000e+00): 0.05381 2.40423  7.169912e-01
[2.0000000e+00, 8.0000000e+00): 0.07758 1.64252  2.763559e+00
[8.0000000e+00, 1.0737418e+09): 0.06948 2.56132  1.122045e+01
[1.0737418e+09, 2.1990233e+12): 0.06967 1.29037  1.693146e+09
[2.1990233e+12, 3.9876840e+36): 0.05704 1.50689  1.424986e+14

lgammaf_r() with patch
Interval:                       Time    ULP      Value
[4.7683716e-07, 2.0000000e+00): 0.04888 2.40423  7.169912e-01
[2.0000000e+00, 8.0000000e+00): 0.07471 1.64252  2.763559e+00
[8.0000000e+00, 1.0737418e+09): 0.05819 2.56132  1.122045e+01
[1.0737418e+09, 2.1990233e+12): 0.05338 1.29037  1.693146e+09
[2.1990233e+12, 3.9876840e+36): 0.07397 1.50689  1.424986e+14

lgamma_r() without patch
Interval:                       Time    ULP      Value
[8.4703295e-22, 2.0000000e+00): 0.05961 2.09187  7.3247074760047326e-01
[2.0000000e+00, 8.0000000e+00): 0.08758 1.46347  3.9813700222966872e+00

lgamma_r() with patch
Interval:                       Time    ULP      Value
[8.4703295e-22, 2.0000000e+00): 0.05761 2.09187  7.3247074760047326e-01
[2.0000000e+00, 8.0000000e+00): 0.08659 1.46347  3.9813700222966872e+00

Timing is in usec/call for 1 million values uniformly distributed in
the interval.

-- 
Steve

--- /usr/src/lib/msun/src/e_lgamma_r.c	2013-12-06 07:58:39.000000000 -0800
+++ src/e_lgamma_r.c	2013-12-09 13:05:22.000000000 -0800
@@ -6,10 +6,9 @@
  *
  * Developed at SunSoft, a Sun Microsystems, Inc. business.
  * Permission to use, copy, modify, and distribute this
- * software is freely granted, provided that this notice 
+ * software is freely granted, provided that this notice
  * is preserved.
  * ====================================================
- *
  */
 
 #include <sys/cdefs.h>
@@ -156,7 +155,8 @@
 
 static const double zero=  0.00000000000000000000e+00;
 
-	static double sin_pi(double x)
+static double
+sin_pi(double x)
 {
 	double y,z;
 	int n,ix;
@@ -177,15 +177,11 @@
 	    y  = 2*(y - floor(y));		/* y = |x| mod 2.0 */
 	    n  = (int)(y*4);
 	} else {
-            if(ix>=0x43400000) {
-                y = zero; n = 0;                 /* y must be even */
-            } else {
-                if(ix<0x43300000) z = y+two52;	/* exact */
-		GET_LOW_WORD(n,z);
-		n &= 1;
-                y  = n;
-                n<<= 2;
-            }
+	    z  = y+two52;			/* exact */
+	    GET_LOW_WORD(n,z);
+	    n &= 1;
+	    y  = n;
+	    n<<= 2;
         }
 	switch (n) {
 	    case 0:   y =  __kernel_sin(pi*y,zero,0); break;
@@ -201,12 +197,46 @@
 }
 
 
+static inline double
+func0(double y)
+{
+	double p1, p2, z;
+
+	z = y * y;
+	p1 = a0 + z * (a2 + z * (a4 + z * (a6 + z * (a8 + z * a10))));
+	p2 = z * (a1 + z * (a3 + z * (a5 + z * (a7 + z * (a9 + z * a11)))));
+	return (y * p1 + p2 - y / 2);
+}
+
+static inline double
+func1(double y)
+{
+	double p1, p2, p3, w, z;
+
+	z = y * y;
+	w = z * y;
+	p1 = t0 + w * (t3 + w * (t6 + w * (t9  + w * t12)));
+	p2 = t1 + w * (t4 + w * (t7 + w * (t10 + w * t13)));
+	p3 = t2 + w * (t5 + w * (t8 + w * (t11 + w * t14)));
+	return (z * p1 - (tt - w * (p2 + y * p3)) + tf);
+}
+
+static inline double
+func2(double y)
+{
+	double p1, p2;
+
+	p1 = y * (u0 + y * (u1 + y * (u2 + y * (u3 + y * (u4 + y * u5)))));
+	p2 = one + y * (v1 + y * (v2 + y * (v3 + y * (v4 + y * v5))));
+	return (p1 / p2 - y / 2);
+}
+
 double
 __ieee754_lgamma_r(double x, int *signgamp)
 {
-	double t,y,z,nadj,p,p1,p2,p3,q,r,w;
+	double t,y,z,nadj,p,q,r,w;
 	int32_t hx;
-	int i,lx,ix;
+	int i,ix,lx;
 
 	EXTRACT_WORDS(hx,lx,x);
 
@@ -235,51 +265,36 @@
 	if((((ix-0x3ff00000)|lx)==0)||(((ix-0x40000000)|lx)==0)) r = 0;
     /* for x < 2.0 */
 	else if(ix<0x40000000) {
-	    if(ix<=0x3feccccc) { 	/* lgamma(x) = lgamma(x+1)-log(x) */
+	    if (ix <= 0X3FECCCCC) { 	/* lgamma(x) = lgamma(x+1)-log(x) */
 		r = -__ieee754_log(x);
-		if(ix>=0x3FE76944) {y = one-x; i= 0;}
-		else if(ix>=0x3FCDA661) {y= x-(tc-one); i=1;}
-	  	else {y = x; i=2;}
+		if (ix >= 0x3FE76944)
+		    r += func0(1 - x);
+		else if (ix >= 0x3FCDA661)
+		    r += func1(x - (tc - 1));
+	  	else
+		    r += func2(x);
 	    } else {
-	  	r = zero;
-	        if(ix>=0x3FFBB4C3) {y=2.0-x;i=0;} /* [1.7316,2] */
-	        else if(ix>=0x3FF3B4C4) {y=x-tc;i=1;} /* [1.23,1.73] */
-		else {y=x-one;i=2;}
-	    }
-	    switch(i) {
-	      case 0:
-		z = y*y;
-		p1 = a0+z*(a2+z*(a4+z*(a6+z*(a8+z*a10))));
-		p2 = z*(a1+z*(a3+z*(a5+z*(a7+z*(a9+z*a11)))));
-		p  = y*p1+p2;
-		r  += (p-0.5*y); break;
-	      case 1:
-		z = y*y;
-		w = z*y;
-		p1 = t0+w*(t3+w*(t6+w*(t9 +w*t12)));	/* parallel comp */
-		p2 = t1+w*(t4+w*(t7+w*(t10+w*t13)));
-		p3 = t2+w*(t5+w*(t8+w*(t11+w*t14)));
-		p  = z*p1-(tt-w*(p2+y*p3));
-		r += (tf + p); break;
-	      case 2:	
-		p1 = y*(u0+y*(u1+y*(u2+y*(u3+y*(u4+y*u5)))));
-		p2 = one+y*(v1+y*(v2+y*(v3+y*(v4+y*v5))));
-		r += (-0.5*y + p1/p2);
+	        if (ix >= 0x3FFBB4C3)		/* [1.7316,2] */
+		    r = func0(2 - x);
+	        else if (ix >= 0x3FF3B4C4)	/* [1.23,1.73] */
+		    r = func1(x - tc);
+		else
+		    r = func2(x - 1);
 	    }
 	}
 	else if(ix<0x40200000) { 			/* x < 8.0 */
 	    i = (int)x;
-	    y = x-(double)i;
+	    y = x-i;
 	    p = y*(s0+y*(s1+y*(s2+y*(s3+y*(s4+y*(s5+y*s6))))));
 	    q = one+y*(r1+y*(r2+y*(r3+y*(r4+y*(r5+y*r6)))));
-	    r = half*y+p/q;
+	    r = y/2+p/q;
 	    z = one;	/* lgamma(1+s) = log(s) + lgamma(s) */
 	    switch(i) {
-	    case 7: z *= (y+6.0);	/* FALLTHRU */
-	    case 6: z *= (y+5.0);	/* FALLTHRU */
-	    case 5: z *= (y+4.0);	/* FALLTHRU */
-	    case 4: z *= (y+3.0);	/* FALLTHRU */
-	    case 3: z *= (y+2.0);	/* FALLTHRU */
+	    case 7: z *= (y+6);	/* FALLTHRU */
+	    case 6: z *= (y+5);	/* FALLTHRU */
+	    case 5: z *= (y+4);	/* FALLTHRU */
+	    case 4: z *= (y+3);	/* FALLTHRU */
+	    case 3: z *= (y+2);	/* FALLTHRU */
 		    r += __ieee754_log(z); break;
 	    }
     /* 8.0 <= x < 2**58 */
--- /usr/src/lib/msun/src/e_lgammaf_r.c	2013-12-06 07:57:42.000000000 -0800
+++ src/e_lgammaf_r.c	2013-12-09 13:04:12.000000000 -0800
@@ -32,10 +32,6 @@
 a5  =  2.8905137442e-03, /* 0x3b3d6ec6 */
 a6  =  1.1927076848e-03, /* 0x3a9c54a1 */
 a7  =  5.1006977446e-04, /* 0x3a05b634 */
-a8  =  2.2086278477e-04, /* 0x39679767 */
-a9  =  1.0801156895e-04, /* 0x38e28445 */
-a10 =  2.5214456400e-05, /* 0x37d383a2 */
-a11 =  4.4864096708e-05, /* 0x383c2c75 */
 tc  =  1.4616321325e+00, /* 0x3fbb16c3 */
 tf  = -1.2148628384e-01, /* 0xbdf8cdcd */
 /* tt = -(tail of tf) */
@@ -50,11 +46,6 @@
 t7  = -3.6845202558e-03, /* 0xbb7177fe */
 t8  =  2.2596477065e-03, /* 0x3b141699 */
 t9  = -1.4034647029e-03, /* 0xbab7f476 */
-t10 =  8.8108185446e-04, /* 0x3a66f867 */
-t11 = -5.3859531181e-04, /* 0xba0d3085 */
-t12 =  3.1563205994e-04, /* 0x39a57b6b */
-t13 = -3.1275415677e-04, /* 0xb9a3f927 */
-t14 =  3.3552918467e-04, /* 0x39afe9f7 */
 u0  = -7.7215664089e-02, /* 0xbd9e233f */
 u1  =  6.3282704353e-01, /* 0x3f2200f4 */
 u2  =  1.4549225569e+00, /* 0x3fba3ae7 */
@@ -81,15 +72,12 @@
 r6  =  7.3266842264e-06, /* 0x36f5d7bd */
 w0  =  4.1893854737e-01, /* 0x3ed67f1d */
 w1  =  8.3333335817e-02, /* 0x3daaaaab */
-w2  = -2.7777778450e-03, /* 0xbb360b61 */
-w3  =  7.9365057172e-04, /* 0x3a500cfd */
-w4  = -5.9518753551e-04, /* 0xba1c065c */
-w5  =  8.3633989561e-04, /* 0x3a5b3dd2 */
-w6  = -1.6309292987e-03; /* 0xbad5c4e8 */
+w2  = -2.7777778450e-03; /* 0xbb360b61 */
 
 static const float zero=  0.0000000000e+00;
 
-	static float sin_pif(float x)
+static float
+sin_pif(float x)
 {
 	float y,z;
 	int n,ix;
@@ -110,15 +98,11 @@
 	    y  = 2*(y - floorf(y));		/* y = |x| mod 2.0 */
 	    n  = (int)(y*4);
 	} else {
-            if(ix>=0x4b800000) {
-                y = zero; n = 0;                 /* y must be even */
-            } else {
-                if(ix<0x4b000000) z = y+two23;	/* exact */
-		GET_FLOAT_WORD(n,z);
-		n &= 1;
-                y  = n;
-                n<<= 2;
-            }
+	    z  = y+two23;			/* exact */
+	    GET_FLOAT_WORD(n,z);
+	    n &= 1;
+	    y  = n;
+	    n<<= 2;
         }
 	switch (n) {
 	    case 0:   y =  __kernel_sindf(pi*y); break;
@@ -133,11 +117,44 @@
 	return -y;
 }
 
+static inline float
+func0(float y)
+{
+	float p1, p2, z;
+
+	z = y * y;
+	p1 = a0 + z * (a2 + z * (a4 + z * a6));
+	p2 = z * (a1 + z * (a3 + z * (a5 + z * a7)));
+	return (y * p1 + p2 - y / 2);
+}
+
+static inline float
+func1(float y)
+{
+	float p1, p2, p3, w, z;
+
+	z = y * y;
+	w = z * y;
+	p1 = t0 + w * (t3 + w * (t6 + w * t9));
+	p2 = t1 + w * (t4 + w * t7);
+	p3 = t2 + w * (t5 + w * t8);
+	return (z * p1 - (tt - w * (p2 + y * p3)) + tf);
+}
+
+static inline float
+func2(float y)
+{
+	float p1, p2;
+
+	p1 = y * (u0 + y * (u1 + y * (u2 + y * (u3 + y * (u4 + y * u5)))));
+	p2 = one + y * (v1 + y * (v2 + y * (v3 + y * (v4 + y * v5))));
+	return (p1 / p2 - y / 2);
+}
 
 float
 __ieee754_lgammaf_r(float x, int *signgamp)
 {
-	float t,y,z,nadj,p,p1,p2,p3,q,r,w;
+	float t,y,z,nadj,p,q,r,w;
 	int32_t hx;
 	int i,ix;
 
@@ -168,62 +185,47 @@
 	if (ix==0x3f800000||ix==0x40000000) r = 0;
     /* for x < 2.0 */
 	else if(ix<0x40000000) {
-	    if(ix<=0x3f666666) { 	/* lgamma(x) = lgamma(x+1)-log(x) */
+	    if (ix <= 0x3f666666) { 	/* lgamma(x) = lgamma(x+1)-log(x) */
 		r = -__ieee754_logf(x);
-		if(ix>=0x3f3b4a20) {y = one-x; i= 0;}
-		else if(ix>=0x3e6d3308) {y= x-(tc-one); i=1;}
-	  	else {y = x; i=2;}
+		if (ix >= 0x3f3b4a20)
+		    r += func0(1 - x);
+		else if (ix >= 0x3e6d3308)
+		    r += func1(x - (tc - 1));
+	  	else
+		    r += func2(x);
 	    } else {
-	  	r = zero;
-	        if(ix>=0x3fdda618) {y=(float)2.0-x;i=0;} /* [1.7316,2] */
-	        else if(ix>=0x3F9da620) {y=x-tc;i=1;} /* [1.23,1.73] */
-		else {y=x-one;i=2;}
-	    }
-	    switch(i) {
-	      case 0:
-		z = y*y;
-		p1 = a0+z*(a2+z*(a4+z*(a6+z*(a8+z*a10))));
-		p2 = z*(a1+z*(a3+z*(a5+z*(a7+z*(a9+z*a11)))));
-		p  = y*p1+p2;
-		r  += (p-(float)0.5*y); break;
-	      case 1:
-		z = y*y;
-		w = z*y;
-		p1 = t0+w*(t3+w*(t6+w*(t9 +w*t12)));	/* parallel comp */
-		p2 = t1+w*(t4+w*(t7+w*(t10+w*t13)));
-		p3 = t2+w*(t5+w*(t8+w*(t11+w*t14)));
-		p  = z*p1-(tt-w*(p2+y*p3));
-		r += (tf + p); break;
-	      case 2:
-		p1 = y*(u0+y*(u1+y*(u2+y*(u3+y*(u4+y*u5)))));
-		p2 = one+y*(v1+y*(v2+y*(v3+y*(v4+y*v5))));
-		r += (-(float)0.5*y + p1/p2);
+	        if (ix >= 0x3fdda618)		/* [1.7316,2] */
+		    r = func0(2 - x);
+		else if (ix >= 0x3F9da620)	/* [1.23,1.73] */
+		    r = func1(x - tc);
+		else
+		    r = func2(x - 1);
 	    }
 	}
 	else if(ix<0x41000000) { 			/* x < 8.0 */
 	    i = (int)x;
-	    y = x-(float)i;
+	    y = x-i;
 	    p = y*(s0+y*(s1+y*(s2+y*(s3+y*(s4+y*(s5+y*s6))))));
 	    q = one+y*(r1+y*(r2+y*(r3+y*(r4+y*(r5+y*r6)))));
-	    r = half*y+p/q;
+	    r = y/2+p/q;
 	    z = one;	/* lgamma(1+s) = log(s) + lgamma(s) */
 	    switch(i) {
-	    case 7: z *= (y+(float)6.0);	/* FALLTHRU */
-	    case 6: z *= (y+(float)5.0);	/* FALLTHRU */
-	    case 5: z *= (y+(float)4.0);	/* FALLTHRU */
-	    case 4: z *= (y+(float)3.0);	/* FALLTHRU */
-	    case 3: z *= (y+(float)2.0);	/* FALLTHRU */
+	    case 7: z *= (y+6);	/* FALLTHRU */
+	    case 6: z *= (y+5);	/* FALLTHRU */
+	    case 5: z *= (y+4);	/* FALLTHRU */
+	    case 4: z *= (y+3);	/* FALLTHRU */
+	    case 3: z *= (y+2);	/* FALLTHRU */
 		    r += __ieee754_logf(z); break;
 	    }
-    /* 8.0 <= x < 2**58 */
-	} else if (ix < 0x5c800000) {
+    /* 8.0 <= x < 2**30 */
+	} else if (ix < 0x4e800000) {
 	    t = __ieee754_logf(x);
 	    z = one/x;
 	    y = z*z;
-	    w = w0+z*(w1+y*(w2+y*(w3+y*(w4+y*(w5+y*w6)))));
+	    w = w0+z*(w1+y*w2);
 	    r = (x-half)*(t-one)+w;
 	} else
-    /* 2**58 <= x <= inf */
+    /* 2**30 <= x <= inf */
 	    r =  x*(__ieee754_logf(x)-one);
 	if(hx<0) r = nadj - r;
 	return r;

From owner-freebsd-numerics@FreeBSD.ORG  Tue Dec 10 05:26:27 2013
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 2328CFA0
 for <freebsd-numerics@freebsd.org>; Tue, 10 Dec 2013 05:26:27 +0000 (UTC)
Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au
 [211.29.132.246])
 by mx1.freebsd.org (Postfix) with ESMTP id A9CE7138D
 for <freebsd-numerics@freebsd.org>; Tue, 10 Dec 2013 05:26:26 +0000 (UTC)
Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au
 (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23])
 by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 22DB442E040;
 Tue, 10 Dec 2013 16:26:16 +1100 (EST)
Date: Tue, 10 Dec 2013 16:26:12 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
Subject: Re: dead code in lgamma_r[f]?
In-Reply-To: <20131209223720.GA91828@troutmask.apl.washington.edu>
Message-ID: <20131210155635.J1022@besplex.bde.org>
References: <20131205182324.GA65135@troutmask.apl.washington.edu>
 <CAN5hRiV6N1arc4RDv=9JbRiXp-J9o3WzAbeZSOpAxks2ZeG+_w@mail.gmail.com>
 <20131205195225.GA65732@troutmask.apl.washington.edu>
 <20131206102724.W1033@besplex.bde.org>
 <20131206114426.I1329@besplex.bde.org>
 <20131207064602.GA76042@troutmask.apl.washington.edu>
 <20131208012939.GA80145@troutmask.apl.washington.edu>
 <20131208141627.F1181@besplex.bde.org>
 <20131208183339.GA83010@troutmask.apl.washington.edu>
 <20131208185941.GA83484@troutmask.apl.washington.edu>
 <20131209223720.GA91828@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.1 cv=Kejr72oD c=1 sm=1 tr=0
 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117 a=PO7r1zJSAAAA:8 a=m8NKqXhiB1EA:10
 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=ElNGO9NIuDsA:10
 a=XGf6sHeGM9skIQnvz40A:9 a=CjuIK1q_8ugA:10
Cc: freebsd-numerics@freebsd.org, Filipe Maia <filipe.c.maia@gmail.com>
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 10 Dec 2013 05:26:27 -0000

On Mon, 9 Dec 2013, Steve Kargl wrote:

> I would like to commit the attached patch.  I will do it in
> 3 passes:
>   1) whitespace fixes,
>   2) literal constants changes (checked by md5),
>   3) the code re-organization, threshold change, and dead code removed.
>
> * lib/msun/src/e_lgamma_r.c
>  . Remove trailing space and trailing blank line in Copyright.

This undoes rev.1.8, in which previous trailing whitespace removal in
1.2 was backed out to reduce diffs against the vendor version.

>  . Fix prototype for sin_pi to agree with the intended commit
>    r97413 done some 11 years 6 month ago.
>  . Remove dead code in sin_pi and remove 'if(ix<0x43300000)' as it is
>    always true.
>  . In the domain [0,2), move the three minimax approximations embedded
>    within __ieee75_lgamma_r() into three 'static inline double' functions.

This is too invasive.  It is only a style change to a slighly worse style.

The other (unrelated) changes seem reasonable.

> --- /usr/src/lib/msun/src/e_lgamma_r.c	2013-12-06 07:58:39.000000000 -0800
> +++ src/e_lgamma_r.c	2013-12-09 13:05:22.000000000 -0800
> ...
> double
> __ieee754_lgamma_r(double x, int *signgamp)
> {
> -	double t,y,z,nadj,p,p1,p2,p3,q,r,w;
> +	double t,y,z,nadj,p,q,r,w;
> 	int32_t hx;
> -	int i,lx,ix;
> +	int i,ix,lx;

Also avoid fixing sorting errors when not changing much.  It would be a
more substantive change to fix the types of these variables (they should
be uint32_t or int32_t).  I should have done more or less in rev.1.9
where I split up this declaration to change only hx from int to int32_t.

>
> 	EXTRACT_WORDS(hx,lx,x);
>
> @@ -235,51 +265,36 @@
> 	if((((ix-0x3ff00000)|lx)==0)||(((ix-0x40000000)|lx)==0)) r = 0;
>     /* for x < 2.0 */
> 	else if(ix<0x40000000) {
> -	    if(ix<=0x3feccccc) { 	/* lgamma(x) = lgamma(x+1)-log(x) */
> +	    if (ix <= 0X3FECCCCC) { 	/* lgamma(x) = lgamma(x+1)-log(x) */

Like whitespace fixes.

> 		r = -__ieee754_log(x);
> -		if(ix>=0x3FE76944) {y = one-x; i= 0;}
> -		else if(ix>=0x3FCDA661) {y= x-(tc-one); i=1;}
> -	  	else {y = x; i=2;}
> +		if (ix >= 0x3FE76944)
> +		    r += func0(1 - x);
> +		else if (ix >= 0x3FCDA661)
> +		    r += func1(x - (tc - 1));
> +	  	else
> +		    r += func2(x);

The simplification from using func*() seems to be mainly avoiding a case
statement and the case selector variable i.

> ...
> 	    switch(i) {
> -	    case 7: z *= (y+6.0);	/* FALLTHRU */
> -	    case 6: z *= (y+5.0);	/* FALLTHRU */
> -	    case 5: z *= (y+4.0);	/* FALLTHRU */
> -	    case 4: z *= (y+3.0);	/* FALLTHRU */
> -	    case 3: z *= (y+2.0);	/* FALLTHRU */
> +	    case 7: z *= (y+6);	/* FALLTHRU */
> +	    case 6: z *= (y+5);	/* FALLTHRU */
> +	    case 5: z *= (y+4);	/* FALLTHRU */
> +	    case 4: z *= (y+3);	/* FALLTHRU */
> +	    case 3: z *= (y+2);	/* FALLTHRU */

The comment indentation would be wrong now, except this file already uses
totally inconsistent comment indentation.  The float version blindly moved
the indentation in the other direction by adding float casts.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Tue Dec 10 05:45:18 2013
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id D0EE835B
 for <freebsd-numerics@FreeBSD.org>; Tue, 10 Dec 2013 05:45:18 +0000 (UTC)
Received: from mail107.syd.optusnet.com.au (mail107.syd.optusnet.com.au
 [211.29.132.53]) by mx1.freebsd.org (Postfix) with ESMTP id 56B47150C
 for <freebsd-numerics@FreeBSD.org>; Tue, 10 Dec 2013 05:45:17 +0000 (UTC)
Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au
 (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23])
 by mail107.syd.optusnet.com.au (Postfix) with ESMTPS id 7F258D447C0;
 Tue, 10 Dec 2013 16:45:15 +1100 (EST)
Date: Tue, 10 Dec 2013 16:45:14 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
Subject: Re: dead code in lgamma_r[f]?
In-Reply-To: <20131208183339.GA83010@troutmask.apl.washington.edu>
Message-ID: <20131210162729.A1022@besplex.bde.org>
References: <20131205173700.GA64575@troutmask.apl.washington.edu>
 <20131205182324.GA65135@troutmask.apl.washington.edu>
 <CAN5hRiV6N1arc4RDv=9JbRiXp-J9o3WzAbeZSOpAxks2ZeG+_w@mail.gmail.com>
 <20131205195225.GA65732@troutmask.apl.washington.edu>
 <20131206102724.W1033@besplex.bde.org>
 <20131206114426.I1329@besplex.bde.org>
 <20131207064602.GA76042@troutmask.apl.washington.edu>
 <20131208012939.GA80145@troutmask.apl.washington.edu>
 <20131208141627.F1181@besplex.bde.org>
 <20131208183339.GA83010@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.1 cv=Kejr72oD c=1 sm=1 tr=0
 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117 a=PO7r1zJSAAAA:8 a=m8NKqXhiB1EA:10
 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=ElNGO9NIuDsA:10
 a=UvcvclIWdY0omR4vDwoA:9 a=mfOIlMnV6EL5W_DY:21 a=H_VbkGgs1e0FD1jH:21
 a=CjuIK1q_8ugA:10
Cc: freebsd-numerics@FreeBSD.org, Filipe Maia <filipe.c.maia@gmail.com>
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 10 Dec 2013 05:45:19 -0000

On Sun, 8 Dec 2013, Steve Kargl wrote:

> On Sun, Dec 08, 2013 at 03:21:04PM +1100, Bruce Evans wrote:
>> ...
>> gamma() was the first part of libm that I looked closely at (even
>> before trig functions in 2004/5).  I didn't understand it at first and
>> tried to write my own.  My first idea was to use the asymptotic
>> expansion as a kernel for all cases, especially for complex args.
>> ...

> Interesting idea.  For x in [2,8), fdlibm code (as you probably
> know) uses recursion to reduce the evaluation to a log of a
> product and lgamma(2+s) with s in [0,1).  One can then show that

I remember a bit more about why I tried to use the asymptotic expansion.
The fdlibm method in [0,8] only works for real args.

> lgamma(2+s) = s * (1 - E) + s**2 * P(s) where E is Euler's constant
> and P(s) is a polynomial in s.  But, instead of using a minimax

This works for complex s, but only for s near 0.

> polynomial approximation fdlibm code uses a rational approximation
> lgamma(2+s) = 0.5*s + p(s) / q(s).  This is the origin of the
> s0, ..., s6, r1, ..., r6 coefficients.

It is indeed hard to see why the code matches the comment.  The 0.5*s
expansion is presumably to calculate the leading term exactly.  (All
ration approximations should do this, since p(s) / q(s) has an error
of a couple of ulps.)  s * (1 - E) would need extra precision.

> A quick scan of n1256.pdf shows that C99 does not require
> lgamma of a complex argument, but it does reserve the
> names clgamma[fl].

Same in C11 (n1570.pdf).

>> % lg(z) =
>> % {
>> % 	local(bol, rnew);
>> %
>> % 	if (abs(z - 1) < 1.e-1, return (lngamma(z)););
>> % 	if (abs(z - 2) < 1.e-1, return (lngamma(z)););
>>
>> Near 1 and 2, I planned to use a poly approximation, but that is not
>> so easy to do in pari so I call the pari builtin lgamma() to fake it.
>
> fdlibm's lgammaf_r does not appear to have an issue near 1 or 2.
> Starting at the lower bound of each interval and using nextafterf
> to scan up to the upper bound, I'm seeing

I mean that these cases are easy (even for complex z) so I didn't bother
testing various implementations of them for accuracy.

>> % lgb(z) = (z - 0.5) * log(z) - z + 0.5 * log(2 * Pi) + \
>> %     sum(n = 1, N, b[n] / z^(2 * n - 1))
>>
>> This is the asymptotic expansion.  Pari syntax is only perfect for
>> numeric things like this (and probably for some number theory things
>> which I don't know well).   b[n] is a table of coefficients.
> ...
> portion of your above code is approximated by a minimax polynomial.
>
>    /* 8.0 <= x < 2**58 */
> 	} else if (ix < 0x5c800000) {
> 	    t = __ieee754_logf(x);
> 	    z = one/x;
> 	    y = z*z;
> 	    w = w0+z*(w1+y*(w2+y*(w3+y*(w4+y*(w5+y*w6)))));
> 	    r = (x-half)*(t-one)+w;
> 	} else

This is essentially the same -- the approximation is mostly a minimax
poly in 1/z.

Another problem for complex args is that it would be surpising if
minimax polys worked for them.  The polys use delicate cancelations
that probably only work for real args.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Tue Dec 10 07:38:41 2013
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id B4443905
 for <freebsd-numerics@freebsd.org>; Tue, 10 Dec 2013 07:38:41 +0000 (UTC)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 885B61CF1
 for <freebsd-numerics@freebsd.org>; Tue, 10 Dec 2013 07:38:41 +0000 (UTC)
Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu
 [127.0.0.1])
 by troutmask.apl.washington.edu (8.14.7/8.14.7) with ESMTP id rBA7ceQZ094134; 
 Mon, 9 Dec 2013 23:38:40 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.14.7/8.14.7/Submit) id rBA7ceaZ094133;
 Mon, 9 Dec 2013 23:38:40 -0800 (PST) (envelope-from sgk)
Date: Mon, 9 Dec 2013 23:38:40 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Subject: Re: dead code in lgamma_r[f]?
Message-ID: <20131210073840.GA94002@troutmask.apl.washington.edu>
References: <20131205195225.GA65732@troutmask.apl.washington.edu>
 <20131206102724.W1033@besplex.bde.org>
 <20131206114426.I1329@besplex.bde.org>
 <20131207064602.GA76042@troutmask.apl.washington.edu>
 <20131208012939.GA80145@troutmask.apl.washington.edu>
 <20131208141627.F1181@besplex.bde.org>
 <20131208183339.GA83010@troutmask.apl.washington.edu>
 <20131208185941.GA83484@troutmask.apl.washington.edu>
 <20131209223720.GA91828@troutmask.apl.washington.edu>
 <20131210155635.J1022@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20131210155635.J1022@besplex.bde.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-numerics@freebsd.org, Filipe Maia <filipe.c.maia@gmail.com>
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 10 Dec 2013 07:38:41 -0000

On Tue, Dec 10, 2013 at 04:26:12PM +1100, Bruce Evans wrote:
> On Mon, 9 Dec 2013, Steve Kargl wrote:
> 
> > I would like to commit the attached patch.  I will do it in
> > 3 passes:
> >   1) whitespace fixes,
> >   2) literal constants changes (checked by md5),
> >   3) the code re-organization, threshold change, and dead code removed.
> >
> > * lib/msun/src/e_lgamma_r.c
> >  . Remove trailing space and trailing blank line in Copyright.
> 
> This undoes rev.1.8, in which previous trailing whitespace removal in
> 1.2 was backed out to reduce diffs against the vendor version.

It reduces the diff between e_lgamma_r.c and e_lgammaf.c.  In addition,
I think that at this point in time, the original vendor is irrelevant.
>From what I can glean from OpenBSD, NetBSD, Openlibm, and dragonflybsd,
FreeBSD is the vendor.

> >  . Fix prototype for sin_pi to agree with the intended commit
> >    r97413 done some 11 years 6 month ago.
> >  . Remove dead code in sin_pi and remove 'if(ix<0x43300000)' as it is
> >    always true.
> >  . In the domain [0,2), move the three minimax approximations embedded
> >    within __ieee75_lgamma_r() into three 'static inline double' functions.
> 
> This is too invasive.  It is only a style change to a slighly worse style.

It's only a style change to a much better style in that I now
have some idea of what to do get working ld80 and ld128 versions.  

> 
> The other (unrelated) changes seem reasonable.
> 
> > --- /usr/src/lib/msun/src/e_lgamma_r.c	2013-12-06 07:58:39.000000000 -0800
> > +++ src/e_lgamma_r.c	2013-12-09 13:05:22.000000000 -0800
> > ...
> > double
> > __ieee754_lgamma_r(double x, int *signgamp)
> > {
> > -	double t,y,z,nadj,p,p1,p2,p3,q,r,w;
> > +	double t,y,z,nadj,p,q,r,w;
> > 	int32_t hx;
> > -	int i,lx,ix;
> > +	int i,ix,lx;
> 
> Also avoid fixing sorting errors when not changing much.  It would be a
> more substantive change to fix the types of these variables (they should
> be uint32_t or int32_t).  I should have done more or less in rev.1.9
> where I split up this declaration to change only hx from int to int32_t.
> 
> >
> > 	EXTRACT_WORDS(hx,lx,x);
> >
> > @@ -235,51 +265,36 @@
> > 	if((((ix-0x3ff00000)|lx)==0)||(((ix-0x40000000)|lx)==0)) r = 0;
> >     /* for x < 2.0 */
> > 	else if(ix<0x40000000) {
> > -	    if(ix<=0x3feccccc) { 	/* lgamma(x) = lgamma(x+1)-log(x) */
> > +	    if (ix <= 0X3FECCCCC) { 	/* lgamma(x) = lgamma(x+1)-log(x) */
> 
> Like whitespace fixes.
> 
> > 		r = -__ieee754_log(x);
> > -		if(ix>=0x3FE76944) {y = one-x; i= 0;}
> > -		else if(ix>=0x3FCDA661) {y= x-(tc-one); i=1;}
> > -	  	else {y = x; i=2;}
> > +		if (ix >= 0x3FE76944)
> > +		    r += func0(1 - x);
> > +		else if (ix >= 0x3FCDA661)
> > +		    r += func1(x - (tc - 1));
> > +	  	else
> > +		    r += func2(x);
> 
> The simplification from using func*() seems to be mainly avoiding a case
> statement and the case selector variable i.

It also allows one to clearly see that here func[0-2] are
evaluations of lgamma(1+x) where subdomains in [0,2) are
being mapped into other domains.

> > ...
> > 	    switch(i) {
> > -	    case 7: z *= (y+6.0);	/* FALLTHRU */
> > -	    case 6: z *= (y+5.0);	/* FALLTHRU */
> > -	    case 5: z *= (y+4.0);	/* FALLTHRU */
> > -	    case 4: z *= (y+3.0);	/* FALLTHRU */
> > -	    case 3: z *= (y+2.0);	/* FALLTHRU */
> > +	    case 7: z *= (y+6);	/* FALLTHRU */
> > +	    case 6: z *= (y+5);	/* FALLTHRU */
> > +	    case 5: z *= (y+4);	/* FALLTHRU */
> > +	    case 4: z *= (y+3);	/* FALLTHRU */
> > +	    case 3: z *= (y+2);	/* FALLTHRU */
> 
> The comment indentation would be wrong now, except this file already uses
> totally inconsistent comment indentation.  The float version blindly moved
> the indentation in the other direction by adding float casts.

The change reduces the diff between float and double versions.

-- 
Steve