From owner-freebsd-numerics@FreeBSD.ORG  Sun May 10 19:01:17 2015
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 2344E1D2
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 19:01:17 +0000 (UTC)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id DD8331E70
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 19:01:16 +0000 (UTC)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.14.9/8.14.9) with ESMTP id t4AJ1FI4085423
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Sun, 10 May 2015 12:01:15 -0700 (PDT)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.14.9/8.14.9/Submit) id t4AJ1FQt085422;
 Sun, 10 May 2015 12:01:15 -0700 (PDT) (envelope-from sgk)
Date: Sun, 10 May 2015 12:01:14 -0700
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-numerics@freebsd.org
Subject: Re: small cleanup patch for e_pow.c
Message-ID: <20150510190114.GA85376@troutmask.apl.washington.edu>
References: <20150510002910.GA82261@troutmask.apl.washington.edu>
 <20150510113454.O841@besplex.bde.org>
 <20150510061458.GA82518@troutmask.apl.washington.edu>
 <20150510172810.E1812@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150510172810.E1812@besplex.bde.org>
User-Agent: Mutt/1.5.23 (2014-03-12)
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 10 May 2015 19:01:17 -0000

On Sun, May 10, 2015 at 08:16:14PM +1000, Bruce Evans wrote:
> On Sat, 9 May 2015, Steve Kargl wrote:
> >
> > I don't see how this can be an optimization.  The code has the form
> >
> >  if (|y| > 2**31) {
> >     if (|y| > 2**64) {
> >         if (a) return
> >         if (b) return
> >     }
> >     if (a') return
> >     if (b') return
> >     ...
> >   }
> >
> > The difference between a and a' is <= instead of <, and similar for
> > b and b' with >= and >.  If either a or b would have return, so will
> > a' and b'.  If neither a nor b return, then neither a' nor b' will
> > return.  The a' and b' lines were added by das in r141296, where the
> > commit message says the change is for fdlibm 5.3 compatibility.
> 
> Hmm, I read the comment and not the code.  According to the comment,
> there are no tests a and b.  But these are apparently needed.
> Obviously, we must return 1 when x is exactly 1, and it is apparently
> possible for overflow/underflow to occur for values near 1 even when
> y is huge.
> 
> > I'll also note that block like 'if (|y|>2**64) { }' is not present
> > in e_powf.c
> 
> Apparently just another translation error.
> 
> The tests for the double precision case are probably fuzzy just for
> efficiency -- we want to look at only the high word.  In float
> precision, we can look at all the bits.  Since even 1 bit away from 1
> is relatively large in float precision, overflow/underflow is more
> likely to happen automatically.  Indeed, (1+2**-23)**(2**30) is a large
> multiple of FLT_MAX although (1+2**-23)**(2**29) is a small fraction
> of FLT_MAX; N >= 2**30 and above also gives underflow for (1-2**-24)**N;
> N >= 2**62 gives overflow in double precision for (1+2**-52)**N, ...
> 
> So 1 is the only numbers near 1 that doesn't give overflow.
> 

Thanks for the explanation!  That help dislodge a mental block.

To find the magic numbers, it seems I need to consider 

(1-2**(-p))**(2**N) = 2**(emin-p)          for underflow
(1+2**(-p))**(2**N) = (1-2**(-p))*2**emax  for overflow

With p = [24, 53, 64, 113], emin = [-125, -1021, -16381, -16381],
emax = [128, 1024, 16384, 16384], and the use of log(1+z) = z for
|z| << 1, I find

underflow:  N = [30.7, 62.5, 77.5, 126.5]
overflow:   N = [30.5, 62.5, 77.5, 126.5] 

PS: Yes, I'm slowly poking away at powl().  I have the 19 special cases
working for ld80.  Now, comes the hard part.
-- 
Steve