From owner-freebsd-numerics@FreeBSD.ORG Sun May 10 19:01:17 2015 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2344E1D2 for ; Sun, 10 May 2015 19:01:17 +0000 (UTC) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "troutmask", Issuer "troutmask" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id DD8331E70 for ; Sun, 10 May 2015 19:01:16 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.14.9/8.14.9) with ESMTP id t4AJ1FI4085423 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sun, 10 May 2015 12:01:15 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.9/8.14.9/Submit) id t4AJ1FQt085422; Sun, 10 May 2015 12:01:15 -0700 (PDT) (envelope-from sgk) Date: Sun, 10 May 2015 12:01:14 -0700 From: Steve Kargl To: Bruce Evans Cc: freebsd-numerics@freebsd.org Subject: Re: small cleanup patch for e_pow.c Message-ID: <20150510190114.GA85376@troutmask.apl.washington.edu> References: <20150510002910.GA82261@troutmask.apl.washington.edu> <20150510113454.O841@besplex.bde.org> <20150510061458.GA82518@troutmask.apl.washington.edu> <20150510172810.E1812@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150510172810.E1812@besplex.bde.org> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 May 2015 19:01:17 -0000 On Sun, May 10, 2015 at 08:16:14PM +1000, Bruce Evans wrote: > On Sat, 9 May 2015, Steve Kargl wrote: > > > > I don't see how this can be an optimization. The code has the form > > > > if (|y| > 2**31) { > > if (|y| > 2**64) { > > if (a) return > > if (b) return > > } > > if (a') return > > if (b') return > > ... > > } > > > > The difference between a and a' is <= instead of <, and similar for > > b and b' with >= and >. If either a or b would have return, so will > > a' and b'. If neither a nor b return, then neither a' nor b' will > > return. The a' and b' lines were added by das in r141296, where the > > commit message says the change is for fdlibm 5.3 compatibility. > > Hmm, I read the comment and not the code. According to the comment, > there are no tests a and b. But these are apparently needed. > Obviously, we must return 1 when x is exactly 1, and it is apparently > possible for overflow/underflow to occur for values near 1 even when > y is huge. > > > I'll also note that block like 'if (|y|>2**64) { }' is not present > > in e_powf.c > > Apparently just another translation error. > > The tests for the double precision case are probably fuzzy just for > efficiency -- we want to look at only the high word. In float > precision, we can look at all the bits. Since even 1 bit away from 1 > is relatively large in float precision, overflow/underflow is more > likely to happen automatically. Indeed, (1+2**-23)**(2**30) is a large > multiple of FLT_MAX although (1+2**-23)**(2**29) is a small fraction > of FLT_MAX; N >= 2**30 and above also gives underflow for (1-2**-24)**N; > N >= 2**62 gives overflow in double precision for (1+2**-52)**N, ... > > So 1 is the only numbers near 1 that doesn't give overflow. > Thanks for the explanation! That help dislodge a mental block. To find the magic numbers, it seems I need to consider (1-2**(-p))**(2**N) = 2**(emin-p) for underflow (1+2**(-p))**(2**N) = (1-2**(-p))*2**emax for overflow With p = [24, 53, 64, 113], emin = [-125, -1021, -16381, -16381], emax = [128, 1024, 16384, 16384], and the use of log(1+z) = z for |z| << 1, I find underflow: N = [30.7, 62.5, 77.5, 126.5] overflow: N = [30.5, 62.5, 77.5, 126.5] PS: Yes, I'm slowly poking away at powl(). I have the 19 special cases working for ld80. Now, comes the hard part. -- Steve