From owner-freebsd-numerics@FreeBSD.ORG  Sun May 10 02:24:06 2015
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 841CF5E4
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 02:24:06 +0000 (UTC)
Received: from mail110.syd.optusnet.com.au (mail110.syd.optusnet.com.au
 [211.29.132.97]) by mx1.freebsd.org (Postfix) with ESMTP id 4A7221FA2
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 02:24:05 +0000 (UTC)
Received: from c211-30-166-197.carlnfd1.nsw.optusnet.com.au
 (c211-30-166-197.carlnfd1.nsw.optusnet.com.au [211.30.166.197])
 by mail110.syd.optusnet.com.au (Postfix) with ESMTPS id EC8AF784016;
 Sun, 10 May 2015 12:00:00 +1000 (AEST)
Date: Sun, 10 May 2015 11:59:55 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
cc: freebsd-numerics@freebsd.org
Subject: Re: small cleanup patch for e_pow.c
In-Reply-To: <20150510002910.GA82261@troutmask.apl.washington.edu>
Message-ID: <20150510113454.O841@besplex.bde.org>
References: <20150510002910.GA82261@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.1 cv=A5NVYcmG c=1 sm=1 tr=0
 a=KA6XNC2GZCFrdESI5ZmdjQ==:117 a=PO7r1zJSAAAA:8 a=kj9zAlcOel0A:10
 a=JzwRw_2MAAAA:8 a=LUJW0xp84oLmpHL8iQMA:9 a=CjuIK1q_8ugA:10
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 10 May 2015 02:24:06 -0000

On Sat, 9 May 2015, Steve Kargl wrote:

> In reading, e_pow.c I found a small piece of code that
> can be remove.  Anyone object?
>
> Index: src/e_pow.c
> ===================================================================
> --- src/e_pow.c	(revision 1603)
> +++ src/e_pow.c	(working copy)
> @@ -187,10 +187,6 @@ __ieee754_pow(double x, double y)
>
>     /* |y| is huge */
> 	if(iy>0x41e00000) { /* if |y| > 2**31 */
> -	    if(iy>0x43f00000){	/* if |y| > 2**64, must o/uflow */
> -		if(ix<=0x3fefffff) return (hy<0)? huge*huge:tiny*tiny;
> -		if(ix>=0x3ff00000) return (hy>0)? huge*huge:tiny*tiny;
> -	    }
> 	/* over/underflow if x is not close to one */
> 	    if(ix<0x3fefffff) return (hy<0)? s*huge*huge:s*tiny*tiny;
> 	    if(ix>0x3ff00000) return (hy>0)? s*huge*huge:s*tiny*tiny;

It seems to be just an optimization.  It is a large optimization for
the huge args, but those are not common, and is at most a tiny
pessimization for non-huge args (just an extra branch which can be
predicted perfectly if non-huge args are never used).

My tests cover huge args uniformly in float space, so they benefit from
optimizations like this more than normal programs.  However, on some
CPUs the exceptions for calculating huge*huge and tiny*tiny when they
overflow/underflow are the main source of slowness, so it doesn't help
much to avoid large code that has to be executed for normal args, unless
the latter would generate multiple exceptions.  Here if you are correct
that the above code can be removed, then the exceptions generated by the
main path can't be too complicated or it would be hard to see that there
are no spurious ones.  We can see this easily for functions like exp(),
and pow() is not very different, because the algorithm is basically
to calculate a small result and then multiply it by 2**k.  The overflows
and underflows only occur in the scaling step.

Bruce