From owner-freebsd-numerics@FreeBSD.ORG Sun May 10 02:24:06 2015 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 841CF5E4 for ; Sun, 10 May 2015 02:24:06 +0000 (UTC) Received: from mail110.syd.optusnet.com.au (mail110.syd.optusnet.com.au [211.29.132.97]) by mx1.freebsd.org (Postfix) with ESMTP id 4A7221FA2 for ; Sun, 10 May 2015 02:24:05 +0000 (UTC) Received: from c211-30-166-197.carlnfd1.nsw.optusnet.com.au (c211-30-166-197.carlnfd1.nsw.optusnet.com.au [211.30.166.197]) by mail110.syd.optusnet.com.au (Postfix) with ESMTPS id EC8AF784016; Sun, 10 May 2015 12:00:00 +1000 (AEST) Date: Sun, 10 May 2015 11:59:55 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Steve Kargl cc: freebsd-numerics@freebsd.org Subject: Re: small cleanup patch for e_pow.c In-Reply-To: <20150510002910.GA82261@troutmask.apl.washington.edu> Message-ID: <20150510113454.O841@besplex.bde.org> References: <20150510002910.GA82261@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=A5NVYcmG c=1 sm=1 tr=0 a=KA6XNC2GZCFrdESI5ZmdjQ==:117 a=PO7r1zJSAAAA:8 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=LUJW0xp84oLmpHL8iQMA:9 a=CjuIK1q_8ugA:10 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 May 2015 02:24:06 -0000 On Sat, 9 May 2015, Steve Kargl wrote: > In reading, e_pow.c I found a small piece of code that > can be remove. Anyone object? > > Index: src/e_pow.c > =================================================================== > --- src/e_pow.c (revision 1603) > +++ src/e_pow.c (working copy) > @@ -187,10 +187,6 @@ __ieee754_pow(double x, double y) > > /* |y| is huge */ > if(iy>0x41e00000) { /* if |y| > 2**31 */ > - if(iy>0x43f00000){ /* if |y| > 2**64, must o/uflow */ > - if(ix<=0x3fefffff) return (hy<0)? huge*huge:tiny*tiny; > - if(ix>=0x3ff00000) return (hy>0)? huge*huge:tiny*tiny; > - } > /* over/underflow if x is not close to one */ > if(ix<0x3fefffff) return (hy<0)? s*huge*huge:s*tiny*tiny; > if(ix>0x3ff00000) return (hy>0)? s*huge*huge:s*tiny*tiny; It seems to be just an optimization. It is a large optimization for the huge args, but those are not common, and is at most a tiny pessimization for non-huge args (just an extra branch which can be predicted perfectly if non-huge args are never used). My tests cover huge args uniformly in float space, so they benefit from optimizations like this more than normal programs. However, on some CPUs the exceptions for calculating huge*huge and tiny*tiny when they overflow/underflow are the main source of slowness, so it doesn't help much to avoid large code that has to be executed for normal args, unless the latter would generate multiple exceptions. Here if you are correct that the above code can be removed, then the exceptions generated by the main path can't be too complicated or it would be hard to see that there are no spurious ones. We can see this easily for functions like exp(), and pow() is not very different, because the algorithm is basically to calculate a small result and then multiply it by 2**k. The overflows and underflows only occur in the scaling step. Bruce