From owner-freebsd-numerics@FreeBSD.ORG  Sun Sep 16 19:01:40 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 8AC1F106566C
	for <freebsd-numerics@FreeBSD.org>;
	Sun, 16 Sep 2012 19:01:40 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 486608FC17
	for <freebsd-numerics@FreeBSD.org>;
	Sun, 16 Sep 2012 19:01:40 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8GJ1c7N057483; Sun, 16 Sep 2012 14:01:39 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <50562213.9020400@missouri.edu>
Date: Sun, 16 Sep 2012 14:01:39 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu> <50297E43.7090309@missouri.edu>
	<20120814201105.T934@besplex.bde.org>
	<502A780B.2010106@missouri.edu>
	<20120815223631.N1751@besplex.bde.org>
	<502C0CF8.8040003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org>
	<5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu>
	<504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
In-Reply-To: <20120917022614.R2943@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 16 Sep 2012 19:01:40 -0000

On 09/16/2012 11:51 AM, Bruce Evans wrote:

>
> I don't like that.  It will be much slower on almost 1/4 of arg space.
> The only reason to consider not doing it is that the args that it
> applies to are not very likely, and optimizing for them may pessimize
> the usual case.

The pessimization when |z| is not small is tiny.  It takes no time at 
all to check that |z| is small.

On the other hand let me go through the code and see what happens when 
|x| is small or |y| is small.  There are actually specific formulas that 
work well in these two cases, and they are probably not that much slower 
than the formulas I decided to remove.  And when you chase through all 
the logic and "if" statements, you may find that you didn't use up a 
whole bunch of time for these very special cases of |z| small - most of 
the extra time merely being the decisions invoked by the "if" statements.

> I just found a related optimization for atan2().  For x > 0 and
> |y|/x < 2**-(MANT_DIG+afew), atan2(y, x) is evaluated as essentially
> sign(y) * atan(|y|/x).  But in this case, its value is simply y/x
> with inexact.  Again the optimization applies to almost 1/4 of arg
> space.  It gains more than the normal overhead of an atan() call by
> avoiding secondary underflows when y/x underflows.

You see, that is exactly where I don't want to do special optimization 
in my code.  In my opinion, it is the tan function itself that should 
realize that |y|/x is small, and hence it is that function that simply 
return |y|/x.  Or if you want to implement it at a higher level, atan2 
should make this realization, and simply return y/x.

Similarly, I would expect log1p(x) to simply return x (inexactly) for x 
small.  And if the compiler is really good, I would hope that the two codes:
log1p(x);
(fabs(x) < DBL_EPSILON) ? x + set_tiny() : log1p(x);
would be equivalent.  (But I am rather sure that gcc isn't that good.)

Furthermore, casinh etc are not commonly used functions.  Putting huge 
amounts of effort looking at special cases to speed it up a little 
somehow feels wrong to me.  In fact, if the programmer knows that he 
will be wanting casinh, and evaluated very fast, then he should be 
motivated enough to try out using z in the case when |z| is small, and 
see if that really speeds things up.

Stephen