From owner-freebsd-numerics@FreeBSD.ORG Tue Sep 18 04:02:21 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 935B41065672 for ; Tue, 18 Sep 2012 04:02:21 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 3335E8FC12 for ; Tue, 18 Sep 2012 04:02:20 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q8I42JSt005318 for ; Mon, 17 Sep 2012 23:02:19 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <5057F24B.7020605@missouri.edu> Date: Mon, 17 Sep 2012 23:02:19 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:15.0) Gecko/20120827 Thunderbird/15.0 MIME-Version: 1.0 To: freebsd-numerics@freebsd.org References: <5017111E.6060003@missouri.edu> <20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu> <20120912191556.F1078@besplex.bde.org> <20120912225847.J1771@besplex.bde.org> <50511B40.3070009@missouri.edu> <20120913204808.T1964@besplex.bde.org> <5051F59C.6000603@missouri.edu> <20120914014208.I2862@besplex.bde.org> <50526050.2070303@missouri.edu> <20120914212403.H1983@besplex.bde.org> <50538E28.6050400@missouri.edu> <20120915231032.C2669@besplex.bde.org> <50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu> <5054C200.7090307@missouri.edu> <20120916041132.D6344@besplex.bde.org> <50553424.2080902@missouri.edu> <20120916134730.Y957@besplex.bde.org> <5055ECA8.2080008@missouri.edu> <20120917022614.R2943@besplex.bde.org> <50562213.9020400@missouri.edu> <20120917060116.G3825@besplex.bde.org> <50563C57.60806@missouri.edu> <20120918012459.V5094@besplex.bde.org> <5057A932.3000603@missouri.edu> In-Reply-To: <5057A932.3000603@missouri.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Sep 2012 04:02:21 -0000 On 09/17/2012 05:50 PM, Stephen Montgomery-Smith wrote: >> cacos*() and casin*() should benefit even more from an up-front raising >> of inexact, since do_hard_work() has 7 magic statements to raise inexact >> where sum_squares has only 1. > > Where is the code that raises inexact up-front? I don't see why having code upfront will make it much more efficient. Out of these 7 magic statements, at most two of them will be called. But I could put something like if ((x == 0 && y == 0) || (x == 0 && y == 1) || (int)(1+tiny) == 1) { ........ at the beginning of do_hard_work and catanh. >> ... I now understand what the threshold should be. You have >> filtered out ax == 1. This makes 1 - ax*ax at least ~2*EPSILON, so >> ay*ay can be dropped if ay is less than sqrt(2*EPSILON*EPSILON) * >> 2**-GUARD_DIGITS = EPSILON * 2**-5 say. SQRT_MIN is way smaller >> than that, so FOUR_SQRT_MIN works too. We should use a larger >> threshold for efficiency, or avoid the special case for ax == 1. >> Testing shows that this analysis is off by a factor of about >> sqrt(EPSILON), since a threshold of EPSILON * 2**7 is optimal. >> The optimization made no difference to speed; it is just an >> optimization for understanding. Maybe the special case for ax == 1 >> can be avoided, or folded together with the same special case for >> evaluation of the real part. This special case is similar to the >> one in clog(), but easier. OK, I think I made changes more or less according to your suggestions. In the case A < A_crossover, a threshold like DBL_EPSILON*DBL_EPSILON/128 is required. I think the one you set is too large. It is important that sqrt(x) + x/2 is sqrt(x). (Again I don't think your tests would pick this up, because you need to do a lot of tests where y is close to or equal to 1.)