From owner-freebsd-i386@FreeBSD.ORG Sun Feb 6 09:21:59 2005 Return-Path: Delivered-To: freebsd-i386@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1349616A4CE; Sun, 6 Feb 2005 09:21:59 +0000 (GMT) Received: from VARK.MIT.EDU (VARK.MIT.EDU [18.95.3.179]) by mx1.FreeBSD.org (Postfix) with ESMTP id A0BA043D1F; Sun, 6 Feb 2005 09:21:58 +0000 (GMT) (envelope-from das@FreeBSD.ORG) Received: from VARK.MIT.EDU (localhost [127.0.0.1]) by VARK.MIT.EDU (8.13.1/8.13.1) with ESMTP id j169Lq5K064331; Sun, 6 Feb 2005 04:21:52 -0500 (EST) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by VARK.MIT.EDU (8.13.1/8.13.1/Submit) id j169LpmI064330; Sun, 6 Feb 2005 04:21:51 -0500 (EST) (envelope-from das@FreeBSD.ORG) Date: Sun, 6 Feb 2005 04:21:51 -0500 From: David Schultz To: Bruce Evans Message-ID: <20050206092151.GA63584@VARK.MIT.EDU> References: <200406012251.i51MpkkU024224@VARK.homeunix.com> <20040602172105.T23521@gamplex.bde.org> <20050204215913.GA44598@VARK.MIT.EDU> <20050205181808.J10966@delplex.bde.org> <20050206000912.GA59372@VARK.MIT.EDU> <20050206162951.J14584@delplex.bde.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050206162951.J14584@delplex.bde.org> cc: FreeBSD-gnats-submit@FreeBSD.ORG cc: freebsd-i386@FreeBSD.ORG Subject: Re: i386/67469: src/lib/msun/i387/s_tan.S gives incorrect results for large inputs X-BeenThere: freebsd-i386@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: I386-specific issues for FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Feb 2005 09:21:59 -0000 On Sun, Feb 06, 2005, Bruce Evans wrote: > > [... too much detail to reply to :-)] > > Your suggestion of effectively special-casing large inputs and > > inputs close to multiples of pi is probably the right way to fix > > the inaccuracy. However, I'm worried that it would wipe away > > any performance benefit of using fptan, if there is a benefit > > at all. How about punting and removing s_tan.S instead? > > The problem affects at least sin() and cos() too, so I think throwing > away the optimized versions is too extreme. Perhaps a single range > check to let the MD version handle only values near 0 ([-pi/2+eps, > pi/2-eps]? would be efficient enough. > > > Other unanswered questions (ENOTIME right now): > > - What about sin() and cos()? > > - What about the float versions of these? > > I tested sin(). It misbehaves similarly (with identical results for > args (2 * n * M_PI). My question was more along the lines of, ``Is there an actual performance benefit for sin() and cos()?'' If there is little or no benefit, then there is no reason to keep the assembly routines and worry about how to fix them. I'll try to investigate this aspect in more detail at some point, unless you beat me to it... > I don't run -current and hadn't seen your code for the MD float versions... > They are buggier: > (1) the exponent can be up to 127, so more than half of the representable > values exceed 2^63 in magnitude and thus need range reduction. Results > for tan(0x1p64) using various methods: > > buggy MD tanf: 0x1p+64 1.84467e+19 > buggy MD tan: -0x1.8467b926c834bp-2 -0.379302 > fdlibm MI tanf: -0x1.82bee6p-6 -0.0236051 Whups. FWIW, they're not mine; I got them from NetBSD. Another one of the float functions in the NetBSD repository is buggy, too, but I forget which one. I only imported the ones that seemed to be correct and faster than fdlibm for input bit patterns chosen uniformly at random, but I guess I missed that problem. > (2) they return extra bits of precision. The MD double precision versions > have the same bug; the bug is just clearer for the float case, and > cannot be fixed by FreeBSD's rounding precision hack. > > BTW, I don't trust the range reduction for floating point pi/2 but > never finished debugging or fixing it. According to my old comment, > it is off by pi/2 and very slow for many values between 32768 and > 65535. However, I couldn't duplicate this misbehaviour last time I > looked at it. I used to fix it using the double version. Yeah, there seem to be many values for which __ieee754_rem_pio2f() returns the wrong quotient: -0x1.8009c6p+8 (input) d = -244 (return value of __ieee754_rem_pio2()) f = -4 (return value of __ieee754_rem_pio2f()) 0x1.389ee4p+87 d = 4 f = 3 -0x1.70bca6p+16 d = -60095 f = -7 It also gets the remainder completely wrong sometimes: 0x1.389ee4p+87 rd0 = -0x1.6ad286p-4 (y[0] from rem_pio2(), rounded to float) rf0 = 0x1.7b728cp+0 (y[0] from rem_pio2f()) 0x1.4d23ecp+95 rd0 = -0x1.ee68f8p-2 rf0 = 0x1.168578p+0 -0x1.5a172p+63 rd0 = 0x1.52f2aap-1 rf0 = -0x1.d14ccp-1 Also, rem_pio2f() is probably not much more efficient than rem_pio2(). The former would be better if it used a single double instead of two floats for increased accuracy. (The double version has two use two doubles for accuracy because there's no ``double double'' type.) Results for MI tan() differ from MI tanf() by >2 ulp for many inputs, including: 0x1.00008ep+15 0x1.0000ap+15 0x1.0000a4p+15 0x1.0000aep+15 [...] Perhaps this is due to the problem with rem_pio2f().