From owner-freebsd-numerics@FreeBSD.ORG Mon Jun 10 17:06:05 2013 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 3B5FA2ED for ; Mon, 10 Jun 2013 17:06:05 +0000 (UTC) (envelope-from das@FreeBSD.ORG) Received: from zim.MIT.EDU (50-196-151-174-static.hfc.comcastbusiness.net [50.196.151.174]) by mx1.freebsd.org (Postfix) with ESMTP id 12ED6162A for ; Mon, 10 Jun 2013 17:06:04 +0000 (UTC) Received: from zim.MIT.EDU (localhost [127.0.0.1]) by zim.MIT.EDU (8.14.7/8.14.2) with ESMTP id r5AH63B1072698; Mon, 10 Jun 2013 10:06:03 -0700 (PDT) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by zim.MIT.EDU (8.14.7/8.14.2/Submit) id r5AH63He072697; Mon, 10 Jun 2013 10:06:03 -0700 (PDT) (envelope-from das@FreeBSD.ORG) Date: Mon, 10 Jun 2013 10:06:03 -0700 From: David Schultz To: Steve Kargl Subject: Re: Implementation for coshl. Message-ID: <20130610170603.GA72597@zim.MIT.EDU> References: <20130610003645.GA16444@troutmask.apl.washington.edu> <20130610110740.V24058@besplex.bde.org> <20130610055834.GA68643@zim.MIT.EDU> <20130610162927.GA20856@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130610162927.GA20856@troutmask.apl.washington.edu> Cc: freebsd-numerics@FreeBSD.org, Bruce Evans X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2013 17:06:05 -0000 On Mon, Jun 10, 2013, Steve Kargl wrote: > On Sun, Jun 09, 2013 at 10:58:34PM -0700, David Schultz wrote: > > Awesome, thanks for working on this. > > > > On Mon, Jun 10, 2013, Bruce Evans wrote: > > > On Sun, 9 Jun 2013, Steve Kargl wrote: > > > > Arch | Interval | #calls | Time (s) | Max ULP | Compiler | Value > > > > -----------+---------------------+--------+----------+---------+----------+------- > > > > i386 [1] | [ 0.00: 0.35] | 100M | 15.0198 | 0.58583 | gcc | 1 > > > > i386 [1] | [ 0.35: 24.00] | 100M | 15.1858 | 1.01504 | gcc | 2 > > > > i386 [1] | [ 24.00:11356.52] | 100M | 12.9591 | 0.51198 | gcc | 3 > > > > i386 [1] | [11356.52:11357.22] | 100M | 13.3328 | 1.90988 | gcc | 4 > > > > -----------+---------------------+--------+----------+---------+----------+------- > > > > > > Quite large errors, unfortunately much the same as in double precision. > > [...] > > > > Bruce and I exchanged emails a long time ago about possible ways > > > > to reduce the ulp in this range by either computer exp(x) with > > > > extra precision or using a table with cosh(x) = cosh(x_i) * cosh(d) > > > > + sinh(x_i) * sinh(d) with d = x - x_i. I tried the latter with > > > > disappointing results. > > > > This is fine for now. We might eventually want to improve the > > accuracy of the float, double, and long double implementations, > > but that ought to come later. It's much easier to start with > > float and double when making improvements in the algorithm. > > I do not have working test programs for float and double, which > will take a morning (or late night) to fix. So, I do not have > a method for testing the either the current cosh[f] or any changes > that I might suggest. I think you might have misinterpreted my suggestion. I meant that you should not spend too much time worrying about improvements in the algorithm at the same time as you're porting it from double to long double. Improvements in the algorithm can come separately, and when we do tackle them, it is easier to develop them in float precision first. > > > > The former would require a refactoring of > > > > s_expl.c into a kernel __kernel_expl(x, hi, lo). I have no plans on > > > > pursuing this at the this time. > > > > > > But you need this soon for __ldexp_exp() and __ldexp_cexp(), which are > > > needed for hyperbolic functions (the large args already fixed for float > > > and double precision) and for cexp() and trig and hyperbolic complex > > > functions. It is much easier to implement these using a kernel. I do > > > this only for float precision. > > I spent a couple of hours last night pulling __kernel_expl out > of expl(). The results were less then stellar, but Bruce's > email suggested the error in my method. My recollection of working on this in double precision is that Bruce was unhappy with most proposals. Either it made exp slower by having exp call the kernel, or it resulted in lots of code duplication. I think it's a very useful thing to have, but don't get stuck in a rathole for a month over it. > > - I think the way you handle the tiny-x case is an improvement > > over the double version. However, float-to-int conversion is > > very slow on x86, so you might instead try something like > > RETURNI(1.0L + tiny), where tiny is volatile and e.g., 0x1p-1000. > > That will also return the correct result in all rounding modes. > > I originally had > > if (ix < BIAS - 33) { /* |x| < 0x1p-33 */ > /* cosh(+-0) = 1. */ > if (x == 0) > return (one); > /* cosh(tiny) = one */ > return (one + tiny * tiny); > } > > because C99 requires cosh(+-0) to be exact. For |x| != the > return value will raise inexact. Aah, right, what I proposed won't handle 0 correctly. In that case, I think what you have is fine; it should handle everything except for the rounding mode issue, which we don't really support anyway. Alternatively, you could just do what fdlibm does, which is hard to argue with given that it has been acceptable for several decades. > > - Please check whether volatile hacks are needed to induce the > > compiler to do the right thing. I'm increasingly finding that > > these are needed with clang. > > I suppose I need to either look at the generated assembly or > add testing for signals to my test programs. How do you go > about looking at this? #include volatile long double x = ..., y; feclearexcept(FE_ALL_EXCEPT); y = coshl(x); printf("%x\n", fetestexcept(FE_ALL_EXCEPT)); The main cases to check are: - cosh(+-0) = 0, no exceptions - cosh(large) = inf, overflow exception (On some architectures, this also raises inexact, which is allowed.) - cosh(inf or qnan) raises no exceptions - cosh(anything else) raises an inexact exception