From owner-freebsd-numerics@FreeBSD.ORG Mon Jun 10 16:29:28 2013 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id F329F215; Mon, 10 Jun 2013 16:29:27 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by mx1.freebsd.org (Postfix) with ESMTP id DDAC51265; Mon, 10 Jun 2013 16:29:27 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.6/8.14.6) with ESMTP id r5AGTRBn020971; Mon, 10 Jun 2013 09:29:27 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.6/8.14.6/Submit) id r5AGTRwK020970; Mon, 10 Jun 2013 09:29:27 -0700 (PDT) (envelope-from sgk) Date: Mon, 10 Jun 2013 09:29:27 -0700 From: Steve Kargl To: David Schultz Subject: Re: Implementation for coshl. Message-ID: <20130610162927.GA20856@troutmask.apl.washington.edu> References: <20130610003645.GA16444@troutmask.apl.washington.edu> <20130610110740.V24058@besplex.bde.org> <20130610055834.GA68643@zim.MIT.EDU> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130610055834.GA68643@zim.MIT.EDU> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-numerics@FreeBSD.org, Bruce Evans X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2013 16:29:28 -0000 On Sun, Jun 09, 2013 at 10:58:34PM -0700, David Schultz wrote: > Awesome, thanks for working on this. > > On Mon, Jun 10, 2013, Bruce Evans wrote: > > On Sun, 9 Jun 2013, Steve Kargl wrote: > > > Arch | Interval | #calls | Time (s) | Max ULP | Compiler | Value > > > -----------+---------------------+--------+----------+---------+----------+------- > > > i386 [1] | [ 0.00: 0.35] | 100M | 15.0198 | 0.58583 | gcc | 1 > > > i386 [1] | [ 0.35: 24.00] | 100M | 15.1858 | 1.01504 | gcc | 2 > > > i386 [1] | [ 24.00:11356.52] | 100M | 12.9591 | 0.51198 | gcc | 3 > > > i386 [1] | [11356.52:11357.22] | 100M | 13.3328 | 1.90988 | gcc | 4 > > > -----------+---------------------+--------+----------+---------+----------+------- > > > > Quite large errors, unfortunately much the same as in double precision. > [...] > > > Bruce and I exchanged emails a long time ago about possible ways > > > to reduce the ulp in this range by either computer exp(x) with > > > extra precision or using a table with cosh(x) = cosh(x_i) * cosh(d) > > > + sinh(x_i) * sinh(d) with d = x - x_i. I tried the latter with > > > disappointing results. > > This is fine for now. We might eventually want to improve the > accuracy of the float, double, and long double implementations, > but that ought to come later. It's much easier to start with > float and double when making improvements in the algorithm. I do not have working test programs for float and double, which will take a morning (or late night) to fix. So, I do not have a method for testing the either the current cosh[f] or any changes that I might suggest. > > > The former would require a refactoring of > > > s_expl.c into a kernel __kernel_expl(x, hi, lo). I have no plans on > > > pursuing this at the this time. > > > > But you need this soon for __ldexp_exp() and __ldexp_cexp(), which are > > needed for hyperbolic functions (the large args already fixed for float > > and double precision) and for cexp() and trig and hyperbolic complex > > functions. It is much easier to implement these using a kernel. I do > > this only for float precision. I spent a couple of hours last night pulling __kernel_expl out of expl(). The results were less then stellar, but Bruce's email suggested the error in my method. > That's fine for now. We need a better k_exp, too. I believe > Bruce was working on one. > > I have a few comments on the code... mostly a subset of Bruce's > comments: > > - You can clear the sign bit on x after the Inf/NaN check. OK. > - I think the way you handle the tiny-x case is an improvement > over the double version. However, float-to-int conversion is > very slow on x86, so you might instead try something like > RETURNI(1.0L + tiny), where tiny is volatile and e.g., 0x1p-1000. > That will also return the correct result in all rounding modes. I originally had if (ix < BIAS - 33) { /* |x| < 0x1p-33 */ /* cosh(+-0) = 1. */ if (x == 0) return (one); /* cosh(tiny) = one */ return (one + tiny * tiny); } because C99 requires cosh(+-0) to be exact. For |x| != the return value will raise inexact. > - Please check whether volatile hacks are needed to induce the > compiler to do the right thing. I'm increasingly finding that > these are needed with clang. I suppose I need to either look at the generated assembly or add testing for signals to my test programs. How do you go about looking at this? -- Steve