From owner-freebsd-numerics@FreeBSD.ORG Mon Jun 10 00:36:51 2013 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 743F4800 for ; Mon, 10 Jun 2013 00:36:51 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by mx1.freebsd.org (Postfix) with ESMTP id 4426C1149 for ; Mon, 10 Jun 2013 00:36:51 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.6/8.14.6) with ESMTP id r5A0ajD6016493 for ; Sun, 9 Jun 2013 17:36:45 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.6/8.14.6/Submit) id r5A0ajeK016492 for freebsd-numerics@freebsd.org; Sun, 9 Jun 2013 17:36:45 -0700 (PDT) (envelope-from sgk) Date: Sun, 9 Jun 2013 17:36:45 -0700 From: Steve Kargl To: freebsd-numerics@freebsd.org Subject: Implementation for coshl. Message-ID: <20130610003645.GA16444@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2013 00:36:51 -0000 I suspect that there will be some nits with the implementation. Anyway, testing gives Arch | Interval | #calls | Time (s) | Max ULP | Compiler | Value -----------+---------------------+--------+----------+---------+----------+------- i386 [1] | [ 0.00: 0.35] | 100M | 15.0198 | 0.58583 | gcc | 1 i386 [1] | [ 0.35: 24.00] | 100M | 15.1858 | 1.01504 | gcc | 2 i386 [1] | [ 24.00:11356.52] | 100M | 12.9591 | 0.51198 | gcc | 3 i386 [1] | [11356.52:11357.22] | 100M | 13.3328 | 1.90988 | gcc | 4 -----------+---------------------+--------+----------+---------+----------+------- amd64 [2]| [ 0.00: 0.35] | 100M | 11.7811 | 0.63075 | clang | 5 amd64 [2]| [ 0.35: 24.00] | 100M | 11.0662 | 1.01535 | clang | 6 amd64 [2]| [ 24.00:11356.52] | 100M | 9.97704 | 0.50852 | clang | 7 amd64 [2]| [11356.52:11357.22] | 100M | 10.8221 | 1.90931 | clang | 8 -----------+---------------------+--------+----------+---------+----------+------- amd64 [2]| [ 0.00: 0.35] | 100M | 10.5930 | 0.63075 | gcc | 9 amd64 [2]| [ 0.35: 24.00] | 100M | 10.0606 | 1.01535 | gcc | 10 amd64 [2]| [ 24.00:11356.52] | 100M | 8.76182 | 0.50852 | gcc | 11 amd64 [2]| [11356.52:11357.22] | 100M | 9.23632 | 1.90931 | gcc | 12 -----------+---------------------+--------+----------+---------+----------+------- sparc64 [3]| [ 0.00: 0.35] | 1M | 62.7154 | 0.57638 | gcc | 13 sparc64 [3]| [ 0.35: 41.00] | 1M | 43.2052 | 1.00709 | gcc | 14 sparc64 [3]| [ 41.00:11356.52] | 1M | 40.5336 | 0.50587 | gcc | 15 sparc64 [3]| [11356.52:11357.22] | 1M | 44.7029 | 1.90400 | gcc | 16 1 3.3932642386627052e-01, 0x1.5b7862d4b272b9c6p-2 2 1.2802453843180306e+00, 0x1.47be2958803a846cp+0 3 2.4087260438954509e+01, 0x1.81656b33b8b51fcp+4 4 1.1357216248489914e+04, 0x1.62e9bae07cffe94ap+13 5 3.4657359027997265e-01, 0x1.62e42fefa39ef35ap-2 6 1.2651166512735005e+00, 0x1.43deaf52d83fa6c4p+0 7 2.4019265291717229e+01, 0x1.804ee91f5df97166p+4 8 1.1357215893598522e+04, 0x1.62e9ba266c488adcp+13 9 3.4657359027997265e-01, 0x1.62e42fefa39ef35ap-2 10 1.2651166512735005e+00, 0x1.43deaf52d83fa6c4p+0 11 2.4019265291717229e+01, 0x1.804ee91f5df97166p+4 12 1.1357215893598522e+04, 0x1.62e9ba266c488adcp+13 13 3.35979864752159202758685843670565218e-01, 0x1.580b1b0ce66d750dad9e5773bc5bp-2 14 1.28794527359513292307645171506249973e+00, 0x1.49b6c80d20fd82749b4ffd5d2b24p+0 15 1.03131066340595824072037883331972584e+04, 0x1.4248da62f5345e862cf6d17602dcp+13 16 1.13572063870748402222198225060228912e+04, 0x1.62e9a6ae44460bffbacfe7a589c2p+13 [1] Intel(R) Core(TM)2 Duo CPU T7250@2.00GHz (1995.04-MHz 686-class CPU) [2] AMD Opteron(tm) Processor 248 (2191.60-MHz K8-class CPU) [3] Sun Microsystems UltraSparc-IIe Processor (648.00 MHz CPU) Two observations: 1. The max ulp for the intervals [0.35:24] and [0.35:41] of 1.xxx is due to the division in the expression half*exp(x) + half/exp(x). Bruce and I exchanged emails a long time ago about possible ways to reduce the ulp in this range by either computer exp(x) with extra precision or using a table with cosh(x) = cosh(x_i) * cosh(d) + sinh(x_i) * sinh(d) with d = x - x_i. I tried the latter with disappointing results. The former would require a refactoring of s_expl.c into a kernel __kernel_expl(x, hi, lo). I have no plans on pursuing this at the this time. 2. gcc produces faster code than clang on amd64. Comments welcomed. -- Steve /* * ==================================================== * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved. * * Developed at SunSoft, a Sun Microsystems, Inc. business. * Permission to use, copy, modify, and distribute this * software is freely granted, provided that this notice * is preserved. * ==================================================== * * Converted to long double by Steven G. Kargl */ #include __FBSDID("$FreeBSD$"); /* * See comments in src/e_cosh.c for a description of the algorithm. */ #include #ifdef __i386__ #include #endif #include "fpmath.h" #include "math.h" #include "math_private.h" #if LDBL_MANT_DIG == 64 static const union IEEEl2bits #define EXP_TINY -32 #define s_threshold 24 /* log(2) / 2 */ log2o2u = LD80C(0xb17217f7d1cf79ac, -2, 0.346573590279972654714L), #define log2o2 (log2o2u.e) /* x = log(LDBL_MAX - 0.5) */ o_threshold1u = LD80C(0xb17217f7d1cf79ac, 13, 11356.5234062941439497L), #define o_threshold1 (o_threshold1u.e) /* log(LDBL_MAX - 0.5) + log(2) */ o_threshold2u = LD80C(0xb174ddc031aec0ea, 13, 11357.2165534747038951L); #define o_threshold2 (o_threshold2u.e) #elif LDBL_MANT_DIG == 113 #define EXP_TINY -56 #define s_threshold 41 static long double log2o2 = 0.346573590279972654708616060729088288L, o_threshold1 = 11356.5234062941439494919310779707650L, o_threshold2 = 11357.2165534747038948013483100922230L; #else #error "Unsupported long double format" #endif static const long double huge = 0x1.p10000L; static const double half = 0.5; #define BIAS (LDBL_MAX_EXP - 1) long double coshl(long double x) { long double t, w; uint16_t hx, ix; ENTERI(); GET_LDBL_EXPSIGN(hx, x); ix = hx & 0x7fff; SET_LDBL_EXPSIGN(x, ix); /* x is +-Inf or NaN. */ if (ix == BIAS + LDBL_MAX_EXP) RETURNI(x * x); if (x < log2o2) { if (ix < BIAS + EXP_TINY) { /* |x| < 0x1pEXP_TINY */ /* cosh(x) = 1 exactly iff x = +-0. */ if ((int)x == 0) RETURNI(1.0L); } t = expm1l(x); w = 1 + t; RETURNI(1 + t * t / (w + w)); } if (x < s_threshold) { t = expl(x); RETURNI(half * t + half / t); } if (x < o_threshold1) RETURNI(half * expl(x)); if (x < o_threshold2) { t = expl(half * x); RETURNI(half * t * t); } RETURNI(huge * huge); } From owner-freebsd-numerics@FreeBSD.ORG Mon Jun 10 02:32:17 2013 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 66A48826 for ; Mon, 10 Jun 2013 02:32:17 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail35.syd.optusnet.com.au (mail35.syd.optusnet.com.au [211.29.133.51]) by mx1.freebsd.org (Postfix) with ESMTP id 92BF5166C for ; Mon, 10 Jun 2013 02:32:16 +0000 (UTC) Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23]) by mail35.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id r5A2W54q007158 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 10 Jun 2013 12:32:07 +1000 Date: Mon, 10 Jun 2013 12:32:05 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Steve Kargl Subject: Re: Implementation for coshl. In-Reply-To: <20130610003645.GA16444@troutmask.apl.washington.edu> Message-ID: <20130610110740.V24058@besplex.bde.org> References: <20130610003645.GA16444@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.0 cv=eqSHVfVX c=1 sm=1 a=LM0AswAWfpYA:10 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=MtpAjnDH-vAA:10 a=uVhegijvKTqswUFXLp4A:9 a=CjuIK1q_8ugA:10 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117 Cc: freebsd-numerics@FreeBSD.org X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2013 02:32:17 -0000 On Sun, 9 Jun 2013, Steve Kargl wrote: > I suspect that there will be some nits with the implementation. Quite a few nats :-). > Anyway, testing gives > > Arch | Interval | #calls | Time (s) | Max ULP | Compiler | Value > -----------+---------------------+--------+----------+---------+----------+------- > i386 [1] | [ 0.00: 0.35] | 100M | 15.0198 | 0.58583 | gcc | 1 > i386 [1] | [ 0.35: 24.00] | 100M | 15.1858 | 1.01504 | gcc | 2 > i386 [1] | [ 24.00:11356.52] | 100M | 12.9591 | 0.51198 | gcc | 3 > i386 [1] | [11356.52:11357.22] | 100M | 13.3328 | 1.90988 | gcc | 4 > -----------+---------------------+--------+----------+---------+----------+------- Quite large errors, unfortunately much the same as in double precision. > amd64 [2]| [ 0.00: 0.35] | 100M | 11.7811 | 0.63075 | clang | 5 > amd64 [2]| [ 0.35: 24.00] | 100M | 11.0662 | 1.01535 | clang | 6 > amd64 [2]| [ 24.00:11356.52] | 100M | 9.97704 | 0.50852 | clang | 7 > amd64 [2]| [11356.52:11357.22] | 100M | 10.8221 | 1.90931 | clang | 8 > -----------+---------------------+--------+----------+---------+----------+------- Likely a bug for the ULPs for the first domain to be so differnt. I suspect even much smaller differences are due to bugs. > ... > 1. The max ulp for the intervals [0.35:24] and [0.35:41] of 1.xxx is > due to the division in the expression half*exp(x) + half/exp(x). That's the one with the large MD difference. > Bruce and I exchanged emails a long time ago about possible ways > to reduce the ulp in this range by either computer exp(x) with > extra precision or using a table with cosh(x) = cosh(x_i) * cosh(d) > + sinh(x_i) * sinh(d) with d = x - x_i. I tried the latter with > disappointing results. The latter may be good for trig functions, but it is bad for hyperbolic functions. It is technically difficult to splice the functions. > The former would require a refactoring of > s_expl.c into a kernel __kernel_expl(x, hi, lo). I have no plans on > pursuing this at the this time. But you need this soon for __ldexp_exp() and __ldexp_cexp(), which are needed for hyperbolic functions (the large args already fixed for float and double precision) and for cexp() and trig and hyperbolic complex functions. It is much easier to implement these using a kernel. I do this only for float precision. The hyperbolic functions are also much easier with a kernel. Most of the thresholds become unnecessary. Above about |x| < 0.5, all cases are handled uniformly using code like: k_expl(x, &hi1, &lo1, &k1); k_expl(-x, &hi2, &lo2, &k2); /* KISS slow initially */ /* * Bah, that's too uniform. I don't want to deal with k. So * use a threshold for large |x| (case handled by __ldexp_expl(). * Now for 0.5 <= |x| < thresh: */ k_expl(x, &hi1, &lo1); k_expl(-x, &hi2, &lo2); /* diferent API includes 2**k */ _2sumF(hi1, lo1); /* a bit sloppy */ return (0.5 * (lo2 + lo1 + hi1 + hi2)); Error analysis: since |x| >= 0.5, the ratio exp(x)/exp(-x) is >= exp(1). Already if we we add exp(x) + exp(-x), the error is at most ~1 ulps (certainly less than 2). But our hi_lo approximations give 6-10 extra bits, so the ~1 ulp error is scaled by 2**-6 or better until the final addition. Note that this doesn't need the complexities of expm1l(x) or a kernel for that. Adding exp(-x) to exp(x) without increasing the error significantly is similar to adding -1 to exp(x) (subtracting exp(-x) for sinh() is even more similar). The addiitonal complications in expm1l() are because: - it wants to give 6-10 bits and not lose any relative to expl() - |x| >= 0.5 so large cancelations cannot occur. For |x| <= 0.5, use a poly approx. 0.5 can be reduced significantly if necessary to get less terms in the poly. This is less needed than for expm1l() since the power series about 0 converges much faster for coshl(). Similarly for sinhl(). I don't know of a similar method for tanhl(). A division seems to be necessary, and hi+lo decompositions only work well for additions. > /* > * ==================================================== > * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved. > * > * Developed at SunSoft, a Sun Microsystems, Inc. business. > * Permission to use, copy, modify, and distribute this > * software is freely granted, provided that this notice > * is preserved. > * ==================================================== > * > * Converted to long double by Steven G. Kargl > */ It changes the style completely, so diffs with the double version are unreadable. > #if LDBL_MANT_DIG == 64 > ... > #else > #error "Unsupported long double format" > #endif The complications for ld80/128 and i386 seem reasonble. > long double > coshl(long double x) > { > long double t, w; > uint16_t hx, ix; > > ENTERI(); > > GET_LDBL_EXPSIGN(hx, x); > ix = hx & 0x7fff; > SET_LDBL_EXPSIGN(x, ix); This sign frobbing is pessimal, and is not done by the double version. Signs are better cleared using fabs*() unless you are sure that clearing them in bits is more optimal. But don't optimize before getting it right. Any optimizations should be made to the double version first. > /* x is +-Inf or NaN. */ > if (ix == BIAS + LDBL_MAX_EXP) > RETURNI(x * x); The sign frobbing also clobbers the result here. > > if (x < log2o2) { The threshold comparisons are painful and probably inefficient to do in bits. Hoever you have most of the pain of using bits by using long doubles and LD80C() to declare precise thresholds. Most or all of the thresholds are fuzzy and don't need more than float precision (or maybe just a the exponent). The double version uses a fuzzy threshold here. It only tests the upper 21 bits of the mantissa, so it uses less than float precision. > if (ix < BIAS + EXP_TINY) { /* |x| < 0x1pEXP_TINY */ > /* cosh(x) = 1 exactly iff x = +-0. */ > if ((int)x == 0) > RETURNI(1.0L); > } Unnecessary algorithm change and micro-optimization. The double version uses the general case doing 1+t to set inexact here. > t = expm1l(x); > w = 1 + t; > RETURNI(1 + t * t / (w + w)); > } > ... > if (x < o_threshold2) { > t = expl(half * x); > RETURNI(half * t * t); > } This is missing use of __ldexp_expl(). Going back to the painful threshold declarations: > #if LDBL_MANT_DIG == 64 > static const union IEEEl2bits > #define EXP_TINY -32 Strange placement of macro in the middle of a declaration. This should be simply LDBL_MANT_DIG / 2. > #define s_threshold 24 I don't understand the magic for this now. It is not quite LDBL_MANT_DIG / 3. > /* log(2) / 2 */ > log2o2u = LD80C(0xb17217f7d1cf79ac, -2, 0.346573590279972654714L), > #define log2o2 (log2o2u.e) > /* x = log(LDBL_MAX - 0.5) */ > o_threshold1u = LD80C(0xb17217f7d1cf79ac, 13, 11356.5234062941439497L), > #define o_threshold1 (o_threshold1u.e) > /* log(LDBL_MAX - 0.5) + log(2) */ > o_threshold2u = LD80C(0xb174ddc031aec0ea, 13, 11357.2165534747038951L); > #define o_threshold2 (o_threshold2u.e) > #elif LDBL_MANT_DIG == 113 > #define EXP_TINY -56 > #define s_threshold 41 > static long double > log2o2 = 0.346573590279972654708616060729088288L, > o_threshold1 = 11356.5234062941439494919310779707650L, > o_threshold2 = 11357.2165534747038948013483100922230L; > #else > #error "Unsupported long double format" > #endif No need for any long doubles or LD80C()'s. The double version uses only 21 bits for all thresholds. We had to be more careful about the thresholds for technical reasons in expl(). IIRC, we needed precise thresholds to do a final filtering step after a fuzzy initial classification. This isn't needed here, since we use calculations that can't give spurious overflow. Going back a bit more: - 'huge' is missing a volatile qualifier to work around clang bugs. expl() is already out of date relative to exp2l() since it is missing the recent fix to add this qualifier. This fix is missing in the double and float versions of all the hyperbolic functions too. - 'half' doesn't need to be long double - the double version has a variable 'one'. You changed that to '1', but didn't change 'half' to 0.5. Bruce From owner-freebsd-numerics@FreeBSD.ORG Mon Jun 10 05:58:41 2013 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id B17D16F7 for ; Mon, 10 Jun 2013 05:58:41 +0000 (UTC) (envelope-from das@FreeBSD.ORG) Received: from zim.MIT.EDU (50-196-151-174-static.hfc.comcastbusiness.net [50.196.151.174]) by mx1.freebsd.org (Postfix) with ESMTP id 982191C88 for ; Mon, 10 Jun 2013 05:58:41 +0000 (UTC) Received: from zim.MIT.EDU (localhost [127.0.0.1]) by zim.MIT.EDU (8.14.7/8.14.2) with ESMTP id r5A5wYKw068908; Sun, 9 Jun 2013 22:58:34 -0700 (PDT) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by zim.MIT.EDU (8.14.7/8.14.2/Submit) id r5A5wYIm068907; Sun, 9 Jun 2013 22:58:34 -0700 (PDT) (envelope-from das@FreeBSD.ORG) Date: Sun, 9 Jun 2013 22:58:34 -0700 From: David Schultz To: Bruce Evans Subject: Re: Implementation for coshl. Message-ID: <20130610055834.GA68643@zim.MIT.EDU> References: <20130610003645.GA16444@troutmask.apl.washington.edu> <20130610110740.V24058@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130610110740.V24058@besplex.bde.org> Cc: freebsd-numerics@FreeBSD.org, Steve Kargl X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2013 05:58:41 -0000 Awesome, thanks for working on this. On Mon, Jun 10, 2013, Bruce Evans wrote: > On Sun, 9 Jun 2013, Steve Kargl wrote: > > Arch | Interval | #calls | Time (s) | Max ULP | Compiler | Value > > -----------+---------------------+--------+----------+---------+----------+------- > > i386 [1] | [ 0.00: 0.35] | 100M | 15.0198 | 0.58583 | gcc | 1 > > i386 [1] | [ 0.35: 24.00] | 100M | 15.1858 | 1.01504 | gcc | 2 > > i386 [1] | [ 24.00:11356.52] | 100M | 12.9591 | 0.51198 | gcc | 3 > > i386 [1] | [11356.52:11357.22] | 100M | 13.3328 | 1.90988 | gcc | 4 > > -----------+---------------------+--------+----------+---------+----------+------- > > Quite large errors, unfortunately much the same as in double precision. [...] > > Bruce and I exchanged emails a long time ago about possible ways > > to reduce the ulp in this range by either computer exp(x) with > > extra precision or using a table with cosh(x) = cosh(x_i) * cosh(d) > > + sinh(x_i) * sinh(d) with d = x - x_i. I tried the latter with > > disappointing results. This is fine for now. We might eventually want to improve the accuracy of the float, double, and long double implementations, but that ought to come later. It's much easier to start with float and double when making improvements in the algorithm. > > The former would require a refactoring of > > s_expl.c into a kernel __kernel_expl(x, hi, lo). I have no plans on > > pursuing this at the this time. > > But you need this soon for __ldexp_exp() and __ldexp_cexp(), which are > needed for hyperbolic functions (the large args already fixed for float > and double precision) and for cexp() and trig and hyperbolic complex > functions. It is much easier to implement these using a kernel. I do > this only for float precision. That's fine for now. We need a better k_exp, too. I believe Bruce was working on one. I have a few comments on the code... mostly a subset of Bruce's comments: - You can clear the sign bit on x after the Inf/NaN check. - I think the way you handle the tiny-x case is an improvement over the double version. However, float-to-int conversion is very slow on x86, so you might instead try something like RETURNI(1.0L + tiny), where tiny is volatile and e.g., 0x1p-1000. That will also return the correct result in all rounding modes. - Please check whether volatile hacks are needed to induce the compiler to do the right thing. I'm increasingly finding that these are needed with clang. - I agree with Bruce about the thresholds not needing to be exact. In fact, if you try to use exact thresholds, the threshold might round the wrong way and give a subtle off-by-one-ulp error. Often just looking at the exponent is sufficient. From owner-freebsd-numerics@FreeBSD.ORG Mon Jun 10 11:06:53 2013 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 57FF4FFF for ; Mon, 10 Jun 2013 11:06:53 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 23CBF1C90 for ; Mon, 10 Jun 2013 11:06:53 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r5AB6qaI097048 for ; Mon, 10 Jun 2013 11:06:52 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r5AB6qQH097046 for freebsd-numerics@FreeBSD.org; Mon, 10 Jun 2013 11:06:52 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 10 Jun 2013 11:06:52 GMT Message-Id: <201306101106.r5AB6qQH097046@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-numerics@FreeBSD.org Subject: Current problem reports assigned to freebsd-numerics@FreeBSD.org X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2013 11:06:53 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o stand/175811 numerics libstdc++ needs complex support in order use C99 o bin/170206 numerics [msun] [patch] complex arcsinh, log, etc. 2 problems total. From owner-freebsd-numerics@FreeBSD.ORG Mon Jun 10 15:43:28 2013 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 5D5C4C9B; Mon, 10 Jun 2013 15:43:28 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail36.syd.optusnet.com.au (mail36.syd.optusnet.com.au [211.29.133.76]) by mx1.freebsd.org (Postfix) with ESMTP id D7DFC1F74; Mon, 10 Jun 2013 15:43:26 +0000 (UTC) Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23]) by mail36.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id r5AFhHfe009078 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 11 Jun 2013 01:43:18 +1000 Date: Tue, 11 Jun 2013 01:43:17 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: David Schultz Subject: Re: Implementation for coshl. In-Reply-To: <20130610055834.GA68643@zim.MIT.EDU> Message-ID: <20130610235038.D790@besplex.bde.org> References: <20130610003645.GA16444@troutmask.apl.washington.edu> <20130610110740.V24058@besplex.bde.org> <20130610055834.GA68643@zim.MIT.EDU> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.0 cv=K8x6hFqI c=1 sm=1 a=LM0AswAWfpYA:10 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=MtpAjnDH-vAA:10 a=o_Yc0WgG-dlwfkC-MugA:9 a=CjuIK1q_8ugA:10 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117 Cc: freebsd-numerics@FreeBSD.org, Steve Kargl , Bruce Evans X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2013 15:43:28 -0000 On Sun, 9 Jun 2013, David Schultz wrote: > On Mon, Jun 10, 2013, Bruce Evans wrote: >> On Sun, 9 Jun 2013, Steve Kargl wrote: > > This is fine for now. We might eventually want to improve the > accuracy of the float, double, and long double implementations, > but that ought to come later. It's much easier to start with > float and double when making improvements in the algorithm. Yes, it would have been better to improve them before cloning their bugs. >>> The former would require a refactoring of >>> s_expl.c into a kernel __kernel_expl(x, hi, lo). I have no plans on >>> pursuing this at the this time. >> >> But you need this soon for __ldexp_exp() and __ldexp_cexp(), which are >> needed for hyperbolic functions (the large args already fixed for float >> and double precision) and for cexp() and trig and hyperbolic complex >> functions. It is much easier to implement these using a kernel. I do >> this only for float precision. > > That's fine for now. We need a better k_exp, too. I believe > Bruce was working on one. This is especially easy for long double precision, since expl() only has 1 return that needs to be changed: @ /* Evaluate expl(endpoint[n2] + r1 + r2) = tbl[n2] * expl(r1 + r2). */ @ z = r * r; @ q = r2 + z * (A2 + r * A3) + z * z * (A4 + r * A5) + z * z * z * A6; @ t = (long double)tbl[n2].lo + tbl[n2].hi; @ t = tbl[n2].lo + t * (q + r1) + tbl[n2].hi; Return the hi and low parts tbl[n2].hi and tbl[n2].lo + t * (q + r1) here and add them up in callers. @ @ /* Scale by 2**k. */ @ if (k >= LDBL_MIN_EXP) { @ if (k == LDBL_MAX_EXP) @ RETURNI(t * 2 * 0x1p16383L); @ RETURNI(t * twopk); @ } else { @ RETURNI(t * twopkp10000 * twom10000); @ } Return k and/or twopk here and multiply by them in callers. Remove all other returns (don't support special values in the kernel). The only difficulty is that twopk doesn't give the full power for cases near of after overflow and underflow, and those cases are precisely the critical ones for __ldexp_[c]exp*(). I have this sort of working for sinhf() in float precision, for the range [1, 9+]. There is no problem with twopk in this range, and I multiply by it in the kernel and don't return it. A simplified version of adding up the values in the caller is: _k_exp(fabs(x), &hi1, &lo1); _k_exp(-fabs(x), &hi2, &lo2); return h * (lo2 + lo1 + hi2 + hi1); This summation is sloppy and loses about hi2/hi1 ulps unnecessarily. hi2/hi1 <= exp(-2*|x|) < exp(-2) so this loss is never absolutely large, and is only relatively large if exp() it very accurate. This error is in addition to the error in exp(|x|) ~= lo1 + h1. > I have a few comments on the code... mostly a subset of Bruce's > comments: > ... > - I think the way you handle the tiny-x case is an improvement > over the double version. However, float-to-int conversion is > very slow on x86, so you might instead try something like > RETURNI(1.0L + tiny), where tiny is volatile and e.g., 0x1p-1000. > That will also return the correct result in all rounding modes. No, it is an unimprovement. It micro-optimizes for an unusual case, and breaks the rounding in that case. We were careful about this in expl() and expm1l(). expl(x) returns 1+x, and expm1l(x) returns a very complicated expression to get the rounding right for x + x*x/2. This is quite difficult for expm1l() since the signs alternate. In the complicated expression, we needed an explicit test for x == 0, since the rest of the complicated expression would mess up the signed for +-0 in some rounding modes. This particular float-to-int conversion isn't very slow on i386. In previous lectures, I explained why it is very efficient even on (out of order execution) i386's: no operand depend on its result, so it can be and is done in parallel. It is magic complicated expressions that are inefficient, since even a simple expression that uses the arg gives 1 dependency and complicated ones give more. I tried to get the rounding for values of +-0 and +-pi/2 right in catrig*.c. But fdlibm generally isn't very careful about this. I noticed the following types of errors: - expressions using pi/2 in such a way that it doesn't work right in all rounding modes. Returning a const pi/2 can never work in all rounding modes, so exprssions like pi/2 + tiny should be used - often, tiny isn't volatile, so compiler bugs prevent pi/2 + tiny working - sometimes, optimization broke pi/2 + tiny by replacing it by pi/2 - often, pi/2 is written as pio2_hi + pio2_lo. This (and pi for pi/2) should be rounded down so that adding tiny to it works right. IIRC, rounding to nearest gives the same result as rounding down in double precision, so everyone gets the same result, but in float precision rounding to nearest rounds up. This delicacy was not understood when the functions were converted to float precision, so there are some misrounded spittings of pi/2. Now I'm not sure if any fixed splitting can work in all rounding modes. The complications from double rounding are espcially large since compile-time rounding of the constant is mixed with runtime rounding of expressions using it. Typical code using this from e_asin.c: if(((ix-0x3ff00000)|lx)==0) /* asin(1)=+-pi/2 with inexact */ return x*pio2_hi+x*pio2_lo; The comment is bad and should say asin(-1) (for asin(1), we wouldn't use x in the expression). The result of this is pio2_hi + pio2_lo for x = 1 and -pio2_hi - pio2_lo for x = -1. This sets inexact but seems to be quite broken for non-default rounding modes. It seems too hard to round to pio/2 correctly in all cases (unless delicate rounding in the splitting does it), but the aboves breaks even asin() being an odd function when the rounding mode is towards +-infinity. Most odd functions avoid this problem by calculating func(|x|) and adjusting the sign at the end. The bugs in this were moved by dubious optimizations in e_asinf.c: if(ix==0x3f800000) /* |x| == 1 */ return x*pio2; /* asin(+-1) = +-pi/2 with inexact */ Now asinf() is odd at +-1 although not for general args; there is no longer any chance of correct rounding at +-1 in all rounding modes; inexact is no longer set in the code, but is still claimed to be set in the comment. I implemented most of these bugs :-(. Fortunately they were not copied to the long double version. - constant folding of pio2_hi + pio2_lo may cause problems even without compiler bugs. The above expression with x's instead of +-1's hopefully prevents compilers reducing to constant expressions and then folding. If we wrote the classification as (fabs(x) == 1), then the optimizer wouldn't have to be very smart to notice that x is 1 or -1 and doing the folding. That would only be an invalid optimization with FENV on. Bruce From owner-freebsd-numerics@FreeBSD.ORG Mon Jun 10 16:29:28 2013 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id F329F215; Mon, 10 Jun 2013 16:29:27 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by mx1.freebsd.org (Postfix) with ESMTP id DDAC51265; Mon, 10 Jun 2013 16:29:27 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.6/8.14.6) with ESMTP id r5AGTRBn020971; Mon, 10 Jun 2013 09:29:27 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.6/8.14.6/Submit) id r5AGTRwK020970; Mon, 10 Jun 2013 09:29:27 -0700 (PDT) (envelope-from sgk) Date: Mon, 10 Jun 2013 09:29:27 -0700 From: Steve Kargl To: David Schultz Subject: Re: Implementation for coshl. Message-ID: <20130610162927.GA20856@troutmask.apl.washington.edu> References: <20130610003645.GA16444@troutmask.apl.washington.edu> <20130610110740.V24058@besplex.bde.org> <20130610055834.GA68643@zim.MIT.EDU> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130610055834.GA68643@zim.MIT.EDU> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-numerics@FreeBSD.org, Bruce Evans X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2013 16:29:28 -0000 On Sun, Jun 09, 2013 at 10:58:34PM -0700, David Schultz wrote: > Awesome, thanks for working on this. > > On Mon, Jun 10, 2013, Bruce Evans wrote: > > On Sun, 9 Jun 2013, Steve Kargl wrote: > > > Arch | Interval | #calls | Time (s) | Max ULP | Compiler | Value > > > -----------+---------------------+--------+----------+---------+----------+------- > > > i386 [1] | [ 0.00: 0.35] | 100M | 15.0198 | 0.58583 | gcc | 1 > > > i386 [1] | [ 0.35: 24.00] | 100M | 15.1858 | 1.01504 | gcc | 2 > > > i386 [1] | [ 24.00:11356.52] | 100M | 12.9591 | 0.51198 | gcc | 3 > > > i386 [1] | [11356.52:11357.22] | 100M | 13.3328 | 1.90988 | gcc | 4 > > > -----------+---------------------+--------+----------+---------+----------+------- > > > > Quite large errors, unfortunately much the same as in double precision. > [...] > > > Bruce and I exchanged emails a long time ago about possible ways > > > to reduce the ulp in this range by either computer exp(x) with > > > extra precision or using a table with cosh(x) = cosh(x_i) * cosh(d) > > > + sinh(x_i) * sinh(d) with d = x - x_i. I tried the latter with > > > disappointing results. > > This is fine for now. We might eventually want to improve the > accuracy of the float, double, and long double implementations, > but that ought to come later. It's much easier to start with > float and double when making improvements in the algorithm. I do not have working test programs for float and double, which will take a morning (or late night) to fix. So, I do not have a method for testing the either the current cosh[f] or any changes that I might suggest. > > > The former would require a refactoring of > > > s_expl.c into a kernel __kernel_expl(x, hi, lo). I have no plans on > > > pursuing this at the this time. > > > > But you need this soon for __ldexp_exp() and __ldexp_cexp(), which are > > needed for hyperbolic functions (the large args already fixed for float > > and double precision) and for cexp() and trig and hyperbolic complex > > functions. It is much easier to implement these using a kernel. I do > > this only for float precision. I spent a couple of hours last night pulling __kernel_expl out of expl(). The results were less then stellar, but Bruce's email suggested the error in my method. > That's fine for now. We need a better k_exp, too. I believe > Bruce was working on one. > > I have a few comments on the code... mostly a subset of Bruce's > comments: > > - You can clear the sign bit on x after the Inf/NaN check. OK. > - I think the way you handle the tiny-x case is an improvement > over the double version. However, float-to-int conversion is > very slow on x86, so you might instead try something like > RETURNI(1.0L + tiny), where tiny is volatile and e.g., 0x1p-1000. > That will also return the correct result in all rounding modes. I originally had if (ix < BIAS - 33) { /* |x| < 0x1p-33 */ /* cosh(+-0) = 1. */ if (x == 0) return (one); /* cosh(tiny) = one */ return (one + tiny * tiny); } because C99 requires cosh(+-0) to be exact. For |x| != the return value will raise inexact. > - Please check whether volatile hacks are needed to induce the > compiler to do the right thing. I'm increasingly finding that > these are needed with clang. I suppose I need to either look at the generated assembly or add testing for signals to my test programs. How do you go about looking at this? -- Steve From owner-freebsd-numerics@FreeBSD.ORG Mon Jun 10 16:37:09 2013 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id CD82F321; Mon, 10 Jun 2013 16:37:09 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by mx1.freebsd.org (Postfix) with ESMTP id 97AB212B1; Mon, 10 Jun 2013 16:37:09 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.6/8.14.6) with ESMTP id r5AGb90A021017; Mon, 10 Jun 2013 09:37:09 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.6/8.14.6/Submit) id r5AGb9Ve021016; Mon, 10 Jun 2013 09:37:09 -0700 (PDT) (envelope-from sgk) Date: Mon, 10 Jun 2013 09:37:09 -0700 From: Steve Kargl To: Bruce Evans Subject: Re: Implementation for coshl. Message-ID: <20130610163709.GB20856@troutmask.apl.washington.edu> References: <20130610003645.GA16444@troutmask.apl.washington.edu> <20130610110740.V24058@besplex.bde.org> <20130610055834.GA68643@zim.MIT.EDU> <20130610235038.D790@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130610235038.D790@besplex.bde.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: David Schultz , freebsd-numerics@FreeBSD.org X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2013 16:37:09 -0000 On Tue, Jun 11, 2013 at 01:43:17AM +1000, Bruce Evans wrote: > On Sun, 9 Jun 2013, David Schultz wrote: >> On Mon, Jun 10, 2013, Bruce Evans wrote: >>> On Sun, 9 Jun 2013, Steve Kargl wrote: >>>> The former would require a refactoring of >>>> s_expl.c into a kernel __kernel_expl(x, hi, lo). I have no plans on >>>> pursuing this at the this time. >>> >>> But you need this soon for __ldexp_exp() and __ldexp_cexp(), which are >>> needed for hyperbolic functions (the large args already fixed for float >>> and double precision) and for cexp() and trig and hyperbolic complex >>> functions. It is much easier to implement these using a kernel. I do >>> this only for float precision. >> >> That's fine for now. We need a better k_exp, too. I believe >> Bruce was working on one. > > This is especially easy for long double precision, since expl() only > has 1 return that needs to be changed: > > @ /* Evaluate expl(endpoint[n2] + r1 + r2) = tbl[n2] * expl(r1 + r2). */ > @ z = r * r; > @ q = r2 + z * (A2 + r * A3) + z * z * (A4 + r * A5) + z * z * z * A6; > @ t = (long double)tbl[n2].lo + tbl[n2].hi; > @ t = tbl[n2].lo + t * (q + r1) + tbl[n2].hi; > > Return the hi and low parts tbl[n2].hi and tbl[n2].lo + t * (q + r1) here > and add them up in callers. > > @ > @ /* Scale by 2**k. */ > @ if (k >= LDBL_MIN_EXP) { > @ if (k == LDBL_MAX_EXP) > @ RETURNI(t * 2 * 0x1p16383L); > @ RETURNI(t * twopk); > @ } else { > @ RETURNI(t * twopkp10000 * twom10000); > @ } > > Return k and/or twopk here and multiply by them in callers. Ah, so that's the answer to my unasked question. I spent a few hours last night working on this, but I was trying to apply the scaling to hi and lo before adding them in the caller. The result was that the max observed ULP in my tests went from 0.51 to 0.73, and I believe, but did not test, that the overflow threshold was effected. > > I have a few comments on the code... mostly a subset of Bruce's > > comments: > > ... > > - I think the way you handle the tiny-x case is an improvement > > over the double version. However, float-to-int conversion is > > very slow on x86, so you might instead try something like > > RETURNI(1.0L + tiny), where tiny is volatile and e.g., 0x1p-1000. > > That will also return the correct result in all rounding modes. > > No, it is an unimprovement. It micro-optimizes for an unusual case, > and breaks the rounding in that case. OK. I'll revisit this section of code. -- Steve From owner-freebsd-numerics@FreeBSD.ORG Mon Jun 10 17:06:05 2013 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 3B5FA2ED for ; Mon, 10 Jun 2013 17:06:05 +0000 (UTC) (envelope-from das@FreeBSD.ORG) Received: from zim.MIT.EDU (50-196-151-174-static.hfc.comcastbusiness.net [50.196.151.174]) by mx1.freebsd.org (Postfix) with ESMTP id 12ED6162A for ; Mon, 10 Jun 2013 17:06:04 +0000 (UTC) Received: from zim.MIT.EDU (localhost [127.0.0.1]) by zim.MIT.EDU (8.14.7/8.14.2) with ESMTP id r5AH63B1072698; Mon, 10 Jun 2013 10:06:03 -0700 (PDT) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by zim.MIT.EDU (8.14.7/8.14.2/Submit) id r5AH63He072697; Mon, 10 Jun 2013 10:06:03 -0700 (PDT) (envelope-from das@FreeBSD.ORG) Date: Mon, 10 Jun 2013 10:06:03 -0700 From: David Schultz To: Steve Kargl Subject: Re: Implementation for coshl. Message-ID: <20130610170603.GA72597@zim.MIT.EDU> References: <20130610003645.GA16444@troutmask.apl.washington.edu> <20130610110740.V24058@besplex.bde.org> <20130610055834.GA68643@zim.MIT.EDU> <20130610162927.GA20856@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130610162927.GA20856@troutmask.apl.washington.edu> Cc: freebsd-numerics@FreeBSD.org, Bruce Evans X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2013 17:06:05 -0000 On Mon, Jun 10, 2013, Steve Kargl wrote: > On Sun, Jun 09, 2013 at 10:58:34PM -0700, David Schultz wrote: > > Awesome, thanks for working on this. > > > > On Mon, Jun 10, 2013, Bruce Evans wrote: > > > On Sun, 9 Jun 2013, Steve Kargl wrote: > > > > Arch | Interval | #calls | Time (s) | Max ULP | Compiler | Value > > > > -----------+---------------------+--------+----------+---------+----------+------- > > > > i386 [1] | [ 0.00: 0.35] | 100M | 15.0198 | 0.58583 | gcc | 1 > > > > i386 [1] | [ 0.35: 24.00] | 100M | 15.1858 | 1.01504 | gcc | 2 > > > > i386 [1] | [ 24.00:11356.52] | 100M | 12.9591 | 0.51198 | gcc | 3 > > > > i386 [1] | [11356.52:11357.22] | 100M | 13.3328 | 1.90988 | gcc | 4 > > > > -----------+---------------------+--------+----------+---------+----------+------- > > > > > > Quite large errors, unfortunately much the same as in double precision. > > [...] > > > > Bruce and I exchanged emails a long time ago about possible ways > > > > to reduce the ulp in this range by either computer exp(x) with > > > > extra precision or using a table with cosh(x) = cosh(x_i) * cosh(d) > > > > + sinh(x_i) * sinh(d) with d = x - x_i. I tried the latter with > > > > disappointing results. > > > > This is fine for now. We might eventually want to improve the > > accuracy of the float, double, and long double implementations, > > but that ought to come later. It's much easier to start with > > float and double when making improvements in the algorithm. > > I do not have working test programs for float and double, which > will take a morning (or late night) to fix. So, I do not have > a method for testing the either the current cosh[f] or any changes > that I might suggest. I think you might have misinterpreted my suggestion. I meant that you should not spend too much time worrying about improvements in the algorithm at the same time as you're porting it from double to long double. Improvements in the algorithm can come separately, and when we do tackle them, it is easier to develop them in float precision first. > > > > The former would require a refactoring of > > > > s_expl.c into a kernel __kernel_expl(x, hi, lo). I have no plans on > > > > pursuing this at the this time. > > > > > > But you need this soon for __ldexp_exp() and __ldexp_cexp(), which are > > > needed for hyperbolic functions (the large args already fixed for float > > > and double precision) and for cexp() and trig and hyperbolic complex > > > functions. It is much easier to implement these using a kernel. I do > > > this only for float precision. > > I spent a couple of hours last night pulling __kernel_expl out > of expl(). The results were less then stellar, but Bruce's > email suggested the error in my method. My recollection of working on this in double precision is that Bruce was unhappy with most proposals. Either it made exp slower by having exp call the kernel, or it resulted in lots of code duplication. I think it's a very useful thing to have, but don't get stuck in a rathole for a month over it. > > - I think the way you handle the tiny-x case is an improvement > > over the double version. However, float-to-int conversion is > > very slow on x86, so you might instead try something like > > RETURNI(1.0L + tiny), where tiny is volatile and e.g., 0x1p-1000. > > That will also return the correct result in all rounding modes. > > I originally had > > if (ix < BIAS - 33) { /* |x| < 0x1p-33 */ > /* cosh(+-0) = 1. */ > if (x == 0) > return (one); > /* cosh(tiny) = one */ > return (one + tiny * tiny); > } > > because C99 requires cosh(+-0) to be exact. For |x| != the > return value will raise inexact. Aah, right, what I proposed won't handle 0 correctly. In that case, I think what you have is fine; it should handle everything except for the rounding mode issue, which we don't really support anyway. Alternatively, you could just do what fdlibm does, which is hard to argue with given that it has been acceptable for several decades. > > - Please check whether volatile hacks are needed to induce the > > compiler to do the right thing. I'm increasingly finding that > > these are needed with clang. > > I suppose I need to either look at the generated assembly or > add testing for signals to my test programs. How do you go > about looking at this? #include volatile long double x = ..., y; feclearexcept(FE_ALL_EXCEPT); y = coshl(x); printf("%x\n", fetestexcept(FE_ALL_EXCEPT)); The main cases to check are: - cosh(+-0) = 0, no exceptions - cosh(large) = inf, overflow exception (On some architectures, this also raises inexact, which is allowed.) - cosh(inf or qnan) raises no exceptions - cosh(anything else) raises an inexact exception From owner-freebsd-numerics@FreeBSD.ORG Mon Jun 10 19:14:54 2013 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 9285C585; Mon, 10 Jun 2013 19:14:54 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail106.syd.optusnet.com.au (mail106.syd.optusnet.com.au [211.29.132.42]) by mx1.freebsd.org (Postfix) with ESMTP id 41E0A1BF2; Mon, 10 Jun 2013 19:14:53 +0000 (UTC) Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23]) by mail106.syd.optusnet.com.au (Postfix) with ESMTPS id B7BDA3C0EA5; Tue, 11 Jun 2013 05:14:51 +1000 (EST) Date: Tue, 11 Jun 2013 05:14:46 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: David Schultz Subject: Re: Implementation for coshl. In-Reply-To: <20130610170603.GA72597@zim.MIT.EDU> Message-ID: <20130611040059.A17928@besplex.bde.org> References: <20130610003645.GA16444@troutmask.apl.washington.edu> <20130610110740.V24058@besplex.bde.org> <20130610055834.GA68643@zim.MIT.EDU> <20130610162927.GA20856@troutmask.apl.washington.edu> <20130610170603.GA72597@zim.MIT.EDU> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.0 cv=Q6eKePKa c=1 sm=1 a=LM0AswAWfpYA:10 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=MtpAjnDH-vAA:10 a=HMF5KdcwFu_V6rpZWRkA:9 a=CjuIK1q_8ugA:10 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117 Cc: freebsd-numerics@freebsd.org, Steve Kargl X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2013 19:14:54 -0000 On Mon, 10 Jun 2013, David Schultz wrote: > On Mon, Jun 10, 2013, Steve Kargl wrote: >> On Sun, Jun 09, 2013 at 10:58:34PM -0700, David Schultz wrote: > ... >>> - I think the way you handle the tiny-x case is an improvement >>> over the double version. However, float-to-int conversion is >>> very slow on x86, so you might instead try something like >>> RETURNI(1.0L + tiny), where tiny is volatile and e.g., 0x1p-1000. >>> That will also return the correct result in all rounding modes. >> >> I originally had >> >> if (ix < BIAS - 33) { /* |x| < 0x1p-33 */ >> /* cosh(+-0) = 1. */ >> if (x == 0) >> return (one); >> /* cosh(tiny) = one */ >> return (one + tiny * tiny); >> } >> >> because C99 requires cosh(+-0) to be exact. For |x| != the >> return value will raise inexact. > > Aah, right, what I proposed won't handle 0 correctly. In that > case, I think what you have is fine; it should handle everything > except for the rounding mode issue, which we don't really support > anyway. Alternatively, you could just do what fdlibm does, which > is hard to argue with given that it has been acceptable for > several decades. No, the above is quite broken. tiny * tiny gives underflow, and when the underflow is to 0, adding 1 to it doesn't give inexact. Also, the 'tiny' in the comment is confusing, since it refers to a different variable than the 'tiny' in the code. Just copy the known-good fdlibm code. It uses one+t, where t is expm1(fabs(x)). This has many subtleties: - it avoids pessimizing the usual case - it moves all the complications for setting inexact to expm1(). These include: - the expression 1+x doesn't work for tiny x since it rounds incorrectly in some modes. It also requires x >= 0 for correct rounding, so we may have to caclulate |x| earlier than we want. - the expression 1+x*x doesn't work for tiny x since it may underflow (just like 1+tiny*tiny) - the expression x doesn't work for tiny x in expm1() since it doesn't set inexact. - the expression x+x*x doesn't work for tiny x in expm1() since it may underflow. fdlibm expm1() avoids the last 2 problems using the strange expressions 't = huge+x; return x - (t-(huge+x));'. That is unlikely to be best, but it does avoid a branch for testing x == 0. We may have looked at this before. If so, we didn't like it, and use a different rather strange expression in expm1l(), and we still need a branch. fdlibm itself uses a different method for the same problem in log1p(). It uses 'if (two54+x>zero && ax<3c90000)...'. I didn't like that, and used the more common ((int)x == 0) method to set inexact after classifying x as being tiny using a test like (ax<3c90000). Benchmarks showed that the ((int)x = 0) works well. I just thought of another possible reason for this: on i386 the cast requires large code with an fenv switch. Perhaps the compiler moves the code out of the way, the same as if we used a __predict_false(). This is good since the tiny-arg case is special. The ENTERI() and RETURNI() macros generate similar large code, but much larger and uglier because of excessive carew taken to avoid trapping in fpsetprec(). gcc generates almost 100 bytes per macro invocation, so average small functions explode in object code if they have many returns in them. But compilers seem to do a good job of moving this code out of the way. >>> - Please check whether volatile hacks are needed to induce the >>> compiler to do the right thing. I'm increasingly finding that >>> these are needed with clang. >> >> I suppose I need to either look at the generated assembly or >> add testing for signals to my test programs. How do you go >> about looking at this? > > #include > volatile long double x = ..., y; > feclearexcept(FE_ALL_EXCEPT); > y = coshl(x); > printf("%x\n", fetestexcept(FE_ALL_EXCEPT)); > > The main cases to check are: > > - cosh(+-0) = 0, no exceptions > - cosh(large) = inf, overflow exception > (On some architectures, this also raises inexact, which is allowed.) > - cosh(inf or qnan) raises no exceptions > - cosh(anything else) raises an inexact exception Or in a debugger, put breakpoints at interesting points and look at the exception mask. This is easier on i386, or with long doubles. "info float" decodes the i387 exception mask. "p $mxcsr" displays the SSE mask. Later versions of gdb decode it too. I don't know any easy way to write to the i387 exception mask in gdb. $mxcsr can be assigned to. I now have the kernel expf working for sinhf like I expected. It worked better when didn't corrupt hi-lo to hi_lo, doh. To emulate the eventual high-precision version, I use this: % void % k_expf(float x, float *hip, float *lop) % { % long double r; % % r = expl(x); % *hip = r; % *lop = r - *hip; % } This gives 24+24 bits of accuracy, and sinhf() ends up being perfectly rounded in the range where it uses this. A better enulation would destroy many low bits to get only about 24+6 bits of accuracy. The version created from expf()'s internals has about 24+3 bits. Then in sinhf(x), for 1 <= x <= 12: % float hi1, hi2, lo1, lo2; % extern void k_expf(float x, float *hip, float *lop); % % t = fabsf(x); % k_expf(t, &hi1, &lo1); Returning h * (lo1 + 1 / (hi1 + lo1) + hi1) as this point works OK. Then the final error is under 1 ulp. Without this, sophisticated expressions were needed to keep it under 2 ulps. % k_expf(-t, &hi2, &lo2); % hi2 = -hi2; % lo2 = -lo2; % _2sumF(hi1, hi2); % return h * (lo1 + hi2 + lo2 + hi1); Adding up the terms like this retains almost all the accuracy provided by the kernel. Calculating expf(-x) as 1/expf(x) gives an extra inaccuracy of about 1/exp(2) = 0.135 ulps. Efficiency is currently too low when there are 2 calls to non-inline kernels, so this method is slightly slower than the old method (except using the faster expl() kernel on non-i386. On i386 it and also i387 exp are too slow due to non-null precision switches). The correct fix is unclear. Hopefully just speed up the kernels. Bruce From owner-freebsd-numerics@FreeBSD.ORG Tue Jun 11 01:22:19 2013 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 5299B413; Tue, 11 Jun 2013 01:22:19 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail108.syd.optusnet.com.au (mail108.syd.optusnet.com.au [211.29.132.59]) by mx1.freebsd.org (Postfix) with ESMTP id DCA5C1ED8; Tue, 11 Jun 2013 01:22:18 +0000 (UTC) Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23]) by mail108.syd.optusnet.com.au (Postfix) with ESMTPS id B175D1A06F7; Tue, 11 Jun 2013 11:22:10 +1000 (EST) Date: Tue, 11 Jun 2013 11:22:09 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Bruce Evans Subject: Re: Implementation for coshl. In-Reply-To: <20130611040059.A17928@besplex.bde.org> Message-ID: <20130611104252.G50308@besplex.bde.org> References: <20130610003645.GA16444@troutmask.apl.washington.edu> <20130610110740.V24058@besplex.bde.org> <20130610055834.GA68643@zim.MIT.EDU> <20130610162927.GA20856@troutmask.apl.washington.edu> <20130610170603.GA72597@zim.MIT.EDU> <20130611040059.A17928@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.0 cv=K8x6hFqI c=1 sm=1 a=LM0AswAWfpYA:10 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=MtpAjnDH-vAA:10 a=Udj031w0ckm0zOnfpgkA:9 a=CjuIK1q_8ugA:10 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117 Cc: David Schultz , freebsd-numerics@freebsd.org, Steve Kargl X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Jun 2013 01:22:19 -0000 On Tue, 11 Jun 2013, Bruce Evans wrote: > On Mon, 10 Jun 2013, David Schultz wrote: > >> On Mon, Jun 10, 2013, Steve Kargl wrote: >... >>> I originally had >>> >>> if (ix < BIAS - 33) { /* |x| < 0x1p-33 */ >>> /* cosh(+-0) = 1. */ >>> if (x == 0) >>> return (one); >>> /* cosh(tiny) = one */ >>> return (one + tiny * tiny); >>> } >>> >>> because C99 requires cosh(+-0) to be exact. For |x| != the >>> return value will raise inexact. > ... > No, the above is quite broken. tiny * tiny gives underflow, and > when the underflow is to 0, adding 1 to it doesn't give inexact. > Also, the 'tiny' in the comment is confusing, since it refers to > a different variable than the 'tiny' in the code. > > Just copy the known-good fdlibm code. It uses one+t, where t is > expm1(fabs(x)). This has many subtleties: > - it avoids pessimizing the usual case > - it moves all the complications for setting inexact to expm1(). These > include: > - the expression 1+x doesn't work for tiny x since it rounds incorrectly > in some modes. It also requires x >= 0 for correct rounding, so > we may have to caclulate |x| earlier than we want. > - the expression 1+x*x doesn't work for tiny x since it may underflow > (just like 1+tiny*tiny) > - the expression x doesn't work for tiny x in expm1() since it doesn't > set inexact. > - the expression x+x*x doesn't work for tiny x in expm1() since it may > underflow. Bah, this is more magic and broken than first appeared. The fdlibm code is now known-bad :-). I stumbled across this when implementing the use of k_expf() in ccoshf(). o One subtlety in the fdlibm cosh() is that it uses an artificially small threshold of 2**-55 to "ensure" that 1+this gives 1 with inexact (actually 1+eps in some rounding modes). A threshold of about 2**26.5 is sufficient for cosh() to decay to 1, but fdlibm wants to abuse the tiny value given by each arg below the threshold for setting inexact as a side effect of evaluating the expression 'one + expm1f(x)'. This obviously requires a threshold of <= 2**-53 or 2**-54. The value of 2**-55 actually used makes no sense. o I broke this in coshf() in 2005 when fixing its threshold. The threshold of 2**-55 was blindly copied. It should have been reduced to 2**-26 for a direct translation. I changed it to the natural threshold of 2**-12. This also gives a micro-optimization. This made extra-precision bugs more apparent, and I fixed them by returning 'one' instead of 'one + expm1f(x)' for tiny x. Returning 'one' gives another micro-optimization. But this weakens the support for non-default rounding modes. o fdlibm cosh()'s large threshold is not large enough to actually work. 'one + 2**-55' may be evaluated and returned in extra precision, and it only takes 2 or 3 bits extra for it to evaluate to itself instead of 1. coshf()'s blindly copied threshold accidentally protected against using '1.0F + 2**-26F' which always evaluates to itself on i386, and is then normally returned unchanged. Assertions like assert(cosh(tiny) == 1) should then fail, but cosh() may return extra precision so is also a bug in the assertions -- essentially the same one that I found recently for exp() non-overlow. This is still a bug in cosh() which would be found by a more careful assertion -- cosh(x) may be evaluated in extra precision, but then its value must be ~1+x*x/2, not ~1+|x|. FreeBSD's default rounding precision protects against these bugs in both float and double precision provided the theshold is <= 2**-53 for both. Bruce From owner-freebsd-numerics@FreeBSD.ORG Thu Jun 13 11:12:07 2013 Return-Path: Delivered-To: freebsd-numerics@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 3C72BCD6; Thu, 13 Jun 2013 11:12:07 +0000 (UTC) (envelope-from gavin@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 16CCF16E8; Thu, 13 Jun 2013 11:12:07 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r5DBC69l078803; Thu, 13 Jun 2013 11:12:06 GMT (envelope-from gavin@freefall.freebsd.org) Received: (from gavin@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r5DBC66F078802; Thu, 13 Jun 2013 11:12:06 GMT (envelope-from gavin) Date: Thu, 13 Jun 2013 11:12:06 GMT Message-Id: <201306131112.r5DBC66F078802@freefall.freebsd.org> To: gavin@FreeBSD.org, freebsd-standards@FreeBSD.org, freebsd-numerics@FreeBSD.org From: gavin@FreeBSD.org Subject: Re: standards/82654: C99 long double math functions are missing X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Jun 2013 11:12:07 -0000 Synopsis: C99 long double math functions are missing Responsible-Changed-From-To: freebsd-standards->freebsd-numerics Responsible-Changed-By: gavin Responsible-Changed-When: Thu Jun 13 11:09:50 UTC 2013 Responsible-Changed-Why: Over to -numerics. It looks like at least half of this has already made it into releases, and the other half is in head, so it may be that this PR can be put into "patched" state now. http://www.freebsd.org/cgi/query-pr.cgi?pr=82654 From owner-freebsd-numerics@FreeBSD.ORG Fri Jun 14 06:53:58 2013 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 9392566C; Fri, 14 Jun 2013 06:53:58 +0000 (UTC) (envelope-from das@FreeBSD.ORG) Received: from zim.MIT.EDU (50-196-151-174-static.hfc.comcastbusiness.net [50.196.151.174]) by mx1.freebsd.org (Postfix) with ESMTP id 79B81167E; Fri, 14 Jun 2013 06:53:57 +0000 (UTC) Received: from zim.MIT.EDU (localhost [127.0.0.1]) by zim.MIT.EDU (8.14.7/8.14.2) with ESMTP id r5E6rvBI084983; Thu, 13 Jun 2013 23:53:57 -0700 (PDT) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by zim.MIT.EDU (8.14.7/8.14.2/Submit) id r5E6rvQ5084982; Thu, 13 Jun 2013 23:53:57 -0700 (PDT) (envelope-from das@FreeBSD.ORG) Date: Thu, 13 Jun 2013 23:53:57 -0700 From: David Schultz To: gavin@FreeBSD.org Subject: Re: standards/82654: C99 long double math functions are missing Message-ID: <20130614065357.GA84881@zim.MIT.EDU> References: <201306131112.r5DBC66F078802@freefall.freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201306131112.r5DBC66F078802@freefall.freebsd.org> Cc: freebsd-numerics@FreeBSD.org, freebsd-standards@FreeBSD.org X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Jun 2013 06:53:58 -0000 On Thu, Jun 13, 2013, gavin@FreeBSD.org wrote: > Synopsis: C99 long double math functions are missing > > Responsible-Changed-From-To: freebsd-standards->freebsd-numerics > Responsible-Changed-By: gavin > Responsible-Changed-When: Thu Jun 13 11:09:50 UTC 2013 > Responsible-Changed-Why: > Over to -numerics. It looks like at least half of this has > already made it into releases, and the other half is in head, so > it may be that this PR can be put into "patched" state now. > > http://www.freebsd.org/cgi/query-pr.cgi?pr=82654 I think someone mentioned wanting to keep this open as a tracking bug, although I now have a wiki page for tracking the status of this work, so it's probably fine to close. kargl@ should do the honors! From owner-freebsd-numerics@FreeBSD.ORG Sat Jun 15 17:26:15 2013 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 933605E2 for ; Sat, 15 Jun 2013 17:26:15 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by mx1.freebsd.org (Postfix) with ESMTP id 5FC4810FD for ; Sat, 15 Jun 2013 17:26:15 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.6/8.14.6) with ESMTP id r5FHQEfs048084 for ; Sat, 15 Jun 2013 10:26:14 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.6/8.14.6/Submit) id r5FHQEaG048083 for freebsd-numerics@freebsd.org; Sat, 15 Jun 2013 10:26:14 -0700 (PDT) (envelope-from sgk) Date: Sat, 15 Jun 2013 10:26:14 -0700 From: Steve Kargl To: freebsd-numerics@freebsd.org Subject: FYI openlibm Message-ID: <20130615172614.GA48063@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Jun 2013 17:26:15 -0000 Just FYI, It seems that the openlibm project, which is a part of the Julia language project, has grabbed msun for their OS independent libm. https://github.com/JuliaLang/openlibm A scan of the commit log suggests that the openlibm/Julia developers are not making any improvements to the accuracy or efficiency of libm. -- Steve