From owner-freebsd-numerics@freebsd.org Mon Mar 4 21:31:42 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 57EF71523E26 for ; Mon, 4 Mar 2019 21:31:42 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "troutmask", Issuer "troutmask" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id C4DD38953C for ; Mon, 4 Mar 2019 21:31:40 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x24LM54d012616 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Mon, 4 Mar 2019 13:22:06 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x24LM5jp012603; Mon, 4 Mar 2019 13:22:05 -0800 (PST) (envelope-from sgk) Date: Mon, 4 Mar 2019 13:21:59 -0800 From: Steve Kargl To: Bruce Evans Cc: freebsd-numerics@freebsd.org Subject: Re: Update ENTERI() macro Message-ID: <20190304212159.GA12587@troutmask.apl.washington.edu> Reply-To: sgk@troutmask.apl.washington.edu References: <20190226191825.GA68479@troutmask.apl.washington.edu> <20190227145002.P907@besplex.bde.org> <20190227074811.GA75972@troutmask.apl.washington.edu> <20190227201214.V1823@besplex.bde.org> <20190227161906.GA77785@troutmask.apl.washington.edu> <20190228060920.R4413@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190228060920.R4413@besplex.bde.org> User-Agent: Mutt/1.11.2 (2019-01-07) X-Rspamd-Queue-Id: C4DD38953C X-Spamd-Bar: + Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [1.33 / 15.00]; ARC_NA(0.00)[]; HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; NEURAL_SPAM_SHORT(0.66)[0.665,0]; NEURAL_HAM_LONG(-0.16)[-0.157,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu]; AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2]; MX_GOOD(-0.01)[troutmask.apl.washington.edu]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_SPAM_MEDIUM(0.08)[0.083,0]; R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[optusnet.com.au]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(0.05)[ip: (0.10), ipnet: 128.95.0.0/16(0.15), asn: 73(0.06), country: US(-0.07)] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Mar 2019 21:31:42 -0000 On Thu, Feb 28, 2019 at 07:15:14AM +1100, Bruce Evans wrote: > On Wed, 27 Feb 2019, Steve Kargl wrote: > > > On Wed, Feb 27, 2019 at 09:15:52PM +1100, Bruce Evans wrote: > >> > >> ENTERI() hard-codes the long double for simplicity. Remember, it is only > >> needed for long double precision on i386. But I forgot about long double > >> complex types, and didn't dream about indirect long double types in sincosl(). > > > > That simplicity does not work for long double complex. We will > > > > need either ENTERIC as in > > > > #define ENTERIC() ENTERIT(long double complex) > > > > or a direct use of ENTERIT as you have done s_clogl.c > > I wrote ENTERIT() to work around this problem. > > >>> I'm fine with making ENTERI() only toggle precision, and adding > >>> a LEAVEI() to reset precision. RETURNI(r) would then be > >>> > >>> #define RETURNI(r) \ > >>> do { \ > >>> LEAVEI(); \ > >>> return (r); \ > >>> } while (0) > >> > >> No, may be an expression, so it must be evaluated before LEAVEI(). This > >> is the reason for existence of the variable to hold the result. > > > > So, we'll need RETURNI for long double and one for long double complex. > > Or, we give RETURNI a second parameter, which is the input parameter of > > the function > > I said to use your method of __typeof(). I tested this: > > XX --- /tmp/math_private.h Sun Nov 27 17:58:57 2005 > XX +++ ./math_private.h Thu Feb 28 06:17:26 2019 > XX @@ -474,21 +474,22 @@ > XX /* Support switching the mode to FP_PE if necessary. */ > XX #if defined(__i386__) && !defined(NO_FPSETPREC) > XX -#define ENTERI() ENTERIT(long double) > XX -#define ENTERIT(returntype) \ > XX - returntype __retval; \ > XX +#define ENTERI() \ > XX fp_prec_t __oprec; \ > XX \ > XX if ((__oprec = fpgetprec()) != FP_PE) \ > XX fpsetprec(FP_PE) > XX -#define RETURNI(x) do { \ > XX - __retval = (x); \ > XX - if (__oprec != FP_PE) \ > XX - fpsetprec(__oprec); \ > XX +#define LEAVEI() \ > XX + if ((__oprec = fpgetprec()) != FP_PE) \ > XX + fpsetprec(FP_PE) Am I reading this diff wrong? Should LEAVEI() be #define LEAVEI() \ if (__oprec != FP_PE) \ fpsetprec(__oprec) That is, we want to reset the precision to what ENTERI grabbed in its conditinal expression. -- Steve 20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4 20161221 https://www.youtube.com/watch?v=IbCHE-hONow From owner-freebsd-numerics@freebsd.org Tue Mar 5 04:48:32 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4C7AC1513542 for ; Tue, 5 Mar 2019 04:48:32 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail105.syd.optusnet.com.au (mail105.syd.optusnet.com.au [211.29.132.249]) by mx1.freebsd.org (Postfix) with ESMTP id DC5CC76F6B for ; Tue, 5 Mar 2019 04:48:24 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail105.syd.optusnet.com.au (Postfix) with ESMTPS id 7C8EB108CBD4; Tue, 5 Mar 2019 15:48:12 +1100 (AEDT) Date: Tue, 5 Mar 2019 15:48:11 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Steve Kargl cc: freebsd-numerics@freebsd.org Subject: Re: Update ENTERI() macro In-Reply-To: <20190304212159.GA12587@troutmask.apl.washington.edu> Message-ID: <20190305153243.Y1349@besplex.bde.org> References: <20190226191825.GA68479@troutmask.apl.washington.edu> <20190227145002.P907@besplex.bde.org> <20190227074811.GA75972@troutmask.apl.washington.edu> <20190227201214.V1823@besplex.bde.org> <20190227161906.GA77785@troutmask.apl.washington.edu> <20190228060920.R4413@besplex.bde.org> <20190304212159.GA12587@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=UJetJGXy c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=JQ0e4kNsG--EZ3fqZAAA:9 a=CjuIK1q_8ugA:10 X-Rspamd-Queue-Id: DC5CC76F6B X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates 211.29.132.249 as permitted sender) smtp.mailfrom=brde@optusnet.com.au X-Spamd-Result: default: False [-6.09 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_COUNT_TWO(0.00)[2]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23]; FREEMAIL_FROM(0.00)[optusnet.com.au]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[extmail.optusnet.com.au]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.74)[-0.738,0]; IP_SCORE(-3.04)[ip: (-7.87), ipnet: 211.28.0.0/14(-4.05), asn: 4804(-3.23), country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[]; RCVD_IN_DNSWL_LOW(-0.10)[249.132.29.211.list.dnswl.org : 127.0.5.1]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[optusnet.com.au]; ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU]; MIME_TRACE(0.00)[0:+]; FROM_EQ_ENVFROM(0.00)[] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Mar 2019 04:48:32 -0000 On Mon, 4 Mar 2019, Steve Kargl wrote: > On Thu, Feb 28, 2019 at 07:15:14AM +1100, Bruce Evans wrote: >> ... >> I said to use your method of __typeof(). I tested this: >> >> XX --- /tmp/math_private.h Sun Nov 27 17:58:57 2005 >> XX +++ ./math_private.h Thu Feb 28 06:17:26 2019 >> XX @@ -474,21 +474,22 @@ >> XX /* Support switching the mode to FP_PE if necessary. */ >> XX #if defined(__i386__) && !defined(NO_FPSETPREC) >> XX -#define ENTERI() ENTERIT(long double) >> XX -#define ENTERIT(returntype) \ >> XX - returntype __retval; \ >> XX +#define ENTERI() \ >> XX fp_prec_t __oprec; \ >> XX \ >> XX if ((__oprec = fpgetprec()) != FP_PE) \ >> XX fpsetprec(FP_PE) >> XX -#define RETURNI(x) do { \ >> XX - __retval = (x); \ >> XX - if (__oprec != FP_PE) \ >> XX - fpsetprec(__oprec); \ >> XX +#define LEAVEI() \ >> XX + if ((__oprec = fpgetprec()) != FP_PE) \ >> XX + fpsetprec(FP_PE) > > Am I reading this diff wrong? Should LEAVEI() be > > #define LEAVEI() \ > if (__oprec != FP_PE) \ > fpsetprec(__oprec) > > That is, we want to reset the precision to what ENTERI > grabbed in its conditinal expression. Oops. I wrote it wrong by copying the wrong clause. The broken version even passed quick runtime tests. This is because the tests know that most long double functions are missing ENTERI(), so they set the precision to FP_PE before the call. So both ENTERI() and LEAVEI() find (__oprec = fpgetprec() equal to FP_PE and do nothing. Bruce From owner-freebsd-numerics@freebsd.org Wed Mar 6 05:52:11 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 23AF71529A93 for ; Wed, 6 Mar 2019 05:52:11 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "troutmask", Issuer "troutmask" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4071B755DC for ; Wed, 6 Mar 2019 05:52:09 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x265q1pZ040308 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Tue, 5 Mar 2019 21:52:01 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x265q1nE040307; Tue, 5 Mar 2019 21:52:01 -0800 (PST) (envelope-from sgk) Date: Tue, 5 Mar 2019 21:52:01 -0800 From: Steve Kargl To: Bruce Evans Cc: freebsd-numerics@freebsd.org Subject: Re: Update ENTERI() macro Message-ID: <20190306055201.GA40298@troutmask.apl.washington.edu> Reply-To: sgk@troutmask.apl.washington.edu References: <20190226191825.GA68479@troutmask.apl.washington.edu> <20190227145002.P907@besplex.bde.org> <20190227074811.GA75972@troutmask.apl.washington.edu> <20190227201214.V1823@besplex.bde.org> <20190227161906.GA77785@troutmask.apl.washington.edu> <20190228060920.R4413@besplex.bde.org> <20190304212159.GA12587@troutmask.apl.washington.edu> <20190305153243.Y1349@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190305153243.Y1349@besplex.bde.org> User-Agent: Mutt/1.11.2 (2019-01-07) X-Rspamd-Queue-Id: 4071B755DC X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [0.93 / 15.00]; ARC_NA(0.00)[]; HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu]; NEURAL_HAM_MEDIUM(-0.03)[-0.031,0]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; NEURAL_SPAM_SHORT(0.85)[0.850,0]; NEURAL_HAM_LONG(-0.62)[-0.624,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu]; AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2]; RCPT_COUNT_TWO(0.00)[2]; MX_GOOD(-0.01)[troutmask.apl.washington.edu]; R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[optusnet.com.au]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(0.05)[ip: (0.10), ipnet: 128.95.0.0/16(0.15), asn: 73(0.06), country: US(-0.07)] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2019 05:52:11 -0000 On Tue, Mar 05, 2019 at 03:48:11PM +1100, Bruce Evans wrote: > On Mon, 4 Mar 2019, Steve Kargl wrote: > > > On Thu, Feb 28, 2019 at 07:15:14AM +1100, Bruce Evans wrote: > >> ... > >> I said to use your method of __typeof(). I tested this: > >> > >> XX --- /tmp/math_private.h Sun Nov 27 17:58:57 2005 > >> XX +++ ./math_private.h Thu Feb 28 06:17:26 2019 > >> XX @@ -474,21 +474,22 @@ > >> XX /* Support switching the mode to FP_PE if necessary. */ > >> XX #if defined(__i386__) && !defined(NO_FPSETPREC) > >> XX -#define ENTERI() ENTERIT(long double) > >> XX -#define ENTERIT(returntype) \ > >> XX - returntype __retval; \ > >> XX +#define ENTERI() \ > >> XX fp_prec_t __oprec; \ > >> XX \ > >> XX if ((__oprec = fpgetprec()) != FP_PE) \ > >> XX fpsetprec(FP_PE) > >> XX -#define RETURNI(x) do { \ > >> XX - __retval = (x); \ > >> XX - if (__oprec != FP_PE) \ > >> XX - fpsetprec(__oprec); \ > >> XX +#define LEAVEI() \ > >> XX + if ((__oprec = fpgetprec()) != FP_PE) \ > >> XX + fpsetprec(FP_PE) > > > > Am I reading this diff wrong? Should LEAVEI() be > > > > #define LEAVEI() \ > > if (__oprec != FP_PE) \ > > fpsetprec(__oprec) > > > > That is, we want to reset the precision to what ENTERI > > grabbed in its conditinal expression. > > Oops. I wrote it wrong by copying the wrong clause. > > The broken version even passed quick runtime tests. This is because > the tests know that most long double functions are missing ENTERI(), > so they set the precision to FP_PE before the call. So both ENTERI() > and LEAVEI() find (__oprec = fpgetprec() equal to FP_PE and do nothing. > My math_private.h has a few additions, which meant I applied your diff manually. As I was reading it, I came across the issue. I think I have the s_cexpl.c file fixed to use LDBL_EXTRACT_WORDS instead of the new macro I introduced. I however cannot figure out what David Das did to arrive at k_exp.c, so I cannot write a similar k_cexpl.c. Yes, I added the 'c' in the name to avoid confusion in ld80/. In particular, I have no idea how he found his scaling value 'k'. Any insights? -- Steve From owner-freebsd-numerics@freebsd.org Wed Mar 6 12:56:43 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 026CE1514503 for ; Wed, 6 Mar 2019 12:56:43 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail105.syd.optusnet.com.au (mail105.syd.optusnet.com.au [211.29.132.249]) by mx1.freebsd.org (Postfix) with ESMTP id B21698C948 for ; Wed, 6 Mar 2019 12:56:35 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail105.syd.optusnet.com.au (Postfix) with ESMTPS id 78D16105B9C9; Wed, 6 Mar 2019 23:56:24 +1100 (AEDT) Date: Wed, 6 Mar 2019 23:56:23 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Steve Kargl cc: freebsd-numerics@freebsd.org Subject: Re: Update ENTERI() macro In-Reply-To: <20190306055201.GA40298@troutmask.apl.washington.edu> Message-ID: <20190306225811.P2731@besplex.bde.org> References: <20190226191825.GA68479@troutmask.apl.washington.edu> <20190227145002.P907@besplex.bde.org> <20190227074811.GA75972@troutmask.apl.washington.edu> <20190227201214.V1823@besplex.bde.org> <20190227161906.GA77785@troutmask.apl.washington.edu> <20190228060920.R4413@besplex.bde.org> <20190304212159.GA12587@troutmask.apl.washington.edu> <20190305153243.Y1349@besplex.bde.org> <20190306055201.GA40298@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=UJetJGXy c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=fpzoFui-lhB3HaOsU0AA:9 a=CjuIK1q_8ugA:10 X-Rspamd-Queue-Id: B21698C948 X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates 211.29.132.249 as permitted sender) smtp.mailfrom=brde@optusnet.com.au X-Spamd-Result: default: False [-6.25 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_IN_DNSWL_LOW(-0.10)[249.132.29.211.list.dnswl.org : 127.0.5.1]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23]; FREEMAIL_FROM(0.00)[optusnet.com.au]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[extmail.optusnet.com.au]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.92)[-0.916,0]; IP_SCORE(-3.03)[ip: (-7.94), ipnet: 211.28.0.0/14(-3.98), asn: 4804(-3.17), country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[optusnet.com.au]; ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2019 12:56:43 -0000 On Tue, 5 Mar 2019, Steve Kargl wrote: > ... > I think I have the s_cexpl.c file fixed to use LDBL_EXTRACT_WORDS > instead of the new macro I introduced. I however cannot figure > out what David Das did to arrive at k_exp.c, so I cannot write Davvid A S. > a similar k_cexpl.c. Yes, I added the 'c' in the name to avoid > confusion in ld80/. In particular, I have no idea how he found > his scaling value 'k'. Any insights? bde already wrote __ldexp_cexpl() in ld*/k_expl.h, and kargl committed it in r260066. Does it not work? :-). Well, it hasn't been tested, and it indeed cannot work since it spells cosl(y) as cos(y). It is written in a more portable way than the double and float versions using '1' instead of bit fiddling to start the construction of 2**k. This makes it identical in ld80 and ld128 (since the exponent range is the same and the accessor macros hide enough details of the bit fiddling for the exponent). The duplication is noted in a comment, but the comment also says that using scalbnl() ond ld128 is probably best after all (it is slow, but multiplication is also slow). I have rewritten the double and float versions in work related to fixing the accuracy of the double and float versions of hyperbolic functions. I fixed these before writing the long double hyperbolic functions using identical methods. You committed the latter, but the double and float versions still use the old innaccurate slower methods. The rewrite improves the comments, and it is the improved comments that the reference to k_exp.c in k_expl.h is supposed to be for. XX Index: k_exp.c XX =================================================================== XX RCS file: /home/ncvs/src/lib/msun/src/k_exp.c,v XX retrieving revision 1.2 XX diff -u -2 -r1.2 k_exp.c XX --- k_exp.c 17 Nov 2012 01:50:07 -0000 1.2 XX +++ k_exp.c 30 Jun 2013 01:31:48 -0000 XX @@ -32,75 +32,46 @@ XX #include "math.h" XX #include "math_private.h" XX - XX -static const uint32_t k = 1799; /* constant for reduction */ XX -static const double kln2 = 1246.97177782734161156; /* k * ln2 */ I never liked this magic by das. I forget how it worked. I avoid using general methods. XX - XX -/* XX - * Compute exp(x), scaled to avoid spurious overflow. An exponent is XX - * returned separately in 'expt'. XX - * XX - * Input: ln(DBL_MAX) <= x < ln(2 * DBL_MAX / DBL_MIN_DENORM) ~= 1454.91 XX - * Output: 2**1023 <= y < 2**1024 XX - */ XX -static double XX -__frexp_exp(double x, int *expt) XX -{ XX - double exp_x; XX - uint32_t hx; XX - XX - /* XX - * We use exp(x) = exp(x - kln2) * 2**k, carefully chosen to XX - * minimize |exp(kln2) - 2**k|. We also scale the exponent of XX - * exp_x to MAX_EXP so that the result can be multiplied by XX - * a tiny number without losing accuracy due to denormalization. XX - */ XX - exp_x = exp(x - kln2); XX - GET_HIGH_WORD(hx, exp_x); XX - *expt = (hx >> 20) - (0x3ff + 1023) + k; XX - SET_HIGH_WORD(exp_x, (hx & 0xfffff) | ((0x3ff + 1023) << 20)); XX - return (exp_x); XX -} __frexp_exp() was only used in __ldexp_cexp(). It is not needed using the general methods. __ldexp_cexpl() already uses the general methods and has no trace of __frexp_expl(). XX +#include "k_exp.h" XX XX /* XX - * __ldexp_exp(x, expt) and __ldexp_cexp(x, expt) compute exp(x) * 2**expt. XX - * They are intended for large arguments (real part >= ln(DBL_MAX)) XX - * where care is needed to avoid overflow. XX + * _ldexp_cexp(x + I*y, expt) computes cexp(x + I*y) * 2**expt. It is XX + * intended for large arguments where care is needed to avoid overflow. XX * XX - * The present implementation is narrowly tailored for our hyperbolic and XX - * exponential functions. We assume expt is small (0 or -1), and the caller XX - * has filtered out very large x, for which overflow would be inevitable. XX + * The present implementation is tailored (not very narrowly) for our XX + * complex hyperbolic and exponential functions. The caller must filter XX + * out all x not supported by k_exp(), most negative x, and very large x, XX + * and not use very large |expt| (we only need expt = 0 or -1). The XX + * caller should filter out all x for which overflow is either impossible XX + * (x < ~ln(DBL_MAX)) or inevitable (x > ~2*ln(DBL_MAX/DBL_TRUE_MIN)). XX */ XX - XX -double XX -__ldexp_exp(double x, int expt) XX -{ XX - double exp_x, scale; XX - int ex_expt; XX - XX - exp_x = __frexp_exp(x, &ex_expt); XX - expt += ex_expt; XX - INSERT_WORDS(scale, (0x3ff + expt) << 20, 0); XX - return (exp_x * scale); XX -} XX - __ldexp_exp() was only used in cosh() and sinh(). coshl() and sinhl() always used better methods, so there is no trace of __ldexp_expl() in k_expl.h. XX double complex XX __ldexp_cexp(double complex z, int expt) XX { XX - double x, y, exp_x, scale1, scale2; XX - int ex_expt, half_expt; XX + double_t exp_x, hi, lo; XX + double x, y, scale1, scale2; XX + int half_expt, k; Also start fixing i386 pessimizations using s/double/double_t. XX XX x = creal(z); XX y = cimag(z); XX - exp_x = __frexp_exp(x, &ex_expt); XX - expt += ex_expt; XX + k_exp(x, &hi, &lo, &k); k_expl.h already has a kernel k_expl() much like this (you wrote the non-kernel expl() and I turned it into the kernel. The rest of this mail is mostly diffs and files for turning exp() into a kernel. The kernel is used a lot in hyperbolic functions, just like in the long double case). XX + XX + /* XX + * 3 scale factors are needed in general. Put one in exp_x now. XX + * exp_x must be larger than 2**DBL_MANT_DIG to avoid spurious XX + * underflows We satisfy this and maximize the range handled XX + * by putting as much as possible in exp_x. XX + */ XX + exp_x = (lo + hi) * 0x1p1022; XX + expt += k - 1022; XX XX /* XX - * Arrange so that scale1 * scale2 == 2**expt. We use this to XX - * compensate for scalbn being horrendously slow. XX + * Arrange so that scale1 * scale2 == 2**expt. As usual, scalbn() XX + * is too slow to actually use for the things it does. XX */ XX + scale1 = 1; XX half_expt = expt / 2; XX - INSERT_WORDS(scale1, (0x3ff + half_expt) << 20, 0); XX - half_expt = expt - half_expt; XX - INSERT_WORDS(scale2, (0x3ff + half_expt) << 20, 0); XX + SET_HIGH_WORD(scale1, (0x3ff + half_expt) << 20); XX + scale2 = 1; XX + SET_HIGH_WORD(scale2, (0x3ff + expt - half_expt) << 20); XX XX return (cpack(cos(y) * exp_x * scale1 * scale2, The scaling is the same as in k_expl.h, with minor differences for the bit fiddling. This started by constructing 1 without using bits too. The scaling is to avoid overlow and underflow, and it must be careful to not have its own underflow. The basic general method is that the kernel creates the value in the form r * 2**k where r is near 1. Since r is near 1, we can multiply almost any value by it without overflow or underflow. Values near +Inf might need to be divided by 2 to prepare. Values near DBL_MIN might need to be multiplied by 2 to prepare. Denormal values should be multiplied by a larger power of 2 to prepare. Adjust k to match. The rest is not a diff, but is the contents of k_exp.h. XX /* See e_exp.c for more comments. */ XX XX static const double XX ln2hi = 6.93147180369123816490e-01, /* 0x3fe62e42, 0xfee00000 */ XX ln2lo = 1.90821492927058770002e-10, /* 0x3dea39ef, 0x35793c76 */ XX invln2 = 1.44269504088896338700e+00, /* 0x3ff71547, 0x652b82fe */ XX /* XX * Domain [-0.34568, 0.34568], range ~[-1.6340e-18, 2.0047e-18]: XX * |x*(exp(x)+1)/(exp(x)-1) - p(x)| < 2**-58.7 XX */ XX P1 = 1.66666666666666019037e-01, /* 0x3FC55555, 0x5555553E */ XX P2 = -2.77777777770155933842e-03, /* 0xBF66C16C, 0x16BEBD93 */ XX P3 = 6.61375632143793436117e-05, /* 0x3F11566A, 0xAF25DE2C */ XX P4 = -1.65339022054652515390e-06, /* 0xBEBBBD41, 0xC5D26BF1 */ XX P5 = 4.13813679705723846039e-08; /* 0x3E663769, 0x72BEA4D0 */ XX XX /* XX * Kernel for exp(x). x must be finite and not tiny or huge. XX * "tiny" is anything that would make us underflow (|P4*x^8| < ~DBL_T_MIN). XX * "huge" is anything that would make t*ln2hi inexact (|x| > ~2**21*ln2). XX */ XX static inline void XX k_exp(double x, double_t *hip, double_t *lop, int *kp) XX { XX double_t c, hi, lo, rhi, rlo, t, xred; XX int32_t k; XX XX /* argument reduction */ XX #if 0 XX t = rnint((double_t)x * invln2); XX #else XX t = (double)((double_t)x * invln2 + 0x1.8p52) - 0x1.8p52; XX #endif XX k = irint(t); XX /* XX * ln2hi is rounded to 32 bits to more than cover the normal XX * range -1075 <= k <= 1024. It covers the wider range XX * |k| <= 2**21. We now need it cover the range XX * -1075 <= k <= 1024+1024+53. XX */ XX hi = x - t*ln2hi; /* t*ln2hi is exact here */ XX lo = t*ln2lo; XX xred = hi - lo; XX XX /* xred is now in primary range */ XX t = xred*xred; XX c = xred - t*P1 - t*t*(P2 + t*P3) - t*t*(t*t)*(P4 + t*P5); XX t = xred*c/(2-c); XX rhi = 1; XX rlo = hi; XX _2sumF(rhi, rlo); XX rlo += t-lo; XX *hip = rhi; XX *lop = rlo; XX *kp = k; XX } XX XX /* XX * Extra-precision exp(x)/2. x must be between the underflow threshold XX * and the overflow threshold (in float precision due to our optimization XX * of evaluating 2**(k-1) in float precision)), and not very close to XX * either, and not tiny. We only use this for x >= 1 up to the threshold XX * where cosh(x) decays to exp(x)/2. XX */ XX static inline void XX k_hexp(double x, double_t *hip, double_t *lop) XX { XX float twopkm1; XX int k; XX XX k_exp(x, hip, lop, &k); XX SET_FLOAT_WORD(twopkm1, 0x3f800000 + ((k - 1) << 23)); XX *hip *= twopkm1; XX *lop *= twopkm1; XX } XX XX /* XX * exp(x)/2 without spurious overflow. x must be well above the underflow XX * threshold and at most a little above the overflow threshold, and not XX * tiny. We only use this for x positive and above the threshold where XX * cosh(x) decays to exp(x)/2 up to a little above the overflow threshold. XX */ XX static inline double_t XX hexp(double x) XX { XX double_t hi, lo; XX double twopkm2; XX int k; XX XX twopkm2 = 1; XX k_exp(x, &hi, &lo, &k); XX SET_HIGH_WORD(twopkm2, 0x3ff00000 + ((k - 2) << 20)); XX return (lo + hi) * 2 * twopkm2; XX } k_expl.h also points to the above uncommitted comments for k_hexp() and hexp(). __ldexp_cexpl() doesn't belong in k_expl.h. It belongs in a new file k_expl.c corresponding to k_exp.c. hexp[fl]() isn't quite a kernel, but it has to be in a header file since it is inline. Using static inline functions also reduces namespace problems. We already use this to name some kernels simply kfoo() instead of uglier names like __kernel_foo() used for older methods where the kernels aren't always inline. Bruce From owner-freebsd-numerics@freebsd.org Wed Mar 6 18:48:35 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3FAF01523F76 for ; Wed, 6 Mar 2019 18:48:35 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "troutmask", Issuer "troutmask" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 983B66B9EC for ; Wed, 6 Mar 2019 18:48:32 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x26ImTXm045033 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Wed, 6 Mar 2019 10:48:29 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x26ImTZH045032; Wed, 6 Mar 2019 10:48:29 -0800 (PST) (envelope-from sgk) Date: Wed, 6 Mar 2019 10:48:29 -0800 From: Steve Kargl To: Bruce Evans Cc: freebsd-numerics@freebsd.org Subject: Re: Update ENTERI() macro Message-ID: <20190306184829.GA44023@troutmask.apl.washington.edu> Reply-To: sgk@troutmask.apl.washington.edu References: <20190226191825.GA68479@troutmask.apl.washington.edu> <20190227145002.P907@besplex.bde.org> <20190227074811.GA75972@troutmask.apl.washington.edu> <20190227201214.V1823@besplex.bde.org> <20190227161906.GA77785@troutmask.apl.washington.edu> <20190228060920.R4413@besplex.bde.org> <20190304212159.GA12587@troutmask.apl.washington.edu> <20190305153243.Y1349@besplex.bde.org> <20190306055201.GA40298@troutmask.apl.washington.edu> <20190306225811.P2731@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190306225811.P2731@besplex.bde.org> User-Agent: Mutt/1.11.2 (2019-01-07) X-Rspamd-Queue-Id: 983B66B9EC X-Spamd-Bar: + Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [1.17 / 15.00]; ARC_NA(0.00)[]; HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; NEURAL_SPAM_SHORT(0.90)[0.899,0]; NEURAL_HAM_LONG(-0.61)[-0.611,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu]; AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2]; MX_GOOD(-0.01)[cached: troutmask.apl.washington.edu]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_SPAM_MEDIUM(0.14)[0.144,0]; R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[optusnet.com.au]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(0.05)[ip: (0.10), ipnet: 128.95.0.0/16(0.15), asn: 73(0.05), country: US(-0.07)] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2019 18:48:35 -0000 On Wed, Mar 06, 2019 at 11:56:23PM +1100, Bruce Evans wrote: > On Tue, 5 Mar 2019, Steve Kargl wrote: > > > ... > > I think I have the s_cexpl.c file fixed to use LDBL_EXTRACT_WORDS > > instead of the new macro I introduced. I however cannot figure > > out what David Das did to arrive at k_exp.c, so I cannot write > > Davvid A S. Whoops, again. Seems I conflated real name and login name. > > a similar k_cexpl.c. Yes, I added the 'c' in the name to avoid > > confusion in ld80/. In particular, I have no idea how he found > > his scaling value 'k'. Any insights? > > bde already wrote __ldexp_cexpl() in ld*/k_expl.h, and kargl committed > it in r260066. Does it not work? :-). > > Well, it hasn't been tested, and it indeed cannot work since it spells > cosl(y) as cos(y). Taking long breaks from pecking at libm issues seems to be conducive to memory loss. I'll go review k_expl.h. I simply remember it as having a kernel for expl(). > It is written in a more portable way than the double and float versions > using '1' instead of bit fiddling to start the construction of 2**k. > This makes it identical in ld80 and ld128 (since the exponent range > is the same and the accessor macros hide enough details of the bit > fiddling for the exponent). The duplication is noted in a comment, > but the comment also says that using scalbnl() ond ld128 is probably > best after all (it is slow, but multiplication is also slow). > > I have rewritten the double and float versions in work related to > fixing the accuracy of the double and float versions of hyperbolic > functions. I fixed these before writing the long double hyperbolic > functions using identical methods. You committed the latter, but > the double and float versions still use the old innaccurate slower > methods. > > The rewrite improves the comments, and it is the improved comments > that the reference to k_exp.c in k_expl.h is supposed to be for. > > XX Index: k_exp.c Any chance you'll get around to committing your WIP? Yes, I know you have a few thousand kernel patches in your queue above libm patches. :-) BTW, for the non-exceptional cases with 1M random z values where z=x+Iy and -11350 < x,y < 11350 and I only consider results that are normal, I find my cexpl() yields % ./testl -u -X 11350 Max ULP Re: 1.980535 z = (1.22918109220546510585e+03,1.03853909865862542237e+04) libm = (5.06780736805320166327e+533,-4.39418989082799451477e+533) mpfr = (5.06780736805320166270e+533,-4.39418989082799451456e+533) Max ULP Im: 2.022155 z = (4.83490728165160559637e+03,1.07778990242305355345e+04) libm = (-3.66535128319537945953e+2099,4.67021177841072936494e+2099) mpfr = (-3.66535128319537945915e+2099,4.67021177841072936441e+2099) For comparison, cexp() with -705 < x,y < 705 yields % ./testd -u -X 705 Max ULP Re: 2.215132 z = (1.49377521822925502e+02,1.79997882645095970e+01) libm = (4.93720465697268180e+64,-5.61754869856313932e+64) mpfr = (4.93720465697268310e+64,-5.61754869856313987e+64) Max ULP Im: 2.182779 z = (2.50664219501672335e+02,-4.81327697040560906e+02) libm = (-5.73256778461974670e+108,4.48612518733245315e+108) mpfr = (-5.73256778461974754e+108,4.48612518733245429e+108) Certainly, not exhaustive but encouraging. -- Steve From owner-freebsd-numerics@freebsd.org Wed Mar 6 19:30:49 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3D1621524E93 for ; Wed, 6 Mar 2019 19:30:49 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by mx1.freebsd.org (Postfix) with ESMTP id 0AB436CF4A for ; Wed, 6 Mar 2019 19:30:47 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 427B043D302; Thu, 7 Mar 2019 06:30:44 +1100 (AEDT) Date: Thu, 7 Mar 2019 06:30:42 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Steve Kargl cc: freebsd-numerics@freebsd.org Subject: Re: Update ENTERI() macro In-Reply-To: <20190306184829.GA44023@troutmask.apl.washington.edu> Message-ID: <20190307061214.R4911@besplex.bde.org> References: <20190226191825.GA68479@troutmask.apl.washington.edu> <20190227145002.P907@besplex.bde.org> <20190227074811.GA75972@troutmask.apl.washington.edu> <20190227201214.V1823@besplex.bde.org> <20190227161906.GA77785@troutmask.apl.washington.edu> <20190228060920.R4413@besplex.bde.org> <20190304212159.GA12587@troutmask.apl.washington.edu> <20190305153243.Y1349@besplex.bde.org> <20190306055201.GA40298@troutmask.apl.washington.edu> <20190306225811.P2731@besplex.bde.org> <20190306184829.GA44023@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=UJetJGXy c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=JKf714QeBDjTFJuGM2EA:9 a=CjuIK1q_8ugA:10 X-Rspamd-Queue-Id: 0AB436CF4A X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates 211.29.132.246 as permitted sender) smtp.mailfrom=brde@optusnet.com.au X-Spamd-Result: default: False [-6.23 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_IN_DNSWL_LOW(-0.10)[246.132.29.211.list.dnswl.org : 127.0.5.1]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23]; FREEMAIL_FROM(0.00)[optusnet.com.au]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[cached: extmail.optusnet.com.au]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.94)[-0.943,0]; IP_SCORE(-2.98)[ip: (-7.65), ipnet: 211.28.0.0/14(-4.00), asn: 4804(-3.19), country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[optusnet.com.au]; ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2019 19:30:49 -0000 On Wed, 6 Mar 2019, Steve Kargl wrote: > On Wed, Mar 06, 2019 at 11:56:23PM +1100, Bruce Evans wrote: >> On Tue, 5 Mar 2019, Steve Kargl wrote: > >>> a similar k_cexpl.c. Yes, I added the 'c' in the name to avoid >>> confusion in ld80/. In particular, I have no idea how he found >>> his scaling value 'k'. Any insights? >> >> bde already wrote __ldexp_cexpl() in ld*/k_expl.h, and kargl committed >> it in r260066. Does it not work? :-). >> >> Well, it hasn't been tested, and it indeed cannot work since it spells >> cosl(y) as cos(y). > > Taking long breaks from pecking at libm issues seems to be > conducive to memory loss. I'll go review k_expl.h. I simply > remember it as having a kernel for expl(). I now see that you implemented 2 more versions of __ldexpl_cexpl() by cloning the old double precision version. Apparently the includes are to unpolluted for the compiler to see the multiple versions :-). Using the version in k_expl.h almost forces inlining of expl()'s kernel and its large tables, just like for hyperbolic functions. This wastes a lot of space, especially for duplicating the tables. It is only a small optimization for time. It is done for the hyperbolic functions to get this optimization, and for __ldexpl_cexpl() just for convenience. >> ... >> I have rewritten the double and float versions in work related to >> fixing the accuracy of the double and float versions of hyperbolic >> functions. I fixed these before writing the long double hyperbolic >> .. >> XX Index: k_exp.c > > Any chance you'll get around to committing your WIP? Yes, > I know you have a few thousand kernel patches in your > queue above libm patches. :-) Probably not soon. > > BTW, for the non-exceptional cases with 1M random z values > where z=x+Iy and -11350 < x,y < 11350 and I only consider > results that are normal, I find my cexpl() yields > > % ./testl -u -X 11350 > Max ULP Re: 1.980535 > z = (1.22918109220546510585e+03,1.03853909865862542237e+04) > libm = (5.06780736805320166327e+533,-4.39418989082799451477e+533) > mpfr = (5.06780736805320166270e+533,-4.39418989082799451456e+533) > Max ULP Im: 2.022155 > z = (4.83490728165160559637e+03,1.07778990242305355345e+04) > libm = (-3.66535128319537945953e+2099,4.67021177841072936494e+2099) > mpfr = (-3.66535128319537945915e+2099,4.67021177841072936441e+2099) Check a few denormals, Infs and NaNs. I can only easily test for coherence with double precision. That is, evaluate in both precisions and check that the long double results rounded down are within a few ulps. > For comparison, cexp() with -705 < x,y < 705 yields > > % ./testd -u -X 705 > Max ULP Re: 2.215132 > z = (1.49377521822925502e+02,1.79997882645095970e+01) > libm = (4.93720465697268180e+64,-5.61754869856313932e+64) > mpfr = (4.93720465697268310e+64,-5.61754869856313987e+64) > Max ULP Im: 2.182779 > z = (2.50664219501672335e+02,-4.81327697040560906e+02) > libm = (-5.73256778461974670e+108,4.48612518733245315e+108) > mpfr = (-5.73256778461974754e+108,4.48612518733245429e+108) > > Certainly, not exhaustive but encouraging. We expect long double precision to be slightly more accurate, at least using the expl kernel, since the kernel accuracy is about 0.51 ulps while the exp accuracy is only about 0.8 ulps. The above shows 0.2 ulps better instead of 0.3. Still more than 2 ulps total from combining several errors of 0.5-1.0 ulps. Bruce From owner-freebsd-numerics@freebsd.org Wed Mar 6 21:42:39 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C828D1528957 for ; Wed, 6 Mar 2019 21:42:39 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "troutmask", Issuer "troutmask" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 7B4517243B for ; Wed, 6 Mar 2019 21:42:37 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x26LgXAS039372 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Wed, 6 Mar 2019 13:42:34 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x26LgXTY039368; Wed, 6 Mar 2019 13:42:33 -0800 (PST) (envelope-from sgk) Date: Wed, 6 Mar 2019 13:42:33 -0800 From: Steve Kargl To: Bruce Evans Cc: freebsd-numerics@freebsd.org Subject: Re: Update ENTERI() macro Message-ID: <20190306214233.GA23159@troutmask.apl.washington.edu> Reply-To: sgk@troutmask.apl.washington.edu References: <20190227074811.GA75972@troutmask.apl.washington.edu> <20190227201214.V1823@besplex.bde.org> <20190227161906.GA77785@troutmask.apl.washington.edu> <20190228060920.R4413@besplex.bde.org> <20190304212159.GA12587@troutmask.apl.washington.edu> <20190305153243.Y1349@besplex.bde.org> <20190306055201.GA40298@troutmask.apl.washington.edu> <20190306225811.P2731@besplex.bde.org> <20190306184829.GA44023@troutmask.apl.washington.edu> <20190307061214.R4911@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190307061214.R4911@besplex.bde.org> User-Agent: Mutt/1.11.2 (2019-01-07) X-Rspamd-Queue-Id: 7B4517243B X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [0.74 / 15.00]; HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu]; TO_DN_SOME(0.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2]; MX_GOOD(-0.01)[cached: troutmask.apl.washington.edu]; RCPT_COUNT_TWO(0.00)[2]; FREEMAIL_TO(0.00)[optusnet.com.au]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US]; MID_RHS_MATCH_FROM(0.00)[]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.37)[-0.371,0]; FROM_HAS_DN(0.00)[]; NEURAL_SPAM_SHORT(0.78)[0.779,0]; NEURAL_HAM_LONG(-0.41)[-0.409,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu]; AUTH_NA(1.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; R_SPF_NA(0.00)[]; IP_SCORE(0.05)[ip: (0.10), ipnet: 128.95.0.0/16(0.15), asn: 73(0.05), country: US(-0.07)] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2019 21:42:40 -0000 On Thu, Mar 07, 2019 at 06:30:42AM +1100, Bruce Evans wrote: > On Wed, 6 Mar 2019, Steve Kargl wrote: > > > On Wed, Mar 06, 2019 at 11:56:23PM +1100, Bruce Evans wrote: > >> On Tue, 5 Mar 2019, Steve Kargl wrote: > > > >>> a similar k_cexpl.c. Yes, I added the 'c' in the name to avoid > >>> confusion in ld80/. In particular, I have no idea how he found > >>> his scaling value 'k'. Any insights? > >> > >> bde already wrote __ldexp_cexpl() in ld*/k_expl.h, and kargl committed > >> it in r260066. Does it not work? :-). > >> > >> Well, it hasn't been tested, and it indeed cannot work since it spells > >> cosl(y) as cos(y). > > > > Taking long breaks from pecking at libm issues seems to be > > conducive to memory loss. I'll go review k_expl.h. I simply > > remember it as having a kernel for expl(). > > I now see that you implemented 2 more versions of __ldexpl_cexpl() by > cloning the old double precision version. Apparently the includes > are to unpolluted for the compiler to see the multiple versions :-). > > Using the version in k_expl.h almost forces inlining of expl()'s kernel > and its large tables, just like for hyperbolic functions. This wastes > a lot of space, especially for duplicating the tables. It is only a > small optimization for time. It is done for the hyperbolic functions > to get this optimization, and for __ldexpl_cexpl() just for convenience. The version in k_expl.h has 2 bugs. You note the first (cos instead of cosl). The second is In file included from /data/kargl/trunk/math/libm/msun/ld80/s_cexpl.c:43: /data/kargl/trunk/math/libm/msun/ld80/k_expl.h:288:22: error: magnitude of floating-point constant too large for type 'double'; maximum is 1.7976931348623157E+308 [-Werror,-Wliteral-range] exp_x = (lo + hi) * 0x1p16382; ^ 1 error generated. *** Error code 1 > >> I have rewritten the double and float versions in work related to > >> fixing the accuracy of the double and float versions of hyperbolic > >> functions. I fixed these before writing the long double hyperbolic > >> .. > >> XX Index: k_exp.c > > > > Any chance you'll get around to committing your WIP? Yes, > > I know you have a few thousand kernel patches in your > > queue above libm patches. :-) > > Probably not soon. > > > > > BTW, for the non-exceptional cases with 1M random z values > > where z=x+Iy and -11350 < x,y < 11350 and I only consider > > results that are normal, I find my cexpl() yields > > > > % ./testl -u -X 11350 > > Max ULP Re: 1.980535 > > z = (1.22918109220546510585e+03,1.03853909865862542237e+04) > > libm = (5.06780736805320166327e+533,-4.39418989082799451477e+533) > > mpfr = (5.06780736805320166270e+533,-4.39418989082799451456e+533) > > Max ULP Im: 2.022155 > > z = (4.83490728165160559637e+03,1.07778990242305355345e+04) > > libm = (-3.66535128319537945953e+2099,4.67021177841072936494e+2099) > > mpfr = (-3.66535128319537945915e+2099,4.67021177841072936441e+2099) > > Check a few denormals, Infs and NaNs. Exceptional cases give % ./testl -e cexpl(1, inf) = (nan,nan) expecting (nan,nan) cexpl(1,-inf) = (nan,nan) expecting (nan,nan) cexpl(nan,-inf) = (nan,nan) expecting (nan,nan) cexpl(nan, inf) = (nan,nan) expecting (nan,nan) cexpl(inf, inf) = (inf,nan) expecting (inf,nan) cexpl(inf,-inf) = (inf,nan) expecting (inf,nan) cexpl(inf,nan) = (inf,nan) expecting (inf,nan) cexpl(-inf,nan) = (0,0) expecting (0,0) cexpl(-inf,-inf) = (0,0) expecting (0,0) cexpl(-inf, inf) = (0,0) expecting (0,0) > I can only easily test for coherence with double precision. That is, > evaluate in both precisions and check that the long double results > rounded down are within a few ulps. I use GNU MPFR and compute exp(x)*cos(y) and exp(x)*sin(y) with a precision of 4*LDBL_MANT_DIG. Supposedly, each function is correctly rounded to this precision. I take the answers as-if it is exact when computing ULP. > > > For comparison, cexp() with -705 < x,y < 705 yields > > > > % ./testd -u -X 705 > > Max ULP Re: 2.215132 > > z = (1.49377521822925502e+02,1.79997882645095970e+01) > > libm = (4.93720465697268180e+64,-5.61754869856313932e+64) > > mpfr = (4.93720465697268310e+64,-5.61754869856313987e+64) > > Max ULP Im: 2.182779 > > z = (2.50664219501672335e+02,-4.81327697040560906e+02) > > libm = (-5.73256778461974670e+108,4.48612518733245315e+108) > > mpfr = (-5.73256778461974754e+108,4.48612518733245429e+108) > > > > Certainly, not exhaustive but encouraging. > > We expect long double precision to be slightly more accurate, at > least using the expl kernel, since the kernel accuracy is about > 0.51 ulps while the exp accuracy is only about 0.8 ulps. The > above shows 0.2 ulps better instead of 0.3. Still more than 2 > ulps total from combining several errors of 0.5-1.0 ulps. > > Bruce -- Steve From owner-freebsd-numerics@freebsd.org Thu Mar 7 04:09:13 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AD779150D42F for ; Thu, 7 Mar 2019 04:09:13 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail105.syd.optusnet.com.au (mail105.syd.optusnet.com.au [211.29.132.249]) by mx1.freebsd.org (Postfix) with ESMTP id DE6C18B64F for ; Thu, 7 Mar 2019 04:09:11 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail105.syd.optusnet.com.au (Postfix) with ESMTPS id 2E5B010598CB; Thu, 7 Mar 2019 15:09:08 +1100 (AEDT) Date: Thu, 7 Mar 2019 15:09:06 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Steve Kargl cc: freebsd-numerics@freebsd.org Subject: Re: Update ENTERI() macro In-Reply-To: <20190306214233.GA23159@troutmask.apl.washington.edu> Message-ID: <20190307144315.N932@besplex.bde.org> References: <20190227074811.GA75972@troutmask.apl.washington.edu> <20190227201214.V1823@besplex.bde.org> <20190227161906.GA77785@troutmask.apl.washington.edu> <20190228060920.R4413@besplex.bde.org> <20190304212159.GA12587@troutmask.apl.washington.edu> <20190305153243.Y1349@besplex.bde.org> <20190306055201.GA40298@troutmask.apl.washington.edu> <20190306225811.P2731@besplex.bde.org> <20190306184829.GA44023@troutmask.apl.washington.edu> <20190307061214.R4911@besplex.bde.org> <20190306214233.GA23159@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=UJetJGXy c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=Et-GumYriqmTMQ_IUTQA:9 a=CjuIK1q_8ugA:10 X-Rspamd-Queue-Id: DE6C18B64F X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates 211.29.132.249 as permitted sender) smtp.mailfrom=brde@optusnet.com.au X-Spamd-Result: default: False [-6.14 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_IN_DNSWL_LOW(-0.10)[249.132.29.211.list.dnswl.org : 127.0.5.1]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23]; FREEMAIL_FROM(0.00)[optusnet.com.au]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[cached: extmail.optusnet.com.au]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.80)[-0.799,0]; IP_SCORE(-3.03)[ip: (-7.92), ipnet: 211.28.0.0/14(-4.00), asn: 4804(-3.19), country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[optusnet.com.au]; ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2019 04:09:14 -0000 On Wed, 6 Mar 2019, Steve Kargl wrote: > On Thu, Mar 07, 2019 at 06:30:42AM +1100, Bruce Evans wrote: >> ... >> I now see that you implemented 2 more versions of __ldexpl_cexpl() by >> cloning the old double precision version. Apparently the includes >> are to unpolluted for the compiler to see the multiple versions :-). >> >> Using the version in k_expl.h almost forces inlining of expl()'s kernel >> and its large tables, just like for hyperbolic functions. This wastes >> a lot of space, especially for duplicating the tables. It is only a >> small optimization for time. It is done for the hyperbolic functions >> to get this optimization, and for __ldexpl_cexpl() just for convenience. > > The version in k_expl.h has 2 bugs. You note the first (cos instead > of cosl). The second is > > In file included from /data/kargl/trunk/math/libm/msun/ld80/s_cexpl.c:43: > /data/kargl/trunk/math/libm/msun/ld80/k_expl.h:288:22: error: magnitude of > floating-point constant too large for type 'double'; maximum is > 1.7976931348623157E+308 [-Werror,-Wliteral-range] > exp_x = (lo + hi) * 0x1p16382; > ^ It is missing an L suffix. Clearly not tested. >>> [errors for cexpl()] >> >> Check a few denormals, Infs and NaNs. > > Exceptional cases give > > % ./testl -e > cexpl(1, inf) = (nan,nan) expecting (nan,nan) > cexpl(1,-inf) = (nan,nan) expecting (nan,nan) > cexpl(nan,-inf) = (nan,nan) expecting (nan,nan) > cexpl(nan, inf) = (nan,nan) expecting (nan,nan) > cexpl(inf, inf) = (inf,nan) expecting (inf,nan) > cexpl(inf,-inf) = (inf,nan) expecting (inf,nan) > cexpl(inf,nan) = (inf,nan) expecting (inf,nan) > cexpl(-inf,nan) = (0,0) expecting (0,0) > cexpl(-inf,-inf) = (0,0) expecting (0,0) > cexpl(-inf, inf) = (0,0) expecting (0,0) But does it preserve NaN bits as much as possible, and produce the same NaN bits as double precision after conversion to double precision? The report spells NaN and Inf using C99's fuzzy bad names. Even the C99 specification spells NaN and Inf correctly in Appendix F. The complex hyperbolic functions have better comments, so are almost an easier place to start. Bruce From owner-freebsd-numerics@freebsd.org Thu Mar 7 06:36:36 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CE646151C778 for ; Thu, 7 Mar 2019 06:36:35 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail106.syd.optusnet.com.au (mail106.syd.optusnet.com.au [211.29.132.42]) by mx1.freebsd.org (Postfix) with ESMTP id 880526BCF0 for ; Thu, 7 Mar 2019 06:36:33 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail106.syd.optusnet.com.au (Postfix) with ESMTPS id 8BED53D60D8; Thu, 7 Mar 2019 17:36:28 +1100 (AEDT) Date: Thu, 7 Mar 2019 17:36:22 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Steve Kargl cc: freebsd-numerics@freebsd.org Subject: Re: Update ENTERI() macro In-Reply-To: <20190307044447.GA16298@troutmask.apl.washington.edu> Message-ID: <20190307163958.V1293@besplex.bde.org> References: <20190227161906.GA77785@troutmask.apl.washington.edu> <20190228060920.R4413@besplex.bde.org> <20190304212159.GA12587@troutmask.apl.washington.edu> <20190305153243.Y1349@besplex.bde.org> <20190306055201.GA40298@troutmask.apl.washington.edu> <20190306225811.P2731@besplex.bde.org> <20190306184829.GA44023@troutmask.apl.washington.edu> <20190307061214.R4911@besplex.bde.org> <20190306214233.GA23159@troutmask.apl.washington.edu> <20190307144315.N932@besplex.bde.org> <20190307044447.GA16298@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=FNpr/6gs c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=NH_DFVgqrTV01CZZK1QA:9 a=CjuIK1q_8ugA:10 X-Rspamd-Queue-Id: 880526BCF0 X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates 211.29.132.42 as permitted sender) smtp.mailfrom=brde@optusnet.com.au X-Spamd-Result: default: False [-6.42 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_IN_DNSWL_LOW(-0.10)[42.132.29.211.list.dnswl.org : 127.0.5.1]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23]; FREEMAIL_FROM(0.00)[optusnet.com.au]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[cached: extmail.optusnet.com.au]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.97)[-0.967,0]; IP_SCORE(-3.14)[ip: (-8.45), ipnet: 211.28.0.0/14(-4.01), asn: 4804(-3.20), country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[optusnet.com.au]; ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2019 06:36:36 -0000 On Wed, 6 Mar 2019, Steve Kargl wrote: > On Thu, Mar 07, 2019 at 03:09:06PM +1100, Bruce Evans wrote: >> On Wed, 6 Mar 2019, Steve Kargl wrote: >> ... >> But does it preserve NaN bits as much as possible, and produce the same >> NaN bits as double precision after conversion to double precision? The >> report spells NaN and Inf using C99's fuzzy bad names. Even the C99 >> specification spells NaN and Inf correctly in Appendix F. >> >> The complex hyperbolic functions have better comments, so are almost >> an easier place to start. > > I suppose it preserves the NaN bits as well as cexpf and cexp as > it uses the same algorthim. > ... > long double complex > cexpl (long double complex z) > { > long double c, exp_x, s, x, y; > uint64_t lx, ly; > uint16_t hx, hy, ix, iy; > > ENTERI(); > > x = creall(z); > y = cimagl(z); > > EXTRACT_LDBL80_WORDS(hx, lx, x); > EXTRACT_LDBL80_WORDS(hy, ly, y); I think you recently changed to this from your new macro. > > ix = hx&0x7fff; > iy = hy&0x7fff; Some style bugs. s_cexp.c doesn't use the fdlibm'ish style of sometimes no spaces around '&'. It puts the ix and iy calculations immediately after the extractions. They shouldn't be separated since they are also extractions. c_cexp.c extracts in a different order, and doesn't use iy. You reordered it a bit to use sincos(). I don't like the reordering much. But old s_cexp.c has its own style bug of a separating the extraction for y but not even following its usual style of a blank line before comments near the extraction of x. > > /* cexp(0 + I y) = cos(y) + I sin(y) */ > if ((ix | lx) == 0) { > sincosl(y, &s, &c); > RETURNI(CMPLXL(c, s)); > } > > /* cexp(x + I 0) = exp(x) + I 0 */ > if ((iy | ly) == 0) > RETURNI(CMPLXL(expl(x), y)); > > if (iy == 0x7fff) { s_cexp.c actually uses a clobbered hy here. Perhaps fix this too (compilers mostly ignore program order for initializing variables like iy and optimize to an expression involving the unclobbered hy here if that is optimal. The extra variables are just for readability or unreadability). > if (ix < 0x7fff || > (ix == 0x7fff && (lx & 0x7fffffffffffffffULL) != 0)) { > /* cexp(finite|NaN +- I Inf|NaN) = NaN + I NaN */ > RETURNI(CMPLXL(y - y, y - y)); Looks OK. I think ix and iy are only used here, so it is clearer to not have variables for them -- just use (hx & 0x7fff), etc. y - y and similar expressions usually give the right NaN (y = Inf -> dNaN and y = NaN -> d(y)). IIRC, the only differences for ld128 is that then normalization bit is hidden so masking out the top bit in the mantissa is not needed, but the mantissa is in 2 64-bit words. > } else if (hx & 0x8000) { This is clearest as it is with an explicit mask. > /* > * exp_ovfl = 11356.5234062941439497, 0x400c, 0xb17217f7d1cf79ac > * cexp_ovfl = 22755.3287906024445633, 0x400d, 0xb1c6a8573de9768c > */ > if ((hx == 0x400c && lx > 0xb17217f7d1cf79acULL) || > (hx == 0x400d && lx < 0xb1c6a8573de9768cULL)) { Don't duplicate the hex constants in the comment. The limits can be fuzzy so don't need so many decimal digits or nonzero nonzero bits or LD80C() to round them (s_cexp.c uses only 1 decimal digit and doesn't test the lower 32 bits in the mantissa; here we have more bits than we need in lx). Maybe not use bit-fiddling at all for this (in other precisions too). The more important exp() functions check thresholds in FP. Here x can be NaN so FP comparisons with it don't work right. That seems to be the only reason to check in bits here. > /* > * x is between exp_ovfl and cexp_ovfl, so we must scale to > * avoid overflow in exp(x). > */ > RETURNI(__ldexp_cexpl(z, 1)); Did your tests cover this case, and does it use the fixed version of the old __ldexp_cexpl() in k_expl.h? > } else { > /* > * Cases covered here: > * - x < exp_ovfl and exp(x) won't overflow (common case) > * - x > cexp_ovfl, so exp(x) * s overflows for all s > 0 > * - x = +-Inf (generated by exp()) > * - x = NaN (spurious inexact exception from y) > */ > exp_x = expl(x); > sincosl(y, &s, &c); > RETURNI(CMPLXL(exp_x * c, exp_x * s)); > } > } I think the last clause can be simplfied and even optimized by using the kernel in all cases: - the kernel produces exp_x in the form kexp_x * 2**k where kexp_x is near 1. Just multiply kexp_x by c and s, and later multiply by 2**k. This cost an extra multiplication by 2**k (for the real and imaginary parts separately), but avoids other work so may be faster. - hopefully kexp_x is actually >= 1 (and < 2). Otherwise we have to worry about spurious underflow - the above method seems to avoid spurious underflow automatically, and doesn't even mention this (multiplying by exp_x may underflow, but not spuriously since c and s are <= 1). Bruce From owner-freebsd-numerics@freebsd.org Thu Mar 7 04:44:53 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A1F5E150FABC for ; Thu, 7 Mar 2019 04:44:53 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "troutmask", Issuer "troutmask" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 13C418D285 for ; Thu, 7 Mar 2019 04:44:51 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x274ilAZ016396 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Wed, 6 Mar 2019 20:44:47 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x274ilQo016395; Wed, 6 Mar 2019 20:44:47 -0800 (PST) (envelope-from sgk) Date: Wed, 6 Mar 2019 20:44:47 -0800 From: Steve Kargl To: Bruce Evans Cc: freebsd-numerics@freebsd.org Subject: Re: Update ENTERI() macro Message-ID: <20190307044447.GA16298@troutmask.apl.washington.edu> Reply-To: sgk@troutmask.apl.washington.edu References: <20190227161906.GA77785@troutmask.apl.washington.edu> <20190228060920.R4413@besplex.bde.org> <20190304212159.GA12587@troutmask.apl.washington.edu> <20190305153243.Y1349@besplex.bde.org> <20190306055201.GA40298@troutmask.apl.washington.edu> <20190306225811.P2731@besplex.bde.org> <20190306184829.GA44023@troutmask.apl.washington.edu> <20190307061214.R4911@besplex.bde.org> <20190306214233.GA23159@troutmask.apl.washington.edu> <20190307144315.N932@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190307144315.N932@besplex.bde.org> User-Agent: Mutt/1.11.2 (2019-01-07) X-Rspamd-Queue-Id: 13C418D285 X-Spamd-Bar: + Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [1.05 / 15.00]; ARC_NA(0.00)[]; HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; NEURAL_SPAM_SHORT(0.96)[0.957,0]; NEURAL_HAM_LONG(-0.72)[-0.719,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu]; AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2]; MX_GOOD(-0.01)[cached: troutmask.apl.washington.edu]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_SPAM_MEDIUM(0.08)[0.079,0]; R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[optusnet.com.au]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(0.05)[ip: (0.10), ipnet: 128.95.0.0/16(0.15), asn: 73(0.05), country: US(-0.07)] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2019 04:44:54 -0000 On Thu, Mar 07, 2019 at 03:09:06PM +1100, Bruce Evans wrote: > On Wed, 6 Mar 2019, Steve Kargl wrote: > > > On Thu, Mar 07, 2019 at 06:30:42AM +1100, Bruce Evans wrote: > >> ... > >> I now see that you implemented 2 more versions of __ldexpl_cexpl() by > >> cloning the old double precision version. Apparently the includes > >> are to unpolluted for the compiler to see the multiple versions :-). > >> > >> Using the version in k_expl.h almost forces inlining of expl()'s kernel > >> and its large tables, just like for hyperbolic functions. This wastes > >> a lot of space, especially for duplicating the tables. It is only a > >> small optimization for time. It is done for the hyperbolic functions > >> to get this optimization, and for __ldexpl_cexpl() just for convenience. > > > > The version in k_expl.h has 2 bugs. You note the first (cos instead > > of cosl). The second is > > > > In file included from /data/kargl/trunk/math/libm/msun/ld80/s_cexpl.c:43: > > /data/kargl/trunk/math/libm/msun/ld80/k_expl.h:288:22: error: magnitude of > > floating-point constant too large for type 'double'; maximum is > > 1.7976931348623157E+308 [-Werror,-Wliteral-range] > > exp_x = (lo + hi) * 0x1p16382; > > ^ > > It is missing an L suffix. Clearly not tested. > > >>> [errors for cexpl()] > >> > >> Check a few denormals, Infs and NaNs. > > > > Exceptional cases give > > > > % ./testl -e > > cexpl(1, inf) = (nan,nan) expecting (nan,nan) > > cexpl(1,-inf) = (nan,nan) expecting (nan,nan) > > cexpl(nan,-inf) = (nan,nan) expecting (nan,nan) > > cexpl(nan, inf) = (nan,nan) expecting (nan,nan) > > cexpl(inf, inf) = (inf,nan) expecting (inf,nan) > > cexpl(inf,-inf) = (inf,nan) expecting (inf,nan) > > cexpl(inf,nan) = (inf,nan) expecting (inf,nan) > > cexpl(-inf,nan) = (0,0) expecting (0,0) > > cexpl(-inf,-inf) = (0,0) expecting (0,0) > > cexpl(-inf, inf) = (0,0) expecting (0,0) > > But does it preserve NaN bits as much as possible, and produce the same > NaN bits as double precision after conversion to double precision? The > report spells NaN and Inf using C99's fuzzy bad names. Even the C99 > specification spells NaN and Inf correctly in Appendix F. > > The complex hyperbolic functions have better comments, so are almost > an easier place to start. > I suppose it preserves the NaN bits as well as cexpf and cexp as it uses the same algorthim. /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2011 David Schultz * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * src/s_cexp.c converted to long double complex by Steven G. Kargl */ #include __FBSDID("$FreeBSD$"); #include #include #ifdef __i386__ #include #endif #include "fpmath.h" #include "math.h" #include "math_private.h" #include "k_expl.h" long double complex cexpl (long double complex z) { long double c, exp_x, s, x, y; uint64_t lx, ly; uint16_t hx, hy, ix, iy; ENTERI(); x = creall(z); y = cimagl(z); EXTRACT_LDBL80_WORDS(hx, lx, x); EXTRACT_LDBL80_WORDS(hy, ly, y); ix = hx&0x7fff; iy = hy&0x7fff; /* cexp(0 + I y) = cos(y) + I sin(y) */ if ((ix | lx) == 0) { sincosl(y, &s, &c); RETURNI(CMPLXL(c, s)); } /* cexp(x + I 0) = exp(x) + I 0 */ if ((iy | ly) == 0) RETURNI(CMPLXL(expl(x), y)); if (iy == 0x7fff) { if (ix < 0x7fff || (ix == 0x7fff && (lx & 0x7fffffffffffffffULL) != 0)) { /* cexp(finite|NaN +- I Inf|NaN) = NaN + I NaN */ RETURNI(CMPLXL(y - y, y - y)); } else if (hx & 0x8000) { /* cexp(-Inf +- I Inf|NaN) = 0 + I 0 */ RETURNI(CMPLXL(0.0, 0.0)); } else { /* cexp(+Inf +- I Inf|NaN) = Inf + I NaN */ RETURNI(CMPLXL(x, y - y)); } } /* * exp_ovfl = 11356.5234062941439497, 0x400c, 0xb17217f7d1cf79ac * cexp_ovfl = 22755.3287906024445633, 0x400d, 0xb1c6a8573de9768c */ if ((hx == 0x400c && lx > 0xb17217f7d1cf79acULL) || (hx == 0x400d && lx < 0xb1c6a8573de9768cULL)) { /* * x is between exp_ovfl and cexp_ovfl, so we must scale to * avoid overflow in exp(x). */ RETURNI(__ldexp_cexpl(z, 1)); } else { /* * Cases covered here: * - x < exp_ovfl and exp(x) won't overflow (common case) * - x > cexp_ovfl, so exp(x) * s overflows for all s > 0 * - x = +-Inf (generated by exp()) * - x = NaN (spurious inexact exception from y) */ exp_x = expl(x); sincosl(y, &s, &c); RETURNI(CMPLXL(exp_x * c, exp_x * s)); } } -- Steve 20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4 20161221 https://www.youtube.com/watch?v=IbCHE-hONow From owner-freebsd-numerics@freebsd.org Thu Mar 7 19:54:46 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6B45615239FD for ; Thu, 7 Mar 2019 19:54:46 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "troutmask", Issuer "troutmask" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 9331F70523 for ; Thu, 7 Mar 2019 19:54:44 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x27JsbOC021893 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Thu, 7 Mar 2019 11:54:37 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x27Jsa5Z021892; Thu, 7 Mar 2019 11:54:36 -0800 (PST) (envelope-from sgk) Date: Thu, 7 Mar 2019 11:54:36 -0800 From: Steve Kargl To: Bruce Evans Cc: freebsd-numerics@freebsd.org Subject: Re: Update ENTERI() macro Message-ID: <20190307195436.GA20856@troutmask.apl.washington.edu> Reply-To: sgk@troutmask.apl.washington.edu References: <20190304212159.GA12587@troutmask.apl.washington.edu> <20190305153243.Y1349@besplex.bde.org> <20190306055201.GA40298@troutmask.apl.washington.edu> <20190306225811.P2731@besplex.bde.org> <20190306184829.GA44023@troutmask.apl.washington.edu> <20190307061214.R4911@besplex.bde.org> <20190306214233.GA23159@troutmask.apl.washington.edu> <20190307144315.N932@besplex.bde.org> <20190307044447.GA16298@troutmask.apl.washington.edu> <20190307163958.V1293@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190307163958.V1293@besplex.bde.org> User-Agent: Mutt/1.11.2 (2019-01-07) X-Rspamd-Queue-Id: 9331F70523 X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [0.92 / 15.00]; ARC_NA(0.00)[]; HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; NEURAL_SPAM_SHORT(0.59)[0.591,0]; NEURAL_HAM_LONG(-0.58)[-0.582,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu]; AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2]; RCPT_COUNT_TWO(0.00)[2]; MX_GOOD(-0.01)[troutmask.apl.washington.edu]; NEURAL_SPAM_MEDIUM(0.17)[0.169,0]; R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[optusnet.com.au]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(0.05)[ip: (0.10), ipnet: 128.95.0.0/16(0.15), asn: 73(0.05), country: US(-0.07)] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2019 19:54:46 -0000 On Thu, Mar 07, 2019 at 05:36:22PM +1100, Bruce Evans wrote: > On Wed, 6 Mar 2019, Steve Kargl wrote: > > > On Thu, Mar 07, 2019 at 03:09:06PM +1100, Bruce Evans wrote: > >> On Wed, 6 Mar 2019, Steve Kargl wrote: > >> ... > >> But does it preserve NaN bits as much as possible, and produce the same > >> NaN bits as double precision after conversion to double precision? The > >> report spells NaN and Inf using C99's fuzzy bad names. Even the C99 > >> specification spells NaN and Inf correctly in Appendix F. > >> > >> The complex hyperbolic functions have better comments, so are almost > >> an easier place to start. > > > > I suppose it preserves the NaN bits as well as cexpf and cexp as > > it uses the same algorthim. > > > ... > > long double complex > > cexpl (long double complex z) > > { > > long double c, exp_x, s, x, y; > > uint64_t lx, ly; > > uint16_t hx, hy, ix, iy; > > > > ENTERI(); > > > > x = creall(z); > > y = cimagl(z); > > > > EXTRACT_LDBL80_WORDS(hx, lx, x); > > EXTRACT_LDBL80_WORDS(hy, ly, y); > > I think you recently changed to this from your new macro. Yes, the new macro is gone. > > > > ix = hx&0x7fff; > > iy = hy&0x7fff; > > Some style bugs. s_cexp.c doesn't use the fdlibm'ish style of sometimes > no spaces around '&'. It puts the ix and iy calculations immediately > after the extractions. They shouldn't be separated since they are > also extractions. c_cexp.c extracts in a different order, and doesn't > use iy. You reordered it a bit to use sincos(). I don't like the > reordering much. But old s_cexp.c has its own style bug of a separating > the extraction for y but not even following its usual style of a blank > line before comments near the extraction of x. I may unorder the re-ordering to the old order. If I understand one of the comments in the GCC bugzilla report, the optimization of converting a = cos(x); b = sin(x): to sincos(x, &b, &a). Actually, does "a = cos(x); b = sin(x)" goes to cexp(cmplx(0,x)) in the middle-end of the compiler. The backend then can recognize cexp(cmplx(0,x)) and converts this to sincos(x, &b, &a). I can't verify this for FreeBSD, because libm isn't C99 compliant. GCC requires a C99 compliant to activate the optimization. I fixed the extraction for hx, and hy. > > /* cexp(0 + I y) = cos(y) + I sin(y) */ > > if ((ix | lx) == 0) { > > sincosl(y, &s, &c); > > RETURNI(CMPLXL(c, s)); > > } > > > > /* cexp(x + I 0) = exp(x) + I 0 */ > > if ((iy | ly) == 0) > > RETURNI(CMPLXL(expl(x), y)); > > > > if (iy == 0x7fff) { > > s_cexp.c actually uses a clobbered hy here. Perhaps fix this too (compilers > mostly ignore program order for initializing variables like iy and optimize > to an expression involving the unclobbered hy here if that is optimal. The > extra variables are just for readability or unreadability). > > > if (ix < 0x7fff || > > (ix == 0x7fff && (lx & 0x7fffffffffffffffULL) != 0)) { > > /* cexp(finite|NaN +- I Inf|NaN) = NaN + I NaN */ > > RETURNI(CMPLXL(y - y, y - y)); > > Looks OK. > > I think ix and iy are only used here, so it is clearer to not have > variables for them -- just use (hx & 0x7fff), etc. > > y - y and similar expressions usually give the right NaN (y = Inf -> dNaN > and y = NaN -> d(y)). > > IIRC, the only differences for ld128 is that then normalization bit is > hidden so masking out the top bit in the mantissa is not needed, but the > mantissa is in 2 64-bit words. > > > } else if (hx & 0x8000) { > > This is clearest as it is with an explicit mask. > > > /* > > * exp_ovfl = 11356.5234062941439497, 0x400c, 0xb17217f7d1cf79ac > > * cexp_ovfl = 22755.3287906024445633, 0x400d, 0xb1c6a8573de9768c > > */ > > if ((hx == 0x400c && lx > 0xb17217f7d1cf79acULL) || > > (hx == 0x400d && lx < 0xb1c6a8573de9768cULL)) { > > Don't duplicate the hex constants in the comment. The limits can be fuzzy > so don't need so many decimal digits or nonzero nonzero bits or LD80C() > to round them (s_cexp.c uses only 1 decimal digit and doesn't test the > lower 32 bits in the mantissa; here we have more bits than we need in lx). Being fuzzy was the movitation for the new macro, I was only using the upper 32 bits of lx in the original cexpl() I posted. > Maybe not use bit-fiddling at all for this (in other precisions too). > The more important exp() functions check thresholds in FP. Here x can > be NaN so FP comparisons with it don't work right. That seems to be > the only reason to check in bits here. > > > /* > > * x is between exp_ovfl and cexp_ovfl, so we must scale to > > * avoid overflow in exp(x). > > */ > > RETURNI(__ldexp_cexpl(z, 1)); > > Did your tests cover this case, and does it use the fixed version of the > old __ldexp_cexpl() in k_expl.h? No. The test I reported was restricted to -11350 < x < 11350. Spot checked x outside that range. I had the conditional wrong for x < -exp_ovfl. Also, the comment is actaully a little optimistic. One cannot avoid overflow. For example, x = 11357 is just above exp_ovfl, and exp(x) = inf. __ldexp_cexpl() does scaling and exploits |cos(y)| <= 1 and |sin(y)| <= 1 to try to extend the range of cexpl(). ./testl -a 11357 0.1 libm: Re(cexpf) = inf mpfr: Re(cexpf) = 1.90658369228334884925e+4932 libm: Im(cexpf) = 1.91296449568717353572e+4931 mpfr: Im(cexpf) = 1.91296449568717353558e+4931 troutmask:kargl[430] ./testl -a 11357 1.57 libm: Re(cexpf) = 1.52588659766018377153e+4929 mpfr: Re(cexpf) = 1.52588659766018377147e+4929 libm: Im(cexpf) = inf mpfr: Im(cexpf) = 1.91615588587371861485e+4932 > > > } else { > > /* > > * Cases covered here: > > * - x < exp_ovfl and exp(x) won't overflow (common case) > > * - x > cexp_ovfl, so exp(x) * s overflows for all s > 0 > > * - x = +-Inf (generated by exp()) > > * - x = NaN (spurious inexact exception from y) > > */ > > exp_x = expl(x); > > sincosl(y, &s, &c); > > RETURNI(CMPLXL(exp_x * c, exp_x * s)); > > } > > } > > I think the last clause can be simplfied and even optimized by using the > kernel in all cases: > - the kernel produces exp_x in the form kexp_x * 2**k where kexp_x is near > 1. Just multiply kexp_x by c and s, and later multiply by 2**k. This > cost an extra multiplication by 2**k (for the real and imaginary parts > separately), but avoids other work so may be faster. > - hopefully kexp_x is actually >= 1 (and < 2). Otherwise we have to > worry about spurious underflow > - the above method seems to avoid spurious underflow automatically, and > doesn't even mention this (multiplying by exp_x may underflow, but not > spuriously since c and s are <= 1). You might be right about using the kernel. I'll leave that to others on freebsd-numerics@; although you and I seem to be the only subscribers. -- Steve From owner-freebsd-numerics@freebsd.org Fri Mar 8 00:22:50 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F2261152C634 for ; Fri, 8 Mar 2019 00:22:49 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "troutmask", Issuer "troutmask" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 386838222D for ; Fri, 8 Mar 2019 00:22:48 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x280Mj1w026366 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Thu, 7 Mar 2019 16:22:45 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x280MjNf026365; Thu, 7 Mar 2019 16:22:45 -0800 (PST) (envelope-from sgk) Date: Thu, 7 Mar 2019 16:22:45 -0800 From: Steve Kargl To: Bruce Evans Cc: freebsd-numerics@freebsd.org Subject: Re: Update ENTERI() macro Message-ID: <20190308002245.GA26338@troutmask.apl.washington.edu> Reply-To: sgk@troutmask.apl.washington.edu References: <20190227201214.V1823@besplex.bde.org> <20190227161906.GA77785@troutmask.apl.washington.edu> <20190228060920.R4413@besplex.bde.org> <20190304212159.GA12587@troutmask.apl.washington.edu> <20190305153243.Y1349@besplex.bde.org> <20190306055201.GA40298@troutmask.apl.washington.edu> <20190306225811.P2731@besplex.bde.org> <20190306184829.GA44023@troutmask.apl.washington.edu> <20190307061214.R4911@besplex.bde.org> <20190306214233.GA23159@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190306214233.GA23159@troutmask.apl.washington.edu> User-Agent: Mutt/1.11.2 (2019-01-07) X-Rspamd-Queue-Id: 386838222D X-Spamd-Bar: + Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [1.24 / 15.00]; ARC_NA(0.00)[]; HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; NEURAL_SPAM_SHORT(0.59)[0.594,0]; NEURAL_HAM_LONG(-0.73)[-0.732,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu]; AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2]; MX_GOOD(-0.01)[cached: troutmask.apl.washington.edu]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_SPAM_MEDIUM(0.64)[0.644,0]; R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[optusnet.com.au]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(0.05)[ip: (0.10), ipnet: 128.95.0.0/16(0.15), asn: 73(0.05), country: US(-0.07)] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2019 00:22:50 -0000 On Wed, Mar 06, 2019 at 01:42:33PM -0800, Steve Kargl wrote: > On Thu, Mar 07, 2019 at 06:30:42AM +1100, Bruce Evans wrote: > > On Wed, 6 Mar 2019, Steve Kargl wrote: > > > > > On Wed, Mar 06, 2019 at 11:56:23PM +1100, Bruce Evans wrote: > > >> On Tue, 5 Mar 2019, Steve Kargl wrote: > > > > > >>> a similar k_cexpl.c. Yes, I added the 'c' in the name to avoid > > >>> confusion in ld80/. In particular, I have no idea how he found > > >>> his scaling value 'k'. Any insights? > > >> > > >> bde already wrote __ldexp_cexpl() in ld*/k_expl.h, and kargl committed > > >> it in r260066. Does it not work? :-). > > >> > > >> Well, it hasn't been tested, and it indeed cannot work since it spells > > >> cosl(y) as cos(y). > > > > > > Taking long breaks from pecking at libm issues seems to be > > > conducive to memory loss. I'll go review k_expl.h. I simply > > > remember it as having a kernel for expl(). > > > > I now see that you implemented 2 more versions of __ldexpl_cexpl() by > > cloning the old double precision version. Apparently the includes > > are to unpolluted for the compiler to see the multiple versions :-). > > > > Using the version in k_expl.h almost forces inlining of expl()'s kernel > > and its large tables, just like for hyperbolic functions. This wastes > > a lot of space, especially for duplicating the tables. It is only a > > small optimization for time. It is done for the hyperbolic functions > > to get this optimization, and for __ldexpl_cexpl() just for convenience. > > The version in k_expl.h has 2 bugs. You note the first (cos instead > of cosl). The second is > > In file included from /data/kargl/trunk/math/libm/msun/ld80/s_cexpl.c:43: > /data/kargl/trunk/math/libm/msun/ld80/k_expl.h:288:22: error: magnitude of > floating-point constant too large for type 'double'; maximum is > 1.7976931348623157E+308 [-Werror,-Wliteral-range] > exp_x = (lo + hi) * 0x1p16382; > ^ Make that 3 bugs. Index: ld80/k_expl.h =================================================================== --- ld80/k_expl.h (revision 344600) +++ ld80/k_expl.h (working copy) @@ -285,7 +285,7 @@ y = cimagl(z); __k_expl(x, &hi, &lo, &k); - exp_x = (lo + hi) * 0x1p16382; + exp_x = (lo + hi) * 0x1p16382L; expt += k - 16382; scale1 = 1; @@ -292,9 +292,9 @@ half_expt = expt / 2; SET_LDBL_EXPSIGN(scale1, BIAS + half_expt); scale2 = 1; - SET_LDBL_EXPSIGN(scale1, BIAS + expt - half_expt); + SET_LDBL_EXPSIGN(scale2, BIAS + expt - half_expt); - return (CMPLXL(cos(y) * exp_x * scale1 * scale2, + return (CMPLXL(cosl(y) * exp_x * scale1 * scale2, sinl(y) * exp_x * scale1 * scale2)); } #endif /* _COMPLEX_H */ -- Steve From owner-freebsd-numerics@freebsd.org Fri Mar 8 03:04:20 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 96F3F1530091 for ; Fri, 8 Mar 2019 03:04:20 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "troutmask", Issuer "troutmask" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 6234287720 for ; Fri, 8 Mar 2019 03:04:19 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x2834GIf028810 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Thu, 7 Mar 2019 19:04:16 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x2834FGU028809; Thu, 7 Mar 2019 19:04:15 -0800 (PST) (envelope-from sgk) Date: Thu, 7 Mar 2019 19:04:15 -0800 From: Steve Kargl To: Bruce Evans Cc: freebsd-numerics@freebsd.org Subject: Re: Update ENTERI() macro Message-ID: <20190308030415.GA28796@troutmask.apl.washington.edu> Reply-To: sgk@troutmask.apl.washington.edu References: <20190227161906.GA77785@troutmask.apl.washington.edu> <20190228060920.R4413@besplex.bde.org> <20190304212159.GA12587@troutmask.apl.washington.edu> <20190305153243.Y1349@besplex.bde.org> <20190306055201.GA40298@troutmask.apl.washington.edu> <20190306225811.P2731@besplex.bde.org> <20190306184829.GA44023@troutmask.apl.washington.edu> <20190307061214.R4911@besplex.bde.org> <20190306214233.GA23159@troutmask.apl.washington.edu> <20190308002245.GA26338@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190308002245.GA26338@troutmask.apl.washington.edu> User-Agent: Mutt/1.11.2 (2019-01-07) X-Rspamd-Queue-Id: 6234287720 X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [0.69 / 15.00]; ARC_NA(0.00)[]; HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu]; NEURAL_HAM_MEDIUM(-0.14)[-0.136,0]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; NEURAL_SPAM_SHORT(0.75)[0.751,0]; NEURAL_HAM_LONG(-0.66)[-0.657,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu]; AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2]; MX_GOOD(-0.01)[cached: troutmask.apl.washington.edu]; RCPT_COUNT_TWO(0.00)[2]; R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[optusnet.com.au]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(0.05)[ip: (0.10), ipnet: 128.95.0.0/16(0.15), asn: 73(0.05), country: US(-0.07)] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2019 03:04:20 -0000 On Thu, Mar 07, 2019 at 04:22:45PM -0800, Steve Kargl wrote: > > Make that 3 bugs. > > Index: ld80/k_expl.h > =================================================================== > --- ld80/k_expl.h (revision 344600) > +++ ld80/k_expl.h (working copy) > @@ -285,7 +285,7 @@ > y = cimagl(z); > __k_expl(x, &hi, &lo, &k); > > - exp_x = (lo + hi) * 0x1p16382; > + exp_x = (lo + hi) * 0x1p16382L; > expt += k - 16382; > > scale1 = 1; > @@ -292,9 +292,9 @@ > half_expt = expt / 2; > SET_LDBL_EXPSIGN(scale1, BIAS + half_expt); > scale2 = 1; > - SET_LDBL_EXPSIGN(scale1, BIAS + expt - half_expt); > + SET_LDBL_EXPSIGN(scale2, BIAS + expt - half_expt); This bug appears in ld128/k_expl.h. The other two bugs are not present or were fixed at some point. > > - return (CMPLXL(cos(y) * exp_x * scale1 * scale2, > + return (CMPLXL(cosl(y) * exp_x * scale1 * scale2, > sinl(y) * exp_x * scale1 * scale2)); > } > #endif /* _COMPLEX_H */ > > -- > Steve > _______________________________________________ > freebsd-numerics@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-numerics > To unsubscribe, send any mail to "freebsd-numerics-unsubscribe@freebsd.org" -- Steve 20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4 20161221 https://www.youtube.com/watch?v=IbCHE-hONow From owner-freebsd-numerics@freebsd.org Fri Mar 8 13:51:16 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 41DFC153DB88 for ; Fri, 8 Mar 2019 13:51:16 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by mx1.freebsd.org (Postfix) with ESMTP id CED7974A61 for ; Fri, 8 Mar 2019 13:51:13 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c122-106-155-65.carlnfd1.nsw.optusnet.com.au (c122-106-155-65.carlnfd1.nsw.optusnet.com.au [122.106.155.65]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 9EE3B432FF4; Sat, 9 Mar 2019 00:51:09 +1100 (AEDT) Date: Sat, 9 Mar 2019 00:51:07 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Steve Kargl cc: freebsd-numerics@freebsd.org Subject: cexpl() (was: Re: Update ENTERI() macro) In-Reply-To: <20190307195436.GA20856@troutmask.apl.washington.edu> Message-ID: <20190308225807.D6410@besplex.bde.org> References: <20190304212159.GA12587@troutmask.apl.washington.edu> <20190305153243.Y1349@besplex.bde.org> <20190306055201.GA40298@troutmask.apl.washington.edu> <20190306225811.P2731@besplex.bde.org> <20190306184829.GA44023@troutmask.apl.washington.edu> <20190307061214.R4911@besplex.bde.org> <20190306214233.GA23159@troutmask.apl.washington.edu> <20190307144315.N932@besplex.bde.org> <20190307044447.GA16298@troutmask.apl.washington.edu> <20190307163958.V1293@besplex.bde.org> <20190307195436.GA20856@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=UJetJGXy c=1 sm=1 tr=0 a=hrcDgRrNnVuIsdOiM/DCpA==:117 a=kj9zAlcOel0A:10 a=vStcYMboq_yilUtR23IA:9 a=CjuIK1q_8ugA:10 X-Rspamd-Queue-Id: CED7974A61 X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates 211.29.132.246 as permitted sender) smtp.mailfrom=brde@optusnet.com.au X-Spamd-Result: default: False [-6.16 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23]; FREEMAIL_FROM(0.00)[optusnet.com.au]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[cached: extmail.optusnet.com.au]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.92)[-0.922,0]; IP_SCORE(-2.93)[ip: (-7.64), ipnet: 211.28.0.0/14(-3.88), asn: 4804(-3.09), country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[]; RCVD_IN_DNSWL_LOW(-0.10)[246.132.29.211.list.dnswl.org : 127.0.5.1]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[optusnet.com.au]; ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU]; MIME_TRACE(0.00)[0:+]; RECEIVED_SPAMHAUS_PBL(0.00)[65.155.106.122.zen.spamhaus.org : 127.0.0.11] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2019 13:51:16 -0000 On Thu, 7 Mar 2019, Steve Kargl wrote: > On Thu, Mar 07, 2019 at 05:36:22PM +1100, Bruce Evans wrote: >> On Wed, 6 Mar 2019, Steve Kargl wrote: >> >>> On Thu, Mar 07, 2019 at 03:09:06PM +1100, Bruce Evans wrote: >* ... >>> /* >>> * exp_ovfl = 11356.5234062941439497, 0x400c, 0xb17217f7d1cf79ac >>> * cexp_ovfl = 22755.3287906024445633, 0x400d, 0xb1c6a8573de9768c >>> */ I checked that these are correct. Any rounding should be down for exp_ovfl and up for cexp_ovfl. >>> if ((hx == 0x400c && lx > 0xb17217f7d1cf79acULL) || >>> (hx == 0x400d && lx < 0xb1c6a8573de9768cULL)) { >> >> Don't duplicate the hex constants in the comment. The limits can be fuzzy >> so don't need so many decimal digits or nonzero nonzero bits or LD80C() >> to round them (s_cexp.c uses only 1 decimal digit and doesn't test the >> lower 32 bits in the mantissa; here we have more bits than we need in lx). > > Being fuzzy was the movitation for the new macro, I was only using > the upper 32 bits of lx in the original cexpl() I posted. IIRC, that was done by extracting into a 32-bit lx. This could be written as (lx >> 32) > 0xb17217f7 (or a fuzzier value), and it might be better to do that anyway. >> Maybe not use bit-fiddling at all for this (in other precisions too). >> The more important exp() functions check thresholds in FP. Here x can >> be NaN so FP comparisons with it don't work right. That seems to be >> the only reason to check in bits here. >> >>> /* >>> * x is between exp_ovfl and cexp_ovfl, so we must scale to >>> * avoid overflow in exp(x). >>> */ >>> RETURNI(__ldexp_cexpl(z, 1)); >> >> Did your tests cover this case, and does it use the fixed version of the >> old __ldexp_cexpl() in k_expl.h? > > No. The test I reported was restricted to -11350 < x < 11350. > Spot checked x outside that range. I had the conditional wrong > for x < -exp_ovfl. Also, the comment is actaully a little optimistic. > One cannot avoid overflow. For example, x = 11357 is just above > exp_ovfl, and exp(x) = inf. __ldexp_cexpl() does scaling and exploits > |cos(y)| <= 1 and |sin(y)| <= 1 to try to extend the range of cexpl(). I understand this better now. __ldexp_cexpl() doesn't work for all large x, so callers must only use it for a range of values above cexp_ovl. The correctness of this is unobvious. It depends on sin(y) and cos(y) never underflowing to 0. Otherwise, this method would give Inf * 0 = NaN for finite x and y, and the scaling method would give finite * 0 * 2**k = 0. >>> } else { >>> /* >>> * Cases covered here: >>> * - x < exp_ovfl and exp(x) won't overflow (common case) >>> * - x > cexp_ovfl, so exp(x) * s overflows for all s > 0 >>> * - x = +-Inf (generated by exp()) >>> * - x = NaN (spurious inexact exception from y) >>> */ >>> exp_x = expl(x); >>> sincosl(y, &s, &c); >>> RETURNI(CMPLXL(exp_x * c, exp_x * s)); >>> } >>> } >> >> I think the last clause can be simplfied and even optimized by using the >> kernel in all cases: I was wrong. >> - the kernel produces exp_x in the form kexp_x * 2**k where kexp_x is near >> 1. Just multiply kexp_x by c and s, and later multiply by 2**k. This >> cost an extra multiplication by 2**k (for the real and imaginary parts >> separately), but avoids other work so may be faster. >> - hopefully kexp_x is actually >= 1 (and < 2). Otherwise we have to >> worry about spurious underflow >> - the above method seems to avoid spurious underflow automatically, and >> doesn't even mention this (multiplying by exp_x may underflow, but not >> spuriously since c and s are <= 1). > > You might be right about using the kernel. I'll leave that > to others on freebsd-numerics@; although you and I seem to be > the only subscribers. Forget about that. __ldexp_cexp*() is not really a kernel. It is used for ccosh*() and csinh*() too. Those can only use the kernel for large positive x. cexp*() may as well do the same. Also, the kernel currently doesn't support very large positive x, so the second overflow threshold is needed in callers. Bruce From owner-freebsd-numerics@freebsd.org Fri Mar 8 13:58:51 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 60CAE153DD79 for ; Fri, 8 Mar 2019 13:58:51 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by mx1.freebsd.org (Postfix) with ESMTP id 7D5A774C84 for ; Fri, 8 Mar 2019 13:58:50 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c122-106-155-65.carlnfd1.nsw.optusnet.com.au (c122-106-155-65.carlnfd1.nsw.optusnet.com.au [122.106.155.65]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 059E1432CD3; Sat, 9 Mar 2019 00:58:47 +1100 (AEDT) Date: Sat, 9 Mar 2019 00:58:47 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Steve Kargl cc: freebsd-numerics@freebsd.org Subject: Re: Update ENTERI() macro In-Reply-To: <20190308030415.GA28796@troutmask.apl.washington.edu> Message-ID: <20190309005153.H6410@besplex.bde.org> References: <20190227161906.GA77785@troutmask.apl.washington.edu> <20190228060920.R4413@besplex.bde.org> <20190304212159.GA12587@troutmask.apl.washington.edu> <20190305153243.Y1349@besplex.bde.org> <20190306055201.GA40298@troutmask.apl.washington.edu> <20190306225811.P2731@besplex.bde.org> <20190306184829.GA44023@troutmask.apl.washington.edu> <20190307061214.R4911@besplex.bde.org> <20190306214233.GA23159@troutmask.apl.washington.edu> <20190308002245.GA26338@troutmask.apl.washington.edu> <20190308030415.GA28796@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=P6RKvmIu c=1 sm=1 tr=0 a=hrcDgRrNnVuIsdOiM/DCpA==:117 a=kj9zAlcOel0A:10 a=JMbnHHhEAjw9J1_4nJ8A:9 a=CjuIK1q_8ugA:10 X-Rspamd-Queue-Id: 7D5A774C84 X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates 211.29.132.246 as permitted sender) smtp.mailfrom=brde@optusnet.com.au X-Spamd-Result: default: False [-6.18 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_IN_DNSWL_LOW(-0.10)[246.132.29.211.list.dnswl.org : 127.0.5.1]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23]; FREEMAIL_FROM(0.00)[optusnet.com.au]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; RECEIVED_SPAMHAUS_PBL(0.00)[65.155.106.122.zen.spamhaus.org : 127.0.0.11]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[cached: extmail.optusnet.com.au]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.92)[-0.924,0]; IP_SCORE(-2.95)[ip: (-7.71), ipnet: 211.28.0.0/14(-3.89), asn: 4804(-3.10), country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[optusnet.com.au]; ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2019 13:58:51 -0000 On Thu, 7 Mar 2019, Steve Kargl wrote: > On Thu, Mar 07, 2019 at 04:22:45PM -0800, Steve Kargl wrote: >> >> Make that 3 bugs. >> >> Index: ld80/k_expl.h >> =================================================================== >> --- ld80/k_expl.h (revision 344600) >> +++ ld80/k_expl.h (working copy) >> @@ -285,7 +285,7 @@ >> y = cimagl(z); >> __k_expl(x, &hi, &lo, &k); >> >> - exp_x = (lo + hi) * 0x1p16382; >> + exp_x = (lo + hi) * 0x1p16382L; >> expt += k - 16382; >> >> scale1 = 1; >> @@ -292,9 +292,9 @@ >> half_expt = expt / 2; >> SET_LDBL_EXPSIGN(scale1, BIAS + half_expt); >> scale2 = 1; >> - SET_LDBL_EXPSIGN(scale1, BIAS + expt - half_expt); >> + SET_LDBL_EXPSIGN(scale2, BIAS + expt - half_expt); Oops. > This bug appears in ld128/k_expl.h. The other two bugs > are not present or were fixed at some point. All 3 are in both ld*/k_expl.h. This one is is fixed or not present in my src/k_exp[f].c. >> >> - return (CMPLXL(cos(y) * exp_x * scale1 * scale2, >> + return (CMPLXL(cosl(y) * exp_x * scale1 * scale2, One like this is fixed or not present in src/k_expf.c. >> sinl(y) * exp_x * scale1 * scale2)); >> } >> #endif /* _COMPLEX_H */ Bruce From owner-freebsd-numerics@freebsd.org Fri Mar 8 16:24:37 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 457EC150F484 for ; Fri, 8 Mar 2019 16:24:37 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "troutmask", Issuer "troutmask" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 025A6820A8 for ; Fri, 8 Mar 2019 16:24:35 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x28GOWLb031509 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Fri, 8 Mar 2019 08:24:32 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x28GOWtr031508; Fri, 8 Mar 2019 08:24:32 -0800 (PST) (envelope-from sgk) Date: Fri, 8 Mar 2019 08:24:32 -0800 From: Steve Kargl To: Bruce Evans Cc: freebsd-numerics@freebsd.org Subject: Re: cexpl() (was: Re: Update ENTERI() macro) Message-ID: <20190308162432.GA31392@troutmask.apl.washington.edu> Reply-To: sgk@troutmask.apl.washington.edu References: <20190306055201.GA40298@troutmask.apl.washington.edu> <20190306225811.P2731@besplex.bde.org> <20190306184829.GA44023@troutmask.apl.washington.edu> <20190307061214.R4911@besplex.bde.org> <20190306214233.GA23159@troutmask.apl.washington.edu> <20190307144315.N932@besplex.bde.org> <20190307044447.GA16298@troutmask.apl.washington.edu> <20190307163958.V1293@besplex.bde.org> <20190307195436.GA20856@troutmask.apl.washington.edu> <20190308225807.D6410@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190308225807.D6410@besplex.bde.org> User-Agent: Mutt/1.11.2 (2019-01-07) X-Rspamd-Queue-Id: 025A6820A8 X-Spamd-Bar: + Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [1.13 / 15.00]; ARC_NA(0.00)[]; HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; NEURAL_SPAM_SHORT(0.64)[0.636,0]; NEURAL_HAM_LONG(-0.73)[-0.728,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu]; AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2]; MX_GOOD(-0.01)[cached: troutmask.apl.washington.edu]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_SPAM_MEDIUM(0.49)[0.490,0]; R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[optusnet.com.au]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(0.05)[ip: (0.10), ipnet: 128.95.0.0/16(0.15), asn: 73(0.05), country: US(-0.07)] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2019 16:24:37 -0000 On Sat, Mar 09, 2019 at 12:51:07AM +1100, Bruce Evans wrote: > On Thu, 7 Mar 2019, Steve Kargl wrote: > > > On Thu, Mar 07, 2019 at 05:36:22PM +1100, Bruce Evans wrote: > >> On Wed, 6 Mar 2019, Steve Kargl wrote: > >> > >>> On Thu, Mar 07, 2019 at 03:09:06PM +1100, Bruce Evans wrote: > >* ... > >>> /* > >>> * exp_ovfl = 11356.5234062941439497, 0x400c, 0xb17217f7d1cf79ac > >>> * cexp_ovfl = 22755.3287906024445633, 0x400d, 0xb1c6a8573de9768c > >>> */ > > I checked that these are correct. Any rounding should be down for > exp_ovfl and up for cexp_ovfl. > > >>> if ((hx == 0x400c && lx > 0xb17217f7d1cf79acULL) || > >>> (hx == 0x400d && lx < 0xb1c6a8573de9768cULL)) { > >> > >> Don't duplicate the hex constants in the comment. The limits can be fuzzy > >> so don't need so many decimal digits or nonzero nonzero bits or LD80C() > >> to round them (s_cexp.c uses only 1 decimal digit and doesn't test the > >> lower 32 bits in the mantissa; here we have more bits than we need in lx). > > > > Being fuzzy was the movitation for the new macro, I was only using > > the upper 32 bits of lx in the original cexpl() I posted. > > IIRC, that was done by extracting into a 32-bit lx. This could be written > as (lx >> 32) > 0xb17217f7 (or a fuzzier value), and it might be better > to do that anyway. This is what I do in my WIP ccosh. There are 3 thresholds of ~22.9, ~11356, and ~22755. I'm casting the result to uint32_t, so something like (uint32_t)(lx >> 32) > 0xb17000000. Do I need the cast and/or do you prefer it? > >> Maybe not use bit-fiddling at all for this (in other precisions too). > >> The more important exp() functions check thresholds in FP. Here x can > >> be NaN so FP comparisons with it don't work right. That seems to be > >> the only reason to check in bits here. > >> > >>> /* > >>> * x is between exp_ovfl and cexp_ovfl, so we must scale to > >>> * avoid overflow in exp(x). > >>> */ > >>> RETURNI(__ldexp_cexpl(z, 1)); > >> > >> Did your tests cover this case, and does it use the fixed version of the > >> old __ldexp_cexpl() in k_expl.h? > > > > No. The test I reported was restricted to -11350 < x < 11350. > > Spot checked x outside that range. I had the conditional wrong > > for x < -exp_ovfl. Also, the comment is actaully a little optimistic. > > One cannot avoid overflow. For example, x = 11357 is just above > > exp_ovfl, and exp(x) = inf. __ldexp_cexpl() does scaling and exploits > > |cos(y)| <= 1 and |sin(y)| <= 1 to try to extend the range of cexpl(). > > I understand this better now. __ldexp_cexpl() doesn't work for all large > x, so callers must only use it for a range of values above cexp_ovl. > The correctness of this is unobvious. It depends on sin(y) and cos(y) > never underflowing to 0. Otherwise, this method would give Inf * 0 = NaN > for finite x and y, and the scaling method would give finite * 0 * 2**k = 0. cos(y) can never be 0 and sin(y) is only 0 at y = 0 (see sinpi, cospi implementations). y=0 is a special case and handled separately. We can have Inf*subnormal when y is near 0. -- Steve From owner-freebsd-numerics@freebsd.org Fri Mar 8 18:03:13 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5BC57152498D for ; Fri, 8 Mar 2019 18:03:13 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail106.syd.optusnet.com.au (mail106.syd.optusnet.com.au [211.29.132.42]) by mx1.freebsd.org (Postfix) with ESMTP id 6A1CD84D6B for ; Fri, 8 Mar 2019 18:03:09 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail106.syd.optusnet.com.au (Postfix) with ESMTPS id 01FE73D5A3B; Sat, 9 Mar 2019 05:02:59 +1100 (AEDT) Date: Sat, 9 Mar 2019 05:02:57 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Steve Kargl cc: freebsd-numerics@freebsd.org Subject: Re: cexpl() (was: Re: Update ENTERI() macro) In-Reply-To: <20190308162432.GA31392@troutmask.apl.washington.edu> Message-ID: <20190309033820.J1443@besplex.bde.org> References: <20190306055201.GA40298@troutmask.apl.washington.edu> <20190306225811.P2731@besplex.bde.org> <20190306184829.GA44023@troutmask.apl.washington.edu> <20190307061214.R4911@besplex.bde.org> <20190306214233.GA23159@troutmask.apl.washington.edu> <20190307144315.N932@besplex.bde.org> <20190307044447.GA16298@troutmask.apl.washington.edu> <20190307163958.V1293@besplex.bde.org> <20190307195436.GA20856@troutmask.apl.washington.edu> <20190308225807.D6410@besplex.bde.org> <20190308162432.GA31392@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=P6RKvmIu c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=hf07pYzDFtiwsiih0sgA:9 a=CjuIK1q_8ugA:10 X-Rspamd-Queue-Id: 6A1CD84D6B X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates 211.29.132.42 as permitted sender) smtp.mailfrom=brde@optusnet.com.au X-Spamd-Result: default: False [-6.39 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_IN_DNSWL_LOW(-0.10)[42.132.29.211.list.dnswl.org : 127.0.5.1]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23]; FREEMAIL_FROM(0.00)[optusnet.com.au]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[cached: extmail.optusnet.com.au]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.98)[-0.976,0]; IP_SCORE(-3.10)[ip: (-8.49), ipnet: 211.28.0.0/14(-3.89), asn: 4804(-3.10), country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[optusnet.com.au]; ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2019 18:03:13 -0000 On Fri, 8 Mar 2019, Steve Kargl wrote: > On Sat, Mar 09, 2019 at 12:51:07AM +1100, Bruce Evans wrote: >> On Thu, 7 Mar 2019, Steve Kargl wrote: >> >>> On Thu, Mar 07, 2019 at 05:36:22PM +1100, Bruce Evans wrote: >>>> On Wed, 6 Mar 2019, Steve Kargl wrote: >>>> >>>>> On Thu, Mar 07, 2019 at 03:09:06PM +1100, Bruce Evans wrote: >>> * ... >>>>> /* >>>>> * exp_ovfl = 11356.5234062941439497, 0x400c, 0xb17217f7d1cf79ac >>>>> * cexp_ovfl = 22755.3287906024445633, 0x400d, 0xb1c6a8573de9768c >>>>> */ >> >> I checked that these are correct. Any rounding should be down for >> exp_ovfl and up for cexp_ovfl. >> >>>>> if ((hx == 0x400c && lx > 0xb17217f7d1cf79acULL) || >>>>> (hx == 0x400d && lx < 0xb1c6a8573de9768cULL)) { >>>> >>>> Don't duplicate the hex constants in the comment. The limits can be fuzzy >>>> so don't need so many decimal digits or nonzero nonzero bits or LD80C() >>>> to round them (s_cexp.c uses only 1 decimal digit and doesn't test the >>>> lower 32 bits in the mantissa; here we have more bits than we need in lx). >>> >>> Being fuzzy was the movitation for the new macro, I was only using >>> the upper 32 bits of lx in the original cexpl() I posted. >> >> IIRC, that was done by extracting into a 32-bit lx. This could be written >> as (lx >> 32) > 0xb17217f7 (or a fuzzier value), and it might be better >> to do that anyway. > > This is what I do in my WIP ccosh. There are 3 thresholds > of ~22.9, ~11356, and ~22755. I'm casting the result to > uint32_t, so something like (uint32_t)(lx >> 32) > 0xb17000000. > Do I need the cast and/or do you prefer it? I think you can even round to a power of 2 and not look at any mantissa bits to classify this, as for the 64 in coshl(). 22.9 here seems wrong. It is approximately the threshold where cosh() decays to exp(). The threshold for coshl() delay is much higher. coshl() uses 64, which is high enough for both ld80 and ld128. More like 32 would work for ld80. >>>> Maybe not use bit-fiddling at all for this (in other precisions too). >>>> The more important exp() functions check thresholds in FP. Here x can >>>> be NaN so FP comparisons with it don't work right. That seems to be >>>> the only reason to check in bits here. coshl() also checks thresholds in only in FP if classifications using the exponent are too fuzzy, so as to avoid the slow step of extracting the mantissa bits and the unportable steps of using them. I think classification for ld80 has to look at the mantissa to classify UNnormals, but ld128 doesn't have UnNormals so might not need the mantissa bits. Look at coshl() for details. It is simpler. >> >> I understand this better now. __ldexp_cexpl() doesn't work for all large >> x, so callers must only use it for a range of values above cexp_ovl. >> The correctness of this is unobvious. It depends on sin(y) and cos(y) >> never underflowing to 0. Otherwise, this method would give Inf * 0 = NaN >> for finite x and y, and the scaling method would give finite * 0 * 2**k = 0. > > cos(y) can never be 0 and sin(y) is only 0 at y = 0 (see sinpi, cospi > implementations). y=0 is a special case and handled separately. We > can have Inf*subnormal when y is near 0. cos(y) and sin(y) would be zero if the exact value underflows on assignment to long double, or the inexact value is miscalculated as 0. It is difficult to prove that they don't underflow to 0 or to a denormal when y is not near 0. That they are not miscalculated so as to give spurious underflow follows if we can prove that denormals don't even nearly occur and that our methods have errors less than 1 ulp in relative terms (only values at least twice as large as the largest denormal occur, and a 1-ulp relative error for these values can't reach a denormal value). Inf*denormal is not very special. It is +-Inf. The upper overflow threshold is chosen so than exp(x) * +-smallest_denormal = +-Inf for x above this threshold. It is easier to optimize this for cexp(). cexp() starts with extracting all bits of y, and then all bits of x unless y = 0, but in most cases the args are finite, nonzero and not very large, and everything except zero can be classified by looking at only the exponent. Zero can be classified using an FP comparison. Something like: /* Do this first for gcc's sin+cos pessimization? */ if (x == 0) return (cpackl(cosl(y), sinl(y)); if (y == 0) return (cpackl(expl(x), y)); /* * Maybe do the following earlier and check (hx & 0x7fff) == 0, * etc., before the FP classification of 0. */ hx = GET_LDBL_EXPSIGN(x); hy = GET_LDBL_EXPSIGN(y); if (hx & (0x7fff) < MUMBLE && (hy & 0x7fff) != 0x7fff) goto final_clause_now; /* simple formula */ /* Need more bits for a fuller classification (rarely reached). */ lx = GET_LDBL_MANTISSA(x); ly = GET_LDBL_MANTISSA(y); This is supposed to handle UnNormals without explicit checks by letting other layers and hardware handle them however it wants. E.g., if x or y is Pseudo-Zero, then the comparision with 0 either succeeds or fails depending on whether Pseudo-Zero is treated as 0 or NaN by the hardware comparison operator, and our treatment is consistent for the early returns. We only have to worry about falling through to later code which is inconsistent because it depends on Pseudo-Zeros not getting that far. Bruce From owner-freebsd-numerics@freebsd.org Fri Mar 8 19:11:57 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DE7E415267C6 for ; Fri, 8 Mar 2019 19:11:56 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "troutmask", Issuer "troutmask" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id D74C587C4B for ; Fri, 8 Mar 2019 19:11:54 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x28JBppf033184 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Fri, 8 Mar 2019 11:11:51 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x28JBpax033183; Fri, 8 Mar 2019 11:11:51 -0800 (PST) (envelope-from sgk) Date: Fri, 8 Mar 2019 11:11:50 -0800 From: Steve Kargl To: Bruce Evans Cc: freebsd-numerics@freebsd.org Subject: Re: cexpl() (was: Re: Update ENTERI() macro) Message-ID: <20190308191150.GA32980@troutmask.apl.washington.edu> Reply-To: sgk@troutmask.apl.washington.edu References: <20190306184829.GA44023@troutmask.apl.washington.edu> <20190307061214.R4911@besplex.bde.org> <20190306214233.GA23159@troutmask.apl.washington.edu> <20190307144315.N932@besplex.bde.org> <20190307044447.GA16298@troutmask.apl.washington.edu> <20190307163958.V1293@besplex.bde.org> <20190307195436.GA20856@troutmask.apl.washington.edu> <20190308225807.D6410@besplex.bde.org> <20190308162432.GA31392@troutmask.apl.washington.edu> <20190309033820.J1443@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190309033820.J1443@besplex.bde.org> User-Agent: Mutt/1.11.2 (2019-01-07) X-Rspamd-Queue-Id: D74C587C4B X-Spamd-Bar: + Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [1.35 / 15.00]; ARC_NA(0.00)[]; HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; NEURAL_SPAM_SHORT(0.85)[0.846,0]; NEURAL_HAM_LONG(-0.73)[-0.733,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu]; AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2]; MX_GOOD(-0.01)[cached: troutmask.apl.washington.edu]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_SPAM_MEDIUM(0.50)[0.500,0]; R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[optusnet.com.au]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(0.05)[ip: (0.09), ipnet: 128.95.0.0/16(0.15), asn: 73(0.05), country: US(-0.07)] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2019 19:11:57 -0000 On Sat, Mar 09, 2019 at 05:02:57AM +1100, Bruce Evans wrote: > On Fri, 8 Mar 2019, Steve Kargl wrote: > > On Sat, Mar 09, 2019 at 12:51:07AM +1100, Bruce Evans wrote: > >> On Thu, 7 Mar 2019, Steve Kargl wrote: > >>> On Thu, Mar 07, 2019 at 05:36:22PM +1100, Bruce Evans wrote: > >>>> On Wed, 6 Mar 2019, Steve Kargl wrote: > >>>>> /* > >>>>> * exp_ovfl = 11356.5234062941439497, 0x400c, 0xb17217f7d1cf79ac > >>>>> * cexp_ovfl = 22755.3287906024445633, 0x400d, 0xb1c6a8573de9768c > >>>>> */ > >> > >> I checked that these are correct. Any rounding should be down for > >> exp_ovfl and up for cexp_ovfl. > >> > >>>>> if ((hx == 0x400c && lx > 0xb17217f7d1cf79acULL) || > >>>>> (hx == 0x400d && lx < 0xb1c6a8573de9768cULL)) { > >>>> > >>>> Don't duplicate the hex constants in the comment. The limits can be fuzzy > >>>> so don't need so many decimal digits or nonzero nonzero bits or LD80C() > >>>> to round them (s_cexp.c uses only 1 decimal digit and doesn't test the > >>>> lower 32 bits in the mantissa; here we have more bits than we need in lx). > >>> > >>> Being fuzzy was the movitation for the new macro, I was only using > >>> the upper 32 bits of lx in the original cexpl() I posted. > >> > >> IIRC, that was done by extracting into a 32-bit lx. This could be written > >> as (lx >> 32) > 0xb17217f7 (or a fuzzier value), and it might be better > >> to do that anyway. > > > > This is what I do in my WIP ccosh. There are 3 thresholds > > of ~22.9, ~11356, and ~22755. I'm casting the result to > > uint32_t, so something like (uint32_t)(lx >> 32) > 0xb17000000. > > Do I need the cast and/or do you prefer it? > > I think you can even round to a power of 2 and not look at any mantissa > bits to classify this, as for the 64 in coshl(). You used your k_hexpl() and take advantage of the hi+lo splitting to use x < 64. > 22.9 here seems wrong. It is approximately the threshold where cosh() > decays to exp(). The threshold for coshl() delay is much higher. > coshl() uses 64, which is high enough for both ld80 and ld128. More > like 32 would work for ld80. I actually round up to 23 for ld80, and haven't thought too much about ld128. A value of 22 is used for s_ccosh.c, which isn't quite large enough for long double. (Yes, I looked at bits) % ./gen 22 coshl(22) = 1.79245642306579578097e+09 expl(22) / 2 = 1.79245642306579578086e+09 sinhl(22) = 1.79245642306579578074e+09 % ./gen 23 coshl(23) = 4.87240172312445130013e+09 expl(23) / 2 = 4.87240172312445130013e+09 sinhl(23) = 4.87240172312445130013e+09 in s_ccoshl.c I then have if (ix < 0x4003 || /* |x| < ~23: normal case */ (ix == 0x4003 && (uint32_t)(lx >> 32) < 0xb8000000)) RETURNI(CMPLXL(coshl(x) * c, sinhl(x) * s)); /* |x| >= 23, so cosh(x) ~= exp(|x|) / 2 */ if (ix < 0x400c || (ix == 0x400c && (uint32_t)(lx >> 32) <= 0xb1700000)) { /* x < 11356: exp(|x|) won't overflow */ h = expl(fabsl(x)) / 2; RETURNI(CMPLXL(h * c, copysignl(h, x) * s)); With |x| < 23, we have 2 function calls for coshl() and sinhl() as opposed to the 1 function call for expl() in 23 <= x <= 32 range. I can write the first conditional as if (ix < 0x4004) /* |x| < 32: normal case */ RETURNI(CMPLXL(coshl(x) * c, sinhl(x) * s)); This then raises the question on whether we should change the limit to 32 in the double complex ccosh()? -- Steve From owner-freebsd-numerics@freebsd.org Fri Mar 8 20:53:33 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D8E011528C0D for ; Fri, 8 Mar 2019 20:53:32 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by mx1.freebsd.org (Postfix) with ESMTP id E25388B976 for ; Fri, 8 Mar 2019 20:53:29 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 13F37433BA5; Sat, 9 Mar 2019 07:53:26 +1100 (AEDT) Date: Sat, 9 Mar 2019 07:53:24 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Steve Kargl cc: freebsd-numerics@freebsd.org Subject: Re: cexpl() (was: Re: Update ENTERI() macro) In-Reply-To: <20190308191150.GA32980@troutmask.apl.washington.edu> Message-ID: <20190309070413.G2539@besplex.bde.org> References: <20190306184829.GA44023@troutmask.apl.washington.edu> <20190307061214.R4911@besplex.bde.org> <20190306214233.GA23159@troutmask.apl.washington.edu> <20190307144315.N932@besplex.bde.org> <20190307044447.GA16298@troutmask.apl.washington.edu> <20190307163958.V1293@besplex.bde.org> <20190307195436.GA20856@troutmask.apl.washington.edu> <20190308225807.D6410@besplex.bde.org> <20190308162432.GA31392@troutmask.apl.washington.edu> <20190309033820.J1443@besplex.bde.org> <20190308191150.GA32980@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=P6RKvmIu c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=R02BefTfkBCJ0zfWrscA:9 a=CjuIK1q_8ugA:10 X-Rspamd-Queue-Id: E25388B976 X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates 211.29.132.246 as permitted sender) smtp.mailfrom=brde@optusnet.com.au X-Spamd-Result: default: False [-6.23 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_IN_DNSWL_LOW(-0.10)[246.132.29.211.list.dnswl.org : 127.0.5.1]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23]; FREEMAIL_FROM(0.00)[optusnet.com.au]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[cached: extmail.optusnet.com.au]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.97)[-0.968,0]; IP_SCORE(-2.95)[ip: (-7.69), ipnet: 211.28.0.0/14(-3.90), asn: 4804(-3.11), country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[optusnet.com.au]; ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2019 20:53:33 -0000 On Fri, 8 Mar 2019, Steve Kargl wrote: > On Sat, Mar 09, 2019 at 05:02:57AM +1100, Bruce Evans wrote: >> On Fri, 8 Mar 2019, Steve Kargl wrote: >>> On Sat, Mar 09, 2019 at 12:51:07AM +1100, Bruce Evans wrote: >>>> On Thu, 7 Mar 2019, Steve Kargl wrote: >>>> ... >>>> IIRC, that was done by extracting into a 32-bit lx. This could be written >>>> as (lx >> 32) > 0xb17217f7 (or a fuzzier value), and it might be better >>>> to do that anyway. >>> >>> This is what I do in my WIP ccosh. There are 3 thresholds >>> of ~22.9, ~11356, and ~22755. I'm casting the result to >>> uint32_t, so something like (uint32_t)(lx >> 32) > 0xb17000000. >>> Do I need the cast and/or do you prefer it? >> >> I think you can even round to a power of 2 and not look at any mantissa >> bits to classify this, as for the 64 in coshl(). > > You used your k_hexpl() and take advantage of the hi+lo splitting > to use x < 64. Or I designed my k_hexpl() to support this splitting up to x < 64. Long double precision is easier because it has (committed) support functions like k_hexpl(). >> 22.9 here seems wrong. It is approximately the threshold where cosh() >> decays to exp(). The threshold for coshl() delay is much higher. >> coshl() uses 64, which is high enough for both ld80 and ld128. More >> like 32 would work for ld80. > > I actually round up to 23 for ld80, and haven't thought too much > about ld128. A value of 22 is used for s_ccosh.c, which isn't > quite large enough for long double. (Yes, I looked at bits) > > % ./gen 22 > coshl(22) = 1.79245642306579578097e+09 > expl(22) / 2 = 1.79245642306579578086e+09 > sinhl(22) = 1.79245642306579578074e+09 > > % ./gen 23 > coshl(23) = 4.87240172312445130013e+09 > expl(23) / 2 = 4.87240172312445130013e+09 > sinhl(23) = 4.87240172312445130013e+09 Look at more bits :-). For r + 1/r, ignoring the 1/r term gives an error of about 1 ulp in ld80 precision when 1/r/r is about 2**-64. This gives a threshold of about log(2**(64/2)) = 22.18 for accuracy of about 1 ulp. But we want more accuracy than that. For 0.5 ulps, log(2**(65/2) = 22.52+. We want more than that too. expl() gives almost 10 extra bits of accuracy, and the hyperbolic functions use the expl kernel to get all these extra bits. expl(x) itself uses a threshold of 2**-75 instead of 2**-65 for x being so tiny that the x**2/2 term and higher terms are negligible compared with the x term. So we should try for 11 extra bits here. This gives a threshold of log(2**(75/2)) = 25.99++. ld128 expl() is missing the change corresponding to changing 65 to 75 for the x vs x**2/2 threshold. It still uses 114. After fixing this to 124, the corresponding threshold here is log(2**124/2) = 42.97+. > in s_ccoshl.c I then have > > if (ix < 0x4003 || /* |x| < ~23: normal case */ > (ix == 0x4003 && (uint32_t)(lx >> 32) < 0xb8000000)) > RETURNI(CMPLXL(coshl(x) * c, sinhl(x) * s)); k_hexpl() can't handle this case. (But if you want efficiency or accuracy, then you should implement sinhcoshl() or just use the kernel to evaluate exp(x) only once.) I think this formula works up to 32 or 64 or even up to the overflow threshold. coshl(x) and sinhl() decay to expl(x) starting a bit above 23, but, but these functions can optimize that almost as well as here (they actually don't bother to until x reaches 64). You can use this to eliminate the check of lx. > > /* |x| >= 23, so cosh(x) ~= exp(|x|) / 2 */ > if (ix < 0x400c || > (ix == 0x400c && (uint32_t)(lx >> 32) <= 0xb1700000)) { > /* x < 11356: exp(|x|) won't overflow */ > h = expl(fabsl(x)) / 2; > RETURNI(CMPLXL(h * c, copysignl(h, x) * s)); Use a power of 2 for the upper threshold too if possible. > With |x| < 23, we have 2 function calls for coshl() and sinhl() > as opposed to the 1 function call for expl() in 23 <= x <= 32 range. > I can write the first conditional as > > if (ix < 0x4004) /* |x| < 32: normal case */ > RETURNI(CMPLXL(coshl(x) * c, sinhl(x) * s)); The general formula is indeed unecessarily slow in that range. Also for the more important range [1, 23]. coshl() in the lower range is lo + 0.25/(hi + lo) + hi after h_expl() to get hi and lo. Similarly for sinhl(). These formulas are short enough to repeat in the above, but I think this optimization is excessive. It is better for accuracy after some rearrangement. > This then raises the question on whether we should change the > limit to 32 in the double complex ccosh()? Do you mean from 64 to 32 the non-complex cosh(), or from your current limit to the above? Bruce From owner-freebsd-numerics@freebsd.org Fri Mar 8 21:48:02 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CEF2C1529CF9 for ; Fri, 8 Mar 2019 21:48:02 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "troutmask", Issuer "troutmask" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 58FD18D717 for ; Fri, 8 Mar 2019 21:48:00 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x28Llpq4034409 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Fri, 8 Mar 2019 13:47:51 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x28Llop5034408; Fri, 8 Mar 2019 13:47:50 -0800 (PST) (envelope-from sgk) Date: Fri, 8 Mar 2019 13:47:50 -0800 From: Steve Kargl To: Bruce Evans Cc: freebsd-numerics@freebsd.org Subject: Re: cexpl() (was: Re: Update ENTERI() macro) Message-ID: <20190308214750.GA34090@troutmask.apl.washington.edu> Reply-To: sgk@troutmask.apl.washington.edu References: <20190306214233.GA23159@troutmask.apl.washington.edu> <20190307144315.N932@besplex.bde.org> <20190307044447.GA16298@troutmask.apl.washington.edu> <20190307163958.V1293@besplex.bde.org> <20190307195436.GA20856@troutmask.apl.washington.edu> <20190308225807.D6410@besplex.bde.org> <20190308162432.GA31392@troutmask.apl.washington.edu> <20190309033820.J1443@besplex.bde.org> <20190308191150.GA32980@troutmask.apl.washington.edu> <20190309070413.G2539@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190309070413.G2539@besplex.bde.org> User-Agent: Mutt/1.11.2 (2019-01-07) X-Rspamd-Queue-Id: 58FD18D717 X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [0.92 / 15.00]; ARC_NA(0.00)[]; HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu]; NEURAL_HAM_MEDIUM(-0.06)[-0.060,0]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; NEURAL_SPAM_SHORT(0.84)[0.836,0]; NEURAL_HAM_LONG(-0.59)[-0.592,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu]; AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2]; MX_GOOD(-0.01)[troutmask.apl.washington.edu]; RCPT_COUNT_TWO(0.00)[2]; R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[optusnet.com.au]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(0.05)[ip: (0.09), ipnet: 128.95.0.0/16(0.15), asn: 73(0.05), country: US(-0.07)] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2019 21:48:03 -0000 (reducing quote depth. I'll need time to digest your bit analysis) On Sat, Mar 09, 2019 at 07:53:24AM +1100, Bruce Evans wrote: > On Fri, 8 Mar 2019, Steve Kargl wrote: > > > This then raises the question on whether we should change the > > limit to 32 in the double complex ccosh()? > > Do you mean from 64 to 32 the non-complex cosh(), or from your current > limit to the above? > I mean ccosh(double complex). Copyright date is 2005 for s_ccosh.c while ld80/*_expl.* has datesi of 2009-2013. It seems you and I developed s_ccosh.c much earlier than the Tang-based expl(). In s_ccosh.c, we have if (ix < 0x40360000) /* |x| < 22: normal case */ return (CMPLX(cosh(x) * cos(y), sinh(x) * sin(y))); /* |x| >= 22, so cosh(x) ~= exp(|x|) */ if (ix < 0x40862e42) { /* x < 710: exp(|x|) won't overflow */ h = exp(fabs(x)) * 0.5; return (CMPLX(h * cos(y), copysign(h, x) * sin(y))); } ... Would it be beneficial to change |x| < 22 to |x| < 32? While we have kernels for exp(), I did not commit your Tang-based exp(). So, exp() has ulp of 0.7 to 0.8 instead of 0.5xx. Maybe using 32 won't buy us anything. -- Steve From owner-freebsd-numerics@freebsd.org Sat Mar 9 03:28:38 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D45551534CD9 for ; Sat, 9 Mar 2019 03:28:37 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail106.syd.optusnet.com.au (mail106.syd.optusnet.com.au [211.29.132.42]) by mx1.freebsd.org (Postfix) with ESMTP id 3A3366B4BC for ; Sat, 9 Mar 2019 03:28:32 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail106.syd.optusnet.com.au (Postfix) with ESMTPS id 072013D57F4; Sat, 9 Mar 2019 14:28:29 +1100 (AEDT) Date: Sat, 9 Mar 2019 14:28:28 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Steve Kargl cc: freebsd-numerics@freebsd.org Subject: Re: cexpl() (was: Re: Update ENTERI() macro) In-Reply-To: <20190308214750.GA34090@troutmask.apl.washington.edu> Message-ID: <20190309134147.E968@besplex.bde.org> References: <20190306214233.GA23159@troutmask.apl.washington.edu> <20190307144315.N932@besplex.bde.org> <20190307044447.GA16298@troutmask.apl.washington.edu> <20190307163958.V1293@besplex.bde.org> <20190307195436.GA20856@troutmask.apl.washington.edu> <20190308225807.D6410@besplex.bde.org> <20190308162432.GA31392@troutmask.apl.washington.edu> <20190309033820.J1443@besplex.bde.org> <20190308191150.GA32980@troutmask.apl.washington.edu> <20190309070413.G2539@besplex.bde.org> <20190308214750.GA34090@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=P6RKvmIu c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=Ma-Tbq1rf8peu99H5l8A:9 a=539z-ZukNZkPh3FH:21 a=-ralKWDhgqI9mZlY:21 a=CjuIK1q_8ugA:10 X-Rspamd-Queue-Id: 3A3366B4BC X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates 211.29.132.42 as permitted sender) smtp.mailfrom=brde@optusnet.com.au X-Spamd-Result: default: False [-6.32 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_COUNT_TWO(0.00)[2]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23]; FREEMAIL_FROM(0.00)[optusnet.com.au]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[extmail.optusnet.com.au]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.89)[-0.889,0]; IP_SCORE(-3.12)[ip: (-8.53), ipnet: 211.28.0.0/14(-3.91), asn: 4804(-3.12), country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[]; RCVD_IN_DNSWL_LOW(-0.10)[42.132.29.211.list.dnswl.org : 127.0.5.1]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[optusnet.com.au]; ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU]; MIME_TRACE(0.00)[0:+]; FROM_EQ_ENVFROM(0.00)[] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Mar 2019 03:28:38 -0000 On Fri, 8 Mar 2019, Steve Kargl wrote: > (reducing quote depth. I'll need time to digest your bit analysis) > > On Sat, Mar 09, 2019 at 07:53:24AM +1100, Bruce Evans wrote: >> On Fri, 8 Mar 2019, Steve Kargl wrote: >> >>> This then raises the question on whether we should change the >>> limit to 32 in the double complex ccosh()? >> >> Do you mean from 64 to 32 the non-complex cosh(), or from your current >> limit to the above? > > I mean ccosh(double complex). Copyright date is 2005 for > s_ccosh.c while ld80/*_expl.* has datesi of 2009-2013. It > seems you and I developed s_ccosh.c much earlier than the > Tang-based expl(). In s_ccosh.c, we have > > if (ix < 0x40360000) /* |x| < 22: normal case */ > return (CMPLX(cosh(x) * cos(y), sinh(x) * sin(y))); > > /* |x| >= 22, so cosh(x) ~= exp(|x|) */ > if (ix < 0x40862e42) { > /* x < 710: exp(|x|) won't overflow */ > h = exp(fabs(x)) * 0.5; > return (CMPLX(h * cos(y), copysign(h, x) * sin(y))); > } ... I see. > Would it be beneficial to change |x| < 22 to |x| < 32? No, 22 is correct. It is copied from cosh() and sinh(). It is from allowing for exp() to have 10 or 11 extra bits of accuracy (which it might actually do for a hardware or other exp() with internal extra precision not limited by the ABI or broken to C11 spec so that it has to destroy the extra etra precision on return). This gives a threshold of x = log(2**(52+11 or 53+10)) / 2 = 21.83 for one of exp(+-x) being negligible relative to the other. Take the ceiling to get 22. ccoshl() uses the result of cosh(), so must use a threshold of 22 or more to let cosh() do the best that it can. Similarly for sinh() -- use a threshold >= the threshold of each. Each uses 22 so 22 is already large enough. 32 is >= 22, so we can use that too, but it is slower in the range [22, 32]. cosh() and sinh() usually don't get near 11 extra bits of accuracy. The threshold for 53 bits is 18.36+. They are slower than necessary between this and 22. This is more complicated for long double precision. There coshl() and sinhl() use a threshold of 64 and are slower than necessary between this and 22.5 to 26 for ld80 and 39.5 and 43 for ld128. This slowness is to simplify the code and thus optimize for more usual cases. The gap between the thresholds is much larger than for double precision, so if we copy the individual functions' threshold of 64 then we get unnecessary slowness over a larger range. But we shouldn't know so much about the functions' implementation that we intentionally optimize differently for this range. It is bad enough that we have to know their internal threshold to ensure getting consistent results. > While we have kernels for exp(), I did not commit your > Tang-based exp(). So, exp() has ulp of 0.7 to 0.8 instead > of 0.5xx. Maybe using 32 won't buy us anything. My exp() is based on old fdlibm (Ng?) since the intervals method didn't work as well as expected and I was able to get enough accuracy for < 1 ulp of error in the hyperbolic functions by exporting hi + lo decompositions using kernels. exp() still has a maximum error of about 0.75 ulps on amd64. But on i386, it has a maximum error of 0.5092 ulps in float precision, by using float_t to get all evaluations in double precision. In double precision, double_t defaults to giving only double precision, so the maximum error is usually the same 0.75 as on amd64. exp() on i386 now uses the i387 and switches it to 64-bit precision. The i387 barely delivers double precision for exp() using the extra precision. (We almost knew that it couldn't deliver long double precision for expl() when we implemented expl() in software). My error table has grown many special cases to record bad old behaviour for regression tests. For exp() it is: exp: max_er = 0x642 0.7822, avg_er = 0.007, #>=1:0.5 = 0:2566529 exp: max_er = 0x5fd 0.7485, avg_er = 0.007, #>=1:0.5 = 0:2733369 (s/w,PD) exp: max_er = 0x405 0.5024, avg_er = 0.007, #>=1:0.5 = 0:2253093 (s/w,PE) 0.7822 is for i387 exp. 0.7485 is the error for the fdlibm method without extra precision in hardware (FP_PD on i386). 0.5024 is the error for a modified fdlibm method with 11 bits of extra precision (FP_PE onl i386) and double_t instead of double to use this, and some reordering. Bruce