From owner-freebsd-numerics@freebsd.org  Mon Mar  4 21:31:42 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 57EF71523E26
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Mon,  4 Mar 2019 21:31:42 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 server-signature RSA-PSS (4096 bits)
 client-signature RSA-PSS (2048 bits) client-digest SHA256)
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id C4DD38953C
 for <freebsd-numerics@freebsd.org>; Mon,  4 Mar 2019 21:31:40 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x24LM54d012616
 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO);
 Mon, 4 Mar 2019 13:22:06 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x24LM5jp012603;
 Mon, 4 Mar 2019 13:22:05 -0800 (PST) (envelope-from sgk)
Date: Mon, 4 Mar 2019 13:21:59 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-numerics@freebsd.org
Subject: Re: Update ENTERI() macro
Message-ID: <20190304212159.GA12587@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
References: <20190226191825.GA68479@troutmask.apl.washington.edu>
 <20190227145002.P907@besplex.bde.org>
 <20190227074811.GA75972@troutmask.apl.washington.edu>
 <20190227201214.V1823@besplex.bde.org>
 <20190227161906.GA77785@troutmask.apl.washington.edu>
 <20190228060920.R4413@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20190228060920.R4413@besplex.bde.org>
User-Agent: Mutt/1.11.2 (2019-01-07)
X-Rspamd-Queue-Id: C4DD38953C
X-Spamd-Bar: +
Authentication-Results: mx1.freebsd.org
X-Spamd-Result: default: False [1.33 / 15.00]; ARC_NA(0.00)[];
 HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu];
 FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[];
 NEURAL_SPAM_SHORT(0.66)[0.665,0];
 NEURAL_HAM_LONG(-0.16)[-0.157,0]; MIME_GOOD(-0.10)[text/plain];
 RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu];
 AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[];
 RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[];
 RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2];
 MX_GOOD(-0.01)[troutmask.apl.washington.edu];
 RCPT_COUNT_TWO(0.00)[2]; NEURAL_SPAM_MEDIUM(0.08)[0.083,0];
 R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[optusnet.com.au];
 FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[];
 MIME_TRACE(0.00)[0:+];
 ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US];
 MID_RHS_MATCH_FROM(0.00)[];
 IP_SCORE(0.05)[ip: (0.10), ipnet: 128.95.0.0/16(0.15), asn: 73(0.06),
 country: US(-0.07)]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Mar 2019 21:31:42 -0000

On Thu, Feb 28, 2019 at 07:15:14AM +1100, Bruce Evans wrote:
> On Wed, 27 Feb 2019, Steve Kargl wrote:
> 
> > On Wed, Feb 27, 2019 at 09:15:52PM +1100, Bruce Evans wrote:
> >>
> >> ENTERI() hard-codes the long double for simplicity.  Remember, it is only
> >> needed for long double precision on i386.  But I forgot about long double
> >> complex types, and didn't dream about indirect long double types in sincosl().
> >
> > That simplicity does not work for long double complex.  We will
> > 
> > need either ENTERIC as in
> >
> > #define ENTERIC() ENTERIT(long double complex)
> >
> > or a direct use of ENTERIT as you have done s_clogl.c
> 
> I wrote ENTERIT() to work around this problem.
> 
> >>> I'm fine with making ENTERI() only toggle precision, and adding
> >>> a LEAVEI() to reset precision.  RETURNI(r) would then be
> >>>
> >>> #define RETURNI(r)	\
> >>> do {		\
> >>>   LEAVEI();		\
> >>>   return (r);	\
> >>> } while (0)
> >>
> >> No, may be an expression, so it must be evaluated before LEAVEI().  This
> >> is the reason for existence of the variable to hold the result.
> >
> > So, we'll need RETURNI for long double and one for long double complex.
> > Or, we give RETURNI a second parameter, which is the input parameter of
> > the function
> 
> I said to use your method of __typeof().  I tested this:
> 
> XX --- /tmp/math_private.h	Sun Nov 27 17:58:57 2005
> XX +++ ./math_private.h	Thu Feb 28 06:17:26 2019
> XX @@ -474,21 +474,22 @@
> XX  /* Support switching the mode to FP_PE if necessary. */
> XX  #if defined(__i386__) && !defined(NO_FPSETPREC)
> XX -#define	ENTERI() ENTERIT(long double)
> XX -#define	ENTERIT(returntype)			\
> XX -	returntype __retval;			\
> XX +#define	ENTERI()				\
> XX  	fp_prec_t __oprec;			\
> XX  						\
> XX  	if ((__oprec = fpgetprec()) != FP_PE)	\
> XX  		fpsetprec(FP_PE)
> XX -#define	RETURNI(x) do {				\
> XX -	__retval = (x);				\
> XX -	if (__oprec != FP_PE)			\
> XX -		fpsetprec(__oprec);		\
> XX +#define	LEAVEI()				\
> XX +	if ((__oprec = fpgetprec()) != FP_PE)	\
> XX +		fpsetprec(FP_PE)

Am I reading this diff wrong?  Should LEAVEI() be

#define LEAVEI()			\
	if (__oprec != FP_PE)		\
		fpsetprec(__oprec)

That is, we want to reset the precision to what ENTERI
grabbed in its conditinal expression.

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow

From owner-freebsd-numerics@freebsd.org  Tue Mar  5 04:48:32 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4C7AC1513542
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Tue,  5 Mar 2019 04:48:32 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail105.syd.optusnet.com.au (mail105.syd.optusnet.com.au
 [211.29.132.249])
 by mx1.freebsd.org (Postfix) with ESMTP id DC5CC76F6B
 for <freebsd-numerics@freebsd.org>; Tue,  5 Mar 2019 04:48:24 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au
 [110.21.101.228])
 by mail105.syd.optusnet.com.au (Postfix) with ESMTPS id 7C8EB108CBD4;
 Tue,  5 Mar 2019 15:48:12 +1100 (AEDT)
Date: Tue, 5 Mar 2019 15:48:11 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
cc: freebsd-numerics@freebsd.org
Subject: Re: Update ENTERI() macro
In-Reply-To: <20190304212159.GA12587@troutmask.apl.washington.edu>
Message-ID: <20190305153243.Y1349@besplex.bde.org>
References: <20190226191825.GA68479@troutmask.apl.washington.edu>
 <20190227145002.P907@besplex.bde.org>
 <20190227074811.GA75972@troutmask.apl.washington.edu>
 <20190227201214.V1823@besplex.bde.org>
 <20190227161906.GA77785@troutmask.apl.washington.edu>
 <20190228060920.R4413@besplex.bde.org>
 <20190304212159.GA12587@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.2 cv=UJetJGXy c=1 sm=1 tr=0
 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17
 a=kj9zAlcOel0A:10 a=JQ0e4kNsG--EZ3fqZAAA:9 a=CjuIK1q_8ugA:10
X-Rspamd-Queue-Id: DC5CC76F6B
X-Spamd-Bar: ------
Authentication-Results: mx1.freebsd.org;
 spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates
 211.29.132.249 as permitted sender) smtp.mailfrom=brde@optusnet.com.au
X-Spamd-Result: default: False [-6.09 / 15.00]; ARC_NA(0.00)[];
 NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_COUNT_TWO(0.00)[2];
 FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[];
 R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23];
 FREEMAIL_FROM(0.00)[optusnet.com.au];
 MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au];
 NEURAL_HAM_LONG(-1.00)[-1.000,0];
 TO_MATCH_ENVRCPT_SOME(0.00)[];
 MX_GOOD(-0.01)[extmail.optusnet.com.au];
 RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.74)[-0.738,0];
 IP_SCORE(-3.04)[ip: (-7.87), ipnet: 211.28.0.0/14(-4.05), asn: 4804(-3.23),
 country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[];
 RCVD_IN_DNSWL_LOW(-0.10)[249.132.29.211.list.dnswl.org : 127.0.5.1];
 R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[optusnet.com.au];
 ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU];
 MIME_TRACE(0.00)[0:+]; FROM_EQ_ENVFROM(0.00)[]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Mar 2019 04:48:32 -0000

On Mon, 4 Mar 2019, Steve Kargl wrote:

> On Thu, Feb 28, 2019 at 07:15:14AM +1100, Bruce Evans wrote:
>> ...
>> I said to use your method of __typeof().  I tested this:
>>
>> XX --- /tmp/math_private.h	Sun Nov 27 17:58:57 2005
>> XX +++ ./math_private.h	Thu Feb 28 06:17:26 2019
>> XX @@ -474,21 +474,22 @@
>> XX  /* Support switching the mode to FP_PE if necessary. */
>> XX  #if defined(__i386__) && !defined(NO_FPSETPREC)
>> XX -#define	ENTERI() ENTERIT(long double)
>> XX -#define	ENTERIT(returntype)			\
>> XX -	returntype __retval;			\
>> XX +#define	ENTERI()				\
>> XX  	fp_prec_t __oprec;			\
>> XX  						\
>> XX  	if ((__oprec = fpgetprec()) != FP_PE)	\
>> XX  		fpsetprec(FP_PE)
>> XX -#define	RETURNI(x) do {				\
>> XX -	__retval = (x);				\
>> XX -	if (__oprec != FP_PE)			\
>> XX -		fpsetprec(__oprec);		\
>> XX +#define	LEAVEI()				\
>> XX +	if ((__oprec = fpgetprec()) != FP_PE)	\
>> XX +		fpsetprec(FP_PE)
>
> Am I reading this diff wrong?  Should LEAVEI() be
>
> #define LEAVEI()			\
> 	if (__oprec != FP_PE)		\
> 		fpsetprec(__oprec)
>
> That is, we want to reset the precision to what ENTERI
> grabbed in its conditinal expression.

Oops.  I wrote it wrong by copying the wrong clause.

The broken version even passed quick runtime tests.  This is because
the tests know that most long double functions are missing ENTERI(),
so they set the precision to FP_PE before the call.  So both ENTERI()
and LEAVEI() find (__oprec = fpgetprec() equal to FP_PE and do nothing.

Bruce

From owner-freebsd-numerics@freebsd.org  Wed Mar  6 05:52:11 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 23AF71529A93
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Wed,  6 Mar 2019 05:52:11 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 server-signature RSA-PSS (4096 bits)
 client-signature RSA-PSS (2048 bits) client-digest SHA256)
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 4071B755DC
 for <freebsd-numerics@freebsd.org>; Wed,  6 Mar 2019 05:52:09 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x265q1pZ040308
 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO);
 Tue, 5 Mar 2019 21:52:01 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x265q1nE040307;
 Tue, 5 Mar 2019 21:52:01 -0800 (PST) (envelope-from sgk)
Date: Tue, 5 Mar 2019 21:52:01 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-numerics@freebsd.org
Subject: Re: Update ENTERI() macro
Message-ID: <20190306055201.GA40298@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
References: <20190226191825.GA68479@troutmask.apl.washington.edu>
 <20190227145002.P907@besplex.bde.org>
 <20190227074811.GA75972@troutmask.apl.washington.edu>
 <20190227201214.V1823@besplex.bde.org>
 <20190227161906.GA77785@troutmask.apl.washington.edu>
 <20190228060920.R4413@besplex.bde.org>
 <20190304212159.GA12587@troutmask.apl.washington.edu>
 <20190305153243.Y1349@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20190305153243.Y1349@besplex.bde.org>
User-Agent: Mutt/1.11.2 (2019-01-07)
X-Rspamd-Queue-Id: 4071B755DC
X-Spamd-Bar: /
Authentication-Results: mx1.freebsd.org
X-Spamd-Result: default: False [0.93 / 15.00]; ARC_NA(0.00)[];
 HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu];
 NEURAL_HAM_MEDIUM(-0.03)[-0.031,0]; FROM_HAS_DN(0.00)[];
 TO_DN_SOME(0.00)[]; NEURAL_SPAM_SHORT(0.85)[0.850,0];
 NEURAL_HAM_LONG(-0.62)[-0.624,0]; MIME_GOOD(-0.10)[text/plain];
 RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu];
 AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[];
 RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[];
 RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2];
 RCPT_COUNT_TWO(0.00)[2];
 MX_GOOD(-0.01)[troutmask.apl.washington.edu]; R_SPF_NA(0.00)[];
 FREEMAIL_TO(0.00)[optusnet.com.au]; FROM_EQ_ENVFROM(0.00)[];
 R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+];
 ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US];
 MID_RHS_MATCH_FROM(0.00)[];
 IP_SCORE(0.05)[ip: (0.10), ipnet: 128.95.0.0/16(0.15), asn: 73(0.06),
 country: US(-0.07)]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2019 05:52:11 -0000

On Tue, Mar 05, 2019 at 03:48:11PM +1100, Bruce Evans wrote:
> On Mon, 4 Mar 2019, Steve Kargl wrote:
> 
> > On Thu, Feb 28, 2019 at 07:15:14AM +1100, Bruce Evans wrote:
> >> ...
> >> I said to use your method of __typeof().  I tested this:
> >>
> >> XX --- /tmp/math_private.h	Sun Nov 27 17:58:57 2005
> >> XX +++ ./math_private.h	Thu Feb 28 06:17:26 2019
> >> XX @@ -474,21 +474,22 @@
> >> XX  /* Support switching the mode to FP_PE if necessary. */
> >> XX  #if defined(__i386__) && !defined(NO_FPSETPREC)
> >> XX -#define	ENTERI() ENTERIT(long double)
> >> XX -#define	ENTERIT(returntype)			\
> >> XX -	returntype __retval;			\
> >> XX +#define	ENTERI()				\
> >> XX  	fp_prec_t __oprec;			\
> >> XX  						\
> >> XX  	if ((__oprec = fpgetprec()) != FP_PE)	\
> >> XX  		fpsetprec(FP_PE)
> >> XX -#define	RETURNI(x) do {				\
> >> XX -	__retval = (x);				\
> >> XX -	if (__oprec != FP_PE)			\
> >> XX -		fpsetprec(__oprec);		\
> >> XX +#define	LEAVEI()				\
> >> XX +	if ((__oprec = fpgetprec()) != FP_PE)	\
> >> XX +		fpsetprec(FP_PE)
> >
> > Am I reading this diff wrong?  Should LEAVEI() be
> >
> > #define LEAVEI()			\
> > 	if (__oprec != FP_PE)		\
> > 		fpsetprec(__oprec)
> >
> > That is, we want to reset the precision to what ENTERI
> > grabbed in its conditinal expression.
> 
> Oops.  I wrote it wrong by copying the wrong clause.
> 
> The broken version even passed quick runtime tests.  This is because
> the tests know that most long double functions are missing ENTERI(),
> so they set the precision to FP_PE before the call.  So both ENTERI()
> and LEAVEI() find (__oprec = fpgetprec() equal to FP_PE and do nothing.
> 

My math_private.h has a few additions, which meant I applied
your diff manually.  As I was reading it, I came across the
issue.

I think I have the s_cexpl.c file fixed to use LDBL_EXTRACT_WORDS
instead of the new macro I introduced.  I however cannot figure
out what David Das did to arrive at k_exp.c, so I cannot write
a similar k_cexpl.c.  Yes, I added the 'c' in the name to avoid
confusion in ld80/.  In particular, I have no idea how he found
his scaling value 'k'.  Any insights?

-- 
Steve

From owner-freebsd-numerics@freebsd.org  Wed Mar  6 12:56:43 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 026CE1514503
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Wed,  6 Mar 2019 12:56:43 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail105.syd.optusnet.com.au (mail105.syd.optusnet.com.au
 [211.29.132.249])
 by mx1.freebsd.org (Postfix) with ESMTP id B21698C948
 for <freebsd-numerics@freebsd.org>; Wed,  6 Mar 2019 12:56:35 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au
 [110.21.101.228])
 by mail105.syd.optusnet.com.au (Postfix) with ESMTPS id 78D16105B9C9;
 Wed,  6 Mar 2019 23:56:24 +1100 (AEDT)
Date: Wed, 6 Mar 2019 23:56:23 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
cc: freebsd-numerics@freebsd.org
Subject: Re: Update ENTERI() macro
In-Reply-To: <20190306055201.GA40298@troutmask.apl.washington.edu>
Message-ID: <20190306225811.P2731@besplex.bde.org>
References: <20190226191825.GA68479@troutmask.apl.washington.edu>
 <20190227145002.P907@besplex.bde.org>
 <20190227074811.GA75972@troutmask.apl.washington.edu>
 <20190227201214.V1823@besplex.bde.org>
 <20190227161906.GA77785@troutmask.apl.washington.edu>
 <20190228060920.R4413@besplex.bde.org>
 <20190304212159.GA12587@troutmask.apl.washington.edu>
 <20190305153243.Y1349@besplex.bde.org>
 <20190306055201.GA40298@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.2 cv=UJetJGXy c=1 sm=1 tr=0
 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17
 a=kj9zAlcOel0A:10 a=fpzoFui-lhB3HaOsU0AA:9 a=CjuIK1q_8ugA:10
X-Rspamd-Queue-Id: B21698C948
X-Spamd-Bar: ------
Authentication-Results: mx1.freebsd.org;
 spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates
 211.29.132.249 as permitted sender) smtp.mailfrom=brde@optusnet.com.au
X-Spamd-Result: default: False [-6.25 / 15.00]; ARC_NA(0.00)[];
 NEURAL_HAM_MEDIUM(-1.00)[-1.000,0];
 RCVD_IN_DNSWL_LOW(-0.10)[249.132.29.211.list.dnswl.org : 127.0.5.1];
 FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[];
 R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23];
 FREEMAIL_FROM(0.00)[optusnet.com.au];
 MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au];
 NEURAL_HAM_LONG(-1.00)[-1.000,0];
 TO_MATCH_ENVRCPT_SOME(0.00)[];
 MX_GOOD(-0.01)[extmail.optusnet.com.au];
 RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.92)[-0.916,0];
 IP_SCORE(-3.03)[ip: (-7.94), ipnet: 211.28.0.0/14(-3.98), asn: 4804(-3.17),
 country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[];
 FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[];
 FREEMAIL_ENVFROM(0.00)[optusnet.com.au];
 ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU];
 MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2019 12:56:43 -0000

On Tue, 5 Mar 2019, Steve Kargl wrote:

> ...
> I think I have the s_cexpl.c file fixed to use LDBL_EXTRACT_WORDS
> instead of the new macro I introduced.  I however cannot figure
> out what David Das did to arrive at k_exp.c, so I cannot write

Davvid A S.

> a similar k_cexpl.c.  Yes, I added the 'c' in the name to avoid
> confusion in ld80/.  In particular, I have no idea how he found
> his scaling value 'k'.  Any insights?

bde already wrote __ldexp_cexpl() in ld*/k_expl.h, and kargl committed
it in r260066.  Does it not work? :-).

Well, it hasn't been tested, and it indeed cannot work since it spells
cosl(y) as cos(y).

It is written in a more portable way than the double and float versions
using '1' instead of bit fiddling to start the construction of 2**k.
This makes it identical in ld80 and ld128 (since the exponent range
is the same and the accessor macros hide enough details of the bit
fiddling for the exponent).  The duplication is noted in a comment,
but the comment also says that using scalbnl() ond ld128 is probably
best after all (it is slow, but multiplication is also slow).

I have rewritten the double and float versions in work related to
fixing the accuracy of the double and float versions of hyperbolic
functions.  I fixed these before writing the long double hyperbolic
functions using identical methods.  You committed the latter, but
the double and float versions still use the old innaccurate slower
methods.

The rewrite improves the comments, and it is the improved comments
that the reference to k_exp.c in k_expl.h is supposed to be for.

XX Index: k_exp.c
XX ===================================================================
XX RCS file: /home/ncvs/src/lib/msun/src/k_exp.c,v
XX retrieving revision 1.2
XX diff -u -2 -r1.2 k_exp.c
XX --- k_exp.c	17 Nov 2012 01:50:07 -0000	1.2
XX +++ k_exp.c	30 Jun 2013 01:31:48 -0000
XX @@ -32,75 +32,46 @@
XX  #include "math.h"
XX  #include "math_private.h"
XX -
XX -static const uint32_t k = 1799;		/* constant for reduction */
XX -static const double kln2 =  1246.97177782734161156;	/* k * ln2 */

I never liked this magic by das.  I forget how it worked.  I avoid using
general methods.

XX -
XX -/*
XX - * Compute exp(x), scaled to avoid spurious overflow.  An exponent is
XX - * returned separately in 'expt'.
XX - *
XX - * Input:  ln(DBL_MAX) <= x < ln(2 * DBL_MAX / DBL_MIN_DENORM) ~= 1454.91
XX - * Output: 2**1023 <= y < 2**1024
XX - */
XX -static double
XX -__frexp_exp(double x, int *expt)
XX -{
XX -	double exp_x;
XX -	uint32_t hx;
XX -
XX -	/*
XX -	 * We use exp(x) = exp(x - kln2) * 2**k, carefully chosen to
XX -	 * minimize |exp(kln2) - 2**k|.  We also scale the exponent of
XX -	 * exp_x to MAX_EXP so that the result can be multiplied by
XX -	 * a tiny number without losing accuracy due to denormalization.
XX -	 */
XX -	exp_x = exp(x - kln2);
XX -	GET_HIGH_WORD(hx, exp_x);
XX -	*expt = (hx >> 20) - (0x3ff + 1023) + k;
XX -	SET_HIGH_WORD(exp_x, (hx & 0xfffff) | ((0x3ff + 1023) << 20));
XX -	return (exp_x);
XX -}

__frexp_exp() was only used in __ldexp_cexp().  It is not needed using the
general methods.  __ldexp_cexpl() already uses the general methods and has
no trace of __frexp_expl().

XX +#include "k_exp.h"
XX 
XX  /*
XX - * __ldexp_exp(x, expt) and __ldexp_cexp(x, expt) compute exp(x) * 2**expt.
XX - * They are intended for large arguments (real part >= ln(DBL_MAX))
XX - * where care is needed to avoid overflow.
XX + * _ldexp_cexp(x + I*y, expt) computes cexp(x + I*y) * 2**expt.  It is
XX + * intended for large arguments where care is needed to avoid overflow.
XX   *
XX - * The present implementation is narrowly tailored for our hyperbolic and
XX - * exponential functions.  We assume expt is small (0 or -1), and the caller
XX - * has filtered out very large x, for which overflow would be inevitable.
XX + * The present implementation is tailored (not very narrowly) for our
XX + * complex hyperbolic and exponential functions.  The caller must filter
XX + * out all x not supported by k_exp(), most negative x, and very large x,
XX + * and not use very large |expt| (we only need expt = 0 or -1).  The
XX + * caller should filter out all x for which overflow is either impossible
XX + * (x < ~ln(DBL_MAX)) or inevitable (x > ~2*ln(DBL_MAX/DBL_TRUE_MIN)).
XX   */
XX -
XX -double
XX -__ldexp_exp(double x, int expt)
XX -{
XX -	double exp_x, scale;
XX -	int ex_expt;
XX -
XX -	exp_x = __frexp_exp(x, &ex_expt);
XX -	expt += ex_expt;
XX -	INSERT_WORDS(scale, (0x3ff + expt) << 20, 0);
XX -	return (exp_x * scale);
XX -}
XX -

__ldexp_exp() was only used in cosh() and sinh().  coshl() and sinhl()
always used better methods, so there is no trace of __ldexp_expl() in
k_expl.h.

XX  double complex
XX  __ldexp_cexp(double complex z, int expt)
XX  {
XX -	double x, y, exp_x, scale1, scale2;
XX -	int ex_expt, half_expt;
XX +	double_t exp_x, hi, lo;
XX +	double x, y, scale1, scale2;
XX +	int half_expt, k;

Also start fixing i386 pessimizations using s/double/double_t.

XX 
XX  	x = creal(z);
XX  	y = cimag(z);
XX -	exp_x = __frexp_exp(x, &ex_expt);
XX -	expt += ex_expt;
XX +	k_exp(x, &hi, &lo, &k);

k_expl.h already has a kernel k_expl() much like this (you wrote the
non-kernel expl() and I turned it into the kernel.  The rest of this
mail is mostly diffs and files for turning exp() into a kernel.  The
kernel is used a lot in hyperbolic functions, just like in the long
double case).

XX +
XX +	/*
XX +	 * 3 scale factors are needed in general.  Put one in exp_x now.
XX +	 * exp_x must be larger than 2**DBL_MANT_DIG to avoid spurious
XX +	 * underflows  We satisfy this and maximize the range handled
XX +	 * by putting as much as possible in exp_x.
XX +	 */
XX +	exp_x = (lo + hi) * 0x1p1022;
XX +	expt += k - 1022;
XX 
XX  	/*
XX -	 * Arrange so that scale1 * scale2 == 2**expt.  We use this to
XX -	 * compensate for scalbn being horrendously slow.
XX +	 * Arrange so that scale1 * scale2 == 2**expt.  As usual, scalbn()
XX +	 * is too slow to actually use for the things it does.
XX  	 */
XX +	scale1 = 1;
XX  	half_expt = expt / 2;
XX -	INSERT_WORDS(scale1, (0x3ff + half_expt) << 20, 0);
XX -	half_expt = expt - half_expt;
XX -	INSERT_WORDS(scale2, (0x3ff + half_expt) << 20, 0);
XX +	SET_HIGH_WORD(scale1, (0x3ff + half_expt) << 20);
XX +	scale2 = 1;
XX +	SET_HIGH_WORD(scale2, (0x3ff + expt - half_expt) << 20);
XX 
XX  	return (cpack(cos(y) * exp_x * scale1 * scale2,

The scaling is the same as in k_expl.h, with minor differences for the
bit fiddling.  This started by constructing 1 without using bits too.

The scaling is to avoid overlow and underflow, and it must be careful
to not have its own underflow.  The basic general method is that the
kernel creates the value in the form r * 2**k where r is near 1.  Since
r is near 1, we can multiply almost any value by it without overflow
or underflow.  Values near +Inf might need to be divided by 2 to prepare.
Values near DBL_MIN might need to be multiplied by 2 to prepare.  Denormal
values should be multiplied by a larger power of 2 to prepare.  Adjust k
to match.

The rest is not a diff, but is the contents of k_exp.h.

XX /* See e_exp.c for more comments. */
XX 
XX static const double
XX ln2hi      =  6.93147180369123816490e-01,  /* 0x3fe62e42, 0xfee00000 */
XX ln2lo      =  1.90821492927058770002e-10,  /* 0x3dea39ef, 0x35793c76 */
XX invln2     =  1.44269504088896338700e+00,  /* 0x3ff71547, 0x652b82fe */
XX /*
XX  * Domain [-0.34568, 0.34568], range ~[-1.6340e-18, 2.0047e-18]:
XX  * |x*(exp(x)+1)/(exp(x)-1) - p(x)| < 2**-58.7
XX  */
XX P1   =  1.66666666666666019037e-01, /* 0x3FC55555, 0x5555553E */
XX P2   = -2.77777777770155933842e-03, /* 0xBF66C16C, 0x16BEBD93 */
XX P3   =  6.61375632143793436117e-05, /* 0x3F11566A, 0xAF25DE2C */
XX P4   = -1.65339022054652515390e-06, /* 0xBEBBBD41, 0xC5D26BF1 */
XX P5   =  4.13813679705723846039e-08; /* 0x3E663769, 0x72BEA4D0 */
XX 
XX /*
XX  * Kernel for exp(x).  x must be finite and not tiny or huge.
XX  * "tiny" is anything that would make us underflow (|P4*x^8| < ~DBL_T_MIN).
XX  * "huge" is anything that would make t*ln2hi inexact (|x| > ~2**21*ln2).
XX  */
XX static inline void
XX k_exp(double x, double_t *hip, double_t *lop, int *kp)
XX {
XX 	double_t c, hi, lo, rhi, rlo, t, xred;
XX 	int32_t k;
XX 
XX     /* argument reduction */
XX #if 0
XX 	t = rnint((double_t)x * invln2);
XX #else
XX 	t = (double)((double_t)x * invln2 + 0x1.8p52) - 0x1.8p52;
XX #endif
XX 	k = irint(t);
XX 	/*
XX 	 * ln2hi is rounded to 32 bits to more than cover the normal
XX 	 * range -1075 <= k <= 1024.  It covers the wider range
XX 	 * |k| <= 2**21.  We now need it cover the range
XX 	 * -1075 <= k <= 1024+1024+53.
XX 	 */
XX 	hi = x - t*ln2hi;	/* t*ln2hi is exact here */
XX 	lo = t*ln2lo;
XX 	xred = hi - lo;
XX 
XX     /* xred is now in primary range */
XX 	t  = xred*xred;
XX 	c  = xred - t*P1 - t*t*(P2 + t*P3) - t*t*(t*t)*(P4 + t*P5);
XX 	t = xred*c/(2-c);
XX 	rhi = 1;
XX 	rlo = hi;
XX 	_2sumF(rhi, rlo);
XX 	rlo += t-lo;
XX 	*hip = rhi;
XX 	*lop = rlo;
XX 	*kp = k;
XX }
XX 
XX /*
XX  * Extra-precision exp(x)/2.  x must be between the underflow threshold
XX  * and the overflow threshold (in float precision due to our optimization
XX  * of evaluating 2**(k-1) in float precision)), and not very close to
XX  * either, and not tiny.  We only use this for x >= 1 up to the threshold
XX  * where cosh(x) decays to exp(x)/2.
XX  */
XX static inline void
XX k_hexp(double x, double_t *hip, double_t *lop)
XX {
XX 	float twopkm1;
XX 	int k;
XX 
XX 	k_exp(x, hip, lop, &k);
XX 	SET_FLOAT_WORD(twopkm1, 0x3f800000 + ((k - 1) << 23));
XX 	*hip *= twopkm1;
XX 	*lop *= twopkm1;
XX }
XX 
XX /*
XX  * exp(x)/2 without spurious overflow.  x must be well above the underflow
XX  * threshold and at most a little above the overflow threshold, and not
XX  * tiny.  We only use this for x positive and above the threshold where
XX  * cosh(x) decays to exp(x)/2 up to a little above the overflow threshold.
XX  */
XX static inline double_t
XX hexp(double x)
XX {
XX 	double_t hi, lo;
XX 	double twopkm2;
XX 	int k;
XX 
XX 	twopkm2 = 1;
XX 	k_exp(x, &hi, &lo, &k);
XX 	SET_HIGH_WORD(twopkm2, 0x3ff00000 + ((k - 2) << 20));
XX 	return (lo + hi) * 2 * twopkm2;
XX }

k_expl.h also points to the above uncommitted comments for k_hexp() and
hexp().

__ldexp_cexpl() doesn't belong in k_expl.h.  It belongs in a new file
k_expl.c corresponding to k_exp.c.  hexp[fl]() isn't quite a kernel, but
it has to be in a header file since it is inline.

Using static inline functions also reduces namespace problems.  We already
use this to name some kernels simply kfoo() instead of uglier names like
__kernel_foo() used for older methods where the kernels aren't always
inline.

Bruce

From owner-freebsd-numerics@freebsd.org  Wed Mar  6 18:48:35 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3FAF01523F76
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Wed,  6 Mar 2019 18:48:35 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 server-signature RSA-PSS (4096 bits)
 client-signature RSA-PSS (2048 bits) client-digest SHA256)
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 983B66B9EC
 for <freebsd-numerics@freebsd.org>; Wed,  6 Mar 2019 18:48:32 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x26ImTXm045033
 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO);
 Wed, 6 Mar 2019 10:48:29 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x26ImTZH045032;
 Wed, 6 Mar 2019 10:48:29 -0800 (PST) (envelope-from sgk)
Date: Wed, 6 Mar 2019 10:48:29 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-numerics@freebsd.org
Subject: Re: Update ENTERI() macro
Message-ID: <20190306184829.GA44023@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
References: <20190226191825.GA68479@troutmask.apl.washington.edu>
 <20190227145002.P907@besplex.bde.org>
 <20190227074811.GA75972@troutmask.apl.washington.edu>
 <20190227201214.V1823@besplex.bde.org>
 <20190227161906.GA77785@troutmask.apl.washington.edu>
 <20190228060920.R4413@besplex.bde.org>
 <20190304212159.GA12587@troutmask.apl.washington.edu>
 <20190305153243.Y1349@besplex.bde.org>
 <20190306055201.GA40298@troutmask.apl.washington.edu>
 <20190306225811.P2731@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20190306225811.P2731@besplex.bde.org>
User-Agent: Mutt/1.11.2 (2019-01-07)
X-Rspamd-Queue-Id: 983B66B9EC
X-Spamd-Bar: +
Authentication-Results: mx1.freebsd.org
X-Spamd-Result: default: False [1.17 / 15.00]; ARC_NA(0.00)[];
 HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu];
 FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[];
 NEURAL_SPAM_SHORT(0.90)[0.899,0];
 NEURAL_HAM_LONG(-0.61)[-0.611,0]; MIME_GOOD(-0.10)[text/plain];
 RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu];
 AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[];
 RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[];
 RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2];
 MX_GOOD(-0.01)[cached: troutmask.apl.washington.edu];
 RCPT_COUNT_TWO(0.00)[2]; NEURAL_SPAM_MEDIUM(0.14)[0.144,0];
 R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[optusnet.com.au];
 FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[];
 MIME_TRACE(0.00)[0:+];
 ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US];
 MID_RHS_MATCH_FROM(0.00)[];
 IP_SCORE(0.05)[ip: (0.10), ipnet: 128.95.0.0/16(0.15), asn: 73(0.05),
 country: US(-0.07)]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2019 18:48:35 -0000

On Wed, Mar 06, 2019 at 11:56:23PM +1100, Bruce Evans wrote:
> On Tue, 5 Mar 2019, Steve Kargl wrote:
> 
> > ...
> > I think I have the s_cexpl.c file fixed to use LDBL_EXTRACT_WORDS
> > instead of the new macro I introduced.  I however cannot figure
> > out what David Das did to arrive at k_exp.c, so I cannot write
> 
> Davvid A S.

Whoops, again.  Seems I conflated real name and login name.

> > a similar k_cexpl.c.  Yes, I added the 'c' in the name to avoid
> > confusion in ld80/.  In particular, I have no idea how he found
> > his scaling value 'k'.  Any insights?
> 
> bde already wrote __ldexp_cexpl() in ld*/k_expl.h, and kargl committed
> it in r260066.  Does it not work? :-).
> 
> Well, it hasn't been tested, and it indeed cannot work since it spells
> cosl(y) as cos(y).

Taking long breaks from pecking at libm issues seems to be
conducive to memory loss.  I'll go review k_expl.h.  I simply
remember it as having a kernel for expl().

> It is written in a more portable way than the double and float versions
> using '1' instead of bit fiddling to start the construction of 2**k.
> This makes it identical in ld80 and ld128 (since the exponent range
> is the same and the accessor macros hide enough details of the bit
> fiddling for the exponent).  The duplication is noted in a comment,
> but the comment also says that using scalbnl() ond ld128 is probably
> best after all (it is slow, but multiplication is also slow).
> 
> I have rewritten the double and float versions in work related to
> fixing the accuracy of the double and float versions of hyperbolic
> functions.  I fixed these before writing the long double hyperbolic
> functions using identical methods.  You committed the latter, but
> the double and float versions still use the old innaccurate slower
> methods.
> 
> The rewrite improves the comments, and it is the improved comments
> that the reference to k_exp.c in k_expl.h is supposed to be for.
> 
> XX Index: k_exp.c

Any chance you'll get around to committing your WIP?  Yes,
I know you have a few thousand kernel patches in your
queue above libm patches. :-)

BTW, for the non-exceptional cases with 1M random z values
where z=x+Iy and -11350 < x,y < 11350 and I only consider
results that are normal, I find my cexpl() yields

% ./testl -u -X 11350
Max ULP Re: 1.980535
   z = (1.22918109220546510585e+03,1.03853909865862542237e+04)
libm = (5.06780736805320166327e+533,-4.39418989082799451477e+533)
mpfr = (5.06780736805320166270e+533,-4.39418989082799451456e+533)
Max ULP Im: 2.022155 
   z = (4.83490728165160559637e+03,1.07778990242305355345e+04)
libm = (-3.66535128319537945953e+2099,4.67021177841072936494e+2099)
mpfr = (-3.66535128319537945915e+2099,4.67021177841072936441e+2099)

For comparison, cexp() with -705 < x,y < 705 yields

%  ./testd -u -X 705
Max ULP Re: 2.215132
   z = (1.49377521822925502e+02,1.79997882645095970e+01)
libm = (4.93720465697268180e+64,-5.61754869856313932e+64)
mpfr = (4.93720465697268310e+64,-5.61754869856313987e+64)
Max ULP Im: 2.182779 
   z = (2.50664219501672335e+02,-4.81327697040560906e+02)
libm = (-5.73256778461974670e+108,4.48612518733245315e+108)
mpfr = (-5.73256778461974754e+108,4.48612518733245429e+108)

Certainly, not exhaustive but encouraging.

-- 
Steve

From owner-freebsd-numerics@freebsd.org  Wed Mar  6 19:30:49 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3D1621524E93
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Wed,  6 Mar 2019 19:30:49 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au
 [211.29.132.246])
 by mx1.freebsd.org (Postfix) with ESMTP id 0AB436CF4A
 for <freebsd-numerics@freebsd.org>; Wed,  6 Mar 2019 19:30:47 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au
 [110.21.101.228])
 by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 427B043D302;
 Thu,  7 Mar 2019 06:30:44 +1100 (AEDT)
Date: Thu, 7 Mar 2019 06:30:42 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
cc: freebsd-numerics@freebsd.org
Subject: Re: Update ENTERI() macro
In-Reply-To: <20190306184829.GA44023@troutmask.apl.washington.edu>
Message-ID: <20190307061214.R4911@besplex.bde.org>
References: <20190226191825.GA68479@troutmask.apl.washington.edu>
 <20190227145002.P907@besplex.bde.org>
 <20190227074811.GA75972@troutmask.apl.washington.edu>
 <20190227201214.V1823@besplex.bde.org>
 <20190227161906.GA77785@troutmask.apl.washington.edu>
 <20190228060920.R4413@besplex.bde.org>
 <20190304212159.GA12587@troutmask.apl.washington.edu>
 <20190305153243.Y1349@besplex.bde.org>
 <20190306055201.GA40298@troutmask.apl.washington.edu>
 <20190306225811.P2731@besplex.bde.org>
 <20190306184829.GA44023@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.2 cv=UJetJGXy c=1 sm=1 tr=0
 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17
 a=kj9zAlcOel0A:10 a=JKf714QeBDjTFJuGM2EA:9 a=CjuIK1q_8ugA:10
X-Rspamd-Queue-Id: 0AB436CF4A
X-Spamd-Bar: ------
Authentication-Results: mx1.freebsd.org;
 spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates
 211.29.132.246 as permitted sender) smtp.mailfrom=brde@optusnet.com.au
X-Spamd-Result: default: False [-6.23 / 15.00]; ARC_NA(0.00)[];
 NEURAL_HAM_MEDIUM(-1.00)[-1.000,0];
 RCVD_IN_DNSWL_LOW(-0.10)[246.132.29.211.list.dnswl.org : 127.0.5.1];
 FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[];
 R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23];
 FREEMAIL_FROM(0.00)[optusnet.com.au];
 MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au];
 NEURAL_HAM_LONG(-1.00)[-1.000,0];
 TO_MATCH_ENVRCPT_SOME(0.00)[];
 MX_GOOD(-0.01)[cached: extmail.optusnet.com.au];
 RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.94)[-0.943,0];
 IP_SCORE(-2.98)[ip: (-7.65), ipnet: 211.28.0.0/14(-4.00), asn: 4804(-3.19),
 country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[];
 FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[];
 FREEMAIL_ENVFROM(0.00)[optusnet.com.au];
 ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU];
 MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2019 19:30:49 -0000

On Wed, 6 Mar 2019, Steve Kargl wrote:

> On Wed, Mar 06, 2019 at 11:56:23PM +1100, Bruce Evans wrote:
>> On Tue, 5 Mar 2019, Steve Kargl wrote:
>
>>> a similar k_cexpl.c.  Yes, I added the 'c' in the name to avoid
>>> confusion in ld80/.  In particular, I have no idea how he found
>>> his scaling value 'k'.  Any insights?
>>
>> bde already wrote __ldexp_cexpl() in ld*/k_expl.h, and kargl committed
>> it in r260066.  Does it not work? :-).
>>
>> Well, it hasn't been tested, and it indeed cannot work since it spells
>> cosl(y) as cos(y).
>
> Taking long breaks from pecking at libm issues seems to be
> conducive to memory loss.  I'll go review k_expl.h.  I simply
> remember it as having a kernel for expl().

I now see that you implemented 2 more versions of __ldexpl_cexpl() by
cloning the old double precision version.  Apparently the includes
are to unpolluted for the compiler to see the multiple versions :-).

Using the version in k_expl.h almost forces inlining of expl()'s kernel
and its large tables, just like for hyperbolic functions.  This wastes
a lot of space, especially for duplicating the tables.  It is only a
small optimization for time.  It is done for the hyperbolic functions
to get this optimization, and for __ldexpl_cexpl() just for convenience.

>> ...
>> I have rewritten the double and float versions in work related to
>> fixing the accuracy of the double and float versions of hyperbolic
>> functions.  I fixed these before writing the long double hyperbolic
>> ..
>> XX Index: k_exp.c
>
> Any chance you'll get around to committing your WIP?  Yes,
> I know you have a few thousand kernel patches in your
> queue above libm patches. :-)

Probably not soon.

>
> BTW, for the non-exceptional cases with 1M random z values
> where z=x+Iy and -11350 < x,y < 11350 and I only consider
> results that are normal, I find my cexpl() yields
>
> % ./testl -u -X 11350
> Max ULP Re: 1.980535
>   z = (1.22918109220546510585e+03,1.03853909865862542237e+04)
> libm = (5.06780736805320166327e+533,-4.39418989082799451477e+533)
> mpfr = (5.06780736805320166270e+533,-4.39418989082799451456e+533)
> Max ULP Im: 2.022155
>   z = (4.83490728165160559637e+03,1.07778990242305355345e+04)
> libm = (-3.66535128319537945953e+2099,4.67021177841072936494e+2099)
> mpfr = (-3.66535128319537945915e+2099,4.67021177841072936441e+2099)

Check a few denormals, Infs and NaNs.

I can only easily test for coherence with double precision.  That is,
evaluate in both precisions and check that the long double results
rounded down are within a few ulps.

> For comparison, cexp() with -705 < x,y < 705 yields
>
> %  ./testd -u -X 705
> Max ULP Re: 2.215132
>   z = (1.49377521822925502e+02,1.79997882645095970e+01)
> libm = (4.93720465697268180e+64,-5.61754869856313932e+64)
> mpfr = (4.93720465697268310e+64,-5.61754869856313987e+64)
> Max ULP Im: 2.182779
>   z = (2.50664219501672335e+02,-4.81327697040560906e+02)
> libm = (-5.73256778461974670e+108,4.48612518733245315e+108)
> mpfr = (-5.73256778461974754e+108,4.48612518733245429e+108)
>
> Certainly, not exhaustive but encouraging.

We expect long double precision to be slightly more accurate, at
least using the expl kernel, since the kernel accuracy is about
0.51 ulps while the exp accuracy is only about 0.8 ulps.  The
above shows 0.2 ulps better instead of 0.3.  Still more than 2
ulps total from combining several errors of 0.5-1.0 ulps.

Bruce

From owner-freebsd-numerics@freebsd.org  Wed Mar  6 21:42:39 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id C828D1528957
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Wed,  6 Mar 2019 21:42:39 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 server-signature RSA-PSS (4096 bits)
 client-signature RSA-PSS (2048 bits) client-digest SHA256)
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 7B4517243B
 for <freebsd-numerics@freebsd.org>; Wed,  6 Mar 2019 21:42:37 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x26LgXAS039372
 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO);
 Wed, 6 Mar 2019 13:42:34 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x26LgXTY039368;
 Wed, 6 Mar 2019 13:42:33 -0800 (PST) (envelope-from sgk)
Date: Wed, 6 Mar 2019 13:42:33 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-numerics@freebsd.org
Subject: Re: Update ENTERI() macro
Message-ID: <20190306214233.GA23159@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
References: <20190227074811.GA75972@troutmask.apl.washington.edu>
 <20190227201214.V1823@besplex.bde.org>
 <20190227161906.GA77785@troutmask.apl.washington.edu>
 <20190228060920.R4413@besplex.bde.org>
 <20190304212159.GA12587@troutmask.apl.washington.edu>
 <20190305153243.Y1349@besplex.bde.org>
 <20190306055201.GA40298@troutmask.apl.washington.edu>
 <20190306225811.P2731@besplex.bde.org>
 <20190306184829.GA44023@troutmask.apl.washington.edu>
 <20190307061214.R4911@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20190307061214.R4911@besplex.bde.org>
User-Agent: Mutt/1.11.2 (2019-01-07)
X-Rspamd-Queue-Id: 7B4517243B
X-Spamd-Bar: /
Authentication-Results: mx1.freebsd.org
X-Spamd-Result: default: False [0.74 / 15.00];
 HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu];
 TO_DN_SOME(0.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[];
 RCVD_COUNT_THREE(0.00)[3];
 RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2];
 MX_GOOD(-0.01)[cached: troutmask.apl.washington.edu];
 RCPT_COUNT_TWO(0.00)[2]; FREEMAIL_TO(0.00)[optusnet.com.au];
 FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[];
 MIME_TRACE(0.00)[0:+];
 ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US];
 MID_RHS_MATCH_FROM(0.00)[]; ARC_NA(0.00)[];
 NEURAL_HAM_MEDIUM(-0.37)[-0.371,0]; FROM_HAS_DN(0.00)[];
 NEURAL_SPAM_SHORT(0.78)[0.779,0];
 NEURAL_HAM_LONG(-0.41)[-0.409,0]; MIME_GOOD(-0.10)[text/plain];
 RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu];
 AUTH_NA(1.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[];
 R_SPF_NA(0.00)[];
 IP_SCORE(0.05)[ip: (0.10), ipnet: 128.95.0.0/16(0.15), asn: 73(0.05),
 country: US(-0.07)]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2019 21:42:40 -0000

On Thu, Mar 07, 2019 at 06:30:42AM +1100, Bruce Evans wrote:
> On Wed, 6 Mar 2019, Steve Kargl wrote:
> 
> > On Wed, Mar 06, 2019 at 11:56:23PM +1100, Bruce Evans wrote:
> >> On Tue, 5 Mar 2019, Steve Kargl wrote:
> >
> >>> a similar k_cexpl.c.  Yes, I added the 'c' in the name to avoid
> >>> confusion in ld80/.  In particular, I have no idea how he found
> >>> his scaling value 'k'.  Any insights?
> >>
> >> bde already wrote __ldexp_cexpl() in ld*/k_expl.h, and kargl committed
> >> it in r260066.  Does it not work? :-).
> >>
> >> Well, it hasn't been tested, and it indeed cannot work since it spells
> >> cosl(y) as cos(y).
> >
> > Taking long breaks from pecking at libm issues seems to be
> > conducive to memory loss.  I'll go review k_expl.h.  I simply
> > remember it as having a kernel for expl().
> 
> I now see that you implemented 2 more versions of __ldexpl_cexpl() by
> cloning the old double precision version.  Apparently the includes
> are to unpolluted for the compiler to see the multiple versions :-).
> 
> Using the version in k_expl.h almost forces inlining of expl()'s kernel
> and its large tables, just like for hyperbolic functions.  This wastes
> a lot of space, especially for duplicating the tables.  It is only a
> small optimization for time.  It is done for the hyperbolic functions
> to get this optimization, and for __ldexpl_cexpl() just for convenience.

The version in k_expl.h has 2 bugs.  You note the first (cos instead
of cosl).  The second is

In file included from /data/kargl/trunk/math/libm/msun/ld80/s_cexpl.c:43:
/data/kargl/trunk/math/libm/msun/ld80/k_expl.h:288:22: error: magnitude of
      floating-point constant too large for type 'double'; maximum is
      1.7976931348623157E+308 [-Werror,-Wliteral-range]
        exp_x = (lo + hi) * 0x1p16382;
                            ^
1 error generated.
*** Error code 1


> >> I have rewritten the double and float versions in work related to
> >> fixing the accuracy of the double and float versions of hyperbolic
> >> functions.  I fixed these before writing the long double hyperbolic
> >> ..
> >> XX Index: k_exp.c
> >
> > Any chance you'll get around to committing your WIP?  Yes,
> > I know you have a few thousand kernel patches in your
> > queue above libm patches. :-)
> 
> Probably not soon.
> 
> >
> > BTW, for the non-exceptional cases with 1M random z values
> > where z=x+Iy and -11350 < x,y < 11350 and I only consider
> > results that are normal, I find my cexpl() yields
> >
> > % ./testl -u -X 11350
> > Max ULP Re: 1.980535
> >   z = (1.22918109220546510585e+03,1.03853909865862542237e+04)
> > libm = (5.06780736805320166327e+533,-4.39418989082799451477e+533)
> > mpfr = (5.06780736805320166270e+533,-4.39418989082799451456e+533)
> > Max ULP Im: 2.022155
> >   z = (4.83490728165160559637e+03,1.07778990242305355345e+04)
> > libm = (-3.66535128319537945953e+2099,4.67021177841072936494e+2099)
> > mpfr = (-3.66535128319537945915e+2099,4.67021177841072936441e+2099)
> 
> Check a few denormals, Infs and NaNs.

Exceptional cases give

% ./testl -e
cexpl(1, inf) = (nan,nan) expecting (nan,nan)
cexpl(1,-inf) = (nan,nan) expecting (nan,nan)
cexpl(nan,-inf) = (nan,nan) expecting (nan,nan)
cexpl(nan, inf) = (nan,nan) expecting (nan,nan)
cexpl(inf, inf) = (inf,nan) expecting (inf,nan)
cexpl(inf,-inf) = (inf,nan) expecting (inf,nan)
cexpl(inf,nan) = (inf,nan) expecting (inf,nan)
cexpl(-inf,nan) = (0,0) expecting (0,0)
cexpl(-inf,-inf) = (0,0) expecting (0,0)
cexpl(-inf, inf) = (0,0) expecting (0,0)

> I can only easily test for coherence with double precision.  That is,
> evaluate in both precisions and check that the long double results
> rounded down are within a few ulps.

I use GNU MPFR and compute exp(x)*cos(y) and exp(x)*sin(y)
with a precision of 4*LDBL_MANT_DIG.  Supposedly, each 
function is correctly rounded to this precision.  I take
the answers as-if it is exact when computing ULP.

> 
> > For comparison, cexp() with -705 < x,y < 705 yields
> >
> > %  ./testd -u -X 705
> > Max ULP Re: 2.215132
> >   z = (1.49377521822925502e+02,1.79997882645095970e+01)
> > libm = (4.93720465697268180e+64,-5.61754869856313932e+64)
> > mpfr = (4.93720465697268310e+64,-5.61754869856313987e+64)
> > Max ULP Im: 2.182779
> >   z = (2.50664219501672335e+02,-4.81327697040560906e+02)
> > libm = (-5.73256778461974670e+108,4.48612518733245315e+108)
> > mpfr = (-5.73256778461974754e+108,4.48612518733245429e+108)
> >
> > Certainly, not exhaustive but encouraging.
> 
> We expect long double precision to be slightly more accurate, at
> least using the expl kernel, since the kernel accuracy is about
> 0.51 ulps while the exp accuracy is only about 0.8 ulps.  The
> above shows 0.2 ulps better instead of 0.3.  Still more than 2
> ulps total from combining several errors of 0.5-1.0 ulps.
> 
> Bruce

-- 
Steve

From owner-freebsd-numerics@freebsd.org  Thu Mar  7 04:09:13 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id AD779150D42F
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Thu,  7 Mar 2019 04:09:13 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail105.syd.optusnet.com.au (mail105.syd.optusnet.com.au
 [211.29.132.249])
 by mx1.freebsd.org (Postfix) with ESMTP id DE6C18B64F
 for <freebsd-numerics@freebsd.org>; Thu,  7 Mar 2019 04:09:11 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au
 [110.21.101.228])
 by mail105.syd.optusnet.com.au (Postfix) with ESMTPS id 2E5B010598CB;
 Thu,  7 Mar 2019 15:09:08 +1100 (AEDT)
Date: Thu, 7 Mar 2019 15:09:06 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
cc: freebsd-numerics@freebsd.org
Subject: Re: Update ENTERI() macro
In-Reply-To: <20190306214233.GA23159@troutmask.apl.washington.edu>
Message-ID: <20190307144315.N932@besplex.bde.org>
References: <20190227074811.GA75972@troutmask.apl.washington.edu>
 <20190227201214.V1823@besplex.bde.org>
 <20190227161906.GA77785@troutmask.apl.washington.edu>
 <20190228060920.R4413@besplex.bde.org>
 <20190304212159.GA12587@troutmask.apl.washington.edu>
 <20190305153243.Y1349@besplex.bde.org>
 <20190306055201.GA40298@troutmask.apl.washington.edu>
 <20190306225811.P2731@besplex.bde.org>
 <20190306184829.GA44023@troutmask.apl.washington.edu>
 <20190307061214.R4911@besplex.bde.org>
 <20190306214233.GA23159@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.2 cv=UJetJGXy c=1 sm=1 tr=0
 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17
 a=kj9zAlcOel0A:10 a=Et-GumYriqmTMQ_IUTQA:9 a=CjuIK1q_8ugA:10
X-Rspamd-Queue-Id: DE6C18B64F
X-Spamd-Bar: ------
Authentication-Results: mx1.freebsd.org;
 spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates
 211.29.132.249 as permitted sender) smtp.mailfrom=brde@optusnet.com.au
X-Spamd-Result: default: False [-6.14 / 15.00]; ARC_NA(0.00)[];
 NEURAL_HAM_MEDIUM(-1.00)[-1.000,0];
 RCVD_IN_DNSWL_LOW(-0.10)[249.132.29.211.list.dnswl.org : 127.0.5.1];
 FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[];
 R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23];
 FREEMAIL_FROM(0.00)[optusnet.com.au];
 MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au];
 NEURAL_HAM_LONG(-1.00)[-1.000,0];
 TO_MATCH_ENVRCPT_SOME(0.00)[];
 MX_GOOD(-0.01)[cached: extmail.optusnet.com.au];
 RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.80)[-0.799,0];
 IP_SCORE(-3.03)[ip: (-7.92), ipnet: 211.28.0.0/14(-4.00), asn: 4804(-3.19),
 country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[];
 FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[];
 FREEMAIL_ENVFROM(0.00)[optusnet.com.au];
 ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU];
 MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2019 04:09:14 -0000

On Wed, 6 Mar 2019, Steve Kargl wrote:

> On Thu, Mar 07, 2019 at 06:30:42AM +1100, Bruce Evans wrote:
>> ...
>> I now see that you implemented 2 more versions of __ldexpl_cexpl() by
>> cloning the old double precision version.  Apparently the includes
>> are to unpolluted for the compiler to see the multiple versions :-).
>>
>> Using the version in k_expl.h almost forces inlining of expl()'s kernel
>> and its large tables, just like for hyperbolic functions.  This wastes
>> a lot of space, especially for duplicating the tables.  It is only a
>> small optimization for time.  It is done for the hyperbolic functions
>> to get this optimization, and for __ldexpl_cexpl() just for convenience.
>
> The version in k_expl.h has 2 bugs.  You note the first (cos instead
> of cosl).  The second is
>
> In file included from /data/kargl/trunk/math/libm/msun/ld80/s_cexpl.c:43:
> /data/kargl/trunk/math/libm/msun/ld80/k_expl.h:288:22: error: magnitude of
>      floating-point constant too large for type 'double'; maximum is
>      1.7976931348623157E+308 [-Werror,-Wliteral-range]
>        exp_x = (lo + hi) * 0x1p16382;
>                            ^

It is missing an L suffix.  Clearly not tested.

>>> [errors for cexpl()]
>>
>> Check a few denormals, Infs and NaNs.
>
> Exceptional cases give
>
> % ./testl -e
> cexpl(1, inf) = (nan,nan) expecting (nan,nan)
> cexpl(1,-inf) = (nan,nan) expecting (nan,nan)
> cexpl(nan,-inf) = (nan,nan) expecting (nan,nan)
> cexpl(nan, inf) = (nan,nan) expecting (nan,nan)
> cexpl(inf, inf) = (inf,nan) expecting (inf,nan)
> cexpl(inf,-inf) = (inf,nan) expecting (inf,nan)
> cexpl(inf,nan) = (inf,nan) expecting (inf,nan)
> cexpl(-inf,nan) = (0,0) expecting (0,0)
> cexpl(-inf,-inf) = (0,0) expecting (0,0)
> cexpl(-inf, inf) = (0,0) expecting (0,0)

But does it preserve NaN bits as much as possible, and produce the same
NaN bits as double precision after conversion to double precision?  The
report spells NaN and Inf using C99's fuzzy bad names.  Even the C99
specification spells NaN and Inf correctly in Appendix F.

The complex hyperbolic functions have better comments, so are almost
an easier place to start.

Bruce

From owner-freebsd-numerics@freebsd.org  Thu Mar  7 06:36:36 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id CE646151C778
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Thu,  7 Mar 2019 06:36:35 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail106.syd.optusnet.com.au (mail106.syd.optusnet.com.au
 [211.29.132.42])
 by mx1.freebsd.org (Postfix) with ESMTP id 880526BCF0
 for <freebsd-numerics@freebsd.org>; Thu,  7 Mar 2019 06:36:33 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au
 [110.21.101.228])
 by mail106.syd.optusnet.com.au (Postfix) with ESMTPS id 8BED53D60D8;
 Thu,  7 Mar 2019 17:36:28 +1100 (AEDT)
Date: Thu, 7 Mar 2019 17:36:22 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
cc: freebsd-numerics@freebsd.org
Subject: Re: Update ENTERI() macro
In-Reply-To: <20190307044447.GA16298@troutmask.apl.washington.edu>
Message-ID: <20190307163958.V1293@besplex.bde.org>
References: <20190227161906.GA77785@troutmask.apl.washington.edu>
 <20190228060920.R4413@besplex.bde.org>
 <20190304212159.GA12587@troutmask.apl.washington.edu>
 <20190305153243.Y1349@besplex.bde.org>
 <20190306055201.GA40298@troutmask.apl.washington.edu>
 <20190306225811.P2731@besplex.bde.org>
 <20190306184829.GA44023@troutmask.apl.washington.edu>
 <20190307061214.R4911@besplex.bde.org>
 <20190306214233.GA23159@troutmask.apl.washington.edu>
 <20190307144315.N932@besplex.bde.org>
 <20190307044447.GA16298@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.2 cv=FNpr/6gs c=1 sm=1 tr=0
 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17
 a=kj9zAlcOel0A:10 a=NH_DFVgqrTV01CZZK1QA:9 a=CjuIK1q_8ugA:10
X-Rspamd-Queue-Id: 880526BCF0
X-Spamd-Bar: ------
Authentication-Results: mx1.freebsd.org;
 spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates
 211.29.132.42 as permitted sender) smtp.mailfrom=brde@optusnet.com.au
X-Spamd-Result: default: False [-6.42 / 15.00]; ARC_NA(0.00)[];
 NEURAL_HAM_MEDIUM(-1.00)[-1.000,0];
 RCVD_IN_DNSWL_LOW(-0.10)[42.132.29.211.list.dnswl.org : 127.0.5.1];
 FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[];
 R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23];
 FREEMAIL_FROM(0.00)[optusnet.com.au];
 MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au];
 NEURAL_HAM_LONG(-1.00)[-1.000,0];
 TO_MATCH_ENVRCPT_SOME(0.00)[];
 MX_GOOD(-0.01)[cached: extmail.optusnet.com.au];
 RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.97)[-0.967,0];
 IP_SCORE(-3.14)[ip: (-8.45), ipnet: 211.28.0.0/14(-4.01), asn: 4804(-3.20),
 country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[];
 FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[];
 FREEMAIL_ENVFROM(0.00)[optusnet.com.au];
 ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU];
 MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2019 06:36:36 -0000

On Wed, 6 Mar 2019, Steve Kargl wrote:

> On Thu, Mar 07, 2019 at 03:09:06PM +1100, Bruce Evans wrote:
>> On Wed, 6 Mar 2019, Steve Kargl wrote:
>> ...
>> But does it preserve NaN bits as much as possible, and produce the same
>> NaN bits as double precision after conversion to double precision?  The
>> report spells NaN and Inf using C99's fuzzy bad names.  Even the C99
>> specification spells NaN and Inf correctly in Appendix F.
>>
>> The complex hyperbolic functions have better comments, so are almost
>> an easier place to start.
>
> I suppose it preserves the NaN bits as well as cexpf and cexp as
> it uses the same algorthim.

> ...
> long double complex
> cexpl (long double complex z)
> {
> 	long double c, exp_x, s, x, y;
> 	uint64_t lx, ly;
> 	uint16_t hx, hy, ix, iy;
>
> 	ENTERI();
>
> 	x = creall(z);
> 	y = cimagl(z);
>
> 	EXTRACT_LDBL80_WORDS(hx, lx, x);
> 	EXTRACT_LDBL80_WORDS(hy, ly, y);

I think you recently changed to this from your new macro.

>
> 	ix = hx&0x7fff;
> 	iy = hy&0x7fff;

Some style bugs.  s_cexp.c doesn't use the fdlibm'ish style of sometimes
no spaces around '&'.  It puts the ix and iy calculations immediately
after the extractions.  They shouldn't be separated since they are
also extractions.  c_cexp.c extracts in a different order, and doesn't
use iy.  You reordered it a bit to use sincos().  I don't like the
reordering much.  But old s_cexp.c has its own style bug of a separating
the extraction for y but not even following its usual style of a blank
line before comments near the extraction of x.

>
> 	/* cexp(0 + I y) = cos(y) + I sin(y) */
> 	if ((ix | lx) == 0) {
> 		sincosl(y, &s, &c);
> 		RETURNI(CMPLXL(c, s));
> 	}
>
> 	/* cexp(x + I 0) = exp(x) + I 0 */
> 	if ((iy | ly) == 0)
> 		RETURNI(CMPLXL(expl(x), y));
>
> 	if (iy == 0x7fff) {

s_cexp.c actually uses a clobbered hy here.  Perhaps fix this too (compilers
mostly ignore program order for initializing variables like iy and optimize
to an expression involving the unclobbered hy here if that is optimal.  The
extra variables are just for readability or unreadability).

> 		if (ix < 0x7fff ||
> 		    (ix == 0x7fff && (lx & 0x7fffffffffffffffULL) != 0)) {
> 			/* cexp(finite|NaN +- I Inf|NaN) = NaN + I NaN */
> 			RETURNI(CMPLXL(y - y, y - y));

Looks OK.

I think ix and iy are only used here, so it is clearer to not have
variables for them -- just use (hx & 0x7fff), etc.

y - y and similar expressions usually give the right NaN (y = Inf -> dNaN
and y = NaN -> d(y)).

IIRC, the only differences for ld128 is that then normalization bit is
hidden so masking out the top bit in the mantissa is not needed, but the
mantissa is in 2 64-bit words.

> 		} else if (hx & 0x8000) {

This is clearest as it is with an explicit mask.

> 	/*
> 	 *  exp_ovfl = 11356.5234062941439497, 0x400c, 0xb17217f7d1cf79ac
> 	 * cexp_ovfl = 22755.3287906024445633, 0x400d, 0xb1c6a8573de9768c
> 	 */
> 	if ((hx == 0x400c && lx > 0xb17217f7d1cf79acULL) ||
> 	    (hx == 0x400d && lx < 0xb1c6a8573de9768cULL)) {

Don't duplicate the hex constants in the comment.  The limits can be fuzzy
so don't need so many decimal digits or nonzero nonzero bits or LD80C()
to round them (s_cexp.c uses only 1 decimal digit and doesn't test the
lower 32 bits in the mantissa; here we have more bits than we need in lx).

Maybe not use bit-fiddling at all for this (in other precisions too).
The more important exp() functions check thresholds in FP.  Here x can
be NaN so FP comparisons with it don't work right.  That seems to be
the only reason to check in bits here.

> 		/*
> 		 * x is between exp_ovfl and cexp_ovfl, so we must scale to
> 		 * avoid overflow in exp(x).
> 		 */
> 		RETURNI(__ldexp_cexpl(z, 1));

Did your tests cover this case, and does it use the fixed version of the
old __ldexp_cexpl() in k_expl.h?

> 	} else {
> 		/*
> 		 * Cases covered here:
> 		 *  -  x < exp_ovfl and exp(x) won't overflow (common case)
> 		 *  -  x > cexp_ovfl, so exp(x) * s overflows for all s > 0
> 		 *  -  x = +-Inf (generated by exp())
> 		 *  -  x = NaN (spurious inexact exception from y)
> 		 */
> 		exp_x = expl(x);
> 		sincosl(y, &s, &c);
> 		RETURNI(CMPLXL(exp_x * c, exp_x * s));
> 	}
> }

I think the last clause can be simplfied and even optimized by using the
kernel in all cases:
- the kernel produces exp_x in the form kexp_x * 2**k where kexp_x is near
   1.  Just multiply kexp_x by c and s, and later multiply by 2**k.  This
   cost an extra multiplication by 2**k (for the real and imaginary parts
   separately), but avoids other work so may be faster.
- hopefully kexp_x is actually >= 1 (and < 2).  Otherwise we have to
   worry about spurious underflow
- the above method seems to avoid spurious underflow automatically, and
   doesn't even mention this (multiplying by exp_x may underflow, but not
   spuriously since c and s are <= 1).

Bruce

From owner-freebsd-numerics@freebsd.org  Thu Mar  7 04:44:53 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id A1F5E150FABC
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Thu,  7 Mar 2019 04:44:53 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 server-signature RSA-PSS (4096 bits)
 client-signature RSA-PSS (2048 bits) client-digest SHA256)
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 13C418D285
 for <freebsd-numerics@freebsd.org>; Thu,  7 Mar 2019 04:44:51 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x274ilAZ016396
 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO);
 Wed, 6 Mar 2019 20:44:47 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x274ilQo016395;
 Wed, 6 Mar 2019 20:44:47 -0800 (PST) (envelope-from sgk)
Date: Wed, 6 Mar 2019 20:44:47 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-numerics@freebsd.org
Subject: Re: Update ENTERI() macro
Message-ID: <20190307044447.GA16298@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
References: <20190227161906.GA77785@troutmask.apl.washington.edu>
 <20190228060920.R4413@besplex.bde.org>
 <20190304212159.GA12587@troutmask.apl.washington.edu>
 <20190305153243.Y1349@besplex.bde.org>
 <20190306055201.GA40298@troutmask.apl.washington.edu>
 <20190306225811.P2731@besplex.bde.org>
 <20190306184829.GA44023@troutmask.apl.washington.edu>
 <20190307061214.R4911@besplex.bde.org>
 <20190306214233.GA23159@troutmask.apl.washington.edu>
 <20190307144315.N932@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20190307144315.N932@besplex.bde.org>
User-Agent: Mutt/1.11.2 (2019-01-07)
X-Rspamd-Queue-Id: 13C418D285
X-Spamd-Bar: +
Authentication-Results: mx1.freebsd.org
X-Spamd-Result: default: False [1.05 / 15.00]; ARC_NA(0.00)[];
 HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu];
 FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[];
 NEURAL_SPAM_SHORT(0.96)[0.957,0];
 NEURAL_HAM_LONG(-0.72)[-0.719,0]; MIME_GOOD(-0.10)[text/plain];
 RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu];
 AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[];
 RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[];
 RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2];
 MX_GOOD(-0.01)[cached: troutmask.apl.washington.edu];
 RCPT_COUNT_TWO(0.00)[2]; NEURAL_SPAM_MEDIUM(0.08)[0.079,0];
 R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[optusnet.com.au];
 FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[];
 MIME_TRACE(0.00)[0:+];
 ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US];
 MID_RHS_MATCH_FROM(0.00)[];
 IP_SCORE(0.05)[ip: (0.10), ipnet: 128.95.0.0/16(0.15), asn: 73(0.05),
 country: US(-0.07)]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2019 04:44:54 -0000

On Thu, Mar 07, 2019 at 03:09:06PM +1100, Bruce Evans wrote:
> On Wed, 6 Mar 2019, Steve Kargl wrote:
> 
> > On Thu, Mar 07, 2019 at 06:30:42AM +1100, Bruce Evans wrote:
> >> ...
> >> I now see that you implemented 2 more versions of __ldexpl_cexpl() by
> >> cloning the old double precision version.  Apparently the includes
> >> are to unpolluted for the compiler to see the multiple versions :-).
> >>
> >> Using the version in k_expl.h almost forces inlining of expl()'s kernel
> >> and its large tables, just like for hyperbolic functions.  This wastes
> >> a lot of space, especially for duplicating the tables.  It is only a
> >> small optimization for time.  It is done for the hyperbolic functions
> >> to get this optimization, and for __ldexpl_cexpl() just for convenience.
> >
> > The version in k_expl.h has 2 bugs.  You note the first (cos instead
> > of cosl).  The second is
> >
> > In file included from /data/kargl/trunk/math/libm/msun/ld80/s_cexpl.c:43:
> > /data/kargl/trunk/math/libm/msun/ld80/k_expl.h:288:22: error: magnitude of
> >      floating-point constant too large for type 'double'; maximum is
> >      1.7976931348623157E+308 [-Werror,-Wliteral-range]
> >        exp_x = (lo + hi) * 0x1p16382;
> >                            ^
> 
> It is missing an L suffix.  Clearly not tested.
> 
> >>> [errors for cexpl()]
> >>
> >> Check a few denormals, Infs and NaNs.
> >
> > Exceptional cases give
> >
> > % ./testl -e
> > cexpl(1, inf) = (nan,nan) expecting (nan,nan)
> > cexpl(1,-inf) = (nan,nan) expecting (nan,nan)
> > cexpl(nan,-inf) = (nan,nan) expecting (nan,nan)
> > cexpl(nan, inf) = (nan,nan) expecting (nan,nan)
> > cexpl(inf, inf) = (inf,nan) expecting (inf,nan)
> > cexpl(inf,-inf) = (inf,nan) expecting (inf,nan)
> > cexpl(inf,nan) = (inf,nan) expecting (inf,nan)
> > cexpl(-inf,nan) = (0,0) expecting (0,0)
> > cexpl(-inf,-inf) = (0,0) expecting (0,0)
> > cexpl(-inf, inf) = (0,0) expecting (0,0)
> 
> But does it preserve NaN bits as much as possible, and produce the same
> NaN bits as double precision after conversion to double precision?  The
> report spells NaN and Inf using C99's fuzzy bad names.  Even the C99
> specification spells NaN and Inf correctly in Appendix F.
> 
> The complex hyperbolic functions have better comments, so are almost
> an easier place to start.
> 

I suppose it preserves the NaN bits as well as cexpf and cexp as
it uses the same algorthim.


/*-
 * SPDX-License-Identifier: BSD-2-Clause-FreeBSD
 *
 * Copyright (c) 2011 David Schultz <das@FreeBSD.ORG>
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 *
 * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 * SUCH DAMAGE.
 *
 * src/s_cexp.c converted to long double complex by Steven G. Kargl
 */

#include <sys/cdefs.h>
__FBSDID("$FreeBSD$");

#include <complex.h>
#include <float.h>
#ifdef __i386__
#include <ieeefp.h>
#endif

#include "fpmath.h"
#include "math.h"
#include "math_private.h"
#include "k_expl.h"

long double complex
cexpl (long double complex z)
{
	long double c, exp_x, s, x, y;
	uint64_t lx, ly;
	uint16_t hx, hy, ix, iy;

	ENTERI();

	x = creall(z);
	y = cimagl(z);

	EXTRACT_LDBL80_WORDS(hx, lx, x);
	EXTRACT_LDBL80_WORDS(hy, ly, y);

	ix = hx&0x7fff;
	iy = hy&0x7fff;

	/* cexp(0 + I y) = cos(y) + I sin(y) */
	if ((ix | lx) == 0) {
		sincosl(y, &s, &c);
		RETURNI(CMPLXL(c, s));
	}

	/* cexp(x + I 0) = exp(x) + I 0 */
	if ((iy | ly) == 0)
		RETURNI(CMPLXL(expl(x), y));

	if (iy == 0x7fff) {
		if (ix < 0x7fff ||
		    (ix == 0x7fff && (lx & 0x7fffffffffffffffULL) != 0)) {
			/* cexp(finite|NaN +- I Inf|NaN) = NaN + I NaN */
			RETURNI(CMPLXL(y - y, y - y));
		} else if (hx & 0x8000) {
			/* cexp(-Inf +- I Inf|NaN) = 0 + I 0 */
			RETURNI(CMPLXL(0.0, 0.0));
		} else {
			/* cexp(+Inf +- I Inf|NaN) = Inf + I NaN */
			RETURNI(CMPLXL(x, y - y));
		}
	}

	/*
	 *  exp_ovfl = 11356.5234062941439497, 0x400c, 0xb17217f7d1cf79ac
	 * cexp_ovfl = 22755.3287906024445633, 0x400d, 0xb1c6a8573de9768c
	 */
	if ((hx == 0x400c && lx > 0xb17217f7d1cf79acULL) ||
	    (hx == 0x400d && lx < 0xb1c6a8573de9768cULL)) {
		/*
		 * x is between exp_ovfl and cexp_ovfl, so we must scale to
		 * avoid overflow in exp(x).
		 */
		RETURNI(__ldexp_cexpl(z, 1));
	} else {
		/*
		 * Cases covered here:
		 *  -  x < exp_ovfl and exp(x) won't overflow (common case)
		 *  -  x > cexp_ovfl, so exp(x) * s overflows for all s > 0
		 *  -  x = +-Inf (generated by exp())
		 *  -  x = NaN (spurious inexact exception from y)
		 */
		exp_x = expl(x);
		sincosl(y, &s, &c);
		RETURNI(CMPLXL(exp_x * c, exp_x * s));
	}
}

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow

From owner-freebsd-numerics@freebsd.org  Thu Mar  7 19:54:46 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6B45615239FD
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Thu,  7 Mar 2019 19:54:46 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 server-signature RSA-PSS (4096 bits)
 client-signature RSA-PSS (2048 bits) client-digest SHA256)
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 9331F70523
 for <freebsd-numerics@freebsd.org>; Thu,  7 Mar 2019 19:54:44 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x27JsbOC021893
 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO);
 Thu, 7 Mar 2019 11:54:37 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x27Jsa5Z021892;
 Thu, 7 Mar 2019 11:54:36 -0800 (PST) (envelope-from sgk)
Date: Thu, 7 Mar 2019 11:54:36 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-numerics@freebsd.org
Subject: Re: Update ENTERI() macro
Message-ID: <20190307195436.GA20856@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
References: <20190304212159.GA12587@troutmask.apl.washington.edu>
 <20190305153243.Y1349@besplex.bde.org>
 <20190306055201.GA40298@troutmask.apl.washington.edu>
 <20190306225811.P2731@besplex.bde.org>
 <20190306184829.GA44023@troutmask.apl.washington.edu>
 <20190307061214.R4911@besplex.bde.org>
 <20190306214233.GA23159@troutmask.apl.washington.edu>
 <20190307144315.N932@besplex.bde.org>
 <20190307044447.GA16298@troutmask.apl.washington.edu>
 <20190307163958.V1293@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20190307163958.V1293@besplex.bde.org>
User-Agent: Mutt/1.11.2 (2019-01-07)
X-Rspamd-Queue-Id: 9331F70523
X-Spamd-Bar: /
Authentication-Results: mx1.freebsd.org
X-Spamd-Result: default: False [0.92 / 15.00]; ARC_NA(0.00)[];
 HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu];
 FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[];
 NEURAL_SPAM_SHORT(0.59)[0.591,0];
 NEURAL_HAM_LONG(-0.58)[-0.582,0]; MIME_GOOD(-0.10)[text/plain];
 RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu];
 AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[];
 RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[];
 RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2];
 RCPT_COUNT_TWO(0.00)[2];
 MX_GOOD(-0.01)[troutmask.apl.washington.edu];
 NEURAL_SPAM_MEDIUM(0.17)[0.169,0]; R_SPF_NA(0.00)[];
 FREEMAIL_TO(0.00)[optusnet.com.au]; FROM_EQ_ENVFROM(0.00)[];
 R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+];
 ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US];
 MID_RHS_MATCH_FROM(0.00)[];
 IP_SCORE(0.05)[ip: (0.10), ipnet: 128.95.0.0/16(0.15), asn: 73(0.05),
 country: US(-0.07)]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2019 19:54:46 -0000

On Thu, Mar 07, 2019 at 05:36:22PM +1100, Bruce Evans wrote:
> On Wed, 6 Mar 2019, Steve Kargl wrote:
> 
> > On Thu, Mar 07, 2019 at 03:09:06PM +1100, Bruce Evans wrote:
> >> On Wed, 6 Mar 2019, Steve Kargl wrote:
> >> ...
> >> But does it preserve NaN bits as much as possible, and produce the same
> >> NaN bits as double precision after conversion to double precision?  The
> >> report spells NaN and Inf using C99's fuzzy bad names.  Even the C99
> >> specification spells NaN and Inf correctly in Appendix F.
> >>
> >> The complex hyperbolic functions have better comments, so are almost
> >> an easier place to start.
> >
> > I suppose it preserves the NaN bits as well as cexpf and cexp as
> > it uses the same algorthim.
> 
> > ...
> > long double complex
> > cexpl (long double complex z)
> > {
> > 	long double c, exp_x, s, x, y;
> > 	uint64_t lx, ly;
> > 	uint16_t hx, hy, ix, iy;
> >
> > 	ENTERI();
> >
> > 	x = creall(z);
> > 	y = cimagl(z);
> >
> > 	EXTRACT_LDBL80_WORDS(hx, lx, x);
> > 	EXTRACT_LDBL80_WORDS(hy, ly, y);
> 
> I think you recently changed to this from your new macro.

Yes, the new macro is gone.

> >
> > 	ix = hx&0x7fff;
> > 	iy = hy&0x7fff;
> 
> Some style bugs.  s_cexp.c doesn't use the fdlibm'ish style of sometimes
> no spaces around '&'.  It puts the ix and iy calculations immediately
> after the extractions.  They shouldn't be separated since they are
> also extractions.  c_cexp.c extracts in a different order, and doesn't
> use iy.  You reordered it a bit to use sincos().  I don't like the
> reordering much.  But old s_cexp.c has its own style bug of a separating
> the extraction for y but not even following its usual style of a blank
> line before comments near the extraction of x.

I may unorder the re-ordering to the old order.  If I understand
one of the comments in the GCC bugzilla report, the optimization of 
converting

  a = cos(x);
  b = sin(x):

to 

  sincos(x, &b, &a).

Actually, does "a = cos(x); b = sin(x)" goes to cexp(cmplx(0,x)) in
the middle-end of the compiler.  The backend then can recognize
cexp(cmplx(0,x)) and converts this to sincos(x, &b, &a).  I can't 
verify this for FreeBSD, because libm isn't C99 compliant.  GCC
requires a C99 compliant to activate the optimization. 

I fixed the extraction for hx, and hy.

> > 	/* cexp(0 + I y) = cos(y) + I sin(y) */
> > 	if ((ix | lx) == 0) {
> > 		sincosl(y, &s, &c);
> > 		RETURNI(CMPLXL(c, s));
> > 	}
> >
> > 	/* cexp(x + I 0) = exp(x) + I 0 */
> > 	if ((iy | ly) == 0)
> > 		RETURNI(CMPLXL(expl(x), y));
> >
> > 	if (iy == 0x7fff) {
> 
> s_cexp.c actually uses a clobbered hy here.  Perhaps fix this too (compilers
> mostly ignore program order for initializing variables like iy and optimize
> to an expression involving the unclobbered hy here if that is optimal.  The
> extra variables are just for readability or unreadability).
> 
> > 		if (ix < 0x7fff ||
> > 		    (ix == 0x7fff && (lx & 0x7fffffffffffffffULL) != 0)) {
> > 			/* cexp(finite|NaN +- I Inf|NaN) = NaN + I NaN */
> > 			RETURNI(CMPLXL(y - y, y - y));
> 
> Looks OK.
> 
> I think ix and iy are only used here, so it is clearer to not have
> variables for them -- just use (hx & 0x7fff), etc.
> 
> y - y and similar expressions usually give the right NaN (y = Inf -> dNaN
> and y = NaN -> d(y)).
> 
> IIRC, the only differences for ld128 is that then normalization bit is
> hidden so masking out the top bit in the mantissa is not needed, but the
> mantissa is in 2 64-bit words.
> 
> > 		} else if (hx & 0x8000) {
> 
> This is clearest as it is with an explicit mask.
> 
> > 	/*
> > 	 *  exp_ovfl = 11356.5234062941439497, 0x400c, 0xb17217f7d1cf79ac
> > 	 * cexp_ovfl = 22755.3287906024445633, 0x400d, 0xb1c6a8573de9768c
> > 	 */
> > 	if ((hx == 0x400c && lx > 0xb17217f7d1cf79acULL) ||
> > 	    (hx == 0x400d && lx < 0xb1c6a8573de9768cULL)) {
> 
> Don't duplicate the hex constants in the comment.  The limits can be fuzzy
> so don't need so many decimal digits or nonzero nonzero bits or LD80C()
> to round them (s_cexp.c uses only 1 decimal digit and doesn't test the
> lower 32 bits in the mantissa; here we have more bits than we need in lx).

Being fuzzy was the movitation for the new macro, I was only using
the upper 32 bits of lx in the original cexpl() I posted.  

> Maybe not use bit-fiddling at all for this (in other precisions too).
> The more important exp() functions check thresholds in FP.  Here x can
> be NaN so FP comparisons with it don't work right.  That seems to be
> the only reason to check in bits here.
> 
> > 		/*
> > 		 * x is between exp_ovfl and cexp_ovfl, so we must scale to
> > 		 * avoid overflow in exp(x).
> > 		 */
> > 		RETURNI(__ldexp_cexpl(z, 1));
> 
> Did your tests cover this case, and does it use the fixed version of the
> old __ldexp_cexpl() in k_expl.h?

No.  The test I reported was restricted to -11350 < x < 11350.
Spot checked x outside that range.  I had the conditional wrong
for x < -exp_ovfl.  Also, the comment is actaully a little optimistic.
One cannot avoid overflow.  For example, x = 11357 is just above
exp_ovfl, and exp(x) = inf.  __ldexp_cexpl() does scaling and exploits
|cos(y)| <= 1 and |sin(y)| <= 1 to try to extend the range of cexpl().

 ./testl -a 11357 0.1
libm: Re(cexpf) =  inf
mpfr: Re(cexpf) =  1.90658369228334884925e+4932
libm: Im(cexpf) =  1.91296449568717353572e+4931
mpfr: Im(cexpf) =  1.91296449568717353558e+4931

troutmask:kargl[430] ./testl -a 11357 1.57
libm: Re(cexpf) =  1.52588659766018377153e+4929
mpfr: Re(cexpf) =  1.52588659766018377147e+4929
libm: Im(cexpf) =  inf
mpfr: Im(cexpf) =  1.91615588587371861485e+4932


> 
> > 	} else {
> > 		/*
> > 		 * Cases covered here:
> > 		 *  -  x < exp_ovfl and exp(x) won't overflow (common case)
> > 		 *  -  x > cexp_ovfl, so exp(x) * s overflows for all s > 0
> > 		 *  -  x = +-Inf (generated by exp())
> > 		 *  -  x = NaN (spurious inexact exception from y)
> > 		 */
> > 		exp_x = expl(x);
> > 		sincosl(y, &s, &c);
> > 		RETURNI(CMPLXL(exp_x * c, exp_x * s));
> > 	}
> > }
> 
> I think the last clause can be simplfied and even optimized by using the
> kernel in all cases:
> - the kernel produces exp_x in the form kexp_x * 2**k where kexp_x is near
>    1.  Just multiply kexp_x by c and s, and later multiply by 2**k.  This
>    cost an extra multiplication by 2**k (for the real and imaginary parts
>    separately), but avoids other work so may be faster.
> - hopefully kexp_x is actually >= 1 (and < 2).  Otherwise we have to
>    worry about spurious underflow
> - the above method seems to avoid spurious underflow automatically, and
>    doesn't even mention this (multiplying by exp_x may underflow, but not
>    spuriously since c and s are <= 1).

You might be right about using the kernel.  I'll leave that
to others on freebsd-numerics@; although you and I seem to be
the only subscribers.

-- 
Steve

From owner-freebsd-numerics@freebsd.org  Fri Mar  8 00:22:50 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id F2261152C634
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Fri,  8 Mar 2019 00:22:49 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 server-signature RSA-PSS (4096 bits)
 client-signature RSA-PSS (2048 bits) client-digest SHA256)
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 386838222D
 for <freebsd-numerics@freebsd.org>; Fri,  8 Mar 2019 00:22:48 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x280Mj1w026366
 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO);
 Thu, 7 Mar 2019 16:22:45 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x280MjNf026365;
 Thu, 7 Mar 2019 16:22:45 -0800 (PST) (envelope-from sgk)
Date: Thu, 7 Mar 2019 16:22:45 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-numerics@freebsd.org
Subject: Re: Update ENTERI() macro
Message-ID: <20190308002245.GA26338@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
References: <20190227201214.V1823@besplex.bde.org>
 <20190227161906.GA77785@troutmask.apl.washington.edu>
 <20190228060920.R4413@besplex.bde.org>
 <20190304212159.GA12587@troutmask.apl.washington.edu>
 <20190305153243.Y1349@besplex.bde.org>
 <20190306055201.GA40298@troutmask.apl.washington.edu>
 <20190306225811.P2731@besplex.bde.org>
 <20190306184829.GA44023@troutmask.apl.washington.edu>
 <20190307061214.R4911@besplex.bde.org>
 <20190306214233.GA23159@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20190306214233.GA23159@troutmask.apl.washington.edu>
User-Agent: Mutt/1.11.2 (2019-01-07)
X-Rspamd-Queue-Id: 386838222D
X-Spamd-Bar: +
Authentication-Results: mx1.freebsd.org
X-Spamd-Result: default: False [1.24 / 15.00]; ARC_NA(0.00)[];
 HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu];
 FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[];
 NEURAL_SPAM_SHORT(0.59)[0.594,0];
 NEURAL_HAM_LONG(-0.73)[-0.732,0]; MIME_GOOD(-0.10)[text/plain];
 RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu];
 AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[];
 RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[];
 RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2];
 MX_GOOD(-0.01)[cached: troutmask.apl.washington.edu];
 RCPT_COUNT_TWO(0.00)[2]; NEURAL_SPAM_MEDIUM(0.64)[0.644,0];
 R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[optusnet.com.au];
 FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[];
 MIME_TRACE(0.00)[0:+];
 ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US];
 MID_RHS_MATCH_FROM(0.00)[];
 IP_SCORE(0.05)[ip: (0.10), ipnet: 128.95.0.0/16(0.15), asn: 73(0.05),
 country: US(-0.07)]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2019 00:22:50 -0000

On Wed, Mar 06, 2019 at 01:42:33PM -0800, Steve Kargl wrote:
> On Thu, Mar 07, 2019 at 06:30:42AM +1100, Bruce Evans wrote:
> > On Wed, 6 Mar 2019, Steve Kargl wrote:
> > 
> > > On Wed, Mar 06, 2019 at 11:56:23PM +1100, Bruce Evans wrote:
> > >> On Tue, 5 Mar 2019, Steve Kargl wrote:
> > >
> > >>> a similar k_cexpl.c.  Yes, I added the 'c' in the name to avoid
> > >>> confusion in ld80/.  In particular, I have no idea how he found
> > >>> his scaling value 'k'.  Any insights?
> > >>
> > >> bde already wrote __ldexp_cexpl() in ld*/k_expl.h, and kargl committed
> > >> it in r260066.  Does it not work? :-).
> > >>
> > >> Well, it hasn't been tested, and it indeed cannot work since it spells
> > >> cosl(y) as cos(y).
> > >
> > > Taking long breaks from pecking at libm issues seems to be
> > > conducive to memory loss.  I'll go review k_expl.h.  I simply
> > > remember it as having a kernel for expl().
> > 
> > I now see that you implemented 2 more versions of __ldexpl_cexpl() by
> > cloning the old double precision version.  Apparently the includes
> > are to unpolluted for the compiler to see the multiple versions :-).
> > 
> > Using the version in k_expl.h almost forces inlining of expl()'s kernel
> > and its large tables, just like for hyperbolic functions.  This wastes
> > a lot of space, especially for duplicating the tables.  It is only a
> > small optimization for time.  It is done for the hyperbolic functions
> > to get this optimization, and for __ldexpl_cexpl() just for convenience.
> 
> The version in k_expl.h has 2 bugs.  You note the first (cos instead
> of cosl).  The second is
> 
> In file included from /data/kargl/trunk/math/libm/msun/ld80/s_cexpl.c:43:
> /data/kargl/trunk/math/libm/msun/ld80/k_expl.h:288:22: error: magnitude of
>       floating-point constant too large for type 'double'; maximum is
>       1.7976931348623157E+308 [-Werror,-Wliteral-range]
>         exp_x = (lo + hi) * 0x1p16382;
>                             ^

Make that 3 bugs.  

Index: ld80/k_expl.h
===================================================================
--- ld80/k_expl.h	(revision 344600)
+++ ld80/k_expl.h	(working copy)
@@ -285,7 +285,7 @@
 	y = cimagl(z);
 	__k_expl(x, &hi, &lo, &k);
 
-	exp_x = (lo + hi) * 0x1p16382;
+	exp_x = (lo + hi) * 0x1p16382L;
 	expt += k - 16382;
 
 	scale1 = 1;
@@ -292,9 +292,9 @@
 	half_expt = expt / 2;
 	SET_LDBL_EXPSIGN(scale1, BIAS + half_expt);
 	scale2 = 1;
-	SET_LDBL_EXPSIGN(scale1, BIAS + expt - half_expt);
+	SET_LDBL_EXPSIGN(scale2, BIAS + expt - half_expt);
 
-	return (CMPLXL(cos(y) * exp_x * scale1 * scale2,
+	return (CMPLXL(cosl(y) * exp_x * scale1 * scale2,
 	    sinl(y) * exp_x * scale1 * scale2));
 }
 #endif /* _COMPLEX_H */

-- 
Steve

From owner-freebsd-numerics@freebsd.org  Fri Mar  8 03:04:20 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 96F3F1530091
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Fri,  8 Mar 2019 03:04:20 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 server-signature RSA-PSS (4096 bits)
 client-signature RSA-PSS (2048 bits) client-digest SHA256)
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 6234287720
 for <freebsd-numerics@freebsd.org>; Fri,  8 Mar 2019 03:04:19 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x2834GIf028810
 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO);
 Thu, 7 Mar 2019 19:04:16 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x2834FGU028809;
 Thu, 7 Mar 2019 19:04:15 -0800 (PST) (envelope-from sgk)
Date: Thu, 7 Mar 2019 19:04:15 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-numerics@freebsd.org
Subject: Re: Update ENTERI() macro
Message-ID: <20190308030415.GA28796@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
References: <20190227161906.GA77785@troutmask.apl.washington.edu>
 <20190228060920.R4413@besplex.bde.org>
 <20190304212159.GA12587@troutmask.apl.washington.edu>
 <20190305153243.Y1349@besplex.bde.org>
 <20190306055201.GA40298@troutmask.apl.washington.edu>
 <20190306225811.P2731@besplex.bde.org>
 <20190306184829.GA44023@troutmask.apl.washington.edu>
 <20190307061214.R4911@besplex.bde.org>
 <20190306214233.GA23159@troutmask.apl.washington.edu>
 <20190308002245.GA26338@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20190308002245.GA26338@troutmask.apl.washington.edu>
User-Agent: Mutt/1.11.2 (2019-01-07)
X-Rspamd-Queue-Id: 6234287720
X-Spamd-Bar: /
Authentication-Results: mx1.freebsd.org
X-Spamd-Result: default: False [0.69 / 15.00]; ARC_NA(0.00)[];
 HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu];
 NEURAL_HAM_MEDIUM(-0.14)[-0.136,0]; FROM_HAS_DN(0.00)[];
 TO_DN_SOME(0.00)[]; NEURAL_SPAM_SHORT(0.75)[0.751,0];
 NEURAL_HAM_LONG(-0.66)[-0.657,0]; MIME_GOOD(-0.10)[text/plain];
 RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu];
 AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[];
 RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[];
 RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2];
 MX_GOOD(-0.01)[cached: troutmask.apl.washington.edu];
 RCPT_COUNT_TWO(0.00)[2]; R_SPF_NA(0.00)[];
 FREEMAIL_TO(0.00)[optusnet.com.au]; FROM_EQ_ENVFROM(0.00)[];
 R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+];
 ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US];
 MID_RHS_MATCH_FROM(0.00)[];
 IP_SCORE(0.05)[ip: (0.10), ipnet: 128.95.0.0/16(0.15), asn: 73(0.05),
 country: US(-0.07)]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2019 03:04:20 -0000

On Thu, Mar 07, 2019 at 04:22:45PM -0800, Steve Kargl wrote:
> 
> Make that 3 bugs.  
> 
> Index: ld80/k_expl.h
> ===================================================================
> --- ld80/k_expl.h	(revision 344600)
> +++ ld80/k_expl.h	(working copy)
> @@ -285,7 +285,7 @@
>  	y = cimagl(z);
>  	__k_expl(x, &hi, &lo, &k);
>  
> -	exp_x = (lo + hi) * 0x1p16382;
> +	exp_x = (lo + hi) * 0x1p16382L;
>  	expt += k - 16382;
>  
>  	scale1 = 1;
> @@ -292,9 +292,9 @@
>  	half_expt = expt / 2;
>  	SET_LDBL_EXPSIGN(scale1, BIAS + half_expt);
>  	scale2 = 1;
> -	SET_LDBL_EXPSIGN(scale1, BIAS + expt - half_expt);
> +	SET_LDBL_EXPSIGN(scale2, BIAS + expt - half_expt);

This bug appears in ld128/k_expl.h.  The other two bugs
are not present or were fixed at some point.

>  
> -	return (CMPLXL(cos(y) * exp_x * scale1 * scale2,
> +	return (CMPLXL(cosl(y) * exp_x * scale1 * scale2,
>  	    sinl(y) * exp_x * scale1 * scale2));
>  }
>  #endif /* _COMPLEX_H */
> 
> -- 
> Steve
> _______________________________________________
> freebsd-numerics@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-numerics
> To unsubscribe, send any mail to "freebsd-numerics-unsubscribe@freebsd.org"

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow

From owner-freebsd-numerics@freebsd.org  Fri Mar  8 13:51:16 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 41DFC153DB88
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Fri,  8 Mar 2019 13:51:16 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au
 [211.29.132.246])
 by mx1.freebsd.org (Postfix) with ESMTP id CED7974A61
 for <freebsd-numerics@freebsd.org>; Fri,  8 Mar 2019 13:51:13 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from c122-106-155-65.carlnfd1.nsw.optusnet.com.au
 (c122-106-155-65.carlnfd1.nsw.optusnet.com.au [122.106.155.65])
 by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 9EE3B432FF4;
 Sat,  9 Mar 2019 00:51:09 +1100 (AEDT)
Date: Sat, 9 Mar 2019 00:51:07 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
cc: freebsd-numerics@freebsd.org
Subject: cexpl() (was: Re: Update ENTERI() macro)
In-Reply-To: <20190307195436.GA20856@troutmask.apl.washington.edu>
Message-ID: <20190308225807.D6410@besplex.bde.org>
References: <20190304212159.GA12587@troutmask.apl.washington.edu>
 <20190305153243.Y1349@besplex.bde.org>
 <20190306055201.GA40298@troutmask.apl.washington.edu>
 <20190306225811.P2731@besplex.bde.org>
 <20190306184829.GA44023@troutmask.apl.washington.edu>
 <20190307061214.R4911@besplex.bde.org>
 <20190306214233.GA23159@troutmask.apl.washington.edu>
 <20190307144315.N932@besplex.bde.org>
 <20190307044447.GA16298@troutmask.apl.washington.edu>
 <20190307163958.V1293@besplex.bde.org>
 <20190307195436.GA20856@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.2 cv=UJetJGXy c=1 sm=1 tr=0
 a=hrcDgRrNnVuIsdOiM/DCpA==:117 a=kj9zAlcOel0A:10 a=vStcYMboq_yilUtR23IA:9
 a=CjuIK1q_8ugA:10
X-Rspamd-Queue-Id: CED7974A61
X-Spamd-Bar: ------
Authentication-Results: mx1.freebsd.org;
 spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates
 211.29.132.246 as permitted sender) smtp.mailfrom=brde@optusnet.com.au
X-Spamd-Result: default: False [-6.16 / 15.00]; ARC_NA(0.00)[];
 NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_COUNT_TWO(0.00)[2];
 FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[];
 TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23];
 FREEMAIL_FROM(0.00)[optusnet.com.au];
 MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au];
 NEURAL_HAM_LONG(-1.00)[-1.000,0];
 TO_MATCH_ENVRCPT_SOME(0.00)[];
 MX_GOOD(-0.01)[cached: extmail.optusnet.com.au];
 RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.92)[-0.922,0];
 IP_SCORE(-2.93)[ip: (-7.64), ipnet: 211.28.0.0/14(-3.88), asn: 4804(-3.09),
 country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[];
 RCVD_IN_DNSWL_LOW(-0.10)[246.132.29.211.list.dnswl.org : 127.0.5.1];
 R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[optusnet.com.au];
 ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU];
 MIME_TRACE(0.00)[0:+];
 RECEIVED_SPAMHAUS_PBL(0.00)[65.155.106.122.zen.spamhaus.org : 127.0.0.11]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2019 13:51:16 -0000

On Thu, 7 Mar 2019, Steve Kargl wrote:

> On Thu, Mar 07, 2019 at 05:36:22PM +1100, Bruce Evans wrote:
>> On Wed, 6 Mar 2019, Steve Kargl wrote:
>>
>>> On Thu, Mar 07, 2019 at 03:09:06PM +1100, Bruce Evans wrote:
>* ...
>>> 	/*
>>> 	 *  exp_ovfl = 11356.5234062941439497, 0x400c, 0xb17217f7d1cf79ac
>>> 	 * cexp_ovfl = 22755.3287906024445633, 0x400d, 0xb1c6a8573de9768c
>>> 	 */

I checked that these are correct.  Any rounding should be down for
exp_ovfl and up for cexp_ovfl.

>>> 	if ((hx == 0x400c && lx > 0xb17217f7d1cf79acULL) ||
>>> 	    (hx == 0x400d && lx < 0xb1c6a8573de9768cULL)) {
>>
>> Don't duplicate the hex constants in the comment.  The limits can be fuzzy
>> so don't need so many decimal digits or nonzero nonzero bits or LD80C()
>> to round them (s_cexp.c uses only 1 decimal digit and doesn't test the
>> lower 32 bits in the mantissa; here we have more bits than we need in lx).
>
> Being fuzzy was the movitation for the new macro, I was only using
> the upper 32 bits of lx in the original cexpl() I posted.

IIRC, that was done by extracting into a 32-bit lx.  This could be written
as (lx >> 32) > 0xb17217f7 (or a fuzzier value), and it might be better
to do that anyway.

>> Maybe not use bit-fiddling at all for this (in other precisions too).
>> The more important exp() functions check thresholds in FP.  Here x can
>> be NaN so FP comparisons with it don't work right.  That seems to be
>> the only reason to check in bits here.
>>
>>> 		/*
>>> 		 * x is between exp_ovfl and cexp_ovfl, so we must scale to
>>> 		 * avoid overflow in exp(x).
>>> 		 */
>>> 		RETURNI(__ldexp_cexpl(z, 1));
>>
>> Did your tests cover this case, and does it use the fixed version of the
>> old __ldexp_cexpl() in k_expl.h?
>
> No.  The test I reported was restricted to -11350 < x < 11350.
> Spot checked x outside that range.  I had the conditional wrong
> for x < -exp_ovfl.  Also, the comment is actaully a little optimistic.
> One cannot avoid overflow.  For example, x = 11357 is just above
> exp_ovfl, and exp(x) = inf.  __ldexp_cexpl() does scaling and exploits
> |cos(y)| <= 1 and |sin(y)| <= 1 to try to extend the range of cexpl().

I understand this better now.  __ldexp_cexpl() doesn't work for all large
x, so callers must only use it for a range of values above cexp_ovl.
The correctness of this is unobvious.  It depends on sin(y) and cos(y)
never underflowing to 0.  Otherwise, this method would give Inf * 0 = NaN
for finite x and y, and the scaling method would give finite * 0 * 2**k = 0.

>>> 	} else {
>>> 		/*
>>> 		 * Cases covered here:
>>> 		 *  -  x < exp_ovfl and exp(x) won't overflow (common case)
>>> 		 *  -  x > cexp_ovfl, so exp(x) * s overflows for all s > 0
>>> 		 *  -  x = +-Inf (generated by exp())
>>> 		 *  -  x = NaN (spurious inexact exception from y)
>>> 		 */
>>> 		exp_x = expl(x);
>>> 		sincosl(y, &s, &c);
>>> 		RETURNI(CMPLXL(exp_x * c, exp_x * s));
>>> 	}
>>> }
>>
>> I think the last clause can be simplfied and even optimized by using the
>> kernel in all cases:

I was wrong.

>> - the kernel produces exp_x in the form kexp_x * 2**k where kexp_x is near
>>    1.  Just multiply kexp_x by c and s, and later multiply by 2**k.  This
>>    cost an extra multiplication by 2**k (for the real and imaginary parts
>>    separately), but avoids other work so may be faster.
>> - hopefully kexp_x is actually >= 1 (and < 2).  Otherwise we have to
>>    worry about spurious underflow
>> - the above method seems to avoid spurious underflow automatically, and
>>    doesn't even mention this (multiplying by exp_x may underflow, but not
>>    spuriously since c and s are <= 1).
>
> You might be right about using the kernel.  I'll leave that
> to others on freebsd-numerics@; although you and I seem to be
> the only subscribers.

Forget about that.  __ldexp_cexp*() is not really a kernel.  It is used
for ccosh*() and csinh*() too.  Those can only use the kernel for large
positive x.  cexp*() may as well do the same.  Also, the kernel currently
doesn't support very large positive x, so the second overflow threshold
is needed in callers.

Bruce

From owner-freebsd-numerics@freebsd.org  Fri Mar  8 13:58:51 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 60CAE153DD79
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Fri,  8 Mar 2019 13:58:51 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au
 [211.29.132.246])
 by mx1.freebsd.org (Postfix) with ESMTP id 7D5A774C84
 for <freebsd-numerics@freebsd.org>; Fri,  8 Mar 2019 13:58:50 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from c122-106-155-65.carlnfd1.nsw.optusnet.com.au
 (c122-106-155-65.carlnfd1.nsw.optusnet.com.au [122.106.155.65])
 by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 059E1432CD3;
 Sat,  9 Mar 2019 00:58:47 +1100 (AEDT)
Date: Sat, 9 Mar 2019 00:58:47 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
cc: freebsd-numerics@freebsd.org
Subject: Re: Update ENTERI() macro
In-Reply-To: <20190308030415.GA28796@troutmask.apl.washington.edu>
Message-ID: <20190309005153.H6410@besplex.bde.org>
References: <20190227161906.GA77785@troutmask.apl.washington.edu>
 <20190228060920.R4413@besplex.bde.org>
 <20190304212159.GA12587@troutmask.apl.washington.edu>
 <20190305153243.Y1349@besplex.bde.org>
 <20190306055201.GA40298@troutmask.apl.washington.edu>
 <20190306225811.P2731@besplex.bde.org>
 <20190306184829.GA44023@troutmask.apl.washington.edu>
 <20190307061214.R4911@besplex.bde.org>
 <20190306214233.GA23159@troutmask.apl.washington.edu>
 <20190308002245.GA26338@troutmask.apl.washington.edu>
 <20190308030415.GA28796@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.2 cv=P6RKvmIu c=1 sm=1 tr=0
 a=hrcDgRrNnVuIsdOiM/DCpA==:117 a=kj9zAlcOel0A:10 a=JMbnHHhEAjw9J1_4nJ8A:9
 a=CjuIK1q_8ugA:10
X-Rspamd-Queue-Id: 7D5A774C84
X-Spamd-Bar: ------
Authentication-Results: mx1.freebsd.org;
 spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates
 211.29.132.246 as permitted sender) smtp.mailfrom=brde@optusnet.com.au
X-Spamd-Result: default: False [-6.18 / 15.00]; ARC_NA(0.00)[];
 NEURAL_HAM_MEDIUM(-1.00)[-1.000,0];
 RCVD_IN_DNSWL_LOW(-0.10)[246.132.29.211.list.dnswl.org : 127.0.5.1];
 FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[];
 R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23];
 FREEMAIL_FROM(0.00)[optusnet.com.au];
 MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au];
 NEURAL_HAM_LONG(-1.00)[-1.000,0];
 RECEIVED_SPAMHAUS_PBL(0.00)[65.155.106.122.zen.spamhaus.org : 127.0.0.11];
 TO_MATCH_ENVRCPT_SOME(0.00)[];
 MX_GOOD(-0.01)[cached: extmail.optusnet.com.au];
 RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.92)[-0.924,0];
 IP_SCORE(-2.95)[ip: (-7.71), ipnet: 211.28.0.0/14(-3.89), asn: 4804(-3.10),
 country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[];
 FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[];
 FREEMAIL_ENVFROM(0.00)[optusnet.com.au];
 ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU];
 MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2019 13:58:51 -0000

On Thu, 7 Mar 2019, Steve Kargl wrote:

> On Thu, Mar 07, 2019 at 04:22:45PM -0800, Steve Kargl wrote:
>>
>> Make that 3 bugs.
>>
>> Index: ld80/k_expl.h
>> ===================================================================
>> --- ld80/k_expl.h	(revision 344600)
>> +++ ld80/k_expl.h	(working copy)
>> @@ -285,7 +285,7 @@
>>  	y = cimagl(z);
>>  	__k_expl(x, &hi, &lo, &k);
>>
>> -	exp_x = (lo + hi) * 0x1p16382;
>> +	exp_x = (lo + hi) * 0x1p16382L;
>>  	expt += k - 16382;
>>
>>  	scale1 = 1;
>> @@ -292,9 +292,9 @@
>>  	half_expt = expt / 2;
>>  	SET_LDBL_EXPSIGN(scale1, BIAS + half_expt);
>>  	scale2 = 1;
>> -	SET_LDBL_EXPSIGN(scale1, BIAS + expt - half_expt);
>> +	SET_LDBL_EXPSIGN(scale2, BIAS + expt - half_expt);

Oops.

> This bug appears in ld128/k_expl.h.  The other two bugs
> are not present or were fixed at some point.

All 3 are in both ld*/k_expl.h.  This one is is fixed or not
present in my src/k_exp[f].c.

>>
>> -	return (CMPLXL(cos(y) * exp_x * scale1 * scale2,
>> +	return (CMPLXL(cosl(y) * exp_x * scale1 * scale2,

One like this is fixed or not present in src/k_expf.c.

>>  	    sinl(y) * exp_x * scale1 * scale2));
>>  }
>>  #endif /* _COMPLEX_H */

Bruce

From owner-freebsd-numerics@freebsd.org  Fri Mar  8 16:24:37 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 457EC150F484
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Fri,  8 Mar 2019 16:24:37 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 server-signature RSA-PSS (4096 bits)
 client-signature RSA-PSS (2048 bits) client-digest SHA256)
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 025A6820A8
 for <freebsd-numerics@freebsd.org>; Fri,  8 Mar 2019 16:24:35 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x28GOWLb031509
 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO);
 Fri, 8 Mar 2019 08:24:32 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x28GOWtr031508;
 Fri, 8 Mar 2019 08:24:32 -0800 (PST) (envelope-from sgk)
Date: Fri, 8 Mar 2019 08:24:32 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-numerics@freebsd.org
Subject: Re: cexpl() (was: Re: Update ENTERI() macro)
Message-ID: <20190308162432.GA31392@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
References: <20190306055201.GA40298@troutmask.apl.washington.edu>
 <20190306225811.P2731@besplex.bde.org>
 <20190306184829.GA44023@troutmask.apl.washington.edu>
 <20190307061214.R4911@besplex.bde.org>
 <20190306214233.GA23159@troutmask.apl.washington.edu>
 <20190307144315.N932@besplex.bde.org>
 <20190307044447.GA16298@troutmask.apl.washington.edu>
 <20190307163958.V1293@besplex.bde.org>
 <20190307195436.GA20856@troutmask.apl.washington.edu>
 <20190308225807.D6410@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20190308225807.D6410@besplex.bde.org>
User-Agent: Mutt/1.11.2 (2019-01-07)
X-Rspamd-Queue-Id: 025A6820A8
X-Spamd-Bar: +
Authentication-Results: mx1.freebsd.org
X-Spamd-Result: default: False [1.13 / 15.00]; ARC_NA(0.00)[];
 HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu];
 FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[];
 NEURAL_SPAM_SHORT(0.64)[0.636,0];
 NEURAL_HAM_LONG(-0.73)[-0.728,0]; MIME_GOOD(-0.10)[text/plain];
 RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu];
 AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[];
 RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[];
 RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2];
 MX_GOOD(-0.01)[cached: troutmask.apl.washington.edu];
 RCPT_COUNT_TWO(0.00)[2]; NEURAL_SPAM_MEDIUM(0.49)[0.490,0];
 R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[optusnet.com.au];
 FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[];
 MIME_TRACE(0.00)[0:+];
 ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US];
 MID_RHS_MATCH_FROM(0.00)[];
 IP_SCORE(0.05)[ip: (0.10), ipnet: 128.95.0.0/16(0.15), asn: 73(0.05),
 country: US(-0.07)]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2019 16:24:37 -0000

On Sat, Mar 09, 2019 at 12:51:07AM +1100, Bruce Evans wrote:
> On Thu, 7 Mar 2019, Steve Kargl wrote:
> 
> > On Thu, Mar 07, 2019 at 05:36:22PM +1100, Bruce Evans wrote:
> >> On Wed, 6 Mar 2019, Steve Kargl wrote:
> >>
> >>> On Thu, Mar 07, 2019 at 03:09:06PM +1100, Bruce Evans wrote:
> >* ...
> >>> 	/*
> >>> 	 *  exp_ovfl = 11356.5234062941439497, 0x400c, 0xb17217f7d1cf79ac
> >>> 	 * cexp_ovfl = 22755.3287906024445633, 0x400d, 0xb1c6a8573de9768c
> >>> 	 */
> 
> I checked that these are correct.  Any rounding should be down for
> exp_ovfl and up for cexp_ovfl.
> 
> >>> 	if ((hx == 0x400c && lx > 0xb17217f7d1cf79acULL) ||
> >>> 	    (hx == 0x400d && lx < 0xb1c6a8573de9768cULL)) {
> >>
> >> Don't duplicate the hex constants in the comment.  The limits can be fuzzy
> >> so don't need so many decimal digits or nonzero nonzero bits or LD80C()
> >> to round them (s_cexp.c uses only 1 decimal digit and doesn't test the
> >> lower 32 bits in the mantissa; here we have more bits than we need in lx).
> >
> > Being fuzzy was the movitation for the new macro, I was only using
> > the upper 32 bits of lx in the original cexpl() I posted.
> 
> IIRC, that was done by extracting into a 32-bit lx.  This could be written
> as (lx >> 32) > 0xb17217f7 (or a fuzzier value), and it might be better
> to do that anyway.

This is what I do in my WIP ccosh.  There are 3 thresholds
of ~22.9, ~11356, and ~22755.  I'm casting the result to
uint32_t, so something like (uint32_t)(lx >> 32) > 0xb17000000.
Do I need the cast and/or do you prefer it? 


> >> Maybe not use bit-fiddling at all for this (in other precisions too).
> >> The more important exp() functions check thresholds in FP.  Here x can
> >> be NaN so FP comparisons with it don't work right.  That seems to be
> >> the only reason to check in bits here.
> >>
> >>> 		/*
> >>> 		 * x is between exp_ovfl and cexp_ovfl, so we must scale to
> >>> 		 * avoid overflow in exp(x).
> >>> 		 */
> >>> 		RETURNI(__ldexp_cexpl(z, 1));
> >>
> >> Did your tests cover this case, and does it use the fixed version of the
> >> old __ldexp_cexpl() in k_expl.h?
> >
> > No.  The test I reported was restricted to -11350 < x < 11350.
> > Spot checked x outside that range.  I had the conditional wrong
> > for x < -exp_ovfl.  Also, the comment is actaully a little optimistic.
> > One cannot avoid overflow.  For example, x = 11357 is just above
> > exp_ovfl, and exp(x) = inf.  __ldexp_cexpl() does scaling and exploits
> > |cos(y)| <= 1 and |sin(y)| <= 1 to try to extend the range of cexpl().
> 
> I understand this better now.  __ldexp_cexpl() doesn't work for all large
> x, so callers must only use it for a range of values above cexp_ovl.
> The correctness of this is unobvious.  It depends on sin(y) and cos(y)
> never underflowing to 0.  Otherwise, this method would give Inf * 0 = NaN
> for finite x and y, and the scaling method would give finite * 0 * 2**k = 0.

cos(y) can never be 0 and sin(y) is only 0 at y = 0 (see sinpi, cospi
implementations).  y=0 is a special case and handled separately.  We
can have Inf*subnormal when y is near 0.

-- 
Steve

From owner-freebsd-numerics@freebsd.org  Fri Mar  8 18:03:13 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5BC57152498D
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Fri,  8 Mar 2019 18:03:13 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail106.syd.optusnet.com.au (mail106.syd.optusnet.com.au
 [211.29.132.42])
 by mx1.freebsd.org (Postfix) with ESMTP id 6A1CD84D6B
 for <freebsd-numerics@freebsd.org>; Fri,  8 Mar 2019 18:03:09 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au
 [110.21.101.228])
 by mail106.syd.optusnet.com.au (Postfix) with ESMTPS id 01FE73D5A3B;
 Sat,  9 Mar 2019 05:02:59 +1100 (AEDT)
Date: Sat, 9 Mar 2019 05:02:57 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
cc: freebsd-numerics@freebsd.org
Subject: Re: cexpl() (was: Re: Update ENTERI() macro)
In-Reply-To: <20190308162432.GA31392@troutmask.apl.washington.edu>
Message-ID: <20190309033820.J1443@besplex.bde.org>
References: <20190306055201.GA40298@troutmask.apl.washington.edu>
 <20190306225811.P2731@besplex.bde.org>
 <20190306184829.GA44023@troutmask.apl.washington.edu>
 <20190307061214.R4911@besplex.bde.org>
 <20190306214233.GA23159@troutmask.apl.washington.edu>
 <20190307144315.N932@besplex.bde.org>
 <20190307044447.GA16298@troutmask.apl.washington.edu>
 <20190307163958.V1293@besplex.bde.org>
 <20190307195436.GA20856@troutmask.apl.washington.edu>
 <20190308225807.D6410@besplex.bde.org>
 <20190308162432.GA31392@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.2 cv=P6RKvmIu c=1 sm=1 tr=0
 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17
 a=kj9zAlcOel0A:10 a=hf07pYzDFtiwsiih0sgA:9 a=CjuIK1q_8ugA:10
X-Rspamd-Queue-Id: 6A1CD84D6B
X-Spamd-Bar: ------
Authentication-Results: mx1.freebsd.org;
 spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates
 211.29.132.42 as permitted sender) smtp.mailfrom=brde@optusnet.com.au
X-Spamd-Result: default: False [-6.39 / 15.00]; ARC_NA(0.00)[];
 NEURAL_HAM_MEDIUM(-1.00)[-1.000,0];
 RCVD_IN_DNSWL_LOW(-0.10)[42.132.29.211.list.dnswl.org : 127.0.5.1];
 FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[];
 R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23];
 FREEMAIL_FROM(0.00)[optusnet.com.au];
 MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au];
 NEURAL_HAM_LONG(-1.00)[-1.000,0];
 TO_MATCH_ENVRCPT_SOME(0.00)[];
 MX_GOOD(-0.01)[cached: extmail.optusnet.com.au];
 RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.98)[-0.976,0];
 IP_SCORE(-3.10)[ip: (-8.49), ipnet: 211.28.0.0/14(-3.89), asn: 4804(-3.10),
 country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[];
 FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[];
 FREEMAIL_ENVFROM(0.00)[optusnet.com.au];
 ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU];
 MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2019 18:03:13 -0000

On Fri, 8 Mar 2019, Steve Kargl wrote:

> On Sat, Mar 09, 2019 at 12:51:07AM +1100, Bruce Evans wrote:
>> On Thu, 7 Mar 2019, Steve Kargl wrote:
>>
>>> On Thu, Mar 07, 2019 at 05:36:22PM +1100, Bruce Evans wrote:
>>>> On Wed, 6 Mar 2019, Steve Kargl wrote:
>>>>
>>>>> On Thu, Mar 07, 2019 at 03:09:06PM +1100, Bruce Evans wrote:
>>> * ...
>>>>> 	/*
>>>>> 	 *  exp_ovfl = 11356.5234062941439497, 0x400c, 0xb17217f7d1cf79ac
>>>>> 	 * cexp_ovfl = 22755.3287906024445633, 0x400d, 0xb1c6a8573de9768c
>>>>> 	 */
>>
>> I checked that these are correct.  Any rounding should be down for
>> exp_ovfl and up for cexp_ovfl.
>>
>>>>> 	if ((hx == 0x400c && lx > 0xb17217f7d1cf79acULL) ||
>>>>> 	    (hx == 0x400d && lx < 0xb1c6a8573de9768cULL)) {
>>>>
>>>> Don't duplicate the hex constants in the comment.  The limits can be fuzzy
>>>> so don't need so many decimal digits or nonzero nonzero bits or LD80C()
>>>> to round them (s_cexp.c uses only 1 decimal digit and doesn't test the
>>>> lower 32 bits in the mantissa; here we have more bits than we need in lx).
>>>
>>> Being fuzzy was the movitation for the new macro, I was only using
>>> the upper 32 bits of lx in the original cexpl() I posted.
>>
>> IIRC, that was done by extracting into a 32-bit lx.  This could be written
>> as (lx >> 32) > 0xb17217f7 (or a fuzzier value), and it might be better
>> to do that anyway.
>
> This is what I do in my WIP ccosh.  There are 3 thresholds
> of ~22.9, ~11356, and ~22755.  I'm casting the result to
> uint32_t, so something like (uint32_t)(lx >> 32) > 0xb17000000.
> Do I need the cast and/or do you prefer it?

I think you can even round to a power of 2 and not look at any mantissa
bits to classify this, as for the 64 in coshl().

22.9 here seems wrong.  It is approximately the threshold where cosh()
decays to exp().  The threshold for coshl() delay is much higher.
coshl() uses 64, which is high enough for both ld80 and ld128.  More
like 32 would work for ld80.

>>>> Maybe not use bit-fiddling at all for this (in other precisions too).
>>>> The more important exp() functions check thresholds in FP.  Here x can
>>>> be NaN so FP comparisons with it don't work right.  That seems to be
>>>> the only reason to check in bits here.

coshl() also checks thresholds in only in FP if classifications using the
exponent are too fuzzy, so as to avoid the slow step of extracting the
mantissa bits and the unportable steps of using them.  I think
classification for ld80 has to look at the mantissa to classify UNnormals,
but ld128 doesn't have UnNormals so might not need the mantissa bits.

Look at coshl() for details.  It is simpler.
>>
>> I understand this better now.  __ldexp_cexpl() doesn't work for all large
>> x, so callers must only use it for a range of values above cexp_ovl.
>> The correctness of this is unobvious.  It depends on sin(y) and cos(y)
>> never underflowing to 0.  Otherwise, this method would give Inf * 0 = NaN
>> for finite x and y, and the scaling method would give finite * 0 * 2**k = 0.
>
> cos(y) can never be 0 and sin(y) is only 0 at y = 0 (see sinpi, cospi
> implementations).  y=0 is a special case and handled separately.  We
> can have Inf*subnormal when y is near 0.

cos(y) and sin(y) would be zero if the exact value underflows on
assignment to long double, or the inexact value is miscalculated as
0.  It is difficult to prove that they don't underflow to 0 or to a
denormal when y is not near 0.  That they are not miscalculated so as
to give spurious underflow follows if we can prove that denormals don't
even nearly occur and that our methods have errors less than 1 ulp in
relative terms (only values at least twice as large as the largest
denormal occur, and a 1-ulp relative error for these values can't reach
a denormal value).

Inf*denormal is not very special.  It is +-Inf.  The upper overflow threshold
is chosen so than exp(x) * +-smallest_denormal = +-Inf for x above this
threshold.

It is easier to optimize this for cexp().  cexp() starts with extracting
all bits of y, and then all bits of x unless y = 0, but in most cases the
args are finite, nonzero and not very large, and everything except zero
can be classified by looking at only the exponent.  Zero can be classified
using an FP comparison.

Something like:

 	/* Do this first for gcc's sin+cos pessimization? */
 	if (x == 0)
 		return (cpackl(cosl(y), sinl(y));

 	if (y == 0)
 		return (cpackl(expl(x), y));

 	/*
 	 * Maybe do the following earlier and check (hx & 0x7fff) == 0,
 	 * etc., before the FP classification of 0.
 	 */
 	hx = GET_LDBL_EXPSIGN(x);
 	hy = GET_LDBL_EXPSIGN(y);

 	if (hx & (0x7fff) < MUMBLE && (hy & 0x7fff) != 0x7fff)
 		goto final_clause_now;	/* simple formula */

 	/* Need more bits for a fuller classification (rarely reached). */
 	lx = GET_LDBL_MANTISSA(x);
 	ly = GET_LDBL_MANTISSA(y);

This is supposed to handle UnNormals without explicit checks by letting
other layers and hardware handle them however it wants.  E.g., if x or
y is Pseudo-Zero, then the comparision with 0 either succeeds or fails
depending on whether Pseudo-Zero is treated as 0 or NaN by the hardware
comparison operator, and our treatment is consistent for the early returns.
We only have to worry about falling through to later code which is
inconsistent because it depends on Pseudo-Zeros not getting that far.

Bruce

From owner-freebsd-numerics@freebsd.org  Fri Mar  8 19:11:57 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id DE7E415267C6
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Fri,  8 Mar 2019 19:11:56 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 server-signature RSA-PSS (4096 bits)
 client-signature RSA-PSS (2048 bits) client-digest SHA256)
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id D74C587C4B
 for <freebsd-numerics@freebsd.org>; Fri,  8 Mar 2019 19:11:54 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x28JBppf033184
 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO);
 Fri, 8 Mar 2019 11:11:51 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x28JBpax033183;
 Fri, 8 Mar 2019 11:11:51 -0800 (PST) (envelope-from sgk)
Date: Fri, 8 Mar 2019 11:11:50 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-numerics@freebsd.org
Subject: Re: cexpl() (was: Re: Update ENTERI() macro)
Message-ID: <20190308191150.GA32980@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
References: <20190306184829.GA44023@troutmask.apl.washington.edu>
 <20190307061214.R4911@besplex.bde.org>
 <20190306214233.GA23159@troutmask.apl.washington.edu>
 <20190307144315.N932@besplex.bde.org>
 <20190307044447.GA16298@troutmask.apl.washington.edu>
 <20190307163958.V1293@besplex.bde.org>
 <20190307195436.GA20856@troutmask.apl.washington.edu>
 <20190308225807.D6410@besplex.bde.org>
 <20190308162432.GA31392@troutmask.apl.washington.edu>
 <20190309033820.J1443@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20190309033820.J1443@besplex.bde.org>
User-Agent: Mutt/1.11.2 (2019-01-07)
X-Rspamd-Queue-Id: D74C587C4B
X-Spamd-Bar: +
Authentication-Results: mx1.freebsd.org
X-Spamd-Result: default: False [1.35 / 15.00]; ARC_NA(0.00)[];
 HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu];
 FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[];
 NEURAL_SPAM_SHORT(0.85)[0.846,0];
 NEURAL_HAM_LONG(-0.73)[-0.733,0]; MIME_GOOD(-0.10)[text/plain];
 RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu];
 AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[];
 RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[];
 RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2];
 MX_GOOD(-0.01)[cached: troutmask.apl.washington.edu];
 RCPT_COUNT_TWO(0.00)[2]; NEURAL_SPAM_MEDIUM(0.50)[0.500,0];
 R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[optusnet.com.au];
 FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[];
 MIME_TRACE(0.00)[0:+];
 ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US];
 MID_RHS_MATCH_FROM(0.00)[];
 IP_SCORE(0.05)[ip: (0.09), ipnet: 128.95.0.0/16(0.15), asn: 73(0.05),
 country: US(-0.07)]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2019 19:11:57 -0000

On Sat, Mar 09, 2019 at 05:02:57AM +1100, Bruce Evans wrote:
> On Fri, 8 Mar 2019, Steve Kargl wrote:
> > On Sat, Mar 09, 2019 at 12:51:07AM +1100, Bruce Evans wrote:
> >> On Thu, 7 Mar 2019, Steve Kargl wrote:
> >>> On Thu, Mar 07, 2019 at 05:36:22PM +1100, Bruce Evans wrote:
> >>>> On Wed, 6 Mar 2019, Steve Kargl wrote:
> >>>>>	/*
> >>>>>	 *  exp_ovfl = 11356.5234062941439497, 0x400c, 0xb17217f7d1cf79ac
> >>>>>	 * cexp_ovfl = 22755.3287906024445633, 0x400d, 0xb1c6a8573de9768c
> >>>>>	 */
> >>
> >> I checked that these are correct.  Any rounding should be down for
> >> exp_ovfl and up for cexp_ovfl.
> >>
> >>>>> 	if ((hx == 0x400c && lx > 0xb17217f7d1cf79acULL) ||
> >>>>> 	    (hx == 0x400d && lx < 0xb1c6a8573de9768cULL)) {
> >>>>
> >>>> Don't duplicate the hex constants in the comment. The limits can be fuzzy
> >>>> so don't need so many decimal digits or nonzero nonzero bits or LD80C()
> >>>> to round them (s_cexp.c uses only 1 decimal digit and doesn't test the
> >>>> lower 32 bits in the mantissa; here we have more bits than we need in lx).
> >>>
> >>> Being fuzzy was the movitation for the new macro, I was only using
> >>> the upper 32 bits of lx in the original cexpl() I posted.
> >>
> >> IIRC, that was done by extracting into a 32-bit lx.  This could be written
> >> as (lx >> 32) > 0xb17217f7 (or a fuzzier value), and it might be better
> >> to do that anyway.
> >
> > This is what I do in my WIP ccosh.  There are 3 thresholds
> > of ~22.9, ~11356, and ~22755.  I'm casting the result to
> > uint32_t, so something like (uint32_t)(lx >> 32) > 0xb17000000.
> > Do I need the cast and/or do you prefer it?
> 
> I think you can even round to a power of 2 and not look at any mantissa
> bits to classify this, as for the 64 in coshl().

You used your k_hexpl() and take advantage of the hi+lo splitting
to use x < 64.

> 22.9 here seems wrong.  It is approximately the threshold where cosh()
> decays to exp().  The threshold for coshl() delay is much higher.
> coshl() uses 64, which is high enough for both ld80 and ld128.  More
> like 32 would work for ld80.

I actually round up to 23 for ld80, and haven't thought too much
about ld128.  A value of 22 is used for s_ccosh.c, which isn't
quite large enough for long double. (Yes, I looked at bits)

% ./gen 22
coshl(22)    = 1.79245642306579578097e+09
expl(22) / 2 = 1.79245642306579578086e+09
sinhl(22)    = 1.79245642306579578074e+09

% ./gen 23
coshl(23)    = 4.87240172312445130013e+09
expl(23) / 2 = 4.87240172312445130013e+09
sinhl(23)    = 4.87240172312445130013e+09

in s_ccoshl.c I then have

	if (ix < 0x4003 ||      /* |x| < ~23: normal case */
	    (ix == 0x4003 && (uint32_t)(lx >> 32) < 0xb8000000))
		RETURNI(CMPLXL(coshl(x) * c, sinhl(x) * s));

	/* |x| >= 23, so cosh(x) ~= exp(|x|) / 2 */
	if (ix < 0x400c ||
	    (ix == 0x400c && (uint32_t)(lx >> 32) <= 0xb1700000)) {
		/* x < 11356: exp(|x|) won't overflow */
		h = expl(fabsl(x)) / 2;
		RETURNI(CMPLXL(h * c, copysignl(h, x) * s));

With |x| < 23, we have 2 function calls for coshl() and sinhl()
as opposed to the 1 function call for expl() in 23 <= x <= 32 range.
I can write the first conditional as

	if (ix < 0x4004)     /* |x| < 32: normal case */
		RETURNI(CMPLXL(coshl(x) * c, sinhl(x) * s));

This then raises the question on whether we should change the
limit to 32 in the double complex ccosh()?

-- 
Steve

From owner-freebsd-numerics@freebsd.org  Fri Mar  8 20:53:33 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id D8E011528C0D
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Fri,  8 Mar 2019 20:53:32 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au
 [211.29.132.246])
 by mx1.freebsd.org (Postfix) with ESMTP id E25388B976
 for <freebsd-numerics@freebsd.org>; Fri,  8 Mar 2019 20:53:29 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au
 [110.21.101.228])
 by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 13F37433BA5;
 Sat,  9 Mar 2019 07:53:26 +1100 (AEDT)
Date: Sat, 9 Mar 2019 07:53:24 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
cc: freebsd-numerics@freebsd.org
Subject: Re: cexpl() (was: Re: Update ENTERI() macro)
In-Reply-To: <20190308191150.GA32980@troutmask.apl.washington.edu>
Message-ID: <20190309070413.G2539@besplex.bde.org>
References: <20190306184829.GA44023@troutmask.apl.washington.edu>
 <20190307061214.R4911@besplex.bde.org>
 <20190306214233.GA23159@troutmask.apl.washington.edu>
 <20190307144315.N932@besplex.bde.org>
 <20190307044447.GA16298@troutmask.apl.washington.edu>
 <20190307163958.V1293@besplex.bde.org>
 <20190307195436.GA20856@troutmask.apl.washington.edu>
 <20190308225807.D6410@besplex.bde.org>
 <20190308162432.GA31392@troutmask.apl.washington.edu>
 <20190309033820.J1443@besplex.bde.org>
 <20190308191150.GA32980@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.2 cv=P6RKvmIu c=1 sm=1 tr=0
 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17
 a=kj9zAlcOel0A:10 a=R02BefTfkBCJ0zfWrscA:9 a=CjuIK1q_8ugA:10
X-Rspamd-Queue-Id: E25388B976
X-Spamd-Bar: ------
Authentication-Results: mx1.freebsd.org;
 spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates
 211.29.132.246 as permitted sender) smtp.mailfrom=brde@optusnet.com.au
X-Spamd-Result: default: False [-6.23 / 15.00]; ARC_NA(0.00)[];
 NEURAL_HAM_MEDIUM(-1.00)[-1.000,0];
 RCVD_IN_DNSWL_LOW(-0.10)[246.132.29.211.list.dnswl.org : 127.0.5.1];
 FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[];
 R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23];
 FREEMAIL_FROM(0.00)[optusnet.com.au];
 MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au];
 NEURAL_HAM_LONG(-1.00)[-1.000,0];
 TO_MATCH_ENVRCPT_SOME(0.00)[];
 MX_GOOD(-0.01)[cached: extmail.optusnet.com.au];
 RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.97)[-0.968,0];
 IP_SCORE(-2.95)[ip: (-7.69), ipnet: 211.28.0.0/14(-3.90), asn: 4804(-3.11),
 country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[];
 FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[];
 FREEMAIL_ENVFROM(0.00)[optusnet.com.au];
 ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU];
 MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2019 20:53:33 -0000

On Fri, 8 Mar 2019, Steve Kargl wrote:

> On Sat, Mar 09, 2019 at 05:02:57AM +1100, Bruce Evans wrote:
>> On Fri, 8 Mar 2019, Steve Kargl wrote:
>>> On Sat, Mar 09, 2019 at 12:51:07AM +1100, Bruce Evans wrote:
>>>> On Thu, 7 Mar 2019, Steve Kargl wrote:
>>>> ...
>>>> IIRC, that was done by extracting into a 32-bit lx.  This could be written
>>>> as (lx >> 32) > 0xb17217f7 (or a fuzzier value), and it might be better
>>>> to do that anyway.
>>>
>>> This is what I do in my WIP ccosh.  There are 3 thresholds
>>> of ~22.9, ~11356, and ~22755.  I'm casting the result to
>>> uint32_t, so something like (uint32_t)(lx >> 32) > 0xb17000000.
>>> Do I need the cast and/or do you prefer it?
>>
>> I think you can even round to a power of 2 and not look at any mantissa
>> bits to classify this, as for the 64 in coshl().
>
> You used your k_hexpl() and take advantage of the hi+lo splitting
> to use x < 64.

Or I designed my k_hexpl() to support this splitting up to x < 64.  Long
double precision is easier because it has (committed) support functions
like k_hexpl().

>> 22.9 here seems wrong.  It is approximately the threshold where cosh()
>> decays to exp().  The threshold for coshl() delay is much higher.
>> coshl() uses 64, which is high enough for both ld80 and ld128.  More
>> like 32 would work for ld80.
>
> I actually round up to 23 for ld80, and haven't thought too much
> about ld128.  A value of 22 is used for s_ccosh.c, which isn't
> quite large enough for long double. (Yes, I looked at bits)
>
> % ./gen 22
> coshl(22)    = 1.79245642306579578097e+09
> expl(22) / 2 = 1.79245642306579578086e+09
> sinhl(22)    = 1.79245642306579578074e+09
>
> % ./gen 23
> coshl(23)    = 4.87240172312445130013e+09
> expl(23) / 2 = 4.87240172312445130013e+09
> sinhl(23)    = 4.87240172312445130013e+09

Look at more bits :-).  For r + 1/r, ignoring the 1/r term gives an error
of about 1 ulp in ld80 precision when 1/r/r is about 2**-64.  This gives
a threshold of about log(2**(64/2)) = 22.18 for accuracy of about 1 ulp.
But we want more accuracy than that.  For 0.5 ulps, log(2**(65/2) = 22.52+.
We want more than that too.  expl() gives almost 10 extra bits of accuracy,
and the hyperbolic functions use the expl kernel to get all these extra
bits.  expl(x) itself uses a threshold of 2**-75 instead of 2**-65 for 
x being so tiny that the x**2/2 term and higher terms are negligible
compared with the x term.  So we should try for 11 extra bits here.  This
gives a threshold of log(2**(75/2)) = 25.99++.

ld128 expl() is missing the change corresponding to changing 65 to 75 for
the x vs x**2/2 threshold.  It still uses 114.  After fixing this to 124,
the corresponding threshold here is log(2**124/2) = 42.97+.

> in s_ccoshl.c I then have
>
> 	if (ix < 0x4003 ||      /* |x| < ~23: normal case */
> 	    (ix == 0x4003 && (uint32_t)(lx >> 32) < 0xb8000000))
> 		RETURNI(CMPLXL(coshl(x) * c, sinhl(x) * s));

k_hexpl() can't handle this case.  (But if you want efficiency or accuracy,
then you should implement sinhcoshl() or just use the kernel to evaluate
exp(x) only once.)

I think this formula works up to 32 or 64 or even up to the overflow
threshold.  coshl(x) and sinhl() decay to expl(x) starting a bit above
23, but, but these functions can optimize that almost as well as here
(they actually don't bother to until x reaches 64).

You can use this to eliminate the check of lx.

>
> 	/* |x| >= 23, so cosh(x) ~= exp(|x|) / 2 */
> 	if (ix < 0x400c ||
> 	    (ix == 0x400c && (uint32_t)(lx >> 32) <= 0xb1700000)) {
> 		/* x < 11356: exp(|x|) won't overflow */
> 		h = expl(fabsl(x)) / 2;
> 		RETURNI(CMPLXL(h * c, copysignl(h, x) * s));

Use a power of 2 for the upper threshold too if possible.

> With |x| < 23, we have 2 function calls for coshl() and sinhl()
> as opposed to the 1 function call for expl() in 23 <= x <= 32 range.
> I can write the first conditional as
>
> 	if (ix < 0x4004)     /* |x| < 32: normal case */
> 		RETURNI(CMPLXL(coshl(x) * c, sinhl(x) * s));

The general formula is indeed unecessarily slow in that range.  Also
for the more important range [1, 23].  coshl() in the lower range is
lo + 0.25/(hi + lo) + hi after h_expl() to get hi and lo.  Similarly
for sinhl().  These formulas are short enough to repeat in the above,
but I think this optimization is excessive.  It is better for accuracy
after some rearrangement.

> This then raises the question on whether we should change the
> limit to 32 in the double complex ccosh()?

Do you mean from 64 to 32 the non-complex cosh(), or from your current
limit to the above?

Bruce

From owner-freebsd-numerics@freebsd.org  Fri Mar  8 21:48:02 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id CEF2C1529CF9
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Fri,  8 Mar 2019 21:48:02 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 server-signature RSA-PSS (4096 bits)
 client-signature RSA-PSS (2048 bits) client-digest SHA256)
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 58FD18D717
 for <freebsd-numerics@freebsd.org>; Fri,  8 Mar 2019 21:48:00 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x28Llpq4034409
 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO);
 Fri, 8 Mar 2019 13:47:51 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x28Llop5034408;
 Fri, 8 Mar 2019 13:47:50 -0800 (PST) (envelope-from sgk)
Date: Fri, 8 Mar 2019 13:47:50 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-numerics@freebsd.org
Subject: Re: cexpl() (was: Re: Update ENTERI() macro)
Message-ID: <20190308214750.GA34090@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
References: <20190306214233.GA23159@troutmask.apl.washington.edu>
 <20190307144315.N932@besplex.bde.org>
 <20190307044447.GA16298@troutmask.apl.washington.edu>
 <20190307163958.V1293@besplex.bde.org>
 <20190307195436.GA20856@troutmask.apl.washington.edu>
 <20190308225807.D6410@besplex.bde.org>
 <20190308162432.GA31392@troutmask.apl.washington.edu>
 <20190309033820.J1443@besplex.bde.org>
 <20190308191150.GA32980@troutmask.apl.washington.edu>
 <20190309070413.G2539@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20190309070413.G2539@besplex.bde.org>
User-Agent: Mutt/1.11.2 (2019-01-07)
X-Rspamd-Queue-Id: 58FD18D717
X-Spamd-Bar: /
Authentication-Results: mx1.freebsd.org
X-Spamd-Result: default: False [0.92 / 15.00]; ARC_NA(0.00)[];
 HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu];
 NEURAL_HAM_MEDIUM(-0.06)[-0.060,0]; FROM_HAS_DN(0.00)[];
 TO_DN_SOME(0.00)[]; NEURAL_SPAM_SHORT(0.84)[0.836,0];
 NEURAL_HAM_LONG(-0.59)[-0.592,0]; MIME_GOOD(-0.10)[text/plain];
 RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[washington.edu];
 AUTH_NA(1.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[];
 RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[];
 RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2];
 MX_GOOD(-0.01)[troutmask.apl.washington.edu];
 RCPT_COUNT_TWO(0.00)[2]; R_SPF_NA(0.00)[];
 FREEMAIL_TO(0.00)[optusnet.com.au]; FROM_EQ_ENVFROM(0.00)[];
 R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+];
 ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US];
 MID_RHS_MATCH_FROM(0.00)[];
 IP_SCORE(0.05)[ip: (0.09), ipnet: 128.95.0.0/16(0.15), asn: 73(0.05),
 country: US(-0.07)]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2019 21:48:03 -0000

(reducing quote depth.  I'll need time to digest your bit analysis)

On Sat, Mar 09, 2019 at 07:53:24AM +1100, Bruce Evans wrote:
> On Fri, 8 Mar 2019, Steve Kargl wrote:
> 
> > This then raises the question on whether we should change the
> > limit to 32 in the double complex ccosh()?
> 
> Do you mean from 64 to 32 the non-complex cosh(), or from your current
> limit to the above?
> 

I mean ccosh(double complex).  Copyright date is 2005 for
s_ccosh.c while ld80/*_expl.* has datesi of 2009-2013.  It
seems you and I developed s_ccosh.c much earlier than the
Tang-based expl().  In s_ccosh.c, we have

	if (ix < 0x40360000) /* |x| < 22: normal case */
	    return (CMPLX(cosh(x) * cos(y), sinh(x) * sin(y)));

	/* |x| >= 22, so cosh(x) ~= exp(|x|) */
	if (ix < 0x40862e42) {
		/* x < 710: exp(|x|) won't overflow */
		h = exp(fabs(x)) * 0.5;
		return (CMPLX(h * cos(y), copysign(h, x) * sin(y)));
	} ...

Would it be beneficial to change |x| < 22 to |x| < 32?  

While we have kernels for exp(), I did not commit your
Tang-based exp(). So, exp() has ulp of 0.7 to 0.8 instead
of 0.5xx.  Maybe using 32 won't buy us anything.  

-- 
Steve

From owner-freebsd-numerics@freebsd.org  Sat Mar  9 03:28:38 2019
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id D45551534CD9
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Sat,  9 Mar 2019 03:28:37 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail106.syd.optusnet.com.au (mail106.syd.optusnet.com.au
 [211.29.132.42])
 by mx1.freebsd.org (Postfix) with ESMTP id 3A3366B4BC
 for <freebsd-numerics@freebsd.org>; Sat,  9 Mar 2019 03:28:32 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au
 [110.21.101.228])
 by mail106.syd.optusnet.com.au (Postfix) with ESMTPS id 072013D57F4;
 Sat,  9 Mar 2019 14:28:29 +1100 (AEDT)
Date: Sat, 9 Mar 2019 14:28:28 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
cc: freebsd-numerics@freebsd.org
Subject: Re: cexpl() (was: Re: Update ENTERI() macro)
In-Reply-To: <20190308214750.GA34090@troutmask.apl.washington.edu>
Message-ID: <20190309134147.E968@besplex.bde.org>
References: <20190306214233.GA23159@troutmask.apl.washington.edu>
 <20190307144315.N932@besplex.bde.org>
 <20190307044447.GA16298@troutmask.apl.washington.edu>
 <20190307163958.V1293@besplex.bde.org>
 <20190307195436.GA20856@troutmask.apl.washington.edu>
 <20190308225807.D6410@besplex.bde.org>
 <20190308162432.GA31392@troutmask.apl.washington.edu>
 <20190309033820.J1443@besplex.bde.org>
 <20190308191150.GA32980@troutmask.apl.washington.edu>
 <20190309070413.G2539@besplex.bde.org>
 <20190308214750.GA34090@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.2 cv=P6RKvmIu c=1 sm=1 tr=0
 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17
 a=kj9zAlcOel0A:10 a=Ma-Tbq1rf8peu99H5l8A:9 a=539z-ZukNZkPh3FH:21
 a=-ralKWDhgqI9mZlY:21 a=CjuIK1q_8ugA:10
X-Rspamd-Queue-Id: 3A3366B4BC
X-Spamd-Bar: ------
Authentication-Results: mx1.freebsd.org;
 spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates
 211.29.132.42 as permitted sender) smtp.mailfrom=brde@optusnet.com.au
X-Spamd-Result: default: False [-6.32 / 15.00]; ARC_NA(0.00)[];
 NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_COUNT_TWO(0.00)[2];
 FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[];
 R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23];
 FREEMAIL_FROM(0.00)[optusnet.com.au];
 MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au];
 NEURAL_HAM_LONG(-1.00)[-1.000,0];
 TO_MATCH_ENVRCPT_SOME(0.00)[];
 MX_GOOD(-0.01)[extmail.optusnet.com.au];
 RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.89)[-0.889,0];
 IP_SCORE(-3.12)[ip: (-8.53), ipnet: 211.28.0.0/14(-3.91), asn: 4804(-3.12),
 country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[];
 RCVD_IN_DNSWL_LOW(-0.10)[42.132.29.211.list.dnswl.org : 127.0.5.1];
 R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[optusnet.com.au];
 ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU];
 MIME_TRACE(0.00)[0:+]; FROM_EQ_ENVFROM(0.00)[]
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Mar 2019 03:28:38 -0000

On Fri, 8 Mar 2019, Steve Kargl wrote:

> (reducing quote depth.  I'll need time to digest your bit analysis)
>
> On Sat, Mar 09, 2019 at 07:53:24AM +1100, Bruce Evans wrote:
>> On Fri, 8 Mar 2019, Steve Kargl wrote:
>>
>>> This then raises the question on whether we should change the
>>> limit to 32 in the double complex ccosh()?
>>
>> Do you mean from 64 to 32 the non-complex cosh(), or from your current
>> limit to the above?
>
> I mean ccosh(double complex).  Copyright date is 2005 for
> s_ccosh.c while ld80/*_expl.* has datesi of 2009-2013.  It
> seems you and I developed s_ccosh.c much earlier than the
> Tang-based expl().  In s_ccosh.c, we have
>
> 	if (ix < 0x40360000) /* |x| < 22: normal case */
> 	    return (CMPLX(cosh(x) * cos(y), sinh(x) * sin(y)));
>
> 	/* |x| >= 22, so cosh(x) ~= exp(|x|) */
> 	if (ix < 0x40862e42) {
> 		/* x < 710: exp(|x|) won't overflow */
> 		h = exp(fabs(x)) * 0.5;
> 		return (CMPLX(h * cos(y), copysign(h, x) * sin(y)));
> 	} ...

I see.

> Would it be beneficial to change |x| < 22 to |x| < 32?

No, 22 is correct.  It is copied from cosh() and sinh().  It is
from allowing for exp() to have 10 or 11 extra bits of accuracy
(which it might actually do for a hardware or other exp() with
internal extra precision not limited by the ABI or broken to C11
spec so that it has to destroy the extra etra precision on return).
This gives a threshold of x = log(2**(52+11 or 53+10)) / 2 = 21.83
for one of exp(+-x) being negligible relative to the other.  Take
the ceiling to get 22.

ccoshl() uses the result of cosh(), so must use a threshold of 22
or more to let cosh() do the best that it can.  Similarly for
sinh() -- use a threshold >= the threshold of each.  Each uses 22
so 22 is already large enough.

32 is >= 22, so we can use that too, but it is slower in the
range [22, 32].

cosh() and sinh() usually don't get near 11 extra bits of accuracy.
The threshold for 53 bits is 18.36+.  They are slower than necessary
between this and 22.

This is more complicated for long double precision.  There coshl()
and sinhl() use a threshold of 64 and are slower than necessary between
this and 22.5 to 26 for ld80 and 39.5 and 43 for ld128.  This slowness
is to simplify the code and thus optimize for more usual cases.  The
gap between the thresholds is much larger than for double precision,
so if we copy the individual functions' threshold of 64 then we get
unnecessary slowness over a larger range.  But we shouldn't know so
much about the functions' implementation that we intentionally optimize
differently for this range.  It is bad enough that we have to know their
internal threshold to ensure getting consistent results.

> While we have kernels for exp(), I did not commit your
> Tang-based exp(). So, exp() has ulp of 0.7 to 0.8 instead
> of 0.5xx.  Maybe using 32 won't buy us anything.

My exp() is based on old fdlibm (Ng?) since the intervals method didn't
work as well as expected and I was able to get enough accuracy for <
1 ulp of error in the hyperbolic functions by exporting hi + lo
decompositions using kernels.  exp() still has a maximum error of about
0.75 ulps on amd64.  But on i386, it has a maximum error of 0.5092
ulps in float precision, by using float_t to get all evaluations in
double precision.  In double precision, double_t defaults to giving
only double precision, so the maximum error is usually the same 0.75
as on amd64.  exp() on i386 now uses the i387 and switches it to 64-bit
precision.  The i387 barely delivers double precision for exp() using
the extra precision.  (We almost knew that it couldn't deliver long
double precision for expl() when we implemented expl() in software).

My error table has grown many special cases to record bad old behaviour
for regression tests.  For exp() it is:

exp:   max_er =      0x642 0.7822, avg_er = 0.007, #>=1:0.5 = 0:2566529
exp:   max_er =      0x5fd 0.7485, avg_er = 0.007, #>=1:0.5 = 0:2733369 (s/w,PD)
exp:   max_er =      0x405 0.5024, avg_er = 0.007, #>=1:0.5 = 0:2253093 (s/w,PE)

0.7822 is for i387 exp.  0.7485 is the error for the fdlibm method
without extra precision in hardware (FP_PD on i386).  0.5024 is the
error for a modified fdlibm method with 11 bits of extra precision
(FP_PE onl i386) and double_t instead of double to use this, and some
reordering.

Bruce