From owner-freebsd-numerics@freebsd.org Wed Feb 27 20:15:27 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 45F3A1503C6C for ; Wed, 27 Feb 2019 20:15:27 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail105.syd.optusnet.com.au (mail105.syd.optusnet.com.au [211.29.132.249]) by mx1.freebsd.org (Postfix) with ESMTP id D30AA8DAB2 for ; Wed, 27 Feb 2019 20:15:24 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail105.syd.optusnet.com.au (Postfix) with ESMTPS id 7426B105FEF3; Thu, 28 Feb 2019 07:15:14 +1100 (AEDT) Date: Thu, 28 Feb 2019 07:15:14 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Steve Kargl cc: freebsd-numerics@freebsd.org Subject: Re: Update ENTERI() macro In-Reply-To: <20190227161906.GA77785@troutmask.apl.washington.edu> Message-ID: <20190228060920.R4413@besplex.bde.org> References: <20190226191825.GA68479@troutmask.apl.washington.edu> <20190227145002.P907@besplex.bde.org> <20190227074811.GA75972@troutmask.apl.washington.edu> <20190227201214.V1823@besplex.bde.org> <20190227161906.GA77785@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=P6RKvmIu c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=k8ySeul569u4Qwmp2EoA:9 a=CjuIK1q_8ugA:10 X-Rspamd-Queue-Id: D30AA8DAB2 X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates 211.29.132.249 as permitted sender) smtp.mailfrom=brde@optusnet.com.au X-Spamd-Result: default: False [-6.41 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_IN_DNSWL_LOW(-0.10)[249.132.29.211.list.dnswl.org : 127.0.5.1]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23]; FREEMAIL_FROM(0.00)[optusnet.com.au]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[cached: extmail.optusnet.com.au]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.95)[-0.947,0]; IP_SCORE(-3.15)[ip: (-8.24), ipnet: 211.28.0.0/14(-4.16), asn: 4804(-3.31), country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[optusnet.com.au]; ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2019 20:15:27 -0000 On Wed, 27 Feb 2019, Steve Kargl wrote: > On Wed, Feb 27, 2019 at 09:15:52PM +1100, Bruce Evans wrote: >> >> ENTERI() hard-codes the long double for simplicity. Remember, it is only >> needed for long double precision on i386. But I forgot about long double >> complex types, and didn't dream about indirect long double types in sincosl(). > > That simplicity does not work for long double complex. We will > > need either ENTERIC as in > > #define ENTERIC() ENTERIT(long double complex) > > or a direct use of ENTERIT as you have done s_clogl.c I wrote ENTERIT() to work around this problem. >>> I'm fine with making ENTERI() only toggle precision, and adding >>> a LEAVEI() to reset precision. RETURNI(r) would then be >>> >>> #define RETURNI(r) \ >>> do { \ >>> LEAVEI(); \ >>> return (r); \ >>> } while (0) >> >> No, may be an expression, so it must be evaluated before LEAVEI(). This >> is the reason for existence of the variable to hold the result. > > So, we'll need RETURNI for long double and one for long double complex. > Or, we give RETURNI a second parameter, which is the input parameter of > the function I said to use your method of __typeof(). I tested this: XX --- /tmp/math_private.h Sun Nov 27 17:58:57 2005 XX +++ ./math_private.h Thu Feb 28 06:17:26 2019 XX @@ -474,21 +474,22 @@ XX /* Support switching the mode to FP_PE if necessary. */ XX #if defined(__i386__) && !defined(NO_FPSETPREC) XX -#define ENTERI() ENTERIT(long double) XX -#define ENTERIT(returntype) \ XX - returntype __retval; \ XX +#define ENTERI() \ XX fp_prec_t __oprec; \ XX \ XX if ((__oprec = fpgetprec()) != FP_PE) \ XX fpsetprec(FP_PE) XX -#define RETURNI(x) do { \ XX - __retval = (x); \ XX - if (__oprec != FP_PE) \ XX - fpsetprec(__oprec); \ XX +#define LEAVEI() \ XX + if ((__oprec = fpgetprec()) != FP_PE) \ XX + fpsetprec(FP_PE) XX +#define RETURNI(expr) do { \ XX + __typeof(expr) __retval = (expr); \ XX + \ XX + LEAVEI(); \ XX RETURNF(__retval); \ XX } while (0) XX #else XX #define ENTERI() XX -#define ENTERIT(x) XX -#define RETURNI(x) RETURNF(x) XX +#define LEAVEI() XX +#define RETURNI(expr) RETURNF(expr) XX #endif XX This compiles, but has minor problems. Note that the apparent style bug of initializing __retval in its declaration is needed in cases where __typeof() gives a const type. This happens in my code that uses RETURNI(1 + tiny) to set inexact. I think it would also happen for RETURNI(1). The type is then int instead of floating point, and I need to check that this is harmless. clogl() is the only user of ENTERIT(). Its size expands from 2302 bytes text to 2399 when compiled by gcc-3.3.3. I hope that this is just gcc not doing a very good job optimizing the returns (there are many RETURNI()s fpr clogl()). Repeating the return code instead of jumping to it might even be optimal. > #define RETURNI(x, r) \ > do { \ > x = (r) \ > LEAVEI(); \ > return (r); \ > } while (0) > > This will cause a lot of churn. Indeed. My version causes 1 line of churn: XX --- /tmp/s_clogl.c Fri Jul 20 16:00:11 2018 XX +++ ./s_clogl.c Thu Feb 28 05:58:05 2019 XX @@ -66,5 +66,5 @@ XX int kx, ky; XX XX - ENTERIT(long double complex); XX + ENTERI(); XX XX x = creall(z); >> Combined sin and cos probably does work better outside of benchmarks for >> sin and cos alone, since it does less work so leaves more resources for >> the, more useful things. > > Exactly! I have a significant amount of Fortran code that does > > z = cmplx(cos(x), sin(x)) > > in modern C this is 'z = CMPLX(cos(x), sin(x))'. GCC with optimization > enables will convert this to z = cexp(cmplx(0,x)) where it expects cexp > to optimize this to sincos(). This is an pessimization unless everything is inlined. An optimization would convert cexp(cmplx(0,x)) to sin(x) and cos(x) or sincos(x). > GCC on FreeBSD will not do this optimization > because FreeBSD's libm is not C99 compliant. It is more conformant than most for cexp(). I think old gcc just doesn't attempt such optimizations. > When I worked on sincos() I tried a few variations. This included > the simpliest implementation: > > void > sincos(double x, double *s, double *c) > { > *c = cos(x); > *s= sin(x); > } > > I tried argument reduction with kernels. > > void > sincos(double x, double *s, double *c) > { > a = inline argument reduction done to set a. > *c = k_cos(x); > *s= k_sin(x); > } You mean *c = s_cos(x), etc. That was good enough. > And finally the version that was committed where k_cos and k_sin > were manually inlined and re-arranged to reduce redundant computations. That has excessive manual inlining. It should have only inlined s_cos() and s_sin(), and changed k_cos() and k_sin() from extern to static inline. Someday the data for these inline functions should be deduplicated, but the data is small compared with that for the expl kernel. Bruce