Date: Tue, 21 Mar 1995 23:49:34 +1000 From: Bruce Evans <bde@zeta.org.au> To: phk@ref.tfs.com, pst@shockwave.com Cc: CVS-commiters@time.cdrom.com, bde@zeta.org.au, cvs-etc@time.cdrom.com, jkh@freebsd.org, rgrimes@gndrsh.aac.dev.com Subject: Re: cvs commit: src/etc make.conf Message-ID: <199503211349.XAA16990@godzilla.zeta.org.au>
next in thread | raw e-mail | index | archive | help
> > We also need dynamic support for the i387 functions. -DHAVE_FPU is no > > good because it can't be used for the distribution libraries. Something > > like > > > > if (_have_i387) > > result = _i387_pow(x, y); > > else > > result = __ieee754_pow(x, y); > > > > would add less time overhead than shared linkage. >The extra test on every operation is bad. Let's replace `pow' by `sin'. pow() isn't an i387 function and is too complicated to synthesize from a few i387 functions. To be precise, it costs 6 cycles on a 486 for the _i387_sin case and 5 cycles for the __ieee754_sin case (plus cache misses...) >Xonsider the following fragment or high-speed linkages with shared libraries >instead (I don't know how fast or slow shared linkages are): Shared linkage costs 4 cycles (1 wasted for a stupidly placed nop and much more for the first call; plus cache misses...). > static vec_pow = pow_init; > pow (base, exp) > { > return (*vec_pow)(base, exp); > } This would only cost 2 cycle (plus cache misses...). Anyone for self modifying code? :-) The shared library already uses it to avoid these 2 cycles and it might not be too hard to get the shared library to patch in the addresses of the i387-specifice functions instead of the generic one. Unfortunately , this won't work for statically linked programs. The hardware sin() takes 193-279 cycles on a 486 and the msun wrappers take many more (especially for shared libraries; position-independent code costs about 10 cycles just for loading the global register), so another 5 cycles would be hardly noticeable. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199503211349.XAA16990>