Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 10 Dec 2005 18:18:12 +1100 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Mike Silbersack <silby@silby.com>
Cc:        src-committers@freebsd.org, Andre Oppermann <andre@freebsd.org>, cvs-src@freebsd.org, cvs-all@freebsd.org, Steve Kargl <sgk@troutmask.apl.washington.edu>, Andrey Chernov <ache@freebsd.org>
Subject:   Re: cvs commit: src/lib/msun/src e_lgammaf_r.c
Message-ID:  <20051210173621.X71090@delplex.bde.org>
In-Reply-To: <20051208233246.G78724@odysseus.silby.com>
References:  <200511280832.jAS8WGvs059057@repoman.freebsd.org> <438AD8FB.A8B96AB6@freebsd.org> <20051128172718.GA59929@troutmask.apl.washington.edu> <20051129110058.T33820@delplex.bde.org> <20051129012102.GA84108@nagual.pp.ru> <20051129184901.I34802@delplex.bde.org> <20051204231731.N43418@odysseus.silby.com> <20051208141941.L63825@delplex.bde.org> <20051208233246.G78724@odysseus.silby.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 8 Dec 2005, Mike Silbersack wrote:

> On Thu, 8 Dec 2005, Bruce Evans wrote:
>
>> Whoever makes the changes would write the regression tests :-).  Mine
>> are't sufficently general to commit.  In batch mode which takes about
>> 10 hours on a 2GHz Athlon to check all cases for floats, they are
>> currently reporting the following errors on amd64:
>
> The regression tests don't need to be exhaustive, you could just pick a few 
> values throughout the range of each function you test, and make a regression 
> test like:
>
> x = 3/2;
> if (x != 1.5)
> 	printf("FP regression!\n");
>
> Heck, you could throw all the calculations into one file.

That wouldn't be very good.

My local tests are:
- ucbtest.  Normally I run it using -DNTESTS=1000000 -DNREGIONS=64.  This
   tests a million cases of some functions (not the more exotic ones like
   gamma or the more standard ones like sqrt()).  I used to think that it
   found most problems; now I'm not so sure.  It somehow only reports
   errors of 2.9 ulps for hardware i387 cos() where my own tests easily
   find errors of several gulps (giga-ulps) in hardware cos() as a side
   effect of testing just 65536 cases for fdlibm cosf().  The difference
   may be that ucbtest mainly tests small args by my test tests mainly
   large args where hardware i387 trig is known to be very bad.  But ucbtest
   somehow doesn't report problems for cos() near pi/2 for its test of
   cos(), although it must know that there are problems since it correctly
   determines the (in)accuracy of the i387's internal pi using another
   test.  (The internal pi has only 66 digits, so cos(M_PI_2) can have
   only ~66-53 = 13 binary digits of precision, with 40 binary digits wrong,
   and in fact the magic 66 can easily be guessed by looking at the wrong
   digits in cos(M_PI_2) -- the lower 40 digits are all 0.)
- exhaustive testing of float functions relative to double functions.  Now
   has a batch mode and many parameters to change.  One parameter is the
   "stride" -- float args in bits are stepped through using step "stride",
   so stride = 1 gives exhaustive testing and stride = 0x10000 gives very
   fast non-exhaustive testing.
- non-exhaustive testing with stride = any for double complex and float
   complex functions relative to alternative implementations of the same
   function.  stride = 1 would give exhaustive testing but would take too
   long to run (~100000 years for float complex and 2**62 times longer
   than that for double complex).  Must be edited to change parameters.
- hacked versions of previous to reduce to non-complex functions.
- some performance tests.  The best one uses uniform distribution with
   a parametrized number of regions.  It also has some hacks for random
   distribution (this shows that branch misprediction slows down things
   as expected.  Hopefully most data isn't random, so branch misprediction
   is rare).
- missing: uniform and random distributions generally.  Special values.
   All needed for non-float cases.

ucbtest and my tests show:
- how hard it is to find all broken cases without using exhaustive testing.
   It's larger than any regression tests that I want to write.
- that regressions found by existing regression tests don't get fixed for
   12+ years.
- that it's useful to write independent tests.   They find independent bugs.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051210173621.X71090>