Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 12 Oct 1997 03:14:32 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        reilly@zeta.org.au (Andrew Reilly)
Cc:        tlambert@primenet.com, reilly@zeta.org.au, freebsd-hackers@FreeBSD.ORG
Subject:   Re: Floating point exceptions
Message-ID:  <199710120314.UAA05148@usr03.primenet.com>
In-Reply-To: <199710120223.MAA00970@gurney.reilly.home> from "Andrew Reilly" at Oct 12, 97 12:23:13 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> > To me, this was not a rude response.  An exception where an exception
> > was not an intended result of the calculation is an exception that is
> > not masked, to my mind, and a useful indicator that all is not right with
> > the code.  It was certainly not intended as a rude remark.
> 
> I think that it is my problem that I take exception to some of the
> IEEE floating point semantics: perhaps it is a good thing to try to
> make a countable, auto-scaling (within a limited range) numeric
> representation behave more like the set of reals in some cases.  I
> prefer to think of floats as scaled integers, and I get caught out
> with some of the modern twists.

I was nailed on the same thing converting some code from UniCOS to
Solaris, actually (the opriginal code was not as careful as you).

> >> Is there
> >> a pointer to fpsetmask in any other manual page?
> > 
> > To be honest, I knew the general name of the function off the top of
> > my head (I do a lot of event simulation), and I used man -k setmask to
> > find the specific name.  But it is referenced in floatingpoint.h by
> > prototype... and that header is referenced by most of the FP functions.
> 
> Given that FreeBSD's behaviour is different from other systems in this
> regard, perhaps this warrants a pointer in the handbook or FAQ?

Honestly, it doesn't come up that often; it might be a FAQ, it might not.

Actually, FreeBSD could do with some "porting notes".  It would probably
encourage porting (as well as fixing things like this).



> This is a good strategy.  I just didn't know the mechanism for the
> masking when the error occurred.  I do think that it is unfortunate
> that ignoring the SIGFPE, as described above, does /not/ have the same
> effect as masking the exception.

I agree.  I will look at it, but it has been a while since I've done
anything in that area... there are other people who are much better
suited to the task (Bruce Evans, for one).  But if it hasn't gotten
too complicated since the last time I looked, I will try to get you a
patch some time next week, if no one beats me to it.

The signal code doesn't save the FPU registers; it probably should for
SIGFPE, actually.  I'll have to look at the databook to see what the
difference is between masked and exceptioned results.  Actually, it
may not maintain state internally on exception;  Hmmm.  Ugh.  Maybe
it can be reissued after masking...


> > An interesting application; you'll note it falls into my "signal processing"
> > bucket which I designated as a bad thing to need to fix because of the
> > need for repeatability...
> 
> I'm not sure what you mean by this comment.  Certainly there are audio
> DSP applications where you would hope for complete repeatability, but
> speech recognition with HMMs is a stochastic process, and rounding
> errors in the calculations are not significantly different from noise
> in the input signal.

Yes.  What I ment was that if you aren't operating on a specific data
set so that the errors were repeatable in exactly the same place, then
you can't rigorously test a fix.  I specifically picked signal processing
because you can except out one time and not another based on what comes
in an open mike.  It depends on the architecture as to whether or not
you can rerun the same data; a lot of these cheap sound cards, you can't.


> How about this for an example of non-repeatability: One of the first
> ports of this code was to a DSP card that used AT&T (now Lucent) DSP32C
> processors.  The recogniser ran as a background (non/soft real time)
> process, while the signal was buffered in real time, in response to the
> frame interrupt.  The DSP32C has 40-bit floating point accumulators
> (8 guard bits on the mantissa) and 32-bit memory, and no mechanism to
> save or restore those guard bits in the interrupt service routine... 
> Talk about noise injection.  We couldn't even get the same answer on
> consecutive runs on test files!  Never the less, this did not affect
> the measured performance of the recogniser more than few tenths of a
> percent.

Heh.  But how can you be sure to trigger the same underflow in repeated
testing, if that's what you are looking for.  8-).  You have to operate
on digitally stored ("canned") test data known to trigger the bug.  Even
then, it'd be a bear with the coupling you describe.  It's much harder
to trigger bugs, especially if you are running an intentionally tolerant
system.  8-).


> > Look to the calculation immediately before the compare.
> 
> That would be it.  The previous instruction stored s as a 32-bit float,
> which would generate an underflow exception if not masked.  I guess
> that if the '87 did not have extended precision floating point
> registers, then the exception would have occurred some time earlier,
> when the over precision result was generated.

Whenever it was reported, it would be *after* the fact.  8-(.

> > The FPU registers are not saved (or restored) by signal handlers, which
> > are not expected to execute FPU instructions.  If you will look at the
> > man page, there is actually a *lot* of calls which are not "safe" to use
> > from a signal handler, according to POSIX.
> 
> Which man page?  I just looked at kill(2) signal(3) and
> sigaction(2), and did not see a reference to this, although
> I do not doubt that such restrictions exist.

Oops.  My screw-up.  It's actually in the POSIX standard itself where
the calls not usable from a signal handler are listed.  8-(.  I'll try
to dig up the posting of the POSIX list (someone posted it a while back),
and send a "please add to man page" message...

> Where in SIG_IGN are floating point instructions used?  If there are
> none, why doesn't it work (i.e., why is the floating point state
> changed)?

Say you get a signal immediately before a context switch, and it's delivered
when you are switched back in.  If you do FPU stuff, you don't know what
code caused the exception, and you trash exceptions in progress.  The
real problem is the difference in ending state for exceptional vs.
masked handling (per above).


> On the subject of saving registers on context switches, are there
> really so many Unix applications that do no floating point at all that
> it is worth differentiating them?

Most of them don't do FP.

> Is it a characteristic of the Intel processors that you can set them 
> to trap on the use of _any_ floating point instruction?

I have to hit the databook to answer that one.  Since it's packed, I
won't know until late next weeek, at the earliest.  8-(.


Out of curiosity, are you unmasking the error condition after the code
where it's expected so that unexpected errors are not also masked?  I
know it's more instructions.  8-(.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199710120314.UAA05148>