Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 11 Oct 1997 07:15:39 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        reilly@zeta.org.au (Andrew Reilly)
Cc:        tlambert@primenet.com, gjohnson@nola.srrc.usda.gov, freebsd-hackers@FreeBSD.ORG
Subject:   Re: Floating point exceptions
Message-ID:  <199710110715.AAA17411@usr04.primenet.com>
In-Reply-To: <199710110624.QAA05971@gurney.reilly.home> from "Andrew Reilly" at Oct 11, 97 04:24:20 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> > Fix:		Correct the code to not generate exceptions
> 
> This is just plain rude.  There are a bunch of exceptions that are
> enabled by default that are simply irrelevant for any signal processing
> code that I can think of, and were certainly what was causing the
> breakage of a program that was otherwise performing exactly as I
> intended.  In particular, trapping on underflow is, to my mind,
> counter-intuitive.  No, the code is _correct_ in my case.

Actually, I find this strange.  It kind of assumes that all hardware has
equivalent precision.  I can guarantee you that code that works fine
on UniCOS will have problems on an Inel-based PC if it expects 128 bit
precision.  8-(.

To me, this was not a rude response.  An exception where an exception
was not an intended result of the calculation is an exception that is
not masked, to my mind, and a useful indicator that all is not right with
the code.  It was certainly not intended as a rude remark.

I think it's better to get an error than to get non-obviously erroneous
results (the alternative).  But I am a physics geek at heart, so maybe
I am biased toward useful answers and ugly exceptions vs. useless answers
and no exceptions... depends on your idea of "useless", I suppose...


> > Bad fix:	fpsetmask( 0);
> 
> This works for me.  Having found this function, and looked it up in the
> manual, I think that the arg of 0 is perhaps a bit harsh.  In my case I
> think ~(FP_X_UFL+FP_IMP+FP_X_DNML) would probably do the job.  Is there
> a pointer to fpsetmask in any other manual page?  My search that
> started in math.3m did not find it.

To be honest, I knew the general name of the function off the top of
my head (I do a lot of event simulation), and I used man -k setmask to
find the specific name.  But it is referenced in floatingpoint.h by
prototype... and that header is referenced by most of the FP functions.


> > Worst fix:	signal( SIGFPE, SIG_IGN);
> 
> Very bad fix, because when I tried it, it just didn't work.  I assume
> that the trap handler does not correctly restore the floating point
> state.  The program ran to completion, but IEEE error values
> of some sort propagated from the exception point and ruined the results.

The point of the handler is to localize the errors.  Mask those which are
intentional, and fix those that aren't on a case-by-case basis.  I had a
number of precision fixes to 21 year old FORTRAN code that resulted from
getting exceptions thrown like this.


> For anyone who cares to help with an attempt at Terry's "Fix:", the
> code in my case is part of the HMM code from a speech recogniser.

[ ... ]

An interesting application; you'll note it falls into my "signal processing"
bucket which I designated as a bad thing to need to fix because of the
need for repeatability...

> Program received signal SIGFPE, Arithmetic exception.
> 0x4a8c in CalcU () at state.c:305
> 305                         if (s > *pMax) *pMax = s;
> 
> [ register float s; register float *pMax; ]


Look to the calculation immediately before the compare.  As I stated in
another posting discussing FPU exception reporting, lazy binding of
FPU register switching, and SMP issues related to lazy reporting of an
error when the code that generated the error could run on a different
processor before the lazy reporting signalled the error on the original
processor, FPU error conditions are signalled on the first FPU instruction
*after* the instruction that caused the error.  This is to keep the
instructions pipelined without a single cycle reporting latency per
instruction.  The fact that the lazy reporting occurs potentially in
another process is an issue that requires flagging and trapping to
allow lazy reporting and to allow SMP context siwtches to work for FPU
reporting (see the SMP list archives: FPU code is known to have problems
in the SMP case).

Actualy, if we used the TSS for task switching, this would be fixed, but
the TSS context switch is significantly higher overhead than the FreeBSD
context switch.


> So: I imagine fcoms is a short floating point compare.  Any
> reason that should generate an exception in this case?

The FPU instruction before it resulted in an exception that wasn't reported
until this FPU instruction triggered the report.


> > Continuing from SIGFPE handlers is much harder than masking FP exceptions,
> > at least on i386's.
> 
> Yes.  I tried doing a signal(SIGFPE,SIG_IGN) at the top of
> main, but that just made it produce totally incorrect results.

The FPU registers are not saved (or restored) by signal handlers, which
are not expected to execute FPU instructions.  If you will look at the
man page, there is actually a *lot* of calls which are not "safe" to use
from a signal handler, according to POSIX.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199710110715.AAA17411>