Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 Aug 2005 12:00:40 GMT
From:      Bruce Evans <bde@zeta.org.au>
To:        freebsd-i386@FreeBSD.org
Subject:   Re: i386/85417: Possible bug in ia32 floating-point exception handler
Message-ID:  <200508291200.j7TC0eKe031981@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR i386/85417; it has been noted by GNATS.

From: Bruce Evans <bde@zeta.org.au>
To: Chuck Ebbert <76306.1226@compuserve.com>
Cc: freebsd-gnats-submit@freebsd.org, freebsd-i386@freebsd.org
Subject: Re: i386/85417: Possible bug in ia32 floating-point exception handler
Date: Mon, 29 Aug 2005 21:58:32 +1000 (EST)

 On Sun, 28 Aug 2005, Chuck Ebbert wrote:
 
 >> Description:
 > src/sys/i386/i386/trap.c 1.277.2.1 lines 303-308:
 > 303:                case T_ARITHTRAP:       /* arithmetic trap */
 > 304: #ifdef DEV_NPX
 > 305:                        ucode = npxtrap();
 > 306:                        if (ucode == -1)
 > 307:                                goto userout;
 > 308: #else
 >
 > src/sys/i386/isa/npx.c:npxtrap()
 > can never return -1, so SIGFPE code 0 will be sent to the
 > user app if no unmasked i387 exception bits are set.  This
 > can happen on some non-Intel processors.
 
 Does it happen often or normally for some non-Intel processors?  I
 don't have any processors with the problem, but noticed it somehow,
 perhaps in connection with the kernel bugs that resulted in the one
 IRQ13 from npx_probe() not being discarded.  This IRQ13 normally
 occurs for all processors and is supposed to be ignored, but was
 counted as a /stray/spurious IRQ13.  I vaguely remember that it didn't
 cause a spurious npxtrap() but I saw the bug by wondering what would
 happen if it did.
 
 >> How-To-Repeat:
 >
 >> Fix:
 >
 > (1) If the current behavior is correct, remove the if statement
 > at line 306 since it has no effect and only confuses reviewers.
 >
 > (2) Otherwise, change fpetable[] in src/sys/i386/isa/npx.c
 > so entry 0 is -1 instead of 0
 
 I have used (2) (and some other related changes) for many years, but
 don't remember having noticed that the check in trap() is so broken.
 I now think it should have been "if (ucode == 0)".  But -1 is better
 for a special value since ucode is not a bitmap.
 
 From my version of npx.c:
 
 % Index: npx.c
 % ===================================================================
 % RCS file: /home/ncvs/src/sys/i386/isa/npx.c,v
 % retrieving revision 1.152
 % diff -u -2 -r1.152 npx.c
 % --- npx.c	19 Jun 2004 22:24:16 -0000	1.152
 % +++ npx.c	15 Oct 2004 15:38:14 -0000
 % @@ -578,5 +612,5 @@
 %   */
 %  static char fpetable[128] = {
 % -	0,
 % +	-1,		/*  0 - no unmasked exception (probably bogus IRQ13) */
 %  	FPE_FLTINV,	/*  1 - INV */
 %  	FPE_FLTUND,	/*  2 - DNML */
 
 This is (2).
 
 % @@ -642,5 +676,5 @@
 %  	FPE_FLTDIV,	/* 3E - DNML | DZ | OFL | UFL | IMP */
 %  	FPE_FLTINV,	/* 3F - INV | DNML | DZ | OFL | UFL | IMP */
 % -	FPE_FLTSUB,	/* 40 - STK */
 % +	-1,		/* 40 - STK, but no unmasked exception so no trap */
 %  	FPE_FLTSUB,	/* 41 - INV | STK */
 %  	FPE_FLTUND,	/* 42 - DNML | STK */
 
 We also shouldn't signal stack-only exceptions to userland, since there
 is no way to mask these exceptions.  Normally there is no problem since
 npx traps shouldn't occur when all maskable exceptions are masked, but
 if an npx trap somehow occurs it may say the STK bit set because there
 is no mask for this bit; thus a spurious npx trap signaled userland
 with a bogus code of FPE_FLTSUB instead a bogus code of 0 depending
 on whether STK is set.
 
 % @@ -751,7 +794,16 @@
 %  	}
 % 
 % +	/* Ignore some spurious traps. */
 % +	if ((status & ~control & 0x3f) == 0) {
 % +		intr_restore(saveintr);
 % +		return (-1);
 % +	}
 
 We also shouldn't do normal trap processing for spurious traps.  This
 change makes the above changes just style fixes since it makes fpetable
 not used for entries 0 and 3F.
 
 % +
 % +#if 0
 % +	/* XXX this clobbers the status. */
 %  	if (PCPU_GET(fpcurthread) == curthread)
 %  		fnclex();
 
 This is an incomplete fix for clobbering of the npx status word for
 normal traps, and happens to make the previous change unnecessary since
 it results in "normal trap processing" not changing anything in the
 npx state.  The exception status word used to be saved separately and
 gdb used to display it, but both the saving and the display have been
 lost.  New signal handlers now get a clean npx state, so the hack of
 clearing the exceptions shouldn't be needed, but when the signal
 handling was fixed, only the infrastructure for the hack (i.e., preserving
 the exception status word) was removed.  Old (FreeBSD-[1-3] and FreeBSD-4
 compatible) signal handlers still need the hack.
 
 % -	intr_restore(savecrit);
 % +#endif
 % +	intr_restore(saveintr);
 
 This churns the name of the "save" variable back to one that matches the
 code.
 
 %  	return (fpetable[status & ((~control & 0x3f) | 0x40)]);
 %  }
 
 Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200508291200.j7TC0eKe031981>