From owner-svn-src-all@FreeBSD.ORG Fri Mar 2 04:11:26 2012 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B1AFF106566B; Fri, 2 Mar 2012 04:11:26 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail16.syd.optusnet.com.au (mail16.syd.optusnet.com.au [211.29.132.197]) by mx1.freebsd.org (Postfix) with ESMTP id 379988FC08; Fri, 2 Mar 2012 04:11:25 +0000 (UTC) Received: from c211-30-171-136.carlnfd1.nsw.optusnet.com.au (c211-30-171-136.carlnfd1.nsw.optusnet.com.au [211.30.171.136]) by mail16.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q224BLZX000806 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 2 Mar 2012 15:11:23 +1100 Date: Fri, 2 Mar 2012 15:11:21 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Tijl Coosemans In-Reply-To: <201203012347.32984.tijl@freebsd.org> Message-ID: <20120302132403.P929@besplex.bde.org> References: <201202282217.q1SMHrIk094780@svn.freebsd.org> <20120229151223.K2273@besplex.bde.org> <201203012347.32984.tijl@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Bruce Evans Subject: Re: svn commit: r232275 - in head/sys: amd64/include i386/include pc98/include x86/include X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Mar 2012 04:11:26 -0000 On Thu, 1 Mar 2012, Tijl Coosemans wrote: > On Wednesday 29 February 2012 06:01:36 Bruce Evans wrote: >> ... >> Here is what current arches have in their machine/setjmp.h: >> >> amd64, i386: not much >> arm: has lots of comments and register offsets. These are defined as >> _JB_REG_* so they aren't pollution, but there is no reason to >> export them to the application either. The actual >> structs are the usual 2 arrays of ints, with the extra 1 for >> both the comment not matching the code, as on i386. The extra >> 1 is unused, or at least has no _JB_REG_* for it. >> ia64: has lots of namespace-pollution definitions under a __BSD_VISIBLE >> ifdef. The structs are arrays of long doubles! This defeats >> my idea of using a MI array of register_t's. _JBLEN could be >> expanded for long doubles, but __align() would be required too, >> and it gets messier than a separate file. >> mips: just the usual extra 1 (now 4 instances for 32/64 doubling) and >> the usual comment not matching the code. >> powerpc: like x86 >> sparc64: just the usual extra 1. The comment is fixed by removing it. >> >> So the extra 1 seems to be just a ~20-year old mistake, faithfully >> propagated to all arches except amd64 i386, with unfaithful propagation >> just fixed for i386. > > If we could add the returns_twice attribute to setjmp() then the > compiler makes sure all registers are dead before calling it and > jmp_buf wouldn't have to be that big. I think compilers already do stuff like that automatically. They have to for setjmp() to work. Since there was no way to declare such attributes 20 years ago, compilers had to know that setjmp() was special and make it work when it only has a Standard C declaration (and some magic in its inmplementation). > Also, from ISO C: "All accessible objects have values, and all other > components of the abstract machine [249] have state, as of the time the > longjmp function was called" > > "[249] This includes, but is not limited to, the floating-point status > flags and the state of open files." > > So I think storing mxcsr in jmp_buf is incorrect. This is a well known bug in ISO C. ISO C never even tried to support longjmp() from signal handlers, but we do. Supporting them requires restoring significant parts of the FP environment. I fixed this for the i387 control word in FreeBSD about 20 years ago. This was required even to support float to integer conversions on i386. The situation with these has changed a bit. It was: - there is a default rounding mode. C didn't support changing it. It is normally round-to-nearest. But for float to integer conversions, it is round-towards-zero. To implement the latter, compilers switch the mode to the latter mode. In old versions of FreeBSD, and still with COMPAT4 signal handlers, handling of the FP state in signal handlers was mostly incorrect. Signal handlers were passed the current FP state, except for clobbering the exception flags for SIGFPE's for hardware FP exceptions. Thus it was normal for signal handlers to see the rounding mode switched to round-towards-zero. This is not part of the abstract machine. A normal rounding mode must be restored somehow. C90 didn't support changing the rounding mode, so it would have been correct for C90 to hard-code the rounding mode at the time of main() in signal handlers if you knew what that was (it really should be set in crt or inherited across exec, instead of being hard-coded in the kernel like it is in FreeBSD). But i387 supports changing the rounding mode. It is simplest to restore it to that at the time of the setjmp(). Now, things are even more complicated: - signal handlers are normally passed a clean FP state. (Since C barely supports signal handlers, it doesn't say anything about this). Now, longjmp() from a signal handler would return this clean state if no FP state is restored, unless the signal handler does some FP operations that dirty its clean state. Returning the clean state has much the same effect in simple cases as restoring the state at the time of the setjmp(), because nothing except the compiler doing the float to integer conversions changes the state from its default, and the longjmp() takes us to a point where the compiler is not doing these conversions so it is correct for the normal state to be restored. There is now the minor simplication that i386 with SSE doesn't need the mode switch for floats, and i386 with SSE2 doesn't need it for doubles; but i386 still needs it for long doubles. There is the minor complication that the signal handler may be COMPAT4, in which case its FP state is not clean and the old method must be used -- longjmp() can hardly be expected to tell which type the signal handler is and adjust its behaviour to match. - C now supports changing the rounding mode. Its requirement that longjmp() not restore the previous rounding mode may be correct for some cases, but it is broken for longjmp() out of signal handlers: - suppose the signal handler gets a clean state, as in FreeBSD. Then any longjmp() out of a signal handler that doesn't restore the rounding mode (or any other part of the FP env) resets to the clean state (which should be the same as the default state); this state may differ from the state at the time of the setjmp() and also from the state at the time of the signal. This is broken. Perhaps the - suppose the signal handler doesn't get a clean state. Who knows what it is? Standards don't specify this. Even if the signal handler understands everything, then it will have a hard time cleaning up the state so that it is right at the time of the longjmp(). Note that it is not just SIGFPE handlers for hardware FP exceptions that would need to understand everything about FP to do the right thing. _All_ signal handlers would need this, since for example a harmless SIGINT handler might be interrupting a FP operation that changes the FP env in ways outside of the abstract machine. Next, there are the FP exception flags. C90 doesn't support these, and I didn't worry about these 20 years ago. I just put the i386 FP control word in jmp_buf, and used fninit to clean out everything else in the FP env. Now, C99 supports these. These should not be changed by longjmp(). However, for the case of longjmp() from a signal handler, if nothing is restored, then all of them will be be cleared by the longjmp() in the usual case where the signal handler doesn't dirty its clean state. Worse, if the signal handler dirties it state and doesn't do this intentionally to prepare for the longjmp(), then the main part of the program gets its exception flags replaced by the signal handler. Again, it is very difficult for signal handlers to understand FP well enough to do the right thing. SIGFPE ones have to understand a little more here. They have to understand that the kernel doesn't understand this stuff, so it has destroyed the exception flags in the saved state after only making a lossy copy of them (in the signal code). Destruction of the exception flags allows the case of returning from a signal handler to sort of work (the SIGFPE doesn't repeat). This problem only occurs if signals for FP exceptions are unmasked. Otherwise, SIGFPE never occurs for FP exceptions, but only for integer exceptions like division by 0. The amd64 _setjmp.S and setjmp.S (but not its sigsetjmp.S) save mxcsr in setjmp but only restores the non-flags from it in longjmp; it loses the fninit (except in sigsetjmp.S): % setjmp: % fnstcw 64(%rcx) /* 8; fpu cw */ % stmxcsr 68(%rcx) /* and mxcsr */ % longjmp: % /* Restore the mxcsr, but leave exception flags intact. */ % stmxcsr -4(%rsp) % movl 68(%rdx),%eax % andl $0xffffffc0,%eax % movl -4(%rsp),%edi % andl $0x3f,%edi % xorl %eax,%edi % movl %edi,-4(%rsp) % ldmxcsr -4(%rsp) % ... % // lost fninit here % fldcw 64(%rdx) For longjmp() from signal handlers, leaving the exception flags intact is worse than useless, since these are only the signal handler's exception flags (it would be better to clear them). On i386, the bugs are similar, except the mxcsr is not touched: - fninit has been removed from _setjmp.S and setjmp.S - sigsetjmp.S has not been touched (so it is missing mxcsr handling, but still does fninit). Removing the fninits is very large breakage in the i386 case. amd64 doesn't support COMPAT4 signal handlers. Thus its signal handlers start with a clean state and fninit in them only cleans up any dirt made by the signal handler, and there is rarely even minor dirt. But for i386 with a COMPAT4 signal handler, when a signal interrupts an FP operation, the i387 FP stack always has something on it. This needs to be cleaned before or by longmp() if longjmp() is used to quit the signal handler. Without COMPAT4 signal handlers, the only obvious bug is that leaving the i387 exception flags intact is worse than useless, the same as for then the sr part of mxcsr. Other things in the FP env are less portable so they are less used so they cause fewer problems. Standards don't support many other things, so the implementation can be correct for them even if the standard requires brokenness for other parts of the env. i387 precision control is an example. Bruce