Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 12 May 1995 19:32:58 +1000
From:      Bruce Evans <bde@zeta.org.au>
To:        maddox@CS.Berkeley.EDU, ylo@fx7.cs.hut.fi
Cc:        freebsd-bugs@freefall.cdrom.com, maddox@redwood.CS.Berkeley.EDU
Subject:   Re: i386/395: CRITICAL PROBLEM: spl functions implemented incorrectly
Message-ID:  <199505120932.TAA20816@godzilla.zeta.org.au>

next in thread | raw e-mail | index | archive | help
>You recently reported a bug in "spl.h" :

>>	...
>>	This means that the compiler has full knowledge about the
>>	semantics, side-effects, and dependencies of these functions.
>>	The compiler inlines these functions, and can mix (reorder) the
>>	resulting code with other code in the function that calls the
>>	spl function.

No it can't.  `static inline' functions have the same semantics as
ordinary static functions.  They aren't equivalent to macros.  Reordering
is severely limited by the C Standard.  The ISO version section 5.1.2.3
says:

"Accessing a volatile object, modifying an object, modifying a file, or
calling a function that does any of these operations are all _side effects_.
At ... _sequence points_, all side effects of previous evaluations shall
be complete and no side effects of subsequent evaluations shall have
taken place".

Note that static [inline] functions are not special.  spl*() all modify
the (non-volatile) object `cpl', so the above applies to them.  Making
`cpl' volatile wouldn't affect this and would be wrong (it isn't volatile
as far as C routines can tell since interrupt handlers preserve it).

>	...
>>Fix:
>	...

>>	i386/include/spl.h:
>>	  - Move the GENSPL macro and its uses, and spl0 and splx to
>>	    i386/386/sys_machdep.c.  Edit the macro and functions to
>>	    make them normal non-static, non-inline functions.
>       ....

This makes optimizations harder but doesn't stop them.  The linker is
allowed to do global optimizations.

The standard also allows entire expressions, function calls and even
accesses to volatile objects to be optimized away if it can be shown
that the side effects are not needed.  This is the usual "as if"
rule explictly stated.

The current implementation of spl*() only relies on the compiler not
being able to show if the assignments to cpl are neeeded.  If it can
show that then it is broken :-) (they are needed for interrupt handling).
It would take a global optimizer noticing that there are no Standard C
signal handlers in the kernel to think it can prove that any global
variable is not needed.  When we have a global optimizer then this will
be easy to fix by using `cpl' in a dummy signal handler.

The average device driver probably depends on non-reordering in many other
places, but probably has more bugs from not declaring enough variables as
volatile.

>I think there is another potential problem lurking in spl.h, in the following:

>	/*
>	 * ipending has to be volatile so that it is read every time it is accessed
>	 * in splx() and spl0(), but we don't want it to be read nonatomically when
>	 * it is changed.  Pretending that ipending is a plain int happens to give
>	 * suitable atomic code for "ipending |= constant;".
>	*/
>	#define	setdelayed()	(*(unsigned *)&ipending |= loadandclear(&idelayed))
>	#define	setsoftast()	(*(unsigned *)&ipending |= SWI_AST_PENDING)
>...

>I assume that the remark above regarding "suitable atomic code" means
>that the bits are set in place in memory, using a single instruction.
>Just because GCC happens to do this today doesn't mean it will tomorrow.

Yes.  I'm willing to examine the generated code.  I would be more concerned
about this if all the drivers were careful about it.

>As the internal execution mechanisms of new-generation x86 processors become 
>more heavily parallel and RISC-like, instruction sequences involving 
>architecturally
>unnecessary register temporaries may turn out to be faster.  I wouldn't bet on 

It already has the same speed on a 486 (one cycle to load, 1 to OR, 1 to
store), but it clobbers a register, so ORing directly into memory is the
best possible on i386's and i486's.

>the stability
>of a compiler's choice of instructions in future releases, and I doubt that 
>anyone is
>going to check this each time a new version of GCC is integrated into FreeBSD.

I always read the diffs for i386.md.

>I would suggest defining these functions as in-line assembly-language 
>functions.
>In any case, this is the sort of thing that should be done in assembler, not C.

I agree about this for setting `ipending', but not for spl*() on a
uniprocessor.  spl*() really can be written in C in this case.  On
multiprocessors many other things will break.

The setting of `ipending' is modeled on the setting of `netisr' in
<net/netisr.h>:

#define	schednetisr(anisr)	{ netisr |= 1<<(anisr); setsoftnet(); }

`netisr' is volatile, so this is guaranteed to go through a register and
lose any bits that have been set by interrupt handlers.  This may not
be a problem in practice - it is safe iff schednetisr() is always done
at ipl >= imp so that it can't be interrupted by net interrupt handlers.
More care is required setting `ipending' since interrupts can occur,
e.g., setsoftast() is potentially interruptable by setsofttty().

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199505120932.TAA20816>