Date: Sat, 13 May 1995 20:08:00 +1000 From: Bruce Evans <bde@zeta.org.au> To: bakul@netcom.com, bugs@FreeBSD.org Subject: Re: i386/395: CRITICAL PROBLEM: spl functions implemented incorrectly Message-ID: <199505131008.UAA23813@godzilla.zeta.org.au>
next in thread | raw e-mail | index | archive | help
>Making driver globals etc. volatile is the way to go (this >problem has already been encountered and solved in a similar >way by a number of commercial systems, especially multi- >processor ones). This seems to work because a compiler must >not reorder volatile accesses w.r.t. *other* volatile >accesses and it must not optimize away such references. >Though, I don't know if gcc handles them right in all cases. In the case of the current implementation of spl*(), it works because the compiler must not reorder volatile accesses w.r.t. the store to cpl. >If you want to temporarily lose volatility for performance >(and where you know this is safe), you can manually copy >such values to registers or perhaps, typecasting will work. It's probably almost always safe because the volatile variables are amost always accessed in spl'ed regions where they are nonvolatile. For volatile hardware registers it is often important to copy the value for further processing because multiple accesses might have side effects. >Pete Carah wrote: >> To make global optimizers safe in the presence of *any* non-memory >> side effects (I/O) one needs function-attribute flags in the library >I think GNU extension of volatile attribute for a function >deals with this well, when used with an asm() `function'. `volatile' functions are ones that don't return. This meaning of volatile is deprecated (use the noreturn attribute). `volatile' asms are ones that can't be moved or deleted. The following seems to work OK to flush cached values of variables that should be volatile: #define invalidate_previous_loads() __asm __volatile("" : : : "memory") It pretends to clobber random memory, so it flushes cached values of all variables. It does the null assembler operation "" so it doesn't necessarily waste any time. In practice it may save or waste a little time due changing the pattern and number of loads. I added it to spl*() and examined the 14 modules that changed (out of 26). I didn't find any bugs. Usually only the caching of local variables changed, as in the following: void foo(int x) { invalidate_previous_loads(); bar(x); } Gcc copies the value of the arg to a register variable _before_ the invalidate step and passes the copy to bar(). Without the invalidate step, it pushes the original arg without loading it. Pushing through a register happens to be faster on i486's and would be used in all cases if the -m486 flag was used. The same sort of thing happens if you attempt to cache a value in a local variable (preferably a register) before calling spl*(). With the invalidate step, the caching is honoured. Without it, the caching is sometimes ignored, e.g., when the cached value has to be spilled to memory, there is little point in caching it, and gcc sometimes notices this. It is always(?) wrong to cache volatile variables before calling spl*(), so I wasn't surprised that there seemed to be no important changes. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199505131008.UAA23813>