Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Nov 2001 15:05:49 -0800 (PST)
From:      John Baldwin <jhb@FreeBSD.org>
To:        Daniel Eischen <eischen@pcnet1.pcnet.com>
Cc:        hackers@FreeBSD.ORG, freebsd-ports@FreeBSD.ORG, marcus@marcuscom.com, Maxim Sobolev <sobomax@FreeBSD.ORG>
Subject:   Re: Using bit 21 of EFLAGS in user-mode [was: Re: sigreturn: efl
Message-ID:  <XFMail.011115150549.jhb@FreeBSD.org>
In-Reply-To: <Pine.SUN.3.91.1011115173611.10851A-100000@pcnet1.pcnet.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On 15-Nov-01 Daniel Eischen wrote:
> On Thu, 15 Nov 2001, Maxim Sobolev wrote:
>> On Thu, 15 Nov 2001 14:56:31 -0500 (EST), Joe Clarke wrote:
>> > 
>> > I learned about this by reading through some of the -hackers archives.
>> > One person complained of similar errors trying to get xine to work on
>> > FreeBSD.  Removing the MMX detection code fixed it.  I remembered libpng
>> > also used MMX, so I removed the pnggccrd.c source, and voila!
>> > 
>> > Based on core dumps, strace output, and a lot of code surfing, this makes
>> > sense to me.  Basically, any png-dependent app's thread that runs longer
>> > than what ITIMER_PROF is set to gets hit with a SIGPROF.  When that
>> > happens, things context switch.  eflags must have been corrupted by the
>> > MMX code, thus sigreturn() bombs out, and causes uthread_kern to die as
>> > well.  Here's what strace looks like when balsa tries to read a 33 MB
>> > mailbox:
>> > 
>> > 74202 sigreturn(0x81f2c64
>> > 
>> > When this happens, strace politely dies with a bus error.
>> > 
>> > Thanks for testing this, Maxim.  Hopefully someone can find the problem
>> > and fix it for good.
>> 
>> That explains... After a quick glance at png code I found that
>> the only place where EFLAGS is altered is CPUID code, where
>> the library flips bit 21 of EFLAGS in order to ensure that the
>> CPUID instruction is supported (otherwise it will get SIGILL
>> on older processors). Unfortunately, for some reason FreeBSB
> 
> Does it need to keep bit 21 of EFLAGS flipped, or can libpng
> set it back and keep knowledge that CPUID is supported?  Or
> does that bit need to remain set for CPUID to work?

It needs to be able to change it.  If you can change the value of the bit (done
by pushf ; pop %eax ; mov %eax,%ebx ; xor $PSL_ID,%eax ; push %eax ; popf ;
pushf ; pop %eax ; compare bit PSL_ID of eax ebx to see if they match).
The problem is if a signal comes in during the middle of that bit toggling due
to a profiling timer.  I think the problem may be that it uses a sequence that
leaves the bit set, thus the kernel freaks out thinking that the user has
changed a kernel only flag.  The solution is Maxim's patch to make the kernel
not care about the flag (which it shouldn't since cpuid is not a privileged
instruction).

> If at all possible, a fix should be committed that wouldn't
> necessitate a new kernel be built for -stable.

Perhaps if you patched libpng to block all signals and changed the code to add
a pushf/popf around the entire sequence to preserve the PSL_ID flag.  You
would need to do this for xien and other apps that attempt to use cpuid as well.

Fixing the kernel is much easier as Maxim's patch shows.

-- 

John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-ports" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.011115150549.jhb>