Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Nov 2001 14:56:31 -0500 (EST)
From:      Joe Clarke <marcus@marcuscom.com>
To:        Maxim Sobolev <sobomax@FreeBSD.org>
Cc:        freebsd-ports@FreeBSD.org, <freebsd-stable@FreeBSD.org>, <hackers@FreeBSD.org>
Subject:   Re: sigreturn: eflags creash (fixed!)
Message-ID:  <20011115145029.G47613-200000@shumai.marcuscom.com>
In-Reply-To: <3BF41BE2.BD7D8EAA@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.
  Send mail to mime@docserver.cac.washington.edu for more info.

--0-2136882512-1005854191=:47613
Content-Type: TEXT/PLAIN; charset=US-ASCII



On Thu, 15 Nov 2001, Maxim Sobolev wrote:

> Joe Clarke wrote:
> >
> > Sorry for the wide distribution, but I have sent email to both lists
> > regarding this problem in the past.  It seems that while doing intensive
> > threaded operations in Gnome applications, crashes occur with the
> > following kernel message:
> >
> > sigreturn: eflags 0x280
> >
> > And, in .gnomerc-errors:
> >
> > Fatal error 'Thread has returned from sigreturn or longjump'
> >
> > The problem as I have found it is with libpng.  libpng uses MMX
> > optimizations by default on FreeBSD.  If you apply the attached patch to
> > patch-aa in /usr/ports/graphics/png, the problem goes away.  You only need
> > to recompile and install libpng.  Reinstalling Gnome isn't necessary.  It
> > seems the MMX optimizations are corrupting eflags, and when a thread tries
> > to restore context after a signal, things go really wrong.
> >
> > The true fix will probably come in analyzing the MMX code in libpng.
> > Unfortunately, I don't know enough about x86 assembly to be of much use
> > here.  Hopefully this will help other experiencing the same problem.
>
> Very interesting, and weird if true. I'll test this tomorrow. In the
> meantime, could some CPU guru confirm or reject theoretical
> possibility of MMX user-level code causing problems to the kernel?

I learned about this by reading through some of the -hackers archives.
One person complained of similar errors trying to get xine to work on
FreeBSD.  Removing the MMX detection code fixed it.  I remembered libpng
also used MMX, so I removed the pnggccrd.c source, and voila!

Based on core dumps, strace output, and a lot of code surfing, this makes
sense to me.  Basically, any png-dependent app's thread that runs longer
than what ITIMER_PROF is set to gets hit with a SIGPROF.  When that
happens, things context switch.  eflags must have been corrupted by the
MMX code, thus sigreturn() bombs out, and causes uthread_kern to die as
well.  Here's what strace looks like when balsa tries to read a 33 MB
mailbox:

...
74202 --- SIGPROF (Profiling timer expired) ---
74202 --- SIGPROF (Profiling timer expired) ---
74202 gettimeofday({1005789324, 257513}, NULL) = 0
74202 sigprocmask(SIG_SETMASK, [], NULL) = 0
74202 sigaltstack({ss_sp=0x811b000, ss_flags=0, ss_size=40960}, NULL) = 0
74202 poll([{fd=5, events=POLLIN}, {fd=7, events=POLLIN}, {fd=10,
events=POLLIN}, {fd=11, events=POLLIN|POLLPRI}, {fd=13, events=POLLIN},
{fd=12, events=POLLIN}], 6, 0) = 0
74202 sigreturn(0x81f2c64

When this happens, strace politely dies with a bus error.

Thanks for testing this, Maxim.  Hopefully someone can find the problem
and fix it for good.

Joe

P.S. my original patch sucks  I was in a hurry.  The one attached is
better.


>
> -Maxim
>
>

--0-2136882512-1005854191=:47613
Content-Type: TEXT/PLAIN; charset=US-ASCII; name="patch-aa.diff"
Content-Transfer-Encoding: BASE64
Content-ID: <20011115145631.X47613@shumai.marcuscom.com>
Content-Description: 
Content-Disposition: attachment; filename="patch-aa.diff"

LS0tIHBhdGNoLWFhLm9yaWcJVGh1IE5vdiAxNSAxNDo1NTo0MyAyMDAxDQor
KysgcGF0Y2gtYWEJVGh1IE5vdiAxNSAxNDo1NTo1NyAyMDAxDQpAQCAtNDMs
OCArNDMsOCBAQA0KICANCiAtYWxsOiBsaWJwbmcuYSBwbmd0ZXN0DQogKy5p
ZiAoJHtBUkNIfSA9PSAiaTM4NiIpDQotK0NGTEFHUys9LURQTkdfVVNFX1BO
R0dDQ1JEDQotK09CSlMrPXBuZ2djY3JkLm8NCisrI0NGTEFHUys9LURQTkdf
VVNFX1BOR0dDQ1JEDQorKyNPQkpTKz1wbmdnY2NyZC5vDQogKy5lbmRpZg0K
ICsNCiArLlNVRkZJWEVTOiAuYyAuc28gLm8NCkBAIC05MSw1ICs5MSw1IEBA
DQogIHBuZ3d0cmFuLm86IHBuZy5oIHBuZ2NvbmYuaA0KICBwbmd3dXRpbC5v
OiBwbmcuaCBwbmdjb25mLmgNCiAgcG5ncHJlYWQubzogcG5nLmggcG5nY29u
Zi5oDQotK3BuZ2djY3JkLm86IHBuZy5oIHBuZ2NvbmYuaCBwbmdhc21yZC5o
DQorKyNwbmdnY2NyZC5vOiBwbmcuaCBwbmdjb25mLmggcG5nYXNtcmQuaA0K
ICANCg==
--0-2136882512-1005854191=:47613--

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011115145029.G47613-200000>