Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 1 Nov 2000 14:00:35 -0500 (EST)
From:      Daniel Eischen <eischen@vigrid.com>
To:        John Polstra <jdp@polstra.com>
Cc:        current@freebsd.org, sobomax@freebsd.org, obrien@freebsd.org, deischen@freebsd.org
Subject:   Re: ABI is broken??
Message-ID:  <Pine.SUN.3.91.1001101135903.7366A-100000@pcnet1.pcnet.com>
In-Reply-To: <200011011835.eA1IZl207585@vashon.polstra.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 1 Nov 2000, John Polstra wrote:
> In article <3A005026.47B9978C@FreeBSD.org>,
> Maxim Sobolev  <sobomax@FreeBSD.ORG> wrote:
> > 
> > I'm not sure what exactly caused this behaviour (I can guess two potential
> > victims: O'Brien's changes in crt stuff and recent Polstra's changes in
> > libgcc_r), but it seems that some programs built on the previous -current from
> > 27 October immediately segfault when I'm trying to run then on system installed
> > from today's sources. The segfault disappeared when I recompiled affected
> > program. With this message I'm attaching short backtrace.
> [...]
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x287de417 in pthread_mutex_lock () from /usr/lib/libc_r.so.4
> > (gdb) bt
> > #0  0x287de417 in pthread_mutex_lock () from /usr/lib/libc_r.so.4
> > #1  0x806e782 in __register_frame_info ()
> > #2  0x287a3137 in _init () from /usr/lib/libc_r.so.4
> > #3  0x2879ffe5 in _init () from /usr/lib/libc_r.so.4
> > #4  0x280797fd in _rtld () from /usr/libexec/ld-elf.so.1
> 
> Here are all the random facts which, when put together, explain what
> is going on.
> 
> Your old application was (like all -pthread programs) linked
> with "/usr/lib/libgcc_r.a".  That library contains a function
> "__register_frame_info" which uses some of the facilities of the
> pthreads library "libc_r".
> 
> The pthreads library has to be initialized before it can be used, by
> a call to _thread_init.  If some functions such as pthread_mutex_lock
> are called before the library has been initialized, a segmentation
> violation results.
> 
> _thread_init is called automatically from libc_r's _init function
> when the dynamic linker loads the library.  Unfortunately, that
> isn't early enough.  libgcc_r is the first thing to be initialized,
> and it calls pthread_mutex_lock before _thread_init has been called.
> Or rather I should say that OLD versions of libgcc_r did that --
> because they were buggy.
> 
> In other words, your old application was linked with a buggy version
> of libgcc_r, but it didn't become apparent until now.
> 
> It didn't become apparent until now because our crtbegin.o and
> crtend.o were also buggy.  They failed to call __register_frame_info.
> This was a problem for C++ programs using exceptions, especially when
> the gcc port was used and DWARF2 exception handling was selected.
> 
> Now we have fixed crtbegin.o and crtend.o, and we have fixed
> libgcc_r.a.  But it causes problems for your old application because
> the new crtbegin.o and crtend.o (linked into the new shared libraries
> such as libc_r) call __register_frame_info in your old, buggy,
> statically linked libgcc_r.a.
> 
> Are you dizzy yet?

Yes ;-)

> To sum up, your old executable contains the bug but
> it wasn't triggered until the recent changes.
> 
> Now, what can or should we do about this?  Arguably we should simply
> say in the release notes, "Relink your old multithreaded applications.
> They had a bug which is now fixed."  But if there are binary-only
> commercial apps which exhibit the problem, this solution is useless.
> I don't know whether there are any such apps, but I doubt it.  N.B.,
> Linux apps don't count because they were never linked with our
> libgcc_r in the first place.
> 
> Or we can try to work around it, but there aren't any perfectly nice
> ways to do so.  Here are some possibilities:
> 
> - Put a hack in the threads library so that whenever
>   pthread_mutex_lock is called it checks to make sure that the
>   threads library has been initialized, and if not, it calls
>   _thread_init.  This is a poor solution because it adds overhead to
>   a rather performance-critical function -- though admittedly the
>   overhead is very small.  Another potential problem is that there
>   could be a race condition if several threads all called
>   pthread_mutex_lock at once before the threads library had been
>   initialized.  I don't think the race condition would materialize,
>   though, since the first call would come from libgcc_r, well before
>   the application had gotten control.
> 
> - Put a hack into the dynamic linker to call _thread_init very early
>   if that symbol was defined.  I like this solution even less,
>   because it's too hackish.  The dynamic linker isn't the place for
>   special hooks like that.
> 
> - Put a hack into crtbegin.o or crtend.o.  But we are using the
>   standard GNU versions of these, and I really really don't want to
>   change that.  In any case, it's the wrong place for the
>   work-around.
> 
> Overall I would lean toward putting the hack into pthread_mutex_lock.
> Comments?

If that's the lesser evil, then I guess it's OK with me.

-- 
Dan Eischen



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.SUN.3.91.1001101135903.7366A-100000>