Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 1 Nov 2000 10:35:47 -0800 (PST)
From:      John Polstra <jdp@polstra.com>
To:        current@freebsd.org
Cc:        sobomax@freebsd.org, obrien@freebsd.org, deischen@freebsd.org
Subject:   Re: ABI is broken??
Message-ID:  <200011011835.eA1IZl207585@vashon.polstra.com>
In-Reply-To: <3A005026.47B9978C@FreeBSD.org>
References:  <3A005026.47B9978C@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
In article <3A005026.47B9978C@FreeBSD.org>,
Maxim Sobolev  <sobomax@FreeBSD.ORG> wrote:
> 
> I'm not sure what exactly caused this behaviour (I can guess two potential
> victims: O'Brien's changes in crt stuff and recent Polstra's changes in
> libgcc_r), but it seems that some programs built on the previous -current from
> 27 October immediately segfault when I'm trying to run then on system installed
> from today's sources. The segfault disappeared when I recompiled affected
> program. With this message I'm attaching short backtrace.
[...]
> Program received signal SIGSEGV, Segmentation fault.
> 0x287de417 in pthread_mutex_lock () from /usr/lib/libc_r.so.4
> (gdb) bt
> #0  0x287de417 in pthread_mutex_lock () from /usr/lib/libc_r.so.4
> #1  0x806e782 in __register_frame_info ()
> #2  0x287a3137 in _init () from /usr/lib/libc_r.so.4
> #3  0x2879ffe5 in _init () from /usr/lib/libc_r.so.4
> #4  0x280797fd in _rtld () from /usr/libexec/ld-elf.so.1

Here are all the random facts which, when put together, explain what
is going on.

Your old application was (like all -pthread programs) linked
with "/usr/lib/libgcc_r.a".  That library contains a function
"__register_frame_info" which uses some of the facilities of the
pthreads library "libc_r".

The pthreads library has to be initialized before it can be used, by
a call to _thread_init.  If some functions such as pthread_mutex_lock
are called before the library has been initialized, a segmentation
violation results.

_thread_init is called automatically from libc_r's _init function
when the dynamic linker loads the library.  Unfortunately, that
isn't early enough.  libgcc_r is the first thing to be initialized,
and it calls pthread_mutex_lock before _thread_init has been called.
Or rather I should say that OLD versions of libgcc_r did that --
because they were buggy.

In other words, your old application was linked with a buggy version
of libgcc_r, but it didn't become apparent until now.

It didn't become apparent until now because our crtbegin.o and
crtend.o were also buggy.  They failed to call __register_frame_info.
This was a problem for C++ programs using exceptions, especially when
the gcc port was used and DWARF2 exception handling was selected.

Now we have fixed crtbegin.o and crtend.o, and we have fixed
libgcc_r.a.  But it causes problems for your old application because
the new crtbegin.o and crtend.o (linked into the new shared libraries
such as libc_r) call __register_frame_info in your old, buggy,
statically linked libgcc_r.a.

Are you dizzy yet?  To sum up, your old executable contains the bug but
it wasn't triggered until the recent changes.

Now, what can or should we do about this?  Arguably we should simply
say in the release notes, "Relink your old multithreaded applications.
They had a bug which is now fixed."  But if there are binary-only
commercial apps which exhibit the problem, this solution is useless.
I don't know whether there are any such apps, but I doubt it.  N.B.,
Linux apps don't count because they were never linked with our
libgcc_r in the first place.

Or we can try to work around it, but there aren't any perfectly nice
ways to do so.  Here are some possibilities:

- Put a hack in the threads library so that whenever
  pthread_mutex_lock is called it checks to make sure that the
  threads library has been initialized, and if not, it calls
  _thread_init.  This is a poor solution because it adds overhead to
  a rather performance-critical function -- though admittedly the
  overhead is very small.  Another potential problem is that there
  could be a race condition if several threads all called
  pthread_mutex_lock at once before the threads library had been
  initialized.  I don't think the race condition would materialize,
  though, since the first call would come from libgcc_r, well before
  the application had gotten control.

- Put a hack into the dynamic linker to call _thread_init very early
  if that symbol was defined.  I like this solution even less,
  because it's too hackish.  The dynamic linker isn't the place for
  special hooks like that.

- Put a hack into crtbegin.o or crtend.o.  But we are using the
  standard GNU versions of these, and I really really don't want to
  change that.  In any case, it's the wrong place for the
  work-around.

Overall I would lean toward putting the hack into pthread_mutex_lock.
Comments?

John
-- 
  John Polstra                                               jdp@polstra.com
  John D. Polstra & Co., Inc.                        Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200011011835.eA1IZl207585>