Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 20 Feb 2016 00:06:50 +0100
From:      =?UTF-8?Q?Jean-S=c3=a9bastien_P=c3=a9dron?= <dumbbell@FreeBSD.org>
To:        freebsd-arch@FreeBSD.org
Subject:   "initial-exec" TLS model and dlopen(3)
Message-ID:  <56C7A00A.3030802@FreeBSD.org>

next in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--GXRNBuFSWdni5fbROsATE9SqfAa5CesF8
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

Hello!

=3D=3D Context =3D=3D

After Mesa 11.2.0 branch point (RC1 is scheduled today), Emil Velikov,
Mesa release engineer, told us he plans to only keep the TLS-based GL
dispatcher and remove the other code path.

However, even if all Linux distributions use the TLS-based one for a
long time, it was still turned off by default in the configure, and we
never field-tested it on FreeBSD.

We enabled the --enable-glx-tls flag in Mesa in our development Ports
tree. Unfortunately, some applications segfault after that. Firefox is
one of them.

Below is what I understand about this issue so far.

=3D=3D The issue =3D=3D

Most applications are linked directly to libGL.so. In this case, there
is no problem. For example, glxgears is linked directly to libGL.so.

Some of them use dlopen(3) to directly or indirectly load libGL.so. For
example, Firefox dlopens libxul.so which is linked to libGL.so.

Mesa uses the "initial-tls" model for at least one TLS variable:
https://cgit.freedesktop.org/mesa/mesa/tree/src/glx/glxcurrent.c#n79

I'm new to TLS implementation details, but if I understand correctly,
this model is a static one, meaning that a variable address is known and
it's accessed directly, like a normal variable, as opposed to dynamic
TLS models where a variable address is first queried with
__tls_get_addr(). This is all transparent to the program because the
compiler is responsible for generating the appropriate code, depending
on the model.

In the case of a direct link like glxgears, our rtld (ld-elf.so)
allocates space during startup to copy static TLS variables from the
program and all linked libraries. libGL.so finds its variables where it
expects them to be, glxgears is happy.

In the case of a dlopen(3) like Firefox, our rtld maps the dlopen'd
object and all its linked libraries but it doesn't look for any static
TLS variables. libGL.so accesses the allocated TLS storage (there is a
small extra chunk of zero'd memory allocated) but its variables were not
copied. So it gets a NULL pointer, dereferences it, End of the World.

Here is a small test program to demonstrate the crash:
https://github.com/dumbbell/test-tls-initial-exec

=3D=3D Solutions =3D=3D

A first workaround is to LD_PRELOAD libGL.so or link the program
directly to libGL.so.

Another solution is the following: in the Glibc (quite popular these
days), they allocate extra static TLS space beside the size of the TLS
variables available at startup (ie. TLS variables from the program and
linked libraries). Then, when a library is dlopen'd with static TLS
variables, they are copied to this extra space. This space is not
dynamically extended, so first loaded, first served. If there is no
space left, I think dlopen(3) fails.

In FreeBSD's rtld, we already allocate extra space. See for instance the
use of the RTLD_STATIC_TLS_EXTRA constant here:
https://github.com/freebsd/freebsd/blob/master/libexec/rtld-elf/amd64/rel=
oc.c#L453-L464

The command even says that this extra space is allocated specifically
for dynamic modules. However, I don't see where we use this space.
dlopen_object() doesn't mess with TLS at all (or I'm missing something).

FWIW, Mesa's libGL.so is not the only one to do this. NVIDIA's libGL.so
uses the same technic and apparently, AMD Catalyst's one too on Linux. I
wonder if the following bug is caused by this exact issue:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D205149

I would like to modify dlopen_object() to install static TLS variables
in this extra space. Or do you suggest a better alternative?

If possible, I would like to have this into FreeBSD 10.3-RELEASE to
avoid future maintenance headaches of Mesa.

Some references about this issue:
https://bugs.freedesktop.org/show_bug.cgi?id=3D35268
http://www.redhat.com/archives/phil-list/2003-February/msg00077.html
https://gcc.gnu.org/ml/gcc/2015-02/msg00095.html

--=20
Jean-S=C3=A9bastien P=C3=A9dron


--GXRNBuFSWdni5fbROsATE9SqfAa5CesF8
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQJ8BAEBCgBmBQJWx6APXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQ2NzA4N0ZEMUFFQUUwRTEyREJDNkE2RjAz
OUU5OTc2MUE1RkQ5NENDAAoJEDnpl2Gl/ZTM9fkP/2NdJUX7n6x9bD67dJ/tGQRO
9PLAoChaK0nnwwPCHa9UsRvHNSDLHYqwmdyWugzsfNn52gNu3v1tDFFFcqEBEi0j
bKihBwFq+sCbwbQ06L/aKhXFCv4RMHN9GerOSD6KRn1e90enn0r5Oy3Wmk2RoOaW
653abIZPSr7xkh0KiiZM4UL8VVgsyzKwSCHOoBbwqyMhiDFOD7a6M0ix0xgVoyKT
PUcG69tzks9zzqrLjTWeG8B6Asc+mtJoqYGk5fSywI5ni6UbNJuFBuK7gORymoPw
kaLJMpWk7IcmFmxJ3U9NaKZy5qn9+R2MLiULk98X/Dnz+lxs/uYpIj9iKsYou7PJ
XoCMvGVBLxm9Y+OrvY3rx8PQ2zfHbGDza07qHzOtzfXIdi/Zb+QuOfA+ZcACyngf
MFiLVkSIoA/KpaMnmMgPNWY+IbTY827O5WoGtds2BH6kRyx4r/PdTfC84muoIABC
MG/Wiv26jILcWbyvAm7AVr5Nzl/aLSIBfYix2Jiy2JKAXRvN1uNrr0To6PYt4RmL
jHB8AlGrw3ZU6tJPwGznKfHSagG+EnoOHdxAYSwUkLqzmP35MIEMJJWv6xzONQnh
6zml58gRBECU6YcQGCdeQ680Rjqx0rcZrNF8setjB111gHSNYPI0v6LYXL4kH/a5
lYH9D5e/LCdtu5sW7z6o
=Xs3Z
-----END PGP SIGNATURE-----

--GXRNBuFSWdni5fbROsATE9SqfAa5CesF8--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?56C7A00A.3030802>