Date: Sat, 20 Feb 2016 00:06:50 +0100 From: =?UTF-8?Q?Jean-S=c3=a9bastien_P=c3=a9dron?= <dumbbell@FreeBSD.org> To: freebsd-arch@FreeBSD.org Subject: "initial-exec" TLS model and dlopen(3) Message-ID: <56C7A00A.3030802@FreeBSD.org>
next in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --GXRNBuFSWdni5fbROsATE9SqfAa5CesF8 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hello! =3D=3D Context =3D=3D After Mesa 11.2.0 branch point (RC1 is scheduled today), Emil Velikov, Mesa release engineer, told us he plans to only keep the TLS-based GL dispatcher and remove the other code path. However, even if all Linux distributions use the TLS-based one for a long time, it was still turned off by default in the configure, and we never field-tested it on FreeBSD. We enabled the --enable-glx-tls flag in Mesa in our development Ports tree. Unfortunately, some applications segfault after that. Firefox is one of them. Below is what I understand about this issue so far. =3D=3D The issue =3D=3D Most applications are linked directly to libGL.so. In this case, there is no problem. For example, glxgears is linked directly to libGL.so. Some of them use dlopen(3) to directly or indirectly load libGL.so. For example, Firefox dlopens libxul.so which is linked to libGL.so. Mesa uses the "initial-tls" model for at least one TLS variable: https://cgit.freedesktop.org/mesa/mesa/tree/src/glx/glxcurrent.c#n79 I'm new to TLS implementation details, but if I understand correctly, this model is a static one, meaning that a variable address is known and it's accessed directly, like a normal variable, as opposed to dynamic TLS models where a variable address is first queried with __tls_get_addr(). This is all transparent to the program because the compiler is responsible for generating the appropriate code, depending on the model. In the case of a direct link like glxgears, our rtld (ld-elf.so) allocates space during startup to copy static TLS variables from the program and all linked libraries. libGL.so finds its variables where it expects them to be, glxgears is happy. In the case of a dlopen(3) like Firefox, our rtld maps the dlopen'd object and all its linked libraries but it doesn't look for any static TLS variables. libGL.so accesses the allocated TLS storage (there is a small extra chunk of zero'd memory allocated) but its variables were not copied. So it gets a NULL pointer, dereferences it, End of the World. Here is a small test program to demonstrate the crash: https://github.com/dumbbell/test-tls-initial-exec =3D=3D Solutions =3D=3D A first workaround is to LD_PRELOAD libGL.so or link the program directly to libGL.so. Another solution is the following: in the Glibc (quite popular these days), they allocate extra static TLS space beside the size of the TLS variables available at startup (ie. TLS variables from the program and linked libraries). Then, when a library is dlopen'd with static TLS variables, they are copied to this extra space. This space is not dynamically extended, so first loaded, first served. If there is no space left, I think dlopen(3) fails. In FreeBSD's rtld, we already allocate extra space. See for instance the use of the RTLD_STATIC_TLS_EXTRA constant here: https://github.com/freebsd/freebsd/blob/master/libexec/rtld-elf/amd64/rel= oc.c#L453-L464 The command even says that this extra space is allocated specifically for dynamic modules. However, I don't see where we use this space. dlopen_object() doesn't mess with TLS at all (or I'm missing something). FWIW, Mesa's libGL.so is not the only one to do this. NVIDIA's libGL.so uses the same technic and apparently, AMD Catalyst's one too on Linux. I wonder if the following bug is caused by this exact issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D205149 I would like to modify dlopen_object() to install static TLS variables in this extra space. Or do you suggest a better alternative? If possible, I would like to have this into FreeBSD 10.3-RELEASE to avoid future maintenance headaches of Mesa. Some references about this issue: https://bugs.freedesktop.org/show_bug.cgi?id=3D35268 http://www.redhat.com/archives/phil-list/2003-February/msg00077.html https://gcc.gnu.org/ml/gcc/2015-02/msg00095.html --=20 Jean-S=C3=A9bastien P=C3=A9dron --GXRNBuFSWdni5fbROsATE9SqfAa5CesF8 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQJ8BAEBCgBmBQJWx6APXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQ2NzA4N0ZEMUFFQUUwRTEyREJDNkE2RjAz OUU5OTc2MUE1RkQ5NENDAAoJEDnpl2Gl/ZTM9fkP/2NdJUX7n6x9bD67dJ/tGQRO 9PLAoChaK0nnwwPCHa9UsRvHNSDLHYqwmdyWugzsfNn52gNu3v1tDFFFcqEBEi0j bKihBwFq+sCbwbQ06L/aKhXFCv4RMHN9GerOSD6KRn1e90enn0r5Oy3Wmk2RoOaW 653abIZPSr7xkh0KiiZM4UL8VVgsyzKwSCHOoBbwqyMhiDFOD7a6M0ix0xgVoyKT PUcG69tzks9zzqrLjTWeG8B6Asc+mtJoqYGk5fSywI5ni6UbNJuFBuK7gORymoPw kaLJMpWk7IcmFmxJ3U9NaKZy5qn9+R2MLiULk98X/Dnz+lxs/uYpIj9iKsYou7PJ XoCMvGVBLxm9Y+OrvY3rx8PQ2zfHbGDza07qHzOtzfXIdi/Zb+QuOfA+ZcACyngf MFiLVkSIoA/KpaMnmMgPNWY+IbTY827O5WoGtds2BH6kRyx4r/PdTfC84muoIABC MG/Wiv26jILcWbyvAm7AVr5Nzl/aLSIBfYix2Jiy2JKAXRvN1uNrr0To6PYt4RmL jHB8AlGrw3ZU6tJPwGznKfHSagG+EnoOHdxAYSwUkLqzmP35MIEMJJWv6xzONQnh 6zml58gRBECU6YcQGCdeQ680Rjqx0rcZrNF8setjB111gHSNYPI0v6LYXL4kH/a5 lYH9D5e/LCdtu5sW7z6o =Xs3Z -----END PGP SIGNATURE----- --GXRNBuFSWdni5fbROsATE9SqfAa5CesF8--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?56C7A00A.3030802>