Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 Mar 2004 20:00:13 +0100
From:      Doug Rabson <dfr@nlsystems.com>
To:        freebsd-threads@freebsd.org
Subject:   Thread Local Storage
Message-ID:  <200403292000.13794.dfr@nlsystems.com>

next in thread | raw e-mail | index | archive | help
I've been spending a bit of time recently familiarising myself with this 
TLS stuff and trying out a few things. I've been playing with rtld and 
I have a prototype patch which implements enough TLS support to let a 
non-threaded program which uses static TLS work. With a tiny bit more 
work I can have limited support for dynamic TLS as well (not for 
dlopen'ed modules yet though). Is there a p4 tree for this stuff yet? 
I'd like to check in what I have sometime.

I've also been looking at libpthread and I can see some potential 
problems with it. Currently libpthread on i386 uses %gs to point at a 
struct kcb which seems to be a per-kse structure. This structure 
contains a pointer to a per-thread struct tcb and this pointer is 
managed by the userland context switch code. Other arches are similar, 
e.g. ia64 uses $tp to point at struct kcb.

The problem with TLS is that the i386 ABI needs %gs to point at the TLS 
storage for the current thread (its a tiny bit more involved than that 
but that doesn't matter much for the purposed of this discussion). This 
leads to trouble since it looks like we will end up needing to allocate 
an LDT segment per thread, leading to an arbitrary limit on the number 
of threads (~8192).

I can think of a couple of possible ways to get around this. One easy 
way would be to allocate a segment per KSE and call i386_set_ldt from 
the thread switch. Pretty ugly really and takes a syscall. Another 
slightly better way would be to lazy-allocate segments when we switch 
threads and reclaim segments from threads which haven't run recently. 
This technique would be able to get away with a smaller number of 
segments which tend to be owned by the threads which run most often.

There is a similar issue with libthr but since it already allocates an 
LDT entry per thread there are no new limitations. Linux has an 
interesting wrinkle on the libthr solution - they have a GDT per cpu 
and they pre-allocate three GDT slots for TLS pointers (one for glibc, 
one for Wine and one spare). The kernel thread switching code fills in 
these GDT slots on the current cpu with values stored in the 
pcb-equivalent.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200403292000.13794.dfr>