Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 18 Jun 2003 07:48:09 +0800
From:      "David Xu" <davidxu@freebsd.org>
To:        "Marcel Moolenaar" <marcel@xcllnt.net>, "Julian Elischer" <julian@elischer.org>
Cc:        aritger@nvidia.com
Subject:   Re: Nvidia, TLS and __thread keyword -- an observation
Message-ID:  <002101c3352a$e931a7f0$0701a8c0@tiger>
References:  <20030617071810.GA2451@dhcp01.pn.xcllnt.net><Pine.BSF.4.21.0306171433060.31025-100000@InterJet.elischer.org> <20030617223910.GB57040@ns1.xcllnt.net>

next in thread | previous in thread | raw e-mail | index | archive | help

----- Original Message -----=20
From: "Marcel Moolenaar" <marcel@xcllnt.net>
To: "Julian Elischer" <julian@elischer.org>
Cc: <threads@FreeBSD.org>; <gareth@nvidia.com>; <aritger@nvidia.com>
Sent: Wednesday, June 18, 2003 6:39 AM
Subject: Re: Nvidia, TLS and __thread keyword -- an observation


> On Tue, Jun 17, 2003 at 03:02:20PM -0700, Julian Elischer wrote:
> >=20
> >=20
> > On Tue, 17 Jun 2003, Marcel Moolenaar wrote:
> >=20
> > > Guys,
> > >=20
> > > In short: Don't bash Nvidia. What they do is not uncommon. Well,
> > > maybe in Open Source environments. So please end this thread,
> > > unless people get constructive.
> >=20
> > I think its already ended..
> >=20
> > basically:
> > We should alwasy be able to use (on i386) the sam amethods outlined =
for
> > solaris.=20
> > Not quite as quick as those for Linux but more general.
>=20
> I'm not sure you understand the issue (I can easily be wrong, I just
> don't see the evidence in your statement). To support the __thread
> keyword, our thread library needs to create the TLS as defined in the
> binary and its dependent shared libraries by virtue of the .tdata and
> .tbss sections/segments, based on the image of the TLS as constructed
> by the RTLD for the initial set of modules (created for the initial
> thread) and amended by TLS space defined in the dynamicly loaded
> libraries; and the TLS has to be created for every new thread at the
> time the thread itself is created. This TLS allocation has to be made
> accessable in accordance with runtime specifications for the supported
> architectures (libthr: i386 & ia64; libkse: i386 currently -- more to
> follow) and in line with the access sequences created by the compiler,
> and using the static relocations known to the static linker and =
dynamic
> relocations of which the support needs to be added to RTLD.
>=20
> The static TLS model requires the least amount of work: add support
> to allocate the TLS image for every thread creation and point the
> thread pointer to it in a way compatible with the runtime spec.
>=20
> The dynamic TLS model requires more substantial changes and involves
> RTLD as well. This is the model that requires __tls_get_addr().
>=20

I believe this will add overhead to thread creating and destroying,
How fast an RTLD can be in this case ? I can create and join 10000=20
threads in 0.2 second here on old Celeron 500 using libkse with hash
table added for thread searching.

Hope that while OpenGL can get benifit from __thread, it won't
punish another style of program --- program does lots of thread
creating and destroying and rarely needs to access thread local
variables, we want to keep thread creating and destroying as cheap
as possible, if we can not obey this rule, then thread creating
and destroying become as heavy as process creating, so what benifit=20
can you get when it is as heavy as process ?

I am not objecting the __thread idea, just notice you this problem
when you try to implement it.

> --=20
>  Marcel Moolenaar   USPA: A-39004 marcel@xcllnt.net
> _______________________________________________
> freebsd-threads@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-threads
> To unsubscribe, send any mail to =
"freebsd-threads-unsubscribe@freebsd.org"
> 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?002101c3352a$e931a7f0$0701a8c0>