Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 19 Aug 2002 13:43:24 -0700
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Maxim Sobolev <sobomax@FreeBSD.org>
Cc:        hackers@FreeBSD.org, audit@FreeBSD.org, Alexander Litvin <archer@whichever.org>, Andriy Gapon <agapon@excite.com>
Subject:   Re: Thread-safe resolver [patches for review]
Message-ID:  <3D61586C.5938BF99@mindspring.com>
References:  <3D578A99.F0821712@FreeBSD.org> <3D5792CD.497C80F0@mindspring.com> <3D57A9D4.DAA043EF@FreeBSD.org> <3D57CF6D.2982CE8@mindspring.com> <3D58BFE8.9281433@FreeBSD.org> <3D58C359.A5F7B1AA@mindspring.com> <3D612A34.BCB0B949@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Maxim Sobolev wrote:
> Terry Lambert wrote:
> > [...]
> > > > The assumption (which is potentially wrong) is that the program
> > > > will correctly shut down all its threads, when in fact it was a
> > > > module not under the programs control that created and used the
> > > > threads.
> > >
> > > I do not quite agree. In such case, the module should probably have
> > > destructor function, either placed into the fini section, or to be
> > > explicitly called by the program before dlclose().
> >
> > Uh, that's exactly the argument I was making: use a .fini section
> > to clean up the per thread memory allocations.
> > 8-).
> 
> I am not sure how you can get from a .fini section list of per-thread
> dynamically allocated storages, without resorting to inspecting inner
> implementation details of pthread_{set,get}specific(3). Any ideas?

The easiest approach is explicit registration of the allocations
at the time they occur.  For kernel threads, this would probably
involve mutex protection of the list, but for user space threads,
you could just do unprotected insertion/deletion.

The best example code for this at present is the per thread
exception stack management code for C++.  Unfortunately, the
EGCS implementation is much worse than the GCC 2.95/2.96 patch
that Jeremey Allison did back when we were working on porting
the ACAP code to G++, so you have to eat the overhead, even in
non-threaded programs, rather than dynamically attaching the
allocation in thread startup (i.e. unthreaded programs eat threads
overhead in the current scheme of things).

Unless you went to a dynamic registration (with the mutex), yes,
you'd have to modify pthread_get_specific(), or provide a weak
symbol with a per .so strong wrapper to trap and record allocations.

I'm not sure if that's really necessary, though.  It is not so
necessary that the allocations end up not being visible to the
address space of other threads, as it is that they simply be
discretely per thread (e.g. you don't have to protect against
non-marshalled access by another thread).

This gets us back to the discussion of whether it's safe to hand
the results of a "resolve this" operation by a resolver worker
thread off to another thread without copying it.  If the result
was local to the address space of the thread, then you'd get
automatically protected (if you handed it off without a copy to
the global heap, when you went to reference it, you'd core dump).

This has some advantages, but I think it's too specific to the
implementation details of threading on a particular OS (i.e.
FreeBSD does not have seperate memory mappings per thread, like
Windows does, and Windows only does for objects created *after*
a thread is created, since allocations since main startup are not
retroactively removed from the copy of the address space, etc.).

Personally, I'd just intern the allocations, e.g.:

	struct foo {	/* object I want to be per thread */
		...
	};

	struct wrap_foo {	/* intern version of object */
		struct foo x;	/* coerce and return this value */
		SLIST_ENTRY(restrack)	res_link;	/* intern list */
	};

And then insert each per thread allocation on a per object type
list that is traversed by the function called by the module .fini.


Realize, though, the the whole idea of per thread allocations in
order to get around the "_r" problem is bogus if you get more
than a small number of threads.  If you had (for example) 10,000
threads, all calling the resolver, then you'd end up with 10,000
per thread allocations.  If that happened, you'd probably be a
*lot* better off explicitly using "_r" routines, and forcing the
user to manage the allocations (and deallocations).  This is
probably automatic on the stack, anyway, which means it's memory
that the caller doesn't have to explicitly track.

Of course, if the user is writing code like that, it's probably
not an issue, since they will end up leaking the memory somewhere
else, given that they write bad code.  ;^).

Really, this is just to get around legacy interfaces that return
a pointer to static storage, and which can't simply be murdered
because of their presence in standards (same goes for strncpy).
Ideally, the interface would never have done this in the first
place.


-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-audit" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3D61586C.5938BF99>