Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 02 Apr 2003 01:17:21 -0800
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Stijn Hoop <stijn@win.tue.nl>
Cc:        current@freebsd.org
Subject:   Re: libthr and 1:1 threading.
Message-ID:  <3E8AAAA1.CB4B51B0@mindspring.com>
References:  <20030331225124.W64602-100000@mail.chesapeake.net> <20030402083026.GB83512@pcwin002.win.tue.nl>

next in thread | previous in thread | raw e-mail | index | archive | help
Stijn Hoop wrote:
> On Mon, Mar 31, 2003 at 10:54:45PM -0500, Jeff Roberson wrote:
> > I have commited libthr.  To try this out you'll need to do the following
> 
> I know very very little about threads, but I'm interested as to what the
> purpose is of this library. Is there a document available somewhere that
> describes the relationships between this, KSE, libc_r, pthreads, the
> Giant-unwinding-make-SMP-work-better project and some of the other
> threads and SMP related libraries and terminology?

No, not really: the new libthr was pretty much a "Skunk Works"
project, and was not managed through the KSE project; it's really
orthogonal, but builds on some of the KSE work already done in the
kernel so far... most of KSE lives there.

Here's a thumbnail sketch, though (forgive me, KSE folks, if I mung
it too badly):


pthreads:	POSIX Threads is a threads API, which is
		specified by the standard ISO/IEC 9945-1:1996
		(``POSIX.1'').

libc_r:		A user space implementation of the pthreads API;
		this implementation uses a "call conversion"
		scheduler, in user space, to convert blocking
		calls into non-blocking calls, plus a threads
		context switch via the user space scheduler.
		Like all interactive timesharing schedulers, it
		gives the illusion of concurrency.  However, the
		kernel is not thread-reentrant in this model, so
		it does not scale to more than one CPU on an SMP
		system, since there is only a single scheduler
		context.

KSE:		"Kernel Schedulable Entitites" is the name of
		the modified scheduler activations framework, as
		well as the user space components, and kernel
		modifications, for an N:M model threading system.
		It has the advantage over the "libc_r" in that it
		causes the kernel to be thread reentrant, and so
		it provides SMP scalability.  Because it's N:M,
		it also has the advantage over the 1:1 approach
		of causing full quantum utilization, and providing
		for CPU affinity for individual threads, and CPU
		negaffinity for threads within the same process,
		thereby providing for theoretically optimal CPU
		and other resource utilization.  It also includes
		a user-space library component, which is incomplete
		at present.

libthr:		This is the recently committed 1:1 model threading
		library.  It provides a simpler user space library
		component, which provides the same SMP scalability,
		as far as kernel thread reentrancy is concerned,
		but fails to provide for full quantum utilization,
		and, at present, does not directly address the CPU
		affinity issues itself (no sophisticated use of
		KSEGRP).  It builds on the kernel modifications for
		KSE, and adds a couple of system call API's in order
		to manage creation, etc., of threads.  The major
		intent is to provide for SMP scalability; as a side
		effect, it provides a proof-of-concept for the KSE
		code already in the kernel, and as such, has been
		very welcome.

SMPng:		"The Giant-unwinding-make-SMP-work-better project",
		to quote you.  8-).  SMPng has it's own project
		page and documentation.  To give another thumbnail
		drawing, it's about improving the granularity of
		locking and the logical seperation between kernel
		subsystems, so that stall barriers are eliminated.
		By doing this, inter-CPU contention is reduced,
		and you get better SMP scaling.  Traditionally,
		when you added another CPU, you maybe got a 20%
		performance improvement for a 100% increase in
		cost.  The idea is to get that 20% up to as close
		to 100% as possible (impossible, but you can
		approach that).  SVR4, for example, scales well
		up to 4 processors on Intel.  It scales higher than
		that, but the incremental improvement is about 80%,
		and so at about 4 processors, you hit a point where
		the cost of additional processors is higher than
		the value of the additional compute cycles.

You may also find these resources useful:

	http://people.freebsd.org/~julian/threads/
	http://www.freebsd.org/smp/index.html
	http://www.freebsd.org/projects/busdma/index.html
	http://www.freebsd.org/projects/projects.html

Most of the documentation lives in mailing list archives, and is
not terribly formal (Software Engineers, not English Majors, and
all that...).

-- Terry



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3E8AAAA1.CB4B51B0>