Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 2 Apr 2003 11:26:32 -0500 (EST)
From:      Robert Watson <rwatson@freebsd.org>
To:        Terry Lambert <tlambert2@mindspring.com>
Cc:        Jeff Roberson <jroberson@chesapeake.net>
Subject:   Re: libthr and 1:1 threading.
Message-ID:  <Pine.NEB.3.96L.1030402111443.35655A-100000@fledge.watson.org>
In-Reply-To: <3E8B03E6.36871704@mindspring.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On Wed, 2 Apr 2003, Terry Lambert wrote:

> Is the disk I/O really that big of an issue?  All writes will be on
> underlying non-blocking descriptors; I guess you are saying that the
> interleaved I/O is more important, further down the system call
> interface than the top, and this becomes an issue? 

The I/O issue is a big deal for things like mysql, yes.

> It seems to me that maybe the correct fix for this is to use AIO instead
> of non-blocking I/O, then? 

Well, they're both fixes.  Another issue for applications that are
threaded and may be bumping up against the system memory limits is whether
or not the whole process stalls on a page fault or memory mapping fault,
or whether it's just the thread.  If you have an application that is
accessing a large memory mapped file, there may be some long kernel sleeps
as you pull in the pages.  Certainly, you can argue that the application
should be structured to make all I/O explicit and asynchronous, but for
various reasons, that's not the case :-).  Our VM and VFS subsystems may
have limited concurrency from an SMPng perspective, but probably have
enough that a marked benefit should be seen there too (you might have to
wait for another thread to block in the subsystem, but that will be a
short period of time compared to how long it takes to service the page
from disk). 

> The GUI thread issues are something I hadn't considered; I don't
> generally think of user space CPU intensive operations like that, but I
> guess it has to be rendered some time.  8-). 

One of the problems I've run into is where you lose interactivity during
file saves and other disk-intensive operations in OpenOffice.  Other
windows could in theory still be processing UI events, such as menu
clicks, etc, but since you're dumping several megabytes of data to disk or
doing interactive file operations that require waiting on disk latency,
you end up with a fairly nasty user experience.  One way to explore this
effect is to do a side-by-side comparison of the behavior of OpenOffice
and Mozilla linked against libc_r and linuxthreads.  I haven't actually
instrumented the kernel, but it might be quite interesting to do
so--attempt to estimate the total impact of disk stalls on libc_r.  From a
purely qualitivative perspective, there is quite a noticeable difference.

> Has anyone tried compiling X11 to use libthr?

Not sure.

> Also, any ETA on the per process signal mask handing bug in libthr? 
> Might not be safe to convert everything up front, in a rush of eager
> enthusiasm... 

Can't speculate on that, except one thing that is useful to note is that
many serious threaded applications are already being linked against
linuxthreads on FreeBSD, which arguably has poorer semantics when it comes
to signals, credentials, etc, than libthr already :-).  For example, most
sites I've talked to that deploy mysql do it with linuxthreads rather than
libc_r to avoid the I/O issues, as well as avoid deadlocks.  There are
enough bits of the kernel (for example, POSIX fifos) that don't handle
non-blocking operation that libc_r can stall or get into I/O buffer
deadlocks.  I seem to recall someone mentioning (but can't confirm easily) 
that Netscape at one point relied on using pipes to handle some sorts of
asynchronous events and wakeups within the same process.  If that pipe
filled, the process would block on a pipe write for a pipe that would
never drain.

I can think of a couple of other interesting excercises to explore the
problem -- implementing AIO "better" using the KSE primitives mixed
between userspace and kernel, reimplementing libc_r to attempt to use AIO
rather than a select loop where possible, etc.  It might be quite
interesting to see whether (for a bounded number of threads, due to our
AIO implementation), a libc_r that used AIO rather than select
demonstrated some of the performance improvements we see with 1:1 via
linuxthreads (and now libthr).  I'm not sure if there are any open source
tools available to easily track process and thread scheduling and
blocking, but there have been several pretty useful visual analysis and
tracing tools for realtime.  Some basic tools for thread tracing and
visualization exist for Mac OS X, and presumably other COTS platforms.
ktrace on FreeBSD makes some attempt to track context switches, but
without enough context (har har) to be useful for this kind of analysis.
I've been thinking about tweaking the local scheduler to put a bit more
information into ktr and alq about blocking circumstances as well as some
way to constrain the tracing to a particular bit of the process hierarchy
with an inheritance flag of some sort.  It might be quite helpful for
understanding some of the nasty threading blocking/timing issues that we
already run into with libc_r, and will continue to run into as our
threading evolves.

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert@fledge.watson.org      Network Associates Laboratories



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1030402111443.35655A-100000>