Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 14 Dec 1999 00:33:12 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        nate@mt.sri.com
Cc:        chuckr@picnic.mat.net, adsharma@sharmas.dhs.org, freebsd-arch@freebsd.org
Subject:   Re: Thread scheduling
Message-ID:  <199912140033.RAA27290@usr08.primenet.com>
In-Reply-To: <199912110455.VAA24095@mt.sri.com> from "Nate Williams" at Dec 10, 99 09:55:29 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> > OK, let me state it again.  I wasn't asking if it was a good thing to
> > share out threads among multiple processors, because the advantages of
> > using a multiple CPU system *as* a multiple CPU seem obvious enough not to
> > need asking.
> 
> Except that it's possible that you may want to limit multiple-CPU's to
> multiple processes for cache reasons.  Once could argue that for certain
> classes of threaded applications, you'd get a better hit rate by
> sticking with a single CPU for *all* of the threads, although this
> wouldn't be true for all threaded applications.

The MESI cache coherency protocol solves most of this; the reload
would only be from L2 to L1 cache (slow enough on most systems, I
grant you).

I don't really know what cache coherency protocol is used by the
4 CPU Alpha box Mike Smith has.  I do know that the dual PPC-603
"BeBox" and several similar boxes didn't have anything like APIC
support, and so they removed the L2 cache entirely, and used the
cache signalling lines to implement MEI coherency.


> It depends on the application.

Right.

I think, though, for most threaded applications, what you'd really
want is negative affinity, where you want your quantum reservations
for different threads to be most likely to be on a different CPU.
This has two effects:

o	It allows multiple CPUs to be executing in user space in
	the same program, simultaneously

o	With a per CPU scheduler, you get rid of The Big Giant
	Lock(tm), in favor of a per CPU scheduler lock, that is
	only held by a different CPU when migrating processes
	for load balancing reasons.  Even so, only the two CPUs
	involved in the migration get stalled, and, frankly,
	they most likely do not get stalled at all, so long as
	you stagger your quantum clock between the CPUs.

This assumes that each thread is not pounding heavily on globally
shared state with other threads (i.e. each thread has limited
locality, and most locality is not in a contention domain).  For
the rare application that doesn't meet this (most likely, as a
result of a bad design; very few problems would require this as
part of their soloution), it might make sense to lockstep all, or
more likely, just the badly behaved threads, onto the same processor.


> > I was asking to see if it would be a good thing to add a
> > strong bias to the system, in such a way as to make the co-scheduling of
> > threads among the different processors so that all processors that are
> > made available to the program's threads are executing in that address
> > space as the same moment in time.
> 
> I wouldn't think it would help for cache rate and/or CPU usage, but
> that's just a gut feeling and not based on anything else.  Each CPU in
> an SMP system has it's own cache, so what happens on another CPU
> shouldn't effect how the one CPU performs.
> 
> Adding this bias wouldn't help, and may in fact make things worse (see
> above).


I would go further: I believe that it would make the scheduler
significantly more complicated than it has to be, as well as
setting The Big Giant Lock(tm) deeper into cement.

It would also tend to result in starvation of other processes,
unless you delayed scheduling, since accelerating it each time
another thread in the process became ready to run would have the
effect of double-booking quantum for the process.

Finally, it would also tend to leave the processors partially
idle (stalled) as a result of voluntary context switches not
occurring at exactly the same place in all threads (i.e. a cache
miss on a read from disk putting the caller to sleep).


> > Not a guarantee, but would it be a good thing to have them
> > "co-scheduled" (or a bias towards that likelihood).
> 
> But, I can't see any advantage to have them co-scheduled.

Gang scheduling is more appropriate to non-NUMA multiprocessors
working on data flow problems.  8-).


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.




To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199912140033.RAA27290>