From owner-freebsd-arch  Tue Apr 11 13:13:18 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from ns1.yes.no (ns1.yes.no [195.204.136.10])
	by hub.freebsd.org (Postfix) with ESMTP id B2FD937BB6B
	for <freebsd-arch@freebsd.org>; Tue, 11 Apr 2000 13:12:24 -0700 (PDT)
	(envelope-from eivind@bitbox.follo.net)
Received: from bitbox.follo.net (bitbox.follo.net [195.204.143.218])
	by ns1.yes.no (8.9.3/8.9.3) with ESMTP id WAA18428
	for <freebsd-arch@freebsd.org>; Tue, 11 Apr 2000 22:11:41 +0200 (CEST)
Received: (from eivind@localhost)
	by bitbox.follo.net (8.8.8/8.8.6) id WAA00358
	for freebsd-arch@freebsd.org; Tue, 11 Apr 2000 22:11:30 +0200 (CEST)
Received: from smtp05.primenet.com (smtp05.primenet.com [206.165.6.135])
	by hub.freebsd.org (Postfix) with ESMTP
	id 7552A37B755; Tue, 11 Apr 2000 10:54:02 -0700 (PDT)
	(envelope-from tlambert@usr01.primenet.com)
Received: (from daemon@localhost)
	by smtp05.primenet.com (8.9.3/8.9.3) id KAA20982;
	Tue, 11 Apr 2000 10:53:40 -0700 (MST)
Received: from usr01.primenet.com(206.165.6.201)
 via SMTP by smtp05.primenet.com, id smtpdAAAEPaO2O; Tue Apr 11 10:53:30 2000
Received: (from tlambert@localhost)
	by usr01.primenet.com (8.8.5/8.8.5) id KAA20002;
	Tue, 11 Apr 2000 10:53:44 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <200004111753.KAA20002@usr01.primenet.com>
Subject: Re: BUF/BIO roadmap.
To: julian@elischer.org (Julian Elischer)
Date: Tue, 11 Apr 2000 17:53:44 +0000 (GMT)
Cc: tlambert@primenet.com (Terry Lambert),
	mckusick@flamingo.McKusick.COM (Kirk McKusick),
	phk@freebsd.org (Poul-Henning Kamp), arch@freebsd.org
In-Reply-To: <38F35AC2.237C228A@elischer.org> from "Julian Elischer" at Apr 11, 2000 10:02:58 AM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> > It seems to me that the interrupt threads are an implementation
> > of this (10 year old) technology.
> 
> As I have mentionned to you before. Just because something is a 
> 10 year old idea  does not make it bad.

Agreed.

> I think that making everything a thread with a blockable context,
> whether initiate from above or below, makes a lot of sense and
> significantly reduces the complexity of the fine-grained-SMP work
> that needs to be done.

It may surprise you that I agree that this statement is correct;
however, I think that given a choice between low complexity vs.
high performance, I will choose high performance.

The Linux benchmarks by ZD Labs, the Netcraft results being
ignored for controversy, show that low complexity is not always
the best choice.

So it takes smart people to understand the code; there are enough
smart people on this list that I don't see that as a problem.  Put
another way, I prefer hanging out with smart people.


> I also would like to point out that handling interupts is a
> significant part of hard-realtime, and that if the lowest
> level of the interrupt code is kept 'under control' then
> solutions imposed at this level in the non threaded case are
> also applicable in the threaded case.  The lowest level is
> basically the same until the control is 'morphed' into a thread.

Where is Peter when I need him?  }B-p

The primary attribute of Hard Real Time is determinism.  You
MUST have determinism in scheduling.  If we wish to be technical,
PC hardware is generally incapable of Hard Real Time for more
than a single task, unless we loosen our deadlines significantly
below the capabilities of the hardware (i.e. Soft Real Time).


> > It also seems to me that kernel threads are _still_ a significantly
> > bad idea, since the problems faced in kernel preemption are a subset
> > of the problems faced in Real Time support, and that as a result, it
> > will be significantly harder to support Hard Real Time in the future
> > without significant revisions of the the OS architecture.
> 
> I believe this to not be the case in reality. Once control has been 
> handed to an interrupt thread, that thread can be suspended in favour of
> the realtime process. This is not an option with the current system
> and actually gives a lot of possible options not presently available
> in handling RT operations.

Please compare the performance of SMP Linux and SMP Solaris.  SMP
Linux uses the approach you are suggesting, and SMP Solaris uses
finite state automata, implemented via critical object locking (less
efficient than critical sectioning, in the limit, but vastly more
efficient than taking a context switch hit, which on FreeBSD Pentium
class hardware may mean FPU register changes for uiomove support; and
once in, I can't see kernel threads _not_ being used for PIO -- they're
too convenient to not abuse).


> > I don't think the goal of code integration for the sake of code
> > integration is really worthwhile.  I view the use of kernel thread
> > context switches as an alternative to addressing fine grained
> > parallelism through critical sectioning and object locking as a
> > compromise; perhaps not a good one, since this will obviously
> > result in register window flushing on RISC architectures, such
> > as SPARC.
> 
> The use of "lazy kernel threads" does not significantly alter
> the usage of registers from the current usage.

Unnecessary context switching, even to other contexts within the
same address space, does.  A stack switch requires flushing of
register windows.

The SunOS 4.x "liblwp" (a user space call conversion scheduler,
which did not support SMP scaling) is specifically based on a
University of Washingtom project, resulting in the paper:

	Register Windows and User-Space Threads on the SPARC
	KEPPEL
	UW-CSE-91-08-01
	U of Washington CS
	ftp://ftp.icsi.Berkeley.edu/pub/techreports/1994/tr-94-027.ps.Z

The system call for the context switch register window flush
described in the paper would have to be done for kernel threads
context switchs on SPARC and other RISC architectures which also
implement register windowing.


> > It seems to me that the thing to address first is that which
> > Dynix addressed first, and which was noted in chapter 12 0f
> > Uresh Vahalia's _UNIX Internal: The New Frontiers_, which is
> > per processor resource pools with high and low watermarking and
> > free resource revocation in low resource conditions.  This would
> > significantly reduce even the need for bus and lock contention,
> > since contention would only occur at high or low watermark, or
> > as the result of a resource revocation event in an already
> > stressed system.
> 
> There has been no argument that this is probably a good idea in 
> the long run. The use of lazy kernel threads for interrupt
> handling in no way interferes with this as the two features are
> orthogonal.

Agreed.  I was merely expressing a preference for ordering, and
where I thought the lowest hanging fruit resides.  IMO, that's
not in kernel threading.  Kernel threads are merely a poor way
to achieve SMP scaling of user space processes, and to avoid doing
the hard work of building proper automata.  My personal feeling
is that kernel threads will be significantly more prone to the
possibility of revealed race conditions than proper automata.  You
may disagree with this statement, but my personal experience with
both approaches tells me that this is so.


> This is only my opinion and not absolute truth.
> however it would be good to get people to discuss this topic a 
> bit more specifically those from BSDI who have experience with the 
> actual implementation.

You mean the BSDI's implementation, of course.  Yes, I would
like to hear more about their implementation; all I have is the
verbal and email descriptions of interrupt threads.

I personally have some applicable experience with Solaris,
UnixWare, and AIX.  Oh yeah, and experience with FreeBSD dating
back to October of 1995 and Jack Vogel's patches.  If we could
drag Bryan Mann into this, he could enlighten us with his
experience on SMP UNIX for ICL.  I kind of doubt that Whistle's
nCube neighbors, or our former CTO, Jim Li (Thinking Machines)
could be dragged in, but it can't hurt to ask.  Jack Vogel would
be a good resource, as well, if we could get him to participate.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message