From owner-freebsd-arch  Sat Apr 28  1:48: 4 2001
Delivered-To: freebsd-arch@freebsd.org
Received: from filk.iinet.net.au (syncopation-dns.iinet.net.au [203.59.24.29])
	by hub.freebsd.org (Postfix) with SMTP id A00E537B43C
	for <arch@FreeBSD.ORG>; Sat, 28 Apr 2001 01:47:57 -0700 (PDT)
	(envelope-from julian@elischer.org)
Received: (qmail 20651 invoked by uid 666); 28 Apr 2001 08:51:10 -0000
Received: from i179-143.nv.iinet.net.au (HELO elischer.org) (203.59.179.143)
  by mail.m.iinet.net.au with SMTP; 28 Apr 2001 08:51:10 -0000
Message-ID: <3AEA837D.C2AE5E8D@elischer.org>
Date: Sat, 28 Apr 2001 01:46:53 -0700
From: Julian Elischer <julian@elischer.org>
X-Mailer: Mozilla 4.7 [en] (X11; U; FreeBSD 5.0-CURRENT i386)
X-Accept-Language: en, hu
MIME-Version: 1.0
To: Terry Lambert <tlambert@primenet.com>
Cc: Alfred Perlstein <bright@wintelcom.net>, arch@FreeBSD.ORG,
	terry@lambert.org
Subject: Re: KSE threading support (first parts)
References: <200104280337.UAA29358@usr08.primenet.com>
Content-Type: text/plain; charset=iso-8859-2
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Terry, what you describe here is so similar to what we are planning 
on doing that the differences could be called "Implementation details"

The only difference is that your syscall returns to where it came from
when it would have blocked, and in that I'm championning, the 
particular thread is left suspended, and control returns to the UTS
via an upcall.

Terry Lambert wrote:
> 
> > The way I envision async call gates is something like each syscall
> > could borrow a spare pcb and do an rfork back into the user
> > application using the borrowed pcb and allowing the syscall to
> > proceed as scheduled as another kernel thread, upon return it would
> > somehow notify the process of completion.
> 
> Close.  Effectively, it uses the minimal amount of call context
> it can get away with, and points the VM space and other stuff
> back to the process control block, which is shared among all
> system calls.
> 
> 
> The context is used by the kernel to continue processing.  It
> contains the address of the user space status block, as well as
> a copy of the stack of the returning program (think of the one
> that continues as a "setmp", with the one doing the return as
> getting a longjmp, where the code it would have run is skipped).

In Tehh proposed scheme, the original context is save in exactly
the same way it would be if the process blocked, but instead of 
scheduling a new process to run, teh same process continues to run, 
in the same KSE, having done a longjmp() to a saved context
that simply returns to the UTS.

The original context is saved into a KSEC structure
and it is hung on the appropriate sleep/wait queue.

The returning new context is very small (a couple of entries on
a small kernel stack) It doesn't need to know anything about the 
syscall that just blocked. (there may not even have been one)

When the KSE was created, one of the argumants was the address of a
mailbox used by that KSE to communicate with the UTS. The status of 
the blocked syscall/thread will be available to the UTS via 
that mailbox.


> 
> The final part is that the context runs to completion at the
> user space boundary; since the call has already returned, it does
> not return to user space, instead it stops at the user/kernel
> boundary, after copying out the completion status into the
> user space status block.

ditto. 

> 
> The status block is a simplified version of the aioread/aiowrite
> status block.
> 
> A program can just use these calls directly.  They can also set a
> flag to make the call synchornous (as in an aiowait).  Finally, a
> user space threads scheduler can use completion notifications to
> make scheduling decisions.

In our case that act of creating a KSE enables 'kse mode' in which case
syscalls always look to the thread concerned, as if they completed
normally, but there may have been intervening upcalls to the UTS
between the time the syscall was dispatched and the time it was completed.

libc MAY not need the syscall stubs changed at all.

> 
> For SMP, you can state that you have the ability to return into
> user space (e.g. similar to vfork/sfork) multiple times.  Each
> of these represents a "scheduler reservation", where you reserve
> the right to compete for a quanta.
> 
> You can also easily implement negafinity for up to 32 processors
> with three 32 bit unsigned int's in the process block: just don't
> reserve on a processor where the bit is already set, until you
> have reserved on all available processors at least once.
> 
> > > My ideal implementation would use async call gates.  In effect,
> > > this would be the same as implementing VMS ASTs in all of FreeBSD.
> >
> > Actually, why not just have a syscall that turns on the async
> > behavior?
> 
> Libc will break.  It does not expect to have to reap completed
> system call status blocks to report completion status to the user
> program.

In the KSE world, you do not reap syscall results. Your reap runnable
threads.
Each thread that is runnable is set up by the kernel to look as though it
did a yield() on the first machine instruction after the syscall.

when you longjmp() to retstart the thread, you have effectlively
just retunred from a perfectly normal 'read()' or whatever call. 
As a thread you cannot tell if you were blocked in that call or not.
(the information ay be available to you if you ask for it but the thread's
behaviour is nto different in any way.


> >
> > Well if we have an implementation where the implementators are
> > unwilling or incapable (because of time constraints, or getting
> > hit by a bus, etc) of doing the more optimized version then what's
> > the point besideds getting more IO concurrancy?  I don't know, it
> > just that if someone has a terrific idea that seems to have astounding
> > complexity and they don't feel like they want to or can take the
> > final step with it, then it really should not be considered.

there are two main reasons for doing the KSE work:
1/ IO concurrancy.. Even with one KSE and one KSEG and one processor
you can still have multithreading, with several IO operations outstanding 
at one time, and still do some processing as well.

You could also implement IO concurrency in 'non threaded' programming
models using KSEs and IO stubs.

2/ Increase processor concurrency.
to be able to run multiple threads in one process context, concurrently
on different processors.


> 
> The point of threads was to reduce context switch overhead, and
> to increase the useful work that actually gets done in any given
> time period, as opposed to spending cycles on system overhead or
> spinning waiting for a call to complete when you have other, better
> work to do.

which is what we are trying to do.

> 
> Somewhere along the way, it became corrupted into a tool to allow
> people without very much clue to write programs one-per-connection,
> instead of building finite state automata, and that corruption has
> proceeded, until now it's a tool to get SMP scalability.

"let me introduce you...,
Mr. Foot, Mr. Bullet   ...    Mr. Bullet  Mr. Foot"

Just because this is true doesn't mean that we should give them tools
to do useful threading.

> 
> > btw, I've read some on scheduler activations, where some references
> > on async call gates?
> 
> You're talking to the originator of the idea.  See the -arch archives.

As far as I can see, the difference to what we are suggesting
is that in you async call infrastructure, the syscall
that has been blocked retunrs through the same code that it waould have 
retunred through had it not blocked, and that the library must
detect this and jump to the UTS inthe 'blocked' case to schedule 
another thread.


-- 
      __--_|\  Julian Elischer
     /       \ julian@elischer.org
    (   OZ    ) World tour 2000-2001
---> X_.---._/  
            v

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message