From owner-freebsd-hackers  Fri Dec  6 13:12:27 1996
Return-Path: <owner-hackers>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id NAA11228
          for hackers-outgoing; Fri, 6 Dec 1996 13:12:27 -0800 (PST)
Received: from chai.plexuscom.com (chai.plexuscom.com [207.87.46.100])
          by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id NAA11220
          for <freebsd-hackers@freebsd.org>; Fri, 6 Dec 1996 13:12:22 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by chai.plexuscom.com (8.7.6/8.6.12) with SMTP id QAA22343; Fri, 6 Dec 1996 16:11:51 -0500 (EST)
Message-Id: <199612062111.QAA22343@chai.plexuscom.com>
X-Authentication-Warning: chai.plexuscom.com: Host localhost [127.0.0.1] didn't use HELO protocol
To: Terry Lambert <terry@lambert.org>
Cc: freebsd-hackers@freebsd.org
Subject: Re: clone()/rfork()/threads (Re: Inferno for FreeBSD) 
In-reply-to: Your message of "Wed, 04 Dec 1996 19:16:40 MST."
             <199612050216.TAA18540@phaeton.artisoft.com> 
Date: Fri, 06 Dec 1996 16:11:51 -0500
From: Bakul Shah <bakul@plexuscom.com>
Sender: owner-hackers@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

> > If, instead of treating a thread *as* a schedulable entity, you
> > allow a set of threads to *belong to* the same schedulable entity,
> > you can be fair and get around the problems you mentined.  The
> > kernel runs threads from the same group as long as their quantum has
> > not expired and atleast one of them is runnable.  When it switches
> > back to the same group, it will use a FIFO scheme: the first
> > runnable thread to give up the processor is the first to run.

> It will, however, be unfair to threads within the group.

I guess we are being `fair' to different things!

If you treat all the threads/processes as equal then you are right.
But is being fair to threads is always the right thing to do?

Also, if you allow different priority levels, realtime/best-effort
scheduling etc. then you are already abandoning the `all processes
are equal' model (or, if you wish, `refining' it).

Equal fairness to all processes made sense in a '70s style
``multi-user'' system where each user ran one process (and to be
fair to your users you had to have fair scheduling of processes).
But now that is no longer the common case.  Processes running on
your PC are all running to serve _you_.  Processes running on a
server are perhaps closer to the multi-user case but even then there
one-to-one correspondence between clients and processes is not a
given.  In fact different people may want to chose different
policies for their FTP/Web/Database servers (if running on the same
machine).  Even for an FTP server, chances are, you want a bigger
slice of processing for your named clients compared to anonymous
clients.  Within a given class all clients are treated equal but
none of the above scheme can be implemented with an equal fairness
to all threads scheduler.

One needs a very different concept of fairness and it even
means different things in different contexts.

An example.  If some application is implemented in two different
ways, once as a single process (with purely user level threads) and
once as a multi-process (using kernel level threads), then with
traditional scheduling the second implementation will get many more
processor time quanta than the first one.  This is not fair to the
users of the first implementation.  You'd likely say: "Tough luck!
That is how things work".  But IMHO an implementation choice should
not have such a drastic side-effect.

One should be able to use threads as just another `software
structuring' choice and have separate control over how the CPU is
used.

At any rate, I don't think one can design a scheduler that will be
fair under all circumstances.  The best one can do is provide
mechanisms or hooks to implement a wide variety of fairness
policies, with some help from the user.  IMHO being fair to threads
does not serve that purpose.

Back to thread groups.

Note that in such a scheme there needs to be a syscall or something
for a thread to become part of a "schedulable" thread group.
Ideally only __cooperating__ threads would want to belong to the
same group.

In the scenarios you presented, the N ftpds are *not* cooperating
with each other so there'd be no point in making them belong to the
same group.

But in general you should be able to use a larger quanta allocation
for process groups you want (so I am not suggesting one quantum per
thread group).  If you are worried about responsiveness, put
interactive processes at a higher priority.

> > The above idea can be extended to multi processors fairly easily.
> > Though multi-processor schedulers that can also do realtime
> > scheduling (and appropriately deal with priority inversion) are not
> > easy.
> 
> Heh.  "locking nodes in a directed acylic graph representing a lock
> heirarchy" will address the priority inversion handily -- assuming

I was talking about scheduling realtime threads on an MP system,
running at different priority levels but which may want to share
objects.  When a high prio. thread has to wait because a low
priority thread holds a lock it wants, we have priority inversion.
I don't think fair scheduling is easy when you consider these
issues.

Anyway, I don't see how computing lock hierarchy helps solve the
priority inversion problem.  Priority inheritance is one known
solution.  `Non-blocking synchronization' (NBS) is another way of
solving the priority inversion (as well as a host of other
problems).

BTW, you may wish to browse the NBS thread (a different kind of
thread) on comp.os.research.  NBS may be something worth using
(instead of mutexs and locks) in the MP version of FreeBSD.  Also
read Greenwald and Cheriton's "The Synergy Between Non-blocking
Synchronization and Operating System Structure" (accessible via
http://www-dsg.stanford.edu/Publications.html)

-- bakul