From owner-freebsd-stable@FreeBSD.ORG  Fri Dec 16 15:07:47 2011
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A8BDE1065675;
	Fri, 16 Dec 2011 15:07:47 +0000 (UTC)
	(envelope-from luigi@onelab2.iet.unipi.it)
Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238])
	by mx1.freebsd.org (Postfix) with ESMTP id 4609C8FC21;
	Fri, 16 Dec 2011 15:07:47 +0000 (UTC)
Received: by onelab2.iet.unipi.it (Postfix, from userid 275)
	id 3CA247300A; Fri, 16 Dec 2011 16:24:09 +0100 (CET)
Date: Fri, 16 Dec 2011 16:24:09 +0100
From: Luigi Rizzo <rizzo@iet.unipi.it>
To: Stefan Esser <se@freebsd.org>
Message-ID: <20111216152409.GA79938@onelab2.iet.unipi.it>
References: <CAJ-FndDniGH8QoT=kUxOQ+zdVhWF0Z0NKLU0PGS-Gt=BK6noWw@mail.gmail.com>
	<4EE2AE64.9060802@m5p.com> <4EE88343.2050302@m5p.com>
	<CAFHbX1+5PttyZuNnYot8emTn_AWkABdJCvnpo5rcRxVXj0ypJA@mail.gmail.com>
	<4EE933C6.4020209@zedat.fu-berlin.de>
	<20111215004205.GA11556@icarus.home.lan>
	<CAFHbX1KdsvKBPvpxQG-dL1szxk4FB86WxQF-Cw1PWLf=7pQg7w@mail.gmail.com>
	<CADGWnjU9Sp6DoGCVQ6X_CNwcgQUDW9YviNr6eENAQmptAmX_wQ@mail.gmail.com>
	<20111216081145.GA76297@onelab2.iet.unipi.it>
	<4EEB218B.1090209@freebsd.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4EEB218B.1090209@freebsd.org>
User-Agent: Mutt/1.4.2.3i
Cc: Tom Evans <tevans.uk@googlemail.com>, freebsd-stable@freebsd.org,
	"C. P. Ghost" <cpghost@cordula.ws>,
	Jeremy Chadwick <freebsd@jdc.parodius.com>
Subject: Re: switching schedulers (Re: SCHED_ULE should not be the default)
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 16 Dec 2011 15:07:47 -0000

On Fri, Dec 16, 2011 at 11:46:35AM +0100, Stefan Esser wrote:
> Am 16.12.2011 09:11, schrieb Luigi Rizzo:
> > The interesting part is probably the definition of the methods that
> > schedulers should implement (see struct _sched_interface ).
> > 
> > The switch from one scheduler to another was implemented with a
> > sysctl. This calls the sched_move() method of the current (i.e.
> > old) scheduler, which extracts all ready processes from its own
> > "queues" (however they are implemented) and reinserts them onto the
> > new scheduler's "queues" using its (new) setrunqueue() method.  You
> > don't need to bother for blocked process as the scheduler doesn't
> > know much about them.
> > 
> > I am not preserving the thread's dynamic "priority" (think of
> > accumulated work, affinity etc.) when switching
> > schedulers, as that is expected to be an infrequent event, and
> > so in the end it doesn't really matter -- at a switch, threads
> > are inserted in the scheduler as newly created ones, using only
> > the static priority as a parameter.
> 
> I think this is OK for user processes (which will receive reasonable
> relative priorities after running a fraction of a second, anyway).
> 
> But I'm not sure whether it is possible to use static priorities for
> (real-time) kernel threads, where priority inversion may occur, if the
> current dynamic (relative) thread priorities are not preserved.

the word "priority" is too overloaded in this context, as it mixes
configuration information (which i called "static priority", and
would be really better characterized as the "service parameters"
you specify when you start a new thread) and scheduler state ("dynamic
priority" in a priority based scheduler, but other schedulers have
different state info, such as tickets, virtual times, deadlines,
cpu affinity and so on).

What i meant to say is that the way i implemented it (and i believe
it is almost the only practical way), on a change of scheduler,
all processes are requeued as if they had just started.
Then it will be the active scheduler the one who can change
the initial state according the evolution of the system
(changing priorities, tickets, virtual times, deadlines, etc.)

> But not only the relative priorities of the existing processes must be
> preserved, new kernel threads must be created with matching (relative)
> priorities. This means, that the schedulers may be switched at any time,
> but the priority values should be portable between schedulers to prevent
> dead-lock (or illegal order of execution?) of threads (AFAICT).

This issue (i think you have in mind priority inheritance, priority
inversion and related issues) is almost irrelevant in FreeBSD, and
i am really sorry to see that it comes up so frequently in discussions
and sometimes also in documentation related to process schedulers.

Apart from bugs in the implementation (see Bruce Evans' email from
a few days ago), our CPU schedulers are a collection of heuristics
without formally proved properties. So, as much as we can trust
developers to come up with effective solutions:

- we cannot rely on priorities for correctness (mutual exclusion
  or deadlock avoidance);
- we don't have any support for real time guarantees;
- average performance (which is why some of our priority-based schedulers
  may decide to implement priority inheritance) is not affected
  by events as infrequent as changing schedulers.
 
cheers
luigi