From owner-freebsd-stable@FreeBSD.ORG Fri Dec 16 15:07:47 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A8BDE1065675; Fri, 16 Dec 2011 15:07:47 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238]) by mx1.freebsd.org (Postfix) with ESMTP id 4609C8FC21; Fri, 16 Dec 2011 15:07:47 +0000 (UTC) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id 3CA247300A; Fri, 16 Dec 2011 16:24:09 +0100 (CET) Date: Fri, 16 Dec 2011 16:24:09 +0100 From: Luigi Rizzo To: Stefan Esser Message-ID: <20111216152409.GA79938@onelab2.iet.unipi.it> References: <4EE2AE64.9060802@m5p.com> <4EE88343.2050302@m5p.com> <4EE933C6.4020209@zedat.fu-berlin.de> <20111215004205.GA11556@icarus.home.lan> <20111216081145.GA76297@onelab2.iet.unipi.it> <4EEB218B.1090209@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4EEB218B.1090209@freebsd.org> User-Agent: Mutt/1.4.2.3i Cc: Tom Evans , freebsd-stable@freebsd.org, "C. P. Ghost" , Jeremy Chadwick Subject: Re: switching schedulers (Re: SCHED_ULE should not be the default) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Dec 2011 15:07:47 -0000 On Fri, Dec 16, 2011 at 11:46:35AM +0100, Stefan Esser wrote: > Am 16.12.2011 09:11, schrieb Luigi Rizzo: > > The interesting part is probably the definition of the methods that > > schedulers should implement (see struct _sched_interface ). > > > > The switch from one scheduler to another was implemented with a > > sysctl. This calls the sched_move() method of the current (i.e. > > old) scheduler, which extracts all ready processes from its own > > "queues" (however they are implemented) and reinserts them onto the > > new scheduler's "queues" using its (new) setrunqueue() method. You > > don't need to bother for blocked process as the scheduler doesn't > > know much about them. > > > > I am not preserving the thread's dynamic "priority" (think of > > accumulated work, affinity etc.) when switching > > schedulers, as that is expected to be an infrequent event, and > > so in the end it doesn't really matter -- at a switch, threads > > are inserted in the scheduler as newly created ones, using only > > the static priority as a parameter. > > I think this is OK for user processes (which will receive reasonable > relative priorities after running a fraction of a second, anyway). > > But I'm not sure whether it is possible to use static priorities for > (real-time) kernel threads, where priority inversion may occur, if the > current dynamic (relative) thread priorities are not preserved. the word "priority" is too overloaded in this context, as it mixes configuration information (which i called "static priority", and would be really better characterized as the "service parameters" you specify when you start a new thread) and scheduler state ("dynamic priority" in a priority based scheduler, but other schedulers have different state info, such as tickets, virtual times, deadlines, cpu affinity and so on). What i meant to say is that the way i implemented it (and i believe it is almost the only practical way), on a change of scheduler, all processes are requeued as if they had just started. Then it will be the active scheduler the one who can change the initial state according the evolution of the system (changing priorities, tickets, virtual times, deadlines, etc.) > But not only the relative priorities of the existing processes must be > preserved, new kernel threads must be created with matching (relative) > priorities. This means, that the schedulers may be switched at any time, > but the priority values should be portable between schedulers to prevent > dead-lock (or illegal order of execution?) of threads (AFAICT). This issue (i think you have in mind priority inheritance, priority inversion and related issues) is almost irrelevant in FreeBSD, and i am really sorry to see that it comes up so frequently in discussions and sometimes also in documentation related to process schedulers. Apart from bugs in the implementation (see Bruce Evans' email from a few days ago), our CPU schedulers are a collection of heuristics without formally proved properties. So, as much as we can trust developers to come up with effective solutions: - we cannot rely on priorities for correctness (mutual exclusion or deadlock avoidance); - we don't have any support for real time guarantees; - average performance (which is why some of our priority-based schedulers may decide to implement priority inheritance) is not affected by events as infrequent as changing schedulers. cheers luigi