From owner-freebsd-current@FreeBSD.ORG Sat Oct 28 10:04:49 2006 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4CB0416A407; Sat, 28 Oct 2006 10:04:49 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id E897D43D45; Sat, 28 Oct 2006 10:04:48 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 5485046CC9; Sat, 28 Oct 2006 06:04:48 -0400 (EDT) Date: Sat, 28 Oct 2006 11:04:48 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: David Xu In-Reply-To: <200610281132.21466.davidxu@freebsd.org> Message-ID: <20061028105454.S69980@fledge.watson.org> References: <45425D92.8060205@elischer.org> <200610281132.21466.davidxu@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-current@freebsd.org, Julian Elischer Subject: Re: Comments on the KSE option X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Oct 2006 10:04:49 -0000 On Sat, 28 Oct 2006, David Xu wrote: > 3) Third, it adds overhead to scheduler (I have already post a number) and > might make locking more diffcult for per-cpu queue like scheduler, since now > you always have to contend the ksegrp runqueue lock between many CPUs, also > because you have build the fairness in the scheduler and every scheduler > must obey the ksegrp algorithm, it may make more diffcult to implement > another alogrithm and replace it, see 4). This is my single biggest concern: our scheduling, thread/process, and context management paths in the kernel are currently extremely complex. This has a number of impacts: it makes it extremely hard to read and understand, it adds significant overhead, and it makes it quite hard to modify and optimize for increasing numbers of processors. We need to be planning on a world of 128 hardware threads/machine on commodity server hardware in the immediate future, which means that the current "giant sched_lock" cannot continue much longer. Kip's prototypes of breaking out sched_lock as part of the sun4v work have been able to benefit significantly from the reduced complexity of a KSE-free kernel, and it's fairly clear that the task of improving schedule scalability is dramatically simpler when the kernel model for threading is more simple. Regardless of where the specific NO_KSE option in the kernel goes, reducing kernel scheduler/etc complexity should be a first order of business, because effective SMP work really depends on that happening. Robert N M Watson Computer Laboratory University of Cambridge