From owner-freebsd-arch@FreeBSD.ORG Thu Feb 21 09:27:43 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3883D16A406; Thu, 21 Feb 2008 09:27:43 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 8AE6B13C4D3; Thu, 21 Feb 2008 09:27:42 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id B91FC46B93; Thu, 21 Feb 2008 04:27:41 -0500 (EST) Date: Thu, 21 Feb 2008 09:27:41 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Jeff Roberson In-Reply-To: <20080220213253.A920@desktop> Message-ID: <20080221092011.J52922@fledge.watson.org> References: <20071219211025.T899@desktop> <18311.49715.457070.397815@grasshopper.cs.duke.edu> <20080112182948.F36731@fledge.watson.org> <20080112170831.A957@desktop> <20080112194521.I957@desktop> <20080219234101.D920@desktop> <20080220101348.D44565@fledge.watson.org> <20080220005030.Y920@desktop> <20080220105333.G44565@fledge.watson.org> <47BCEFDB.5040207@freebsd.org> <20080220175532.Q920@desktop> <20080220213253.A920@desktop> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Daniel Eischen , arch@freebsd.org, David Xu , Andrew Gallatin Subject: Re: getaffinity/setaffinity and cpu sets. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Feb 2008 09:27:43 -0000 On Wed, 20 Feb 2008, Jeff Roberson wrote: > I also have a 'cpuset' command which can run a new program with a given cpu > set, view and modify sets of arbitrary pids. This is all working and I can > supply patches if anyone is interested. I have to implement 4BSD support > before I can commit. > > I have a proposal for solaris style processor sets which I think is simple > and sufficient for most cases. It involves the following new syscalls: > > int cpuset(void); int setcpuset(pid_t pid, int setid); int getcpuset(pid_t > pid); > > The notion would be that you can create a new numbered cpuset with cpuset(). > You can modify or inspect its affinity with get/setaffinity above and the > CPU_WHICH_SET argument. The cpuset exists as long as there are members of > the set. Sort of like a process group or session. The {get,set}cpuset > calls can inspect or modify the state. > > This set would not be modifiable by user processes or by processes in a > jail. It would create the restriction that differs between 'avail' and 'sys' > above. Processors would be able to directly bind to any processor within the > set. Changing the set would apply to all processes in the set. The cpuset > would be per-process while the mask is per-thread. Sets involvement is > inherited on fork(). > > In solaris sets can be named and have a more complete management api. I'm > not really interested in implementing all of that but I believe what I have > outlined here would be subset of this and no code/syscalls would be wasted. > > Comments? Objections? I'm fairly pleased with this arrangement now. Just to put a few notes from our conversation on IRC in e-mail: - I think I'd prefer int cpuset(cpuset_t *set), int getcpuset(pid_t, cpuset_t *) so that we don't mix up ID's and return values. More recent interfaces tend to do this, I believe, and it means that the prototype, even if not the ABI, remains the same if the set identifier changes in the future. - You don't mention what happens if a process's cpu set changes to preclude a CPU the process has a thread with affinity for. Online, you suggested SIGKILL, and I thought maybe a new SIGCPUGONE with a default SIGKILL action might be a friendlier model. We should see what Solaris and others do here though. I like the idea that the affinity is a guarantee in userspace because it means that you can rely on it; I'm OK with the idea that your thread always runs on the CPUs you have affinity for unless in the SIGCPUGONE handler :-). - It would be nice to be able to use CPU sets in jail as well, suggesting a hierarchal model with some sort of tagging so you know what CPU sets were created in a jail such that you know whether they can be changed in a jail. While I recognize this makes things a lot more tricky, I think we should basically be planning more carefully with respect to virtualization when we add new interfaces, since it's a widely used feature, and the current set of "stragglers" unsupported in Jail is growing rather than shrinking. - There's still no way to specify an affinity policy rather than explicit affinity, but if our CPU set model is sufficiently general, that might be a vehicle to do that. I.e., cpuset_setpolicy() rather than setting a mask. - In the interests of boring API changes, recent APIs tend to prefix the method on the object name. Have you thought about cpuset_create(), cpuset_foo(), etc? That reduces the chances of interfering with application namespaces. I think, anyway. :-). I need to ponder the proposal a little more, ideally over a hot beverage this morning, and will follow up if I have further thoughts. Thanks for working on this, BTW -- affinity is well-overdue for FreeBSD. Robert N M Watson Computer Laboratory University of Cambridge