From owner-freebsd-arch@FreeBSD.ORG Sun Feb 24 10:39:07 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 51BB316A406; Sun, 24 Feb 2008 10:39:07 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id F29F013C43E; Sun, 24 Feb 2008 10:39:06 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.107] (cpe-24-94-75-93.hawaii.res.rr.com [24.94.75.93]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id m1OAd2qL086981; Sun, 24 Feb 2008 05:39:03 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Sun, 24 Feb 2008 00:40:31 -1000 (HST) From: Jeff Roberson X-X-Sender: jroberson@desktop To: arch@freebsd.org In-Reply-To: <20080223213507.GD39699@lor.one-eyed-alien.net> Message-ID: <20080224001902.J920@desktop> References: <20080220105333.G44565@fledge.watson.org> <47BCEFDB.5040207@freebsd.org> <20080220175532.Q920@desktop> <20080220213253.A920@desktop> <20080221092011.J52922@fledge.watson.org> <20080222121253.N920@desktop> <20080222231245.GA28788@lor.one-eyed-alien.net> <20080222134923.M920@desktop> <20080223194047.GB38485@lor.one-eyed-alien.net> <20080223111659.K920@desktop> <20080223213507.GD39699@lor.one-eyed-alien.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Daniel Eischen , Brooks Davis , Robert Watson , David Xu , Andrew Gallatin Subject: cpuset and affinity implementation X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Feb 2008 10:39:07 -0000 Please see: http://people.freebsd.org/~jeff/cpuset.diff This is unfortunately intertwined with ULE's new CPU selection algorithm so that code is in the patch as well. Otherwise, this includes a simple, ugly userland tool called cpuset and all of the kernel support required. I have tested this by creating sets and subsets and modifying their cpu masks under load. I'm able to dynamically reprovision without issue. This doesn't have support for jails but the infrastructure is there. It also fails to modify sets if it would leave threads without a valid cpu to run on. I have not implemented a force option but it will be trivial to do so. The initial cpu set is also created before we know all_cpus so it's faked up with all cpus set for now. I mostly want people to look at the interface in cpuset.h and make sure they agree with it before I start polishing to commit. I'm fairly happy with the way the syscall api looks now. The code itself ended up being much more complicated than I'd hoped due to locking considerations. Try not to look at cpuset_setproc() ;). If you want to actually try the patch, here's a couple of neat things to do with cpuset: cpuset -l 0-4 /bin/sh This creates a new group with a list (-l) of cpus 0-4 inclusive and runs sh in it. cpuset -g -p This will get (-g) the mask of cpus pid (-p) is allowed to run on. cpuset -l 0,2 -p This will restrict sh to running on cpus 0, 2 while its group is still allowed 0-4. cpuset -l 0,2 -c -p This will modify the cpuset (-c) that the sh belongs to. cpuset -l 0-3 -s 1 This will modify the set (-s) that all threads are in by default to contain the first 4 cpus leaving the rest idled. Feedback is appreciated. Thanks, Jeff From owner-freebsd-arch@FreeBSD.ORG Mon Feb 25 23:17:47 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 79E1016A407; Mon, 25 Feb 2008 23:17:47 +0000 (UTC) (envelope-from bright@elvis.mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 5DAE213C4EB; Mon, 25 Feb 2008 23:17:47 +0000 (UTC) (envelope-from bright@elvis.mu.org) Received: by elvis.mu.org (Postfix, from userid 1192) id 31F2B1A4D7E; Mon, 25 Feb 2008 15:17:47 -0800 (PST) Date: Mon, 25 Feb 2008 15:17:47 -0800 From: Alfred Perlstein To: Jeff Roberson Message-ID: <20080225231747.GT99258@elvis.mu.org> References: <20080220175532.Q920@desktop> <20080220213253.A920@desktop> <20080221092011.J52922@fledge.watson.org> <20080222121253.N920@desktop> <20080222231245.GA28788@lor.one-eyed-alien.net> <20080222134923.M920@desktop> <20080223194047.GB38485@lor.one-eyed-alien.net> <20080223111659.K920@desktop> <20080223213507.GD39699@lor.one-eyed-alien.net> <20080224001902.J920@desktop> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080224001902.J920@desktop> User-Agent: Mutt/1.4.2.3i Cc: Brooks Davis , Andrew Gallatin , Daniel Eischen , arch@freebsd.org, Robert Watson , David Xu Subject: Re: cpuset and affinity implementation X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Feb 2008 23:17:47 -0000 * Jeff Roberson [080224 02:39] wrote: > Please see: > http://people.freebsd.org/~jeff/cpuset.diff > > This is unfortunately intertwined with ULE's new CPU selection algorithm > so that code is in the patch as well. Otherwise, this includes a simple, > ugly userland tool called cpuset and all of the kernel support required. > I have tested this by creating sets and subsets and modifying their cpu > masks under load. I'm able to dynamically reprovision without issue. > > This doesn't have support for jails but the infrastructure is there. It > also fails to modify sets if it would leave threads without a valid cpu > to run on. I have not implemented a force option but it will be trivial > to do so. The initial cpu set is also created before we know all_cpus so > it's faked up with all cpus set for now. > > I mostly want people to look at the interface in cpuset.h and make sure > they agree with it before I start polishing to commit. I'm fairly happy > with the way the syscall api looks now. The code itself ended up being > much more complicated than I'd hoped due to locking considerations. Try > not to look at cpuset_setproc() ;). Jeff, this is very cool. I do have one issue though: + * A thread may not be assigned to a a group seperate from other threads in + * the process. This is to remove ambiguity when the setid is queried with + * a pid argument. There is no other technical limitation. Am I understanding things correctly such that within a process there can only be one "set"? If so this restricts some of the benifits you get with sets and binding. An example would be some sort of system with multiple CPUs where some are assigned specifically for pseudo-realtime processing and others are for general control things such as cli, stats, etc. In our case we would like to be able to run some threads on specific cpu sets, and other threads to be run anywhere on the control CPUs. Can this be done with this API? -Alfred From owner-freebsd-arch@FreeBSD.ORG Tue Feb 26 00:34:58 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5CB8B16A403; Tue, 26 Feb 2008 00:34:58 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id 18FD513C448; Tue, 26 Feb 2008 00:34:58 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.107] (cpe-24-94-75-93.hawaii.res.rr.com [24.94.75.93]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id m1Q0Yq3b086597; Mon, 25 Feb 2008 19:34:53 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Mon, 25 Feb 2008 14:36:30 -1000 (HST) From: Jeff Roberson X-X-Sender: jroberson@desktop To: Alfred Perlstein In-Reply-To: <20080225231747.GT99258@elvis.mu.org> Message-ID: <20080225143222.B920@desktop> References: <20080220175532.Q920@desktop> <20080220213253.A920@desktop> <20080221092011.J52922@fledge.watson.org> <20080222121253.N920@desktop> <20080222231245.GA28788@lor.one-eyed-alien.net> <20080222134923.M920@desktop> <20080223194047.GB38485@lor.one-eyed-alien.net> <20080223111659.K920@desktop> <20080223213507.GD39699@lor.one-eyed-alien.net> <20080224001902.J920@desktop> <20080225231747.GT99258@elvis.mu.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Brooks Davis , Andrew Gallatin , Daniel Eischen , arch@freebsd.org, Robert Watson , David Xu Subject: Re: cpuset and affinity implementation X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2008 00:34:58 -0000 On Mon, 25 Feb 2008, Alfred Perlstein wrote: > * Jeff Roberson [080224 02:39] wrote: >> Please see: >> http://people.freebsd.org/~jeff/cpuset.diff >> >> This is unfortunately intertwined with ULE's new CPU selection algorithm >> so that code is in the patch as well. Otherwise, this includes a simple, >> ugly userland tool called cpuset and all of the kernel support required. >> I have tested this by creating sets and subsets and modifying their cpu >> masks under load. I'm able to dynamically reprovision without issue. >> >> This doesn't have support for jails but the infrastructure is there. It >> also fails to modify sets if it would leave threads without a valid cpu >> to run on. I have not implemented a force option but it will be trivial >> to do so. The initial cpu set is also created before we know all_cpus so >> it's faked up with all cpus set for now. >> >> I mostly want people to look at the interface in cpuset.h and make sure >> they agree with it before I start polishing to commit. I'm fairly happy >> with the way the syscall api looks now. The code itself ended up being >> much more complicated than I'd hoped due to locking considerations. Try >> not to look at cpuset_setproc() ;). > > > Jeff, this is very cool. I do have one issue though: > > + * A thread may not be assigned to a a group seperate from other threads in > + * the process. This is to remove ambiguity when the setid is queried with > + * a pid argument. There is no other technical limitation. > > Am I understanding things correctly such that within a process there > can only be one "set"? > > If so this restricts some of the benifits you get with sets and > binding. > > An example would be some sort of system with multiple CPUs where some > are assigned specifically for pseudo-realtime processing and others are for > general control things such as cli, stats, etc. > > In our case we would like to be able to run some threads on specific > cpu sets, and other threads to be run anywhere on the control CPUs. > > Can this be done with this API? Individual threads can be bound to any cpu or group of cpus within the set. So if you just make a set that includes all cpus in the system you can then bind your realtime threads to specific cpus and the other threads to the remainder. You will have to specifically bind each thread however. The reason individual threads can't be assigned to groups is because cpuset_getid() for a pid wouldn't make sense then and I expect administrators to be mostly interested in managing groups of processes. It's really two different goals that are being served by this api. You can think of the sets as more of an administrative tool. While the private thread mask is a tool for the application programmer. It just so happens that it was convenient under the hood to have it all managed in the same way. Thanks, Jeff > > -Alfred > From owner-freebsd-arch@FreeBSD.ORG Tue Feb 26 01:40:56 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 920B616A405; Tue, 26 Feb 2008 01:40:56 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from mail.netplex.net (mail.netplex.net [204.213.176.10]) by mx1.freebsd.org (Postfix) with ESMTP id 3080B13C4DB; Tue, 26 Feb 2008 01:40:55 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) by mail.netplex.net (8.14.2/8.14.2/NETPLEX) with ESMTP id m1Q1ebFn008212; Mon, 25 Feb 2008 20:40:37 -0500 (EST) X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.netplex.net) X-Greylist: Message whitelisted by DRAC access database, not delayed by milter-greylist-4.0 (mail.netplex.net [204.213.176.10]); Mon, 25 Feb 2008 20:40:37 -0500 (EST) Date: Mon, 25 Feb 2008 20:40:37 -0500 (EST) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: Jeff Roberson In-Reply-To: <20080225143222.B920@desktop> Message-ID: References: <20080220175532.Q920@desktop> <20080220213253.A920@desktop> <20080221092011.J52922@fledge.watson.org> <20080222121253.N920@desktop> <20080222231245.GA28788@lor.one-eyed-alien.net> <20080222134923.M920@desktop> <20080223194047.GB38485@lor.one-eyed-alien.net> <20080223111659.K920@desktop> <20080223213507.GD39699@lor.one-eyed-alien.net> <20080224001902.J920@desktop> <20080225231747.GT99258@elvis.mu.org> <20080225143222.B920@desktop> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Brooks Davis , Andrew Gallatin , Alfred Perlstein , arch@freebsd.org, Robert Watson , David Xu Subject: Re: cpuset and affinity implementation X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Daniel Eischen List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2008 01:40:56 -0000 On Mon, 25 Feb 2008, Jeff Roberson wrote: > On Mon, 25 Feb 2008, Alfred Perlstein wrote: > >> Jeff, this is very cool. I do have one issue though: >> >> + * A thread may not be assigned to a a group seperate from other threads >> in >> + * the process. This is to remove ambiguity when the setid is queried >> with >> + * a pid argument. There is no other technical limitation. >> >> Am I understanding things correctly such that within a process there >> can only be one "set"? >> >> If so this restricts some of the benifits you get with sets and >> binding. >> >> An example would be some sort of system with multiple CPUs where some >> are assigned specifically for pseudo-realtime processing and others are for >> general control things such as cli, stats, etc. >> >> In our case we would like to be able to run some threads on specific >> cpu sets, and other threads to be run anywhere on the control CPUs. >> >> Can this be done with this API? > > Individual threads can be bound to any cpu or group of cpus within the set. > So if you just make a set that includes all cpus in the system you can then > bind your realtime threads to specific cpus and the other threads to the > remainder. You will have to specifically bind each thread however. > > The reason individual threads can't be assigned to groups is because > cpuset_getid() for a pid wouldn't make sense then and I expect administrators > to be mostly interested in managing groups of processes. If the administrator sets up a set of CPUs specifically for real-time and another set for non-real-time, you may want to bind some threads to the real-time set, and leave other threads unbound (or even bound to the non-real-time set). In this case, I think cpuset_getid() should either return the default cpuset of all cpus in the system, or the last cpuset to which the process was bound. But regardless, I think binding a thread to a different processor set should be allowed and should override its inherent binding of the process' processor set. Hmm, I guess in this case, a subsequent binding of the process to a processor set should probably override any per-thread bindings. -- DE From owner-freebsd-arch@FreeBSD.ORG Tue Feb 26 02:06:19 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4CC3116A400; Tue, 26 Feb 2008 02:06:19 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id 07BFF13C43E; Tue, 26 Feb 2008 02:06:18 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.107] (cpe-24-94-75-93.hawaii.res.rr.com [24.94.75.93]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id m1Q26CRF006587; Mon, 25 Feb 2008 21:06:14 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Mon, 25 Feb 2008 16:07:51 -1000 (HST) From: Jeff Roberson X-X-Sender: jroberson@desktop To: Daniel Eischen In-Reply-To: Message-ID: <20080225160433.P920@desktop> References: <20080220175532.Q920@desktop> <20080220213253.A920@desktop> <20080221092011.J52922@fledge.watson.org> <20080222121253.N920@desktop> <20080222231245.GA28788@lor.one-eyed-alien.net> <20080222134923.M920@desktop> <20080223194047.GB38485@lor.one-eyed-alien.net> <20080223111659.K920@desktop> <20080223213507.GD39699@lor.one-eyed-alien.net> <20080224001902.J920@desktop> <20080225231747.GT99258@elvis.mu.org> <20080225143222.B920@desktop> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Brooks Davis , Andrew Gallatin , Alfred Perlstein , arch@freebsd.org, Robert Watson , David Xu Subject: Re: cpuset and affinity implementation X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2008 02:06:19 -0000 On Mon, 25 Feb 2008, Daniel Eischen wrote: > On Mon, 25 Feb 2008, Jeff Roberson wrote: > >> On Mon, 25 Feb 2008, Alfred Perlstein wrote: >> >>> Jeff, this is very cool. I do have one issue though: >>> >>> + * A thread may not be assigned to a a group seperate from other threads >>> in >>> + * the process. This is to remove ambiguity when the setid is queried >>> with >>> + * a pid argument. There is no other technical limitation. >>> >>> Am I understanding things correctly such that within a process there >>> can only be one "set"? >>> >>> If so this restricts some of the benifits you get with sets and >>> binding. >>> >>> An example would be some sort of system with multiple CPUs where some >>> are assigned specifically for pseudo-realtime processing and others are >>> for >>> general control things such as cli, stats, etc. >>> >>> In our case we would like to be able to run some threads on specific >>> cpu sets, and other threads to be run anywhere on the control CPUs. >>> >>> Can this be done with this API? >> >> Individual threads can be bound to any cpu or group of cpus within the set. >> So if you just make a set that includes all cpus in the system you can then >> bind your realtime threads to specific cpus and the other threads to the >> remainder. You will have to specifically bind each thread however. >> >> The reason individual threads can't be assigned to groups is because >> cpuset_getid() for a pid wouldn't make sense then and I expect >> administrators to be mostly interested in managing groups of processes. > > If the administrator sets up a set of CPUs specifically for > real-time and another set for non-real-time, you may want to > bind some threads to the real-time set, and leave other threads > unbound (or even bound to the non-real-time set). In this > case, I think cpuset_getid() should either return the default > cpuset of all cpus in the system, or the last cpuset to > which the process was bound. > > But regardless, I think binding a thread to a different > processor set should be allowed and should override its > inherent binding of the process' processor set. > Hmm, I guess in this case, a subsequent binding of the > process to a processor set should probably override any > per-thread bindings. I think we're getting into complex corner cases here which will only confuse the api and administrators. I don't expect administrators will want to set groups to individual threads. How would he even identify the individual thread? And if he did, he could just as easily set masks on that thread along with others in the process. I'm already a little nervous about how complicated this will be for programmers. If we allowed each thread in a pid to be in its own set, we'd have to make cpuset_getid() return an array of ids. I definitely don't want to do that. > > -- > DE > From owner-freebsd-arch@FreeBSD.ORG Tue Feb 26 02:11:20 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3D15C16A402; Tue, 26 Feb 2008 02:11:20 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from mail.netplex.net (mail.netplex.net [204.213.176.10]) by mx1.freebsd.org (Postfix) with ESMTP id 0678313C468; Tue, 26 Feb 2008 02:11:19 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) by mail.netplex.net (8.14.2/8.14.2/NETPLEX) with ESMTP id m1Q2B7Uh023172; Mon, 25 Feb 2008 21:11:07 -0500 (EST) X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.netplex.net) X-Greylist: Message whitelisted by DRAC access database, not delayed by milter-greylist-4.0 (mail.netplex.net [204.213.176.10]); Mon, 25 Feb 2008 21:11:07 -0500 (EST) Date: Mon, 25 Feb 2008 21:11:07 -0500 (EST) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: Jeff Roberson In-Reply-To: <20080225160433.P920@desktop> Message-ID: References: <20080220175532.Q920@desktop> <20080220213253.A920@desktop> <20080221092011.J52922@fledge.watson.org> <20080222121253.N920@desktop> <20080222231245.GA28788@lor.one-eyed-alien.net> <20080222134923.M920@desktop> <20080223194047.GB38485@lor.one-eyed-alien.net> <20080223111659.K920@desktop> <20080223213507.GD39699@lor.one-eyed-alien.net> <20080224001902.J920@desktop> <20080225231747.GT99258@elvis.mu.org> <20080225143222.B920@desktop> <20080225160433.P920@desktop> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Brooks Davis , Andrew Gallatin , Alfred Perlstein , arch@freebsd.org, Robert Watson , David Xu Subject: Re: cpuset and affinity implementation X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Daniel Eischen List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2008 02:11:20 -0000 On Mon, 25 Feb 2008, Jeff Roberson wrote: > On Mon, 25 Feb 2008, Daniel Eischen wrote: > >> On Mon, 25 Feb 2008, Jeff Roberson wrote: >> >>> On Mon, 25 Feb 2008, Alfred Perlstein wrote: >>> >>>> Jeff, this is very cool. I do have one issue though: >>>> >>>> + * A thread may not be assigned to a a group seperate from other threads >>>> in >>>> + * the process. This is to remove ambiguity when the setid is queried >>>> with >>>> + * a pid argument. There is no other technical limitation. >>>> >>>> Am I understanding things correctly such that within a process there >>>> can only be one "set"? >>>> >>>> If so this restricts some of the benifits you get with sets and >>>> binding. >>>> >>>> An example would be some sort of system with multiple CPUs where some >>>> are assigned specifically for pseudo-realtime processing and others are >>>> for >>>> general control things such as cli, stats, etc. >>>> >>>> In our case we would like to be able to run some threads on specific >>>> cpu sets, and other threads to be run anywhere on the control CPUs. >>>> >>>> Can this be done with this API? >>> >>> Individual threads can be bound to any cpu or group of cpus within the >>> set. So if you just make a set that includes all cpus in the system you >>> can then bind your realtime threads to specific cpus and the other threads >>> to the remainder. You will have to specifically bind each thread however. >>> >>> The reason individual threads can't be assigned to groups is because >>> cpuset_getid() for a pid wouldn't make sense then and I expect >>> administrators to be mostly interested in managing groups of processes. >> >> If the administrator sets up a set of CPUs specifically for >> real-time and another set for non-real-time, you may want to >> bind some threads to the real-time set, and leave other threads >> unbound (or even bound to the non-real-time set). In this >> case, I think cpuset_getid() should either return the default >> cpuset of all cpus in the system, or the last cpuset to >> which the process was bound. >> >> But regardless, I think binding a thread to a different >> processor set should be allowed and should override its >> inherent binding of the process' processor set. >> Hmm, I guess in this case, a subsequent binding of the >> process to a processor set should probably override any >> per-thread bindings. > > I think we're getting into complex corner cases here which will only confuse > the api and administrators. I don't expect administrators will want to set > groups to individual threads. How would he even identify the individual > thread? And if he did, he could just as easily set masks on that thread > along with others in the process. > > I'm already a little nervous about how complicated this will be for > programmers. If we allowed each thread in a pid to be in its own set, we'd > have to make cpuset_getid() return an array of ids. I definitely don't want > to do that. Solaris does seem to allow this BTW. -- DE From owner-freebsd-arch@FreeBSD.ORG Tue Feb 26 05:50:54 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 086EC16A47D; Tue, 26 Feb 2008 05:50:54 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id 454E113C57C; Tue, 26 Feb 2008 05:50:52 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.107] (cpe-24-94-75-93.hawaii.res.rr.com [24.94.75.93]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id m1Q5ohLB041877; Tue, 26 Feb 2008 00:50:48 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Mon, 25 Feb 2008 19:52:22 -1000 (HST) From: Jeff Roberson X-X-Sender: jroberson@desktop To: Daniel Eischen In-Reply-To: Message-ID: <20080225194320.V920@desktop> References: <20080220175532.Q920@desktop> <20080220213253.A920@desktop> <20080221092011.J52922@fledge.watson.org> <20080222121253.N920@desktop> <20080222231245.GA28788@lor.one-eyed-alien.net> <20080222134923.M920@desktop> <20080223194047.GB38485@lor.one-eyed-alien.net> <20080223111659.K920@desktop> <20080223213507.GD39699@lor.one-eyed-alien.net> <20080224001902.J920@desktop> <20080225231747.GT99258@elvis.mu.org> <20080225143222.B920@desktop> <20080225160433.P920@desktop> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Brooks Davis , Andrew Gallatin , Alfred Perlstein , arch@freebsd.org, Robert Watson , David Xu Subject: Re: cpuset and affinity implementation X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2008 05:50:54 -0000 On Mon, 25 Feb 2008, Daniel Eischen wrote: > On Mon, 25 Feb 2008, Jeff Roberson wrote: > >> On Mon, 25 Feb 2008, Daniel Eischen wrote: >> >>> On Mon, 25 Feb 2008, Jeff Roberson wrote: >>> >>>> On Mon, 25 Feb 2008, Alfred Perlstein wrote: >>>> >>>>> Jeff, this is very cool. I do have one issue though: >>>>> >>>>> + * A thread may not be assigned to a a group seperate from other >>>>> threads in >>>>> + * the process. This is to remove ambiguity when the setid is queried >>>>> with >>>>> + * a pid argument. There is no other technical limitation. >>>>> >>>>> Am I understanding things correctly such that within a process there >>>>> can only be one "set"? >>>>> >>>>> If so this restricts some of the benifits you get with sets and >>>>> binding. >>>>> >>>>> An example would be some sort of system with multiple CPUs where some >>>>> are assigned specifically for pseudo-realtime processing and others are >>>>> for >>>>> general control things such as cli, stats, etc. >>>>> >>>>> In our case we would like to be able to run some threads on specific >>>>> cpu sets, and other threads to be run anywhere on the control CPUs. >>>>> >>>>> Can this be done with this API? >>>> >>>> Individual threads can be bound to any cpu or group of cpus within the >>>> set. So if you just make a set that includes all cpus in the system you >>>> can then bind your realtime threads to specific cpus and the other >>>> threads to the remainder. You will have to specifically bind each thread >>>> however. >>>> >>>> The reason individual threads can't be assigned to groups is because >>>> cpuset_getid() for a pid wouldn't make sense then and I expect >>>> administrators to be mostly interested in managing groups of processes. >>> >>> If the administrator sets up a set of CPUs specifically for >>> real-time and another set for non-real-time, you may want to >>> bind some threads to the real-time set, and leave other threads >>> unbound (or even bound to the non-real-time set). In this >>> case, I think cpuset_getid() should either return the default >>> cpuset of all cpus in the system, or the last cpuset to >>> which the process was bound. >>> >>> But regardless, I think binding a thread to a different >>> processor set should be allowed and should override its >>> inherent binding of the process' processor set. >>> Hmm, I guess in this case, a subsequent binding of the >>> process to a processor set should probably override any >>> per-thread bindings. >> >> I think we're getting into complex corner cases here which will only >> confuse the api and administrators. I don't expect administrators will >> want to set groups to individual threads. How would he even identify the >> individual thread? And if he did, he could just as easily set masks on >> that thread along with others in the process. >> >> I'm already a little nervous about how complicated this will be for >> programmers. If we allowed each thread in a pid to be in its own set, >> we'd have to make cpuset_getid() return an array of ids. I definitely >> don't want to do that. > > Solaris does seem to allow this BTW. Solaris also doesn't allow a processor to be in more than one set. It doesn't allow a thread to bind to a processor that's in a processor set. It also doesn't seem provide a mechanism to query the set that a thread is in, so there is no ambiguity for the querying. However, when you modify you have the option of retrieving the old set. They must simply return the first one discovered. We could do that but it doesn't seem very attractive. Would people be in favor of binding threads to sets if it meant getting the id from a pid was not always 100% accurate? Even though a thread may already restrict its set? >From the pset_assign man page: "Processors with LWPs bound to them using processor_bind(2) cannot be assigned to a new processor set. If this is attempted, pset_assign() will fail and set errno to EBUSY." My cpuset design seems to be a lot more flexible. Cheers, Jeff > > -- > DE > From owner-freebsd-arch@FreeBSD.ORG Tue Feb 26 06:39:44 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C274016A404; Tue, 26 Feb 2008 06:39:44 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from mail.netplex.net (mail.netplex.net [204.213.176.10]) by mx1.freebsd.org (Postfix) with ESMTP id 7520113C458; Tue, 26 Feb 2008 06:39:44 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) by mail.netplex.net (8.14.2/8.14.2/NETPLEX) with ESMTP id m1Q6dYHe002308; Tue, 26 Feb 2008 01:39:35 -0500 (EST) X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.netplex.net) X-Greylist: Message whitelisted by DRAC access database, not delayed by milter-greylist-4.0 (mail.netplex.net [204.213.176.10]); Tue, 26 Feb 2008 01:39:35 -0500 (EST) Date: Tue, 26 Feb 2008 01:39:34 -0500 (EST) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: Jeff Roberson In-Reply-To: <20080225194320.V920@desktop> Message-ID: References: <20080220175532.Q920@desktop> <20080220213253.A920@desktop> <20080221092011.J52922@fledge.watson.org> <20080222121253.N920@desktop> <20080222231245.GA28788@lor.one-eyed-alien.net> <20080222134923.M920@desktop> <20080223194047.GB38485@lor.one-eyed-alien.net> <20080223111659.K920@desktop> <20080223213507.GD39699@lor.one-eyed-alien.net> <20080224001902.J920@desktop> <20080225231747.GT99258@elvis.mu.org> <20080225143222.B920@desktop> <20080225160433.P920@desktop> <20080225194320.V920@desktop> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Brooks Davis , Andrew Gallatin , Alfred Perlstein , arch@freebsd.org, Robert Watson , David Xu Subject: Re: cpuset and affinity implementation X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Daniel Eischen List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2008 06:39:44 -0000 On Mon, 25 Feb 2008, Jeff Roberson wrote: > On Mon, 25 Feb 2008, Daniel Eischen wrote: > >> On Mon, 25 Feb 2008, Jeff Roberson wrote: >> >>> On Mon, 25 Feb 2008, Daniel Eischen wrote: >>> >>>> On Mon, 25 Feb 2008, Jeff Roberson wrote: >>>> >>>>> On Mon, 25 Feb 2008, Alfred Perlstein wrote: >>>>> >>>>>> Jeff, this is very cool. I do have one issue though: >>>>>> >>>>>> + * A thread may not be assigned to a a group seperate from other >>>>>> threads in >>>>>> + * the process. This is to remove ambiguity when the setid is queried >>>>>> with >>>>>> + * a pid argument. There is no other technical limitation. >>>>>> >>>>>> Am I understanding things correctly such that within a process there >>>>>> can only be one "set"? >>>>>> >>>>>> If so this restricts some of the benifits you get with sets and >>>>>> binding. >>>>>> >>>>>> An example would be some sort of system with multiple CPUs where some >>>>>> are assigned specifically for pseudo-realtime processing and others are >>>>>> for >>>>>> general control things such as cli, stats, etc. >>>>>> >>>>>> In our case we would like to be able to run some threads on specific >>>>>> cpu sets, and other threads to be run anywhere on the control CPUs. >>>>>> >>>>>> Can this be done with this API? >>>>> >>>>> Individual threads can be bound to any cpu or group of cpus within the >>>>> set. So if you just make a set that includes all cpus in the system you >>>>> can then bind your realtime threads to specific cpus and the other >>>>> threads to the remainder. You will have to specifically bind each >>>>> thread however. >>>>> >>>>> The reason individual threads can't be assigned to groups is because >>>>> cpuset_getid() for a pid wouldn't make sense then and I expect >>>>> administrators to be mostly interested in managing groups of processes. >>>> >>>> If the administrator sets up a set of CPUs specifically for >>>> real-time and another set for non-real-time, you may want to >>>> bind some threads to the real-time set, and leave other threads >>>> unbound (or even bound to the non-real-time set). In this >>>> case, I think cpuset_getid() should either return the default >>>> cpuset of all cpus in the system, or the last cpuset to >>>> which the process was bound. >>>> >>>> But regardless, I think binding a thread to a different >>>> processor set should be allowed and should override its >>>> inherent binding of the process' processor set. >>>> Hmm, I guess in this case, a subsequent binding of the >>>> process to a processor set should probably override any >>>> per-thread bindings. >>> >>> I think we're getting into complex corner cases here which will only >>> confuse the api and administrators. I don't expect administrators will >>> want to set groups to individual threads. How would he even identify the >>> individual thread? And if he did, he could just as easily set masks on >>> that thread along with others in the process. >>> >>> I'm already a little nervous about how complicated this will be for >>> programmers. If we allowed each thread in a pid to be in its own set, >>> we'd have to make cpuset_getid() return an array of ids. I definitely >>> don't want to do that. >> >> Solaris does seem to allow this BTW. > > Solaris also doesn't allow a processor to be in more than one set. It > doesn't allow a thread to bind to a processor that's in a processor set. It > also doesn't seem provide a mechanism to query the set that a thread is in, > so there is no ambiguity for the querying. However, when you modify you have > the option of retrieving the old set. They must simply return the first one > discovered. We could do that but it doesn't seem very attractive. Probably, we should just return the last processor set that was bound to the process (using the default processor set if there was none). I would disregard any LWP/thread-specific bindings when returning the processor set for the process. Everyone should know by know that there are threads to consider, and if they want more specific information to query the processor bindings for each thread as well as for the process. The Solaris man page for pset_bind does say that it binds all LWPs of the process when the argument is the PID. That seems to indicate that it will override any LWP-specific bindings. > Would people be in favor of binding threads to sets if it meant getting the > id from a pid was not always 100% accurate? Even though a thread may already > restrict its set? I think it would be accurate if we really returned the set for the process, disregarding thread-specific bindings. As long as it is worded correctly, I don't think it would be wrong. The only ambiguity might be if there was no explicit per-process binding, but there was thread-specific bindings. Even in this case though, if you returned the default cpuset, I think it would still be accurate. > From the pset_assign man page: > > "Processors with LWPs bound to them using processor_bind(2) cannot be > assigned to a new processor set. If this is attempted, pset_assign() will > fail and set errno to EBUSY." > > My cpuset design seems to be a lot more flexible. I think it is because that older Solaris had only specific processor bindings. Newer versions of Solaris added processor sets. I don't think we would want this restriction :-) -- DE From owner-freebsd-arch@FreeBSD.ORG Tue Feb 26 07:37:38 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8E05B16A407; Tue, 26 Feb 2008 07:37:38 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id 49B6413C442; Tue, 26 Feb 2008 07:37:38 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.107] (cpe-24-94-75-93.hawaii.res.rr.com [24.94.75.93]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id m1Q7bWBF049477; Tue, 26 Feb 2008 02:37:33 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Mon, 25 Feb 2008 21:39:12 -1000 (HST) From: Jeff Roberson X-X-Sender: jroberson@desktop To: Daniel Eischen In-Reply-To: Message-ID: <20080225213434.L920@desktop> References: <20080220175532.Q920@desktop> <20080220213253.A920@desktop> <20080221092011.J52922@fledge.watson.org> <20080222121253.N920@desktop> <20080222231245.GA28788@lor.one-eyed-alien.net> <20080222134923.M920@desktop> <20080223194047.GB38485@lor.one-eyed-alien.net> <20080223111659.K920@desktop> <20080223213507.GD39699@lor.one-eyed-alien.net> <20080224001902.J920@desktop> <20080225231747.GT99258@elvis.mu.org> <20080225143222.B920@desktop> <20080225160433.P920@desktop> <20080225194320.V920@desktop> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Brooks Davis , Andrew Gallatin , Alfred Perlstein , arch@freebsd.org, Robert Watson , David Xu Subject: Re: cpuset and affinity implementation X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2008 07:37:38 -0000 On Tue, 26 Feb 2008, Daniel Eischen wrote: > On Mon, 25 Feb 2008, Jeff Roberson wrote: > >> On Mon, 25 Feb 2008, Daniel Eischen wrote: >> >>> On Mon, 25 Feb 2008, Jeff Roberson wrote: >>> >>>> On Mon, 25 Feb 2008, Daniel Eischen wrote: >>>> >>>>> On Mon, 25 Feb 2008, Jeff Roberson wrote: >>>>> >>>>>> On Mon, 25 Feb 2008, Alfred Perlstein wrote: >>>>>> >>>>>>> Jeff, this is very cool. I do have one issue though: >>>>>>> >>>>>>> + * A thread may not be assigned to a a group seperate from other >>>>>>> threads in >>>>>>> + * the process. This is to remove ambiguity when the setid is >>>>>>> queried with >>>>>>> + * a pid argument. There is no other technical limitation. >>>>>>> >>>>>>> Am I understanding things correctly such that within a process there >>>>>>> can only be one "set"? >>>>>>> >>>>>>> If so this restricts some of the benifits you get with sets and >>>>>>> binding. >>>>>>> >>>>>>> An example would be some sort of system with multiple CPUs where some >>>>>>> are assigned specifically for pseudo-realtime processing and others >>>>>>> are for >>>>>>> general control things such as cli, stats, etc. >>>>>>> >>>>>>> In our case we would like to be able to run some threads on specific >>>>>>> cpu sets, and other threads to be run anywhere on the control CPUs. >>>>>>> >>>>>>> Can this be done with this API? >>>>>> >>>>>> Individual threads can be bound to any cpu or group of cpus within the >>>>>> set. So if you just make a set that includes all cpus in the system you >>>>>> can then bind your realtime threads to specific cpus and the other >>>>>> threads to the remainder. You will have to specifically bind each >>>>>> thread however. >>>>>> >>>>>> The reason individual threads can't be assigned to groups is because >>>>>> cpuset_getid() for a pid wouldn't make sense then and I expect >>>>>> administrators to be mostly interested in managing groups of processes. >>>>> >>>>> If the administrator sets up a set of CPUs specifically for >>>>> real-time and another set for non-real-time, you may want to >>>>> bind some threads to the real-time set, and leave other threads >>>>> unbound (or even bound to the non-real-time set). In this >>>>> case, I think cpuset_getid() should either return the default >>>>> cpuset of all cpus in the system, or the last cpuset to >>>>> which the process was bound. >>>>> >>>>> But regardless, I think binding a thread to a different >>>>> processor set should be allowed and should override its >>>>> inherent binding of the process' processor set. >>>>> Hmm, I guess in this case, a subsequent binding of the >>>>> process to a processor set should probably override any >>>>> per-thread bindings. >>>> >>>> I think we're getting into complex corner cases here which will only >>>> confuse the api and administrators. I don't expect administrators will >>>> want to set groups to individual threads. How would he even identify the >>>> individual thread? And if he did, he could just as easily set masks on >>>> that thread along with others in the process. >>>> >>>> I'm already a little nervous about how complicated this will be for >>>> programmers. If we allowed each thread in a pid to be in its own set, >>>> we'd have to make cpuset_getid() return an array of ids. I definitely >>>> don't want to do that. >>> >>> Solaris does seem to allow this BTW. >> >> Solaris also doesn't allow a processor to be in more than one set. It >> doesn't allow a thread to bind to a processor that's in a processor set. >> It also doesn't seem provide a mechanism to query the set that a thread is >> in, so there is no ambiguity for the querying. However, when you modify >> you have the option of retrieving the old set. They must simply return the >> first one discovered. We could do that but it doesn't seem very >> attractive. > > Probably, we should just return the last processor set that > was bound to the process (using the default processor set if > there was none). I would disregard any LWP/thread-specific > bindings when returning the processor set for the process. > Everyone should know by know that there are threads to > consider, and if they want more specific information to > query the processor bindings for each thread as well as > for the process. Binding a processor set to the process simply sets the per-thread binding of each thread in the process. There is otherwise no specific process binding. We could keep a pointer to the last specifically bound set in the process if we wanted, but what would it be used for other than querying the id of the process? What if each thread was seperately specifically bound to a different set? What set should be used on fork? The set of the process or the thread that called fork? What about when creating a new thread? > > The Solaris man page for pset_bind does say that it binds > all LWPs of the process when the argument is the PID. That > seems to indicate that it will override any LWP-specific > bindings. Yes, same with the current cpuset design. > >> Would people be in favor of binding threads to sets if it meant getting the >> id from a pid was not always 100% accurate? Even though a thread may >> already restrict its set? > > I think it would be accurate if we really returned the set > for the process, disregarding thread-specific bindings. As > long as it is worded correctly, I don't think it would be > wrong. The only ambiguity might be if there was no explicit > per-process binding, but there was thread-specific bindings. > Even in this case though, if you returned the default cpuset, > I think it would still be accurate. See above discussion. I'm not sure what you mean by 'default' cpuset here. > >> From the pset_assign man page: >> >> "Processors with LWPs bound to them using processor_bind(2) cannot be >> assigned to a new processor set. If this is attempted, pset_assign() will >> fail and set errno to EBUSY." >> >> My cpuset design seems to be a lot more flexible. > > I think it is because that older Solaris had only specific > processor bindings. Newer versions of Solaris added processor > sets. I don't think we would want this restriction :-) Yeah, they started with the simplest interface and started adding more complex and incompatible interface. They now have pools, sets, and binding none of which are compatible with each other. In fact if you enable pools it disables sets. And specific binding precludes both. I also looked at the linux implementation. It uses a filesystem to store and manipulate set information. It also seems to allow arbitrary binding and sets as we have, however, the distributed version is said not to allow modifying the set while live. They call this migration. I think the filesystem interface is inelegant but it's similar in features to our cpuset. Jeff > > -- > DE > From owner-freebsd-arch@FreeBSD.ORG Tue Feb 26 15:39:06 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EEEF5106568C; Tue, 26 Feb 2008 15:39:06 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from mail.netplex.net (mail.netplex.net [204.213.176.10]) by mx1.freebsd.org (Postfix) with ESMTP id 91EB013C448; Tue, 26 Feb 2008 15:39:06 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) by mail.netplex.net (8.14.2/8.14.2/NETPLEX) with ESMTP id m1QFcc9A028786; Tue, 26 Feb 2008 10:38:38 -0500 (EST) X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.netplex.net) X-Greylist: Message whitelisted by DRAC access database, not delayed by milter-greylist-4.0 (mail.netplex.net [204.213.176.10]); Tue, 26 Feb 2008 10:38:38 -0500 (EST) Date: Tue, 26 Feb 2008 10:38:38 -0500 (EST) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: Jeff Roberson In-Reply-To: <20080225213434.L920@desktop> Message-ID: References: <20080220175532.Q920@desktop> <20080220213253.A920@desktop> <20080221092011.J52922@fledge.watson.org> <20080222121253.N920@desktop> <20080222231245.GA28788@lor.one-eyed-alien.net> <20080222134923.M920@desktop> <20080223194047.GB38485@lor.one-eyed-alien.net> <20080223111659.K920@desktop> <20080223213507.GD39699@lor.one-eyed-alien.net> <20080224001902.J920@desktop> <20080225231747.GT99258@elvis.mu.org> <20080225143222.B920@desktop> <20080225160433.P920@desktop> <20080225194320.V920@desktop> <20080225213434.L920@desktop> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Brooks Davis , Andrew Gallatin , Alfred Perlstein , arch@freebsd.org, Robert Watson , David Xu Subject: Re: cpuset and affinity implementation X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Daniel Eischen List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2008 15:39:07 -0000 On Mon, 25 Feb 2008, Jeff Roberson wrote: > Binding a processor set to the process simply sets the per-thread binding of > each thread in the process. There is otherwise no specific process binding. > We could keep a pointer to the last specifically bound set in the process if > we wanted, but what would it be used for other than querying the id of the > process? What if each thread was seperately specifically bound to a > different set? What set should be used on fork? The set of the process or > the thread that called fork? What about when creating a new thread? The set used on fork should be the set of the calling thread, same concept as signal masks I would think. Same thing when creating a new thread. I guess I'd check how Linux and Solaris do it, see if they are consistent. I can see how you might _not_ want to inherit bindings in a created thread. For a process with real-time threads, the application might start with superuser privileges, create some threads with real-time priority and set their bindings, then setuid() to remove superuser privileges. Is a privilege check made in a newly created thread when applying inherited bindings? > See above discussion. I'm not sure what you mean by 'default' cpuset here. I imagine the 'default' cpuset as the system's default cpuset, in lieu of any administratively created cpusets and bindings for the process (inherited or explicit). -- DE From owner-freebsd-arch@FreeBSD.ORG Tue Feb 26 19:24:01 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3B2121065672 for ; Tue, 26 Feb 2008 19:24:01 +0000 (UTC) (envelope-from nevans@talkpoint.com) Received: from mailbox.talkpoint.com (mailbox.talkpoint.com [204.141.15.162]) by mx1.freebsd.org (Postfix) with ESMTP id D7DCA13C4F0 for ; Tue, 26 Feb 2008 19:24:00 +0000 (UTC) (envelope-from nevans@talkpoint.com) Received: from localhost (localhost [127.0.0.1]) by mailbox.talkpoint.com (Postfix) with ESMTP id 087D82C50023; Tue, 26 Feb 2008 13:54:41 -0500 (EST) X-Virus-Scanned: amavisd-new at X-Spam-Flag: NO X-Spam-Score: -3.829 X-Spam-Level: X-Spam-Status: No, score=-3.829 tagged_above=-10 required=3 tests=[ALL_TRUSTED=-1.8, AWL=0.570, BAYES_00=-2.599] Received: from mailbox.talkpoint.com ([127.0.0.1]) by localhost (mailbox.talkpoint.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SvJqz8WXHnRi; Tue, 26 Feb 2008 13:54:40 -0500 (EST) Received: from pleiades.nextvenue.com (pleiades.nextvenue.com [204.141.15.194]) by mailbox.talkpoint.com (Postfix) with ESMTP id 2B1E22C50022; Tue, 26 Feb 2008 13:54:40 -0500 (EST) Date: Tue, 26 Feb 2008 13:54:39 -0500 From: Nick Evans To: Jeff Roberson Message-ID: <20080226135439.27db7400@pleiades.nextvenue.com> In-Reply-To: <20080224001902.J920@desktop> References: <20080220105333.G44565@fledge.watson.org> <47BCEFDB.5040207@freebsd.org> <20080220175532.Q920@desktop> <20080220213253.A920@desktop> <20080221092011.J52922@fledge.watson.org> <20080222121253.N920@desktop> <20080222231245.GA28788@lor.one-eyed-alien.net> <20080222134923.M920@desktop> <20080223194047.GB38485@lor.one-eyed-alien.net> <20080223111659.K920@desktop> <20080223213507.GD39699@lor.one-eyed-alien.net> <20080224001902.J920@desktop> X-Mailer: Claws Mail 3.3.0 (GTK+ 2.12.3; i386-portbld-freebsd7.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: arch@freebsd.org Subject: Re: cpuset and affinity implementation X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2008 19:24:01 -0000 On Sun, 24 Feb 2008 00:40:31 -1000 (HST) Jeff Roberson wrote: > Please see: > http://people.freebsd.org/~jeff/cpuset.diff > > This is unfortunately intertwined with ULE's new CPU selection algorithm > so that code is in the patch as well. Otherwise, this includes a simple, > ugly userland tool called cpuset and all of the kernel support required. > I have tested this by creating sets and subsets and modifying their cpu > masks under load. I'm able to dynamically reprovision without issue. > > This doesn't have support for jails but the infrastructure is there. It > also fails to modify sets if it would leave threads without a valid cpu > to run on. I have not implemented a force option but it will be trivial > to do so. The initial cpu set is also created before we know all_cpus so > it's faked up with all cpus set for now. > > I mostly want people to look at the interface in cpuset.h and make sure > they agree with it before I start polishing to commit. I'm fairly happy > with the way the syscall api looks now. The code itself ended up being > much more complicated than I'd hoped due to locking considerations. Try > not to look at cpuset_setproc() ;). > > If you want to actually try the patch, here's a couple of neat things to > do with cpuset: > > cpuset -l 0-4 /bin/sh > > This creates a new group with a list (-l) of cpus 0-4 inclusive and runs > sh in it. > > cpuset -g -p > > This will get (-g) the mask of cpus pid (-p) is allowed to run on. > > cpuset -l 0,2 -p > > This will restrict sh to running on cpus 0, 2 while its group is still > allowed 0-4. > > cpuset -l 0,2 -c -p > > This will modify the cpuset (-c) that the sh belongs to. > > cpuset -l 0-3 -s 1 > > This will modify the set (-s) that all threads are in by default to > contain the first 4 cpus leaving the rest idled. > > Feedback is appreciated. > Jeff, Is it currently, or will it eventually be possible to assign network threads to different cores? Everything appears to be driven by pid, but at least according to top all interrupt "processes" show as pid 12. Also, if kern.sched.topology returns 0 is it safe to assume I'm not getting the benefit of the topology distinction between packages vs cores? Nick From owner-freebsd-arch@FreeBSD.ORG Tue Feb 26 20:24:39 2008 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6CF0F1065671 for ; Tue, 26 Feb 2008 20:24:39 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from speedfactory.net (mail.speedfactory.net [66.23.216.219]) by mx1.freebsd.org (Postfix) with ESMTP id E82E713C4D1 for ; Tue, 26 Feb 2008 20:24:38 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.8s) with ESMTP id 233492774-1834499 for ; Tue, 26 Feb 2008 15:24:13 -0500 Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.14.2/8.14.2) with ESMTP id m1QKOW69022782 for ; Tue, 26 Feb 2008 15:24:32 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: arch@FreeBSD.org Date: Tue, 26 Feb 2008 15:24:30 -0500 User-Agent: KMail/1.9.7 MIME-Version: 1.0 Content-Disposition: inline Message-Id: <200802261524.30384.jhb@FreeBSD.org> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Tue, 26 Feb 2008 15:24:32 -0500 (EST) X-Virus-Scanned: ClamAV 0.91.2/6003/Tue Feb 26 06:34:31 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: Subject: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2008 20:24:39 -0000 Way back in the 4.x days we had a fiasco over changing the size of FILE (struct __sFILE) to add locking for multithreaded apps because the 'stdin', 'stdout', and 'stderr' symbols were direct references to the global array of FILE objects. The first fix was to move the locking fields into a private 'struct __sFILEX' to preserve the size of FILE. Later the stdin/out/err symbols were fixed to reference standalone pointers instead of the global array. Given that, I think at this point we can safely merge __sFILEX back into __sFILE w/o breaking anything. This is assuming that the contents and layout of FILE are not a public ABI (i.e. we malloc the things internally and consumers should just treat the pointer value as a cookie and not grub around in the internals). In addition to removing the __sFILEX stuff, I'd like to change the fd member of FILE to be an int so you can open more than 32k files via fopen(). Otherwise, if fopen() gets an fd that is > SHORT_MAX, it gets sign extended when the fd is passed to read(), close(), etc. and those calls fail with EBADF. Comments? -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Tue Feb 26 20:32:35 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4A8AD106567B for ; Tue, 26 Feb 2008 20:32:35 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from mail.netplex.net (mail.netplex.net [204.213.176.10]) by mx1.freebsd.org (Postfix) with ESMTP id 068F713C45B for ; Tue, 26 Feb 2008 20:32:34 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) by mail.netplex.net (8.14.2/8.14.2/NETPLEX) with ESMTP id m1QKWXUr006784; Tue, 26 Feb 2008 15:32:33 -0500 (EST) X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.netplex.net) X-Greylist: Message whitelisted by DRAC access database, not delayed by milter-greylist-4.0 (mail.netplex.net [204.213.176.10]); Tue, 26 Feb 2008 15:32:33 -0500 (EST) Date: Tue, 26 Feb 2008 15:32:33 -0500 (EST) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: John Baldwin In-Reply-To: <200802261524.30384.jhb@FreeBSD.org> Message-ID: References: <200802261524.30384.jhb@FreeBSD.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@freebsd.org Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Daniel Eischen List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2008 20:32:35 -0000 On Tue, 26 Feb 2008, John Baldwin wrote: > Way back in the 4.x days we had a fiasco over changing the size of FILE > (struct __sFILE) to add locking for multithreaded apps because > the 'stdin', 'stdout', and 'stderr' symbols were direct references to the > global array of FILE objects. The first fix was to move the locking fields > into a private 'struct __sFILEX' to preserve the size of FILE. Later the > stdin/out/err symbols were fixed to reference standalone pointers instead of > the global array. Given that, I think at this point we can safely merge > __sFILEX back into __sFILE w/o breaking anything. This is assuming that the > contents and layout of FILE are not a public ABI (i.e. we malloc the things > internally and consumers should just treat the pointer value as a cookie and > not grub around in the internals). In addition to removing the __sFILEX > stuff, I'd like to change the fd member of FILE to be an int so you can open > more than 32k files via fopen(). Otherwise, if fopen() gets an fd that is > > SHORT_MAX, it gets sign extended when the fd is passed to read(), close(), > etc. and those calls fail with EBADF. > > Comments? Try it and see if anything breaks? -- DE From owner-freebsd-arch@FreeBSD.ORG Tue Feb 26 22:11:02 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 37A421065671 for ; Tue, 26 Feb 2008 22:11:02 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id 07A6D13C4E7 for ; Tue, 26 Feb 2008 22:11:01 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.107] (cpe-24-94-75-93.hawaii.res.rr.com [24.94.75.93]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id m1QMApKV049274; Tue, 26 Feb 2008 17:10:56 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Tue, 26 Feb 2008 12:12:34 -1000 (HST) From: Jeff Roberson X-X-Sender: jroberson@desktop To: Nick Evans In-Reply-To: <20080226135439.27db7400@pleiades.nextvenue.com> Message-ID: <20080226120956.V920@desktop> References: <20080220105333.G44565@fledge.watson.org> <47BCEFDB.5040207@freebsd.org> <20080220175532.Q920@desktop> <20080220213253.A920@desktop> <20080221092011.J52922@fledge.watson.org> <20080222121253.N920@desktop> <20080222231245.GA28788@lor.one-eyed-alien.net> <20080222134923.M920@desktop> <20080223194047.GB38485@lor.one-eyed-alien.net> <20080223111659.K920@desktop> <20080223213507.GD39699@lor.one-eyed-alien.net> <20080224001902.J920@desktop> <20080226135439.27db7400@pleiades.nextvenue.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@freebsd.org Subject: Re: cpuset and affinity implementation X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2008 22:11:02 -0000 On Tue, 26 Feb 2008, Nick Evans wrote: > On Sun, 24 Feb 2008 00:40:31 -1000 (HST) > Jeff Roberson wrote: > >> Please see: >> http://people.freebsd.org/~jeff/cpuset.diff >> >> This is unfortunately intertwined with ULE's new CPU selection algorithm >> so that code is in the patch as well. Otherwise, this includes a simple, >> ugly userland tool called cpuset and all of the kernel support required. >> I have tested this by creating sets and subsets and modifying their cpu >> masks under load. I'm able to dynamically reprovision without issue. >> >> This doesn't have support for jails but the infrastructure is there. It >> also fails to modify sets if it would leave threads without a valid cpu >> to run on. I have not implemented a force option but it will be trivial >> to do so. The initial cpu set is also created before we know all_cpus so >> it's faked up with all cpus set for now. >> >> I mostly want people to look at the interface in cpuset.h and make sure >> they agree with it before I start polishing to commit. I'm fairly happy >> with the way the syscall api looks now. The code itself ended up being >> much more complicated than I'd hoped due to locking considerations. Try >> not to look at cpuset_setproc() ;). >> >> If you want to actually try the patch, here's a couple of neat things to >> do with cpuset: >> >> cpuset -l 0-4 /bin/sh >> >> This creates a new group with a list (-l) of cpus 0-4 inclusive and runs >> sh in it. >> >> cpuset -g -p >> >> This will get (-g) the mask of cpus pid (-p) is allowed to run on. >> >> cpuset -l 0,2 -p >> >> This will restrict sh to running on cpus 0, 2 while its group is still >> allowed 0-4. >> >> cpuset -l 0,2 -c -p >> >> This will modify the cpuset (-c) that the sh belongs to. >> >> cpuset -l 0-3 -s 1 >> >> This will modify the set (-s) that all threads are in by default to >> contain the first 4 cpus leaving the rest idled. >> >> Feedback is appreciated. >> > > Jeff, > > Is it currently, or will it eventually be possible to assign network threads > to different cores? Everything appears to be driven by pid, but at least > according to top all interrupt "processes" show as pid 12. Also, if > kern.sched.topology returns 0 is it safe to assume I'm not getting the > benefit of the topology distinction between packages vs cores? I forgot to remove the topology sysctl. It is meaningless now. If your machine on boot say something like: Cores per package: 2 Then the scheduler is aware of the layout. As for binding interrupt/kernel threads, you need to get the tid. Then you can set a mask. er, except I guess I didn't make a -t tid argument for cpuset yet. I'll add that to the next patch. Thanks, Jeff > > Nick > From owner-freebsd-arch@FreeBSD.ORG Tue Feb 26 22:14:26 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AFE9C1065674; Tue, 26 Feb 2008 22:14:26 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id 4FA8613C45A; Tue, 26 Feb 2008 22:14:26 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.107] (cpe-24-94-75-93.hawaii.res.rr.com [24.94.75.93]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id m1QMENa0050294; Tue, 26 Feb 2008 17:14:24 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Tue, 26 Feb 2008 12:16:06 -1000 (HST) From: Jeff Roberson X-X-Sender: jroberson@desktop To: Daniel Eischen In-Reply-To: Message-ID: <20080226121251.V920@desktop> References: <20080220175532.Q920@desktop> <20080220213253.A920@desktop> <20080221092011.J52922@fledge.watson.org> <20080222121253.N920@desktop> <20080222231245.GA28788@lor.one-eyed-alien.net> <20080222134923.M920@desktop> <20080223194047.GB38485@lor.one-eyed-alien.net> <20080223111659.K920@desktop> <20080223213507.GD39699@lor.one-eyed-alien.net> <20080224001902.J920@desktop> <20080225231747.GT99258@elvis.mu.org> <20080225143222.B920@desktop> <20080225160433.P920@desktop> <20080225194320.V920@desktop> <20080225213434.L920@desktop> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Brooks Davis , Andrew Gallatin , Alfred Perlstein , arch@freebsd.org, Robert Watson , David Xu Subject: Re: cpuset and affinity implementation X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2008 22:14:26 -0000 On Tue, 26 Feb 2008, Daniel Eischen wrote: > On Mon, 25 Feb 2008, Jeff Roberson wrote: > >> Binding a processor set to the process simply sets the per-thread binding >> of each thread in the process. There is otherwise no specific process >> binding. We could keep a pointer to the last specifically bound set in the >> process if we wanted, but what would it be used for other than querying the >> id of the process? What if each thread was seperately specifically bound >> to a different set? What set should be used on fork? The set of the >> process or the thread that called fork? What about when creating a new >> thread? > > The set used on fork should be the set of the calling thread, > same concept as signal masks I would think. Same thing when > creating a new thread. I guess I'd check how Linux and Solaris > do it, see if they are consistent. Yes, that's what I do now. The mask is inherited from the creater. I was just pointing out that it gets a little ambiguous if we were to have some notion of a per-process set. > > I can see how you might _not_ want to inherit bindings in a > created thread. For a process with real-time threads, the > application might start with superuser privileges, create some > threads with real-time priority and set their bindings, then > setuid() to remove superuser privileges. Is a privilege check > made in a newly created thread when applying inherited bindings? No privilege check on fork. This would create weird failure modes. > >> See above discussion. I'm not sure what you mean by 'default' cpuset here. > > I imagine the 'default' cpuset as the system's default cpuset, > in lieu of any administratively created cpusets and bindings > for the process (inherited or explicit). My opinion is that if we decide that it's important to assign numbered sets to tids we need then to allow cpuset_getid to return multiple ids for WHICH_PID. Jeff > > -- > DE > From owner-freebsd-arch@FreeBSD.ORG Tue Feb 26 23:27:49 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CF3CB1065671; Tue, 26 Feb 2008 23:27:49 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (hergotha.csail.mit.edu [66.92.79.170]) by mx1.freebsd.org (Postfix) with ESMTP id 84DD713C428; Tue, 26 Feb 2008 23:27:49 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.13.8/8.13.8) with ESMTP id m1QMp70l021710; Tue, 26 Feb 2008 17:51:07 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.13.8/8.13.8/Submit) id m1QMp7bV021709; Tue, 26 Feb 2008 17:51:07 -0500 (EST) (envelope-from wollman) Date: Tue, 26 Feb 2008 17:51:07 -0500 (EST) From: Garrett Wollman Message-Id: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> To: jhb@freebsd.org In-Reply-To: <200802261524.30384.jhb@FreeBSD.org> Organization: None X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (hergotha.csail.mit.edu [127.0.0.1]); Tue, 26 Feb 2008 17:51:07 -0500 (EST) X-Spam-Status: No, score=-1.4 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.2.3 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on hergotha.csail.mit.edu Cc: arch@freebsd.org Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2008 23:27:49 -0000 In article <200802261524.30384.jhb@FreeBSD.org> you write: >This is assuming that the contents and layout of FILE are not a >public ABI (i.e. we malloc the things internally and consumers should >just treat the pointer value as a cookie and not grub around in the >internals). Most interpreted languages grub around in the internals, as (historically) do a number of macros. Historically Emacs did so as well (I suppose you can call it an interpreted language). >Comments? I think you have the right idea but this will break the ABI in a way that can't be fudged with symbol versioning. -GAWollman -- Garrett A. Wollman | The real tragedy of human existence is not that we are wollman@csail.mit.edu| nasty by nature, but that a cruel structural asymmetry Opinions not those | grants to rare events of meanness such power to shape of MIT or CSAIL. | our history. - S.J. Gould, Ten Thousand Acts of Kindness From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 04:47:11 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 481771065671; Wed, 27 Feb 2008 04:47:11 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from mail.netplex.net (mail.netplex.net [204.213.176.10]) by mx1.freebsd.org (Postfix) with ESMTP id E8AA113C45D; Wed, 27 Feb 2008 04:47:05 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) by mail.netplex.net (8.14.2/8.14.2/NETPLEX) with ESMTP id m1R4kr77019374; Tue, 26 Feb 2008 23:46:53 -0500 (EST) X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.netplex.net) X-Greylist: Message whitelisted by DRAC access database, not delayed by milter-greylist-4.0 (mail.netplex.net [204.213.176.10]); Tue, 26 Feb 2008 23:46:54 -0500 (EST) Date: Tue, 26 Feb 2008 23:46:53 -0500 (EST) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: Jeff Roberson In-Reply-To: <20080226121251.V920@desktop> Message-ID: References: <20080220175532.Q920@desktop> <20080220213253.A920@desktop> <20080221092011.J52922@fledge.watson.org> <20080222121253.N920@desktop> <20080222231245.GA28788@lor.one-eyed-alien.net> <20080222134923.M920@desktop> <20080223194047.GB38485@lor.one-eyed-alien.net> <20080223111659.K920@desktop> <20080223213507.GD39699@lor.one-eyed-alien.net> <20080224001902.J920@desktop> <20080225231747.GT99258@elvis.mu.org> <20080225143222.B920@desktop> <20080225160433.P920@desktop> <20080225194320.V920@desktop> <20080225213434.L920@desktop> <20080226121251.V920@desktop> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Brooks Davis , Andrew Gallatin , Alfred Perlstein , arch@freebsd.org, Robert Watson , David Xu Subject: Re: cpuset and affinity implementation X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Daniel Eischen List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 04:47:11 -0000 On Tue, 26 Feb 2008, Jeff Roberson wrote: > > On Tue, 26 Feb 2008, Daniel Eischen wrote: > >> On Mon, 25 Feb 2008, Jeff Roberson wrote: >> >>> See above discussion. I'm not sure what you mean by 'default' cpuset >>> here. >> >> I imagine the 'default' cpuset as the system's default cpuset, >> in lieu of any administratively created cpusets and bindings >> for the process (inherited or explicit). > > My opinion is that if we decide that it's important to assign numbered sets > to tids we need then to allow cpuset_getid to return multiple ids for > WHICH_PID. Maybe there shouldn't be WHICH_PID. Perhaps it should be called WHICH_ALLTIDS. Then it might appear more expected if cpuset_getid(WHICH_ALLTIDS, ...) returned multiple cpusets. I realize this is just playing with words, and I do prefer WHICH_PID :-) -- DE From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 04:58:57 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 61D09106566B; Wed, 27 Feb 2008 04:58:57 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from mail.netplex.net (mail.netplex.net [204.213.176.10]) by mx1.freebsd.org (Postfix) with ESMTP id 194F913C458; Wed, 27 Feb 2008 04:58:56 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) by mail.netplex.net (8.14.2/8.14.2/NETPLEX) with ESMTP id m1R4wtf5023235; Tue, 26 Feb 2008 23:58:55 -0500 (EST) X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.netplex.net) X-Greylist: Message whitelisted by DRAC access database, not delayed by milter-greylist-4.0 (mail.netplex.net [204.213.176.10]); Tue, 26 Feb 2008 23:58:55 -0500 (EST) Date: Tue, 26 Feb 2008 23:58:55 -0500 (EST) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: Garrett Wollman In-Reply-To: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> Message-ID: References: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@freebsd.org Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Daniel Eischen List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 04:58:57 -0000 On Tue, 26 Feb 2008, Garrett Wollman wrote: > In article <200802261524.30384.jhb@FreeBSD.org> you write: > >> This is assuming that the contents and layout of FILE are not a >> public ABI (i.e. we malloc the things internally and consumers should >> just treat the pointer value as a cookie and not grub around in the >> internals). > > Most interpreted languages grub around in the internals, as > (historically) do a number of macros. Historically Emacs > did so as well (I suppose you can call it an interpreted language). Yech. I also forgot about the macros in , like __sfeof() and friends. >> Comments? > > I think you have the right idea but this will break the ABI in a way > that can't be fudged with symbol versioning. Well, you can if you add compat symbols for all functions with FILE as an argument. I'd like us to say that the innards of FILE are not part of our ABI, but we would still have to do something to keep at least enough of the ABI for any macros. -- DE From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 05:01:30 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B9D3F1065672 for ; Wed, 27 Feb 2008 05:01:30 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail.speedfactory.net [66.23.216.219]) by mx1.freebsd.org (Postfix) with ESMTP id 4DB8013C447 for ; Wed, 27 Feb 2008 05:01:29 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.8s) with ESMTP id 233533631-1834499 for multiple; Tue, 26 Feb 2008 23:59:40 -0500 Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.14.2/8.14.2) with ESMTP id m1R51Gq1026417; Wed, 27 Feb 2008 00:01:17 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: Garrett Wollman Date: Tue, 26 Feb 2008 23:55:16 -0500 User-Agent: KMail/1.9.7 References: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> In-Reply-To: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200802262355.16519.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Wed, 27 Feb 2008 00:01:17 -0500 (EST) X-Virus-Scanned: ClamAV 0.91.2/6006/Tue Feb 26 20:03:40 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: arch@freebsd.org Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 05:01:30 -0000 On Tuesday 26 February 2008 05:51:07 pm Garrett Wollman wrote: > In article <200802261524.30384.jhb@FreeBSD.org> you write: > > >This is assuming that the contents and layout of FILE are not a > >public ABI (i.e. we malloc the things internally and consumers should > >just treat the pointer value as a cookie and not grub around in the > >internals). > > Most interpreted languages grub around in the internals, as > (historically) do a number of macros. Historically Emacs > did so as well (I suppose you can call it an interpreted language). > > >Comments? > > I think you have the right idea but this will break the ABI in a way > that can't be fudged with symbol versioning. Yes, I discovered the macros today while working on my fd as short problem. In actual fact, I bet the software that does this is rare outside of our stdio.h since glibc doesn't expose its FILE struct and doesn't inline operations like fileno(). Axeing __sFILEX should be safe as it doesn't affect any of the members that we access via inline macros and would merely restore one member where 'extra' currently lives and add the locking stuff to the end. However, I can't fix the fact that our stdio can't handle fd's > SHRT_MAX (again, glibc handles this just fine) w/o making a royal mess. We could create a new versioned FILE struct (so long as we can recognize the existing FILE struct somehow) and have new fopen()/fdopen()/freopen() symbols that return the new struct but then all the stdio routines would have to check to see if the structure was an old structure explicitly and handle it appropriately if so. Rather gross. What I've gone with instead to fix the SHRT_MAX problem is to change fopen/fdopen/freopen to fail to use fd's > SHRT_MAX with an error. This fixes the problem we are seeing at work where once an app has 32k open file descriptors, calls to gethostbyname(3) leak a file descriptor of /etc/hosts because the 'files' impl of gethostbyname(3) fopen()s /etc/hosts but the subsequent fclose() (as well as the fread()s to parse it) fail because the fd is sign-extended when the short value is converted to an int to be passed to read(2) and close(2). With this change fopen() now fails instead of fread() and fclose() and it no longer leaks file descriptors. --- //depot/vendor/freebsd_6/src/lib/libc/stdio/fdopen.c 2004/10/22 16:37:59 +++ //depot/yahoo/ybsd_6/src/lib/libc/stdio/fdopen.c 2008/02/26 20:27:21 @@ -61,6 +61,18 @@ if (nofile == 0) nofile = getdtablesize(); + /* + * File descriptors are a full int, but _file is only a short. + * If we get a valid file descriptor that is greater than + * SHRT_MAX, then the fd will get sign-extended into an + * invalid file descriptor. Handle this case by failing the + * open. + */ + if (fd > SHRT_MAX) { + errno = EINVAL; + return (NULL); + } + if ((flags = __sflags(mode, &oflags)) == 0) return (NULL); --- //depot/vendor/freebsd_6/src/lib/libc/stdio/fopen.c 2004/10/22 16:37:59 +++ //depot/yahoo/ybsd_6/src/lib/libc/stdio/fopen.c 2008/02/26 20:33:26 @@ -44,6 +44,7 @@ #include #include #include +#include #include #include #include "un-namespace.h" @@ -67,6 +68,18 @@ fp->_flags = 0; /* release */ return (NULL); } + /* + * File descriptors are a full int, but _file is only a short. + * If we get a valid file descriptor that is greater than + * SHRT_MAX, then the fd will get sign-extended into an + * invalid file descriptor. Handle this case by failing the + * open. + */ + if (f > SHRT_MAX) { + _close(f); + errno = ENOMEM; + return (NULL); + } fp->_file = f; fp->_flags = flags; fp->_cookie = fp; --- //depot/vendor/freebsd_6/src/lib/libc/stdio/freopen.c 2006/11/18 14:28:31 +++ //depot/yahoo/ybsd_6/src/lib/libc/stdio/freopen.c 2008/02/26 20:27:21 @@ -207,6 +207,20 @@ } } + /* + * File descriptors are a full int, but _file is only a short. + * If we get a valid file descriptor that is greater than + * SHRT_MAX, then the fd will get sign-extended into an + * invalid file descriptor. Handle this case by failing the + * open. + */ + if (f > SHRT_MAX) { + fp->_flags = 0; /* set it free */ + FUNLOCKFILE(fp); + errno = ENOMEM; + return (NULL); + } + fp->_flags = flags; fp->_file = f; fp->_cookie = fp; -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 05:15:01 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 18ECE106566B for ; Wed, 27 Feb 2008 05:15:01 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (hergotha.csail.mit.edu [66.92.79.170]) by mx1.freebsd.org (Postfix) with ESMTP id C080E13C448 for ; Wed, 27 Feb 2008 05:15:00 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.13.8/8.13.8) with ESMTP id m1R5ExSa024047; Wed, 27 Feb 2008 00:14:59 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.13.8/8.13.8/Submit) id m1R5ExDe024046; Wed, 27 Feb 2008 00:14:59 -0500 (EST) (envelope-from wollman) Date: Wed, 27 Feb 2008 00:14:59 -0500 (EST) From: Garrett Wollman Message-Id: <200802270514.m1R5ExDe024046@hergotha.csail.mit.edu> To: deischen@freebsd.org In-Reply-To: References: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> Organization: None X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (hergotha.csail.mit.edu [127.0.0.1]); Wed, 27 Feb 2008 00:14:59 -0500 (EST) X-Spam-Status: No, score=-1.4 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.2.3 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on hergotha.csail.mit.edu Cc: arch@freebsd.org Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 05:15:01 -0000 In article , Daniel Eischen writes: >> [I wrote:] >> I think you have the right idea but this will break the ABI in a way >> that can't be fudged with symbol versioning. > >Well, you can if you add compat symbols for all functions with FILE >as an argument. In every library, including hundreds of third-party libraries that pass FILE * arguments? I don't think so. What would work, although it would be extra pain, would be to extend the structure. It would be necessary to keep compatibility members of the structure, in their old locations, and update them to reflect state changes appropriately. If the only thing that will change is the width of _file, then that's probably a workable approach, since it doesn't break anything that wasn't already broken in the presence of FD 65536 anyway. Applications aren't permitted to store objects of type FILE, only FILE *, so this should be safe. -GAWollman From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 05:26:27 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 70008106566B; Wed, 27 Feb 2008 05:26:27 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (hergotha.csail.mit.edu [66.92.79.170]) by mx1.freebsd.org (Postfix) with ESMTP id 0752C13C455; Wed, 27 Feb 2008 05:26:26 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.13.8/8.13.8) with ESMTP id m1R5QQtk024164; Wed, 27 Feb 2008 00:26:26 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.13.8/8.13.8/Submit) id m1R5QQT3024163; Wed, 27 Feb 2008 00:26:26 -0500 (EST) (envelope-from wollman) Date: Wed, 27 Feb 2008 00:26:26 -0500 (EST) From: Garrett Wollman Message-Id: <200802270526.m1R5QQT3024163@hergotha.csail.mit.edu> To: jhb@freebsd.org X-Newsgroups: mit.lcs.mail.freebsd-arch In-Reply-To: <200802262355.16519.jhb@freebsd.org> References: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> Organization: None X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (hergotha.csail.mit.edu [127.0.0.1]); Wed, 27 Feb 2008 00:26:26 -0500 (EST) X-Spam-Status: No, score=-1.4 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.2.3 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on hergotha.csail.mit.edu Cc: arch@freebsd.org Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 05:26:27 -0000 In article <200802262355.16519.jhb@freebsd.org>, John Baldwin writes: >On Tuesday 26 February 2008 05:51:07 pm Garrett Wollman wrote: >+ /* >+ * File descriptors are a full int, but _file is only a short. >+ * If we get a valid file descriptor that is greater than >+ * SHRT_MAX, then the fd will get sign-extended into an >+ * invalid file descriptor. Handle this case by failing the >+ * open. >+ */ >+ if (fd > SHRT_MAX) { >+ errno = EINVAL; >+ return (NULL); >+ } >+ Please, please, please, whatever you do, don't add Yet Another Overloaded Meaning for [EINVAL]. Use [EMFILE] instead, which is defined to have the precise meaning desired here. For extra credit, fix the various places {STREAM_MAX} is defined to take this limit into account. I think the following may be all that is required (beware xterm cut-and-paste screwage): Index: lib/libc/gen/sysconf.c =================================================================== RCS file: /home/ncvs/src/lib/libc/gen/sysconf.c,v retrieving revision 1.20 diff -u -r1.20 sysconf.c --- lib/libc/gen/sysconf.c 17 Nov 2002 08:54:29 -0000 1.20 +++ lib/libc/gen/sysconf.c 27 Feb 2008 05:23:24 -0000 @@ -105,7 +105,6 @@ mib[1] = KERN_NGROUPS; break; case _SC_OPEN_MAX: - case _SC_STREAM_MAX: /* assume fds run out before memory does */ if (getrlimit(RLIMIT_NOFILE, &rl) != 0) return (-1); if (rl.rlim_cur == RLIM_INFINITY) @@ -115,6 +114,25 @@ return (-1); } return ((long)rl.rlim_cur); + case _SC_STREAM_MAX: + if (getrlimit(RLIMIT_NOFILE, &rl) != 0) + return (-1); + if (rl.rlim_cur == RLIM_INFINITY) + return (-1); + if (rl.rlim_cur > LONG_MAX) { + errno = EOVERFLOW; + return (-1); + } + /* + * struct __sFILE currently has a limitation that + * file descriptors must fit in a signed short. + * This doesn't precisely capture the letter of POSIX + * but approximates the spirit. + */ + if (rl.rlim_cur > SHRT_MAX) + return (SHRT_MAX); + + return ((long)rl.rlim_cur); case _SC_JOB_CONTROL: return (_POSIX_JOB_CONTROL); case _SC_SAVED_IDS: -GAWollman From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 05:33:43 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 76CEF1065673; Wed, 27 Feb 2008 05:33:43 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail.speedfactory.net [66.23.216.219]) by mx1.freebsd.org (Postfix) with ESMTP id A99E313C442; Wed, 27 Feb 2008 05:33:42 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.8s) with ESMTP id 233535366-1834499 for multiple; Wed, 27 Feb 2008 00:31:55 -0500 Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.14.2/8.14.2) with ESMTP id m1R5XcGp026631; Wed, 27 Feb 2008 00:33:38 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: Daniel Eischen Date: Wed, 27 Feb 2008 00:27:05 -0500 User-Agent: KMail/1.9.7 References: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200802270027.05426.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Wed, 27 Feb 2008 00:33:38 -0500 (EST) X-Virus-Scanned: ClamAV 0.91.2/6006/Tue Feb 26 20:03:40 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: arch@freebsd.org, Garrett Wollman Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 05:33:43 -0000 On Tuesday 26 February 2008 11:58:55 pm Daniel Eischen wrote: > On Tue, 26 Feb 2008, Garrett Wollman wrote: > > > In article <200802261524.30384.jhb@FreeBSD.org> you write: > > > >> This is assuming that the contents and layout of FILE are not a > >> public ABI (i.e. we malloc the things internally and consumers should > >> just treat the pointer value as a cookie and not grub around in the > >> internals). > > > > Most interpreted languages grub around in the internals, as > > (historically) do a number of macros. Historically Emacs > > did so as well (I suppose you can call it an interpreted language). > > Yech. I also forgot about the macros in , like __sfeof() > and friends. > > >> Comments? > > > > I think you have the right idea but this will break the ABI in a way > > that can't be fudged with symbol versioning. > > Well, you can if you add compat symbols for all functions with FILE > as an argument. You have to worry about other libraries (say ncurses) that use fopen@1.0 and then return that FILE * to a user app that calls fclose (but the user app will call fclose@1.1 and it blows up). Hence in my other e-mail where I said all the stdio routines would have to detect the two different versions and handle them. Gross. > I'd like us to say that the innards of FILE are not part of our > ABI, but we would still have to do something to keep at least > enough of the ABI for any macros. The problem is that _file is used by fileno() via __sfile() and that's the one I want to fix. The sFILEX stuff is ok as none of it is exposed, but not fixing __file. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 05:50:12 2008 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F1AFD1065670 for ; Wed, 27 Feb 2008 05:50:12 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail.speedfactory.net [66.23.216.219]) by mx1.freebsd.org (Postfix) with ESMTP id 702B413C455 for ; Wed, 27 Feb 2008 05:50:12 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.8s) with ESMTP id 233535368-1834499 for multiple; Wed, 27 Feb 2008 00:31:57 -0500 Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.14.2/8.14.2) with ESMTP id m1R5XcGq026631; Wed, 27 Feb 2008 00:33:40 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: freebsd-arch@freebsd.org Date: Wed, 27 Feb 2008 00:31:59 -0500 User-Agent: KMail/1.9.7 References: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> <200802270514.m1R5ExDe024046@hergotha.csail.mit.edu> In-Reply-To: <200802270514.m1R5ExDe024046@hergotha.csail.mit.edu> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200802270031.59797.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Wed, 27 Feb 2008 00:33:41 -0500 (EST) X-Virus-Scanned: ClamAV 0.91.2/6006/Tue Feb 26 20:03:40 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: deischen@freebsd.org, Garrett Wollman Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 05:50:13 -0000 On Wednesday 27 February 2008 12:14:59 am Garrett Wollman wrote: > In article , > Daniel Eischen writes: > >> [I wrote:] > >> I think you have the right idea but this will break the ABI in a way > >> that can't be fudged with symbol versioning. > > > >Well, you can if you add compat symbols for all functions with FILE > >as an argument. > > In every library, including hundreds of third-party libraries that > pass FILE * arguments? I don't think so. > > What would work, although it would be extra pain, would be to extend > the structure. It would be necessary to keep compatibility members of > the structure, in their old locations, and update them to reflect > state changes appropriately. If the only thing that will change is > the width of _file, then that's probably a workable approach, since it > doesn't break anything that wasn't already broken in the presence of > FD 65536 anyway. Applications aren't permitted to store objects of > type FILE, only FILE *, so this should be safe. Actually FD 32768. 32768 gets sign extended when the short is promoted to an int. I guess we could add a _nfile that is an int, and try to keep _file up to date for older apps. Newer apps would just always use _nfile. (Or better, rename _file to _ofile and make the new one _file but at new location). I'll work on that next then. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 06:59:31 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 439A91065673 for ; Wed, 27 Feb 2008 06:59:31 +0000 (UTC) (envelope-from peterjeremy@optushome.com.au) Received: from mail14.syd.optusnet.com.au (mail14.syd.optusnet.com.au [211.29.132.195]) by mx1.freebsd.org (Postfix) with ESMTP id C58D513C458 for ; Wed, 27 Feb 2008 06:59:30 +0000 (UTC) (envelope-from peterjeremy@optushome.com.au) Received: from server.vk2pj.dyndns.org (c220-239-20-82.belrs4.nsw.optusnet.com.au [220.239.20.82]) by mail14.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id m1R6xQpk020863 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 27 Feb 2008 17:59:27 +1100 Received: from server.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by server.vk2pj.dyndns.org (8.14.2/8.14.1) with ESMTP id m1R6xPkw077309; Wed, 27 Feb 2008 17:59:25 +1100 (EST) (envelope-from peter@server.vk2pj.dyndns.org) Received: (from peter@localhost) by server.vk2pj.dyndns.org (8.14.2/8.14.2/Submit) id m1R6xPql077308; Wed, 27 Feb 2008 17:59:25 +1100 (EST) (envelope-from peter) Date: Wed, 27 Feb 2008 17:59:25 +1100 From: Peter Jeremy To: John Baldwin Message-ID: <20080227065925.GK83599@server.vk2pj.dyndns.org> References: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> <200802262355.16519.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="dgjlcl3Tl+kb3YDk" Content-Disposition: inline In-Reply-To: <200802262355.16519.jhb@freebsd.org> X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.17 (2007-11-01) Cc: arch@freebsd.org, Garrett Wollman Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 06:59:31 -0000 --dgjlcl3Tl+kb3YDk Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Feb 26, 2008 at 11:55:16PM -0500, John Baldwin wrote: >Yes, I discovered the macros today while working on my fd as short problem= =2E =20 Macros and __inline functions mean that a significant proportion of software compiled on FreeBSD has the existing definition of FILE compiled into it. >However, I can't fix the fact that our stdio can't handle fd's > SHRT_MAX= =20 >(again, glibc handles this just fine) w/o making a royal mess. I don't think a versioned FILE is practical so we are stuck with a 16-bit _file for the immediate future. >What I've gone with instead to fix the SHRT_MAX problem is to change=20 >fopen/fdopen/freopen to fail to use fd's > SHRT_MAX with an error. You could change _file from 'short' to 'unsigned short' without breaking the ABI - this would allow either 65535 or 65536 file descriptors (I'm not sure whether _file =3D=3D -1 is special or not). This would postpone the problem for some time. My suggestion would be: Now: a) change _file to 'unsigned short' and add checks as proposed b) merge __sFILEX into FILE c) Remove the macros and inlines that poke around inside FILE d) Note that directly accessing FILE innards is deprecated and move the definition of struct __sFILE into libc/stdio/local.h Once RELENG_8 is branched: e) Don asbestos underwear and re-arrange struct __sFILE to grow _file etc. --=20 Peter Jeremy Please excuse any delays as the result of my ISP's inability to implement an MTA that is either RFC2821-compliant or matches their claimed behaviour. --dgjlcl3Tl+kb3YDk Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHxQpN/opHv/APuIcRAgmnAKC6cMr+GWFK5J985GWWAkWv4CwsowCgjDr6 lpsxwQUgrmyup1dI4KTf5dI= =ei2f -----END PGP SIGNATURE----- --dgjlcl3Tl+kb3YDk-- From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 09:36:18 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C213E106566B; Wed, 27 Feb 2008 09:36:18 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id 59A3C13C461; Wed, 27 Feb 2008 09:36:18 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.107] (cpe-24-94-75-93.hawaii.res.rr.com [24.94.75.93]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id m1R9aCsV062865; Wed, 27 Feb 2008 04:36:15 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Tue, 26 Feb 2008 23:37:58 -1000 (HST) From: Jeff Roberson X-X-Sender: jroberson@desktop To: Daniel Eischen In-Reply-To: Message-ID: <20080226233645.D920@desktop> References: <20080220175532.Q920@desktop> <20080220213253.A920@desktop> <20080221092011.J52922@fledge.watson.org> <20080222121253.N920@desktop> <20080222231245.GA28788@lor.one-eyed-alien.net> <20080222134923.M920@desktop> <20080223194047.GB38485@lor.one-eyed-alien.net> <20080223111659.K920@desktop> <20080223213507.GD39699@lor.one-eyed-alien.net> <20080224001902.J920@desktop> <20080225231747.GT99258@elvis.mu.org> <20080225143222.B920@desktop> <20080225160433.P920@desktop> <20080225194320.V920@desktop> <20080225213434.L920@desktop> <20080226121251.V920@desktop> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Brooks Davis , Andrew Gallatin , Alfred Perlstein , arch@freebsd.org, Robert Watson , David Xu Subject: Re: cpuset and affinity implementation X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 09:36:18 -0000 On Tue, 26 Feb 2008, Daniel Eischen wrote: > On Tue, 26 Feb 2008, Jeff Roberson wrote: > >> >> On Tue, 26 Feb 2008, Daniel Eischen wrote: >> >>> On Mon, 25 Feb 2008, Jeff Roberson wrote: >>> >>>> See above discussion. I'm not sure what you mean by 'default' cpuset >>>> here. >>> >>> I imagine the 'default' cpuset as the system's default cpuset, >>> in lieu of any administratively created cpusets and bindings >>> for the process (inherited or explicit). >> >> My opinion is that if we decide that it's important to assign numbered sets >> to tids we need then to allow cpuset_getid to return multiple ids for >> WHICH_PID. > > Maybe there shouldn't be WHICH_PID. Perhaps it should be called > WHICH_ALLTIDS. Then it might appear more expected if > cpuset_getid(WHICH_ALLTIDS, ...) returned multiple cpusets. > I realize this is just playing with words, and I do prefer > WHICH_PID :-) Are there any objections to commiting this functionality in its current form? I think there is the possibility for further debate and refinement but I believe the code is stable and simple enough to hit the tree for people to start using it. Thanks, Jeff > > -- > DE > From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 14:42:22 2008 Return-Path: Delivered-To: arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D8FC11065676 for ; Wed, 27 Feb 2008 14:42:22 +0000 (UTC) (envelope-from das@FreeBSD.ORG) Received: from zim.MIT.EDU (ZIM.MIT.EDU [18.95.3.101]) by mx1.freebsd.org (Postfix) with ESMTP id AA0AF8FC13 for ; Wed, 27 Feb 2008 14:42:17 +0000 (UTC) (envelope-from das@FreeBSD.ORG) Received: from zim.MIT.EDU (localhost [127.0.0.1]) by zim.MIT.EDU (8.14.2/8.14.2) with ESMTP id m1REh3Sw080116; Wed, 27 Feb 2008 09:43:03 -0500 (EST) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by zim.MIT.EDU (8.14.2/8.14.2/Submit) id m1REh3Oc080115; Wed, 27 Feb 2008 09:43:03 -0500 (EST) (envelope-from das@FreeBSD.ORG) Date: Wed, 27 Feb 2008 09:43:03 -0500 From: David Schultz To: John Baldwin Message-ID: <20080227144303.GA79999@zim.MIT.EDU> Mail-Followup-To: John Baldwin , Garrett Wollman , arch@FreeBSD.ORG References: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> <200802262355.16519.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200802262355.16519.jhb@freebsd.org> Cc: arch@FreeBSD.ORG, Garrett Wollman Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 14:42:22 -0000 On Tue, Feb 26, 2008, John Baldwin wrote: > > I think you have the right idea but this will break the ABI in a way > > that can't be fudged with symbol versioning. [...] > However, I can't fix the fact that our stdio can't handle fd's > SHRT_MAX > (again, glibc handles this just fine) w/o making a royal mess. We could > create a new versioned FILE struct (so long as we can recognize the existing > FILE struct somehow) and have new fopen()/fdopen()/freopen() symbols that > return the new struct but then all the stdio routines would have to check to > see if the structure was an old structure explicitly and handle it > appropriately if so. Rather gross. Symbol versioning also doesn't help the case where a FILE * gets passed from an app that's using the new symbol to another library that's using the old symbol, or vise versa. If you do wind up breaking the ABI again, maybe it's worth it to add an explicit version number in the FILE struct itself, so we never have to worry about this again. From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 15:33:56 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8CA5D1065679; Wed, 27 Feb 2008 15:33:56 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from mail.netplex.net (mail.netplex.net [204.213.176.10]) by mx1.freebsd.org (Postfix) with ESMTP id 5538C8FC20; Wed, 27 Feb 2008 15:33:55 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) by mail.netplex.net (8.14.2/8.14.2/NETPLEX) with ESMTP id m1RFXsKg007174; Wed, 27 Feb 2008 10:33:54 -0500 (EST) X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.netplex.net) X-Greylist: Message whitelisted by DRAC access database, not delayed by milter-greylist-4.0 (mail.netplex.net [204.213.176.10]); Wed, 27 Feb 2008 10:33:54 -0500 (EST) Date: Wed, 27 Feb 2008 10:33:54 -0500 (EST) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: David Schultz In-Reply-To: <20080227144303.GA79999@zim.MIT.EDU> Message-ID: References: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> <200802262355.16519.jhb@freebsd.org> <20080227144303.GA79999@zim.MIT.EDU> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@freebsd.org, Garrett Wollman Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Daniel Eischen List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 15:33:56 -0000 On Wed, 27 Feb 2008, David Schultz wrote: > On Tue, Feb 26, 2008, John Baldwin wrote: >>> I think you have the right idea but this will break the ABI in a way >>> that can't be fudged with symbol versioning. > [...] >> However, I can't fix the fact that our stdio can't handle fd's > SHRT_MAX >> (again, glibc handles this just fine) w/o making a royal mess. We could >> create a new versioned FILE struct (so long as we can recognize the existing >> FILE struct somehow) and have new fopen()/fdopen()/freopen() symbols that >> return the new struct but then all the stdio routines would have to check to >> see if the structure was an old structure explicitly and handle it >> appropriately if so. Rather gross. > > Symbol versioning also doesn't help the case where a FILE * gets > passed from an app that's using the new symbol to another library > that's using the old symbol, or vise versa. If you do wind up > breaking the ABI again, maybe it's worth it to add an explicit > version number in the FILE struct itself, so we never have to > worry about this again. If we are doing anything about this and modifying FILE, perhaps we should also think about hiding all the innards of FILE, except for the few fields that are required for the stdio macros. -- DE From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 18:30:29 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3C1951065681 for ; Wed, 27 Feb 2008 18:30:29 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail.speedfactory.net [66.23.216.219]) by mx1.freebsd.org (Postfix) with ESMTP id D0BE38FC1B for ; Wed, 27 Feb 2008 18:30:28 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.8s) with ESMTP id 233599867-1834499 for multiple; Wed, 27 Feb 2008 13:28:10 -0500 Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.14.2/8.14.2) with ESMTP id m1RITrOh035092; Wed, 27 Feb 2008 13:30:04 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: Garrett Wollman Date: Wed, 27 Feb 2008 11:34:04 -0500 User-Agent: KMail/1.9.7 References: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> <200802270526.m1R5QQT3024163@hergotha.csail.mit.edu> In-Reply-To: <200802270526.m1R5QQT3024163@hergotha.csail.mit.edu> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200802271134.04166.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Wed, 27 Feb 2008 13:30:05 -0500 (EST) X-Virus-Scanned: ClamAV 0.91.2/6010/Wed Feb 27 07:54:14 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: arch@freebsd.org Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 18:30:29 -0000 On Wednesday 27 February 2008 12:26:26 am Garrett Wollman wrote: > In article <200802262355.16519.jhb@freebsd.org>, > John Baldwin writes: > >On Tuesday 26 February 2008 05:51:07 pm Garrett Wollman wrote: > > >+ /* > >+ * File descriptors are a full int, but _file is only a short. > >+ * If we get a valid file descriptor that is greater than > >+ * SHRT_MAX, then the fd will get sign-extended into an > >+ * invalid file descriptor. Handle this case by failing the > >+ * open. > >+ */ > >+ if (fd > SHRT_MAX) { > >+ errno = EINVAL; > >+ return (NULL); > >+ } > >+ > > Please, please, please, whatever you do, don't add Yet Another > Overloaded Meaning for [EINVAL]. Use [EMFILE] instead, which is > defined to have the precise meaning desired here. For extra credit, > fix the various places {STREAM_MAX} is defined to take this limit into > account. I think the following may be all that is required (beware > xterm cut-and-paste screwage): I used EINVAL rather than ENOMEM for fdopen() since fdopen() is documented to return errors for fcntl(2) and one of the EINVAL's for fcntl(2) is: [EINVAL] The cmd argument is F_DUPFD and arg is negative or greater than the maximum allowable number (see getdtablesize(2)). I avoided EMFILE in all 3 cases as it struck me as being not really true (an app would find the rlimit higher than the current fd for example). Also, EMFILE doesn't really make sense from fdopen() at all. You've already opened the fd, so you know you can't run out of fd's. > Index: lib/libc/gen/sysconf.c > =================================================================== > RCS file: /home/ncvs/src/lib/libc/gen/sysconf.c,v > retrieving revision 1.20 > diff -u -r1.20 sysconf.c > --- lib/libc/gen/sysconf.c 17 Nov 2002 08:54:29 -0000 1.20 > +++ lib/libc/gen/sysconf.c 27 Feb 2008 05:23:24 -0000 > @@ -105,7 +105,6 @@ > mib[1] = KERN_NGROUPS; > break; > case _SC_OPEN_MAX: > - case _SC_STREAM_MAX: /* assume fds run out before memory does */ > if (getrlimit(RLIMIT_NOFILE, &rl) != 0) > return (-1); > if (rl.rlim_cur == RLIM_INFINITY) > @@ -115,6 +114,25 @@ > return (-1); > } > return ((long)rl.rlim_cur); > + case _SC_STREAM_MAX: > + if (getrlimit(RLIMIT_NOFILE, &rl) != 0) > + return (-1); > + if (rl.rlim_cur == RLIM_INFINITY) > + return (-1); > + if (rl.rlim_cur > LONG_MAX) { > + errno = EOVERFLOW; > + return (-1); > + } > + /* > + * struct __sFILE currently has a limitation that > + * file descriptors must fit in a signed short. > + * This doesn't precisely capture the letter of POSIX > + * but approximates the spirit. > + */ > + if (rl.rlim_cur > SHRT_MAX) > + return (SHRT_MAX); > + > + return ((long)rl.rlim_cur); > case _SC_JOB_CONTROL: > return (_POSIX_JOB_CONTROL); > case _SC_SAVED_IDS: > > > -GAWollman Ah, thanks. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 18:30:39 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D2B3D1065679 for ; Wed, 27 Feb 2008 18:30:39 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail.speedfactory.net [66.23.216.219]) by mx1.freebsd.org (Postfix) with ESMTP id 6E7068FC28 for ; Wed, 27 Feb 2008 18:30:39 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.8s) with ESMTP id 233599874-1834499 for multiple; Wed, 27 Feb 2008 13:28:15 -0500 Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.14.2/8.14.2) with ESMTP id m1RITrOi035092; Wed, 27 Feb 2008 13:30:10 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: Peter Jeremy Date: Wed, 27 Feb 2008 11:38:33 -0500 User-Agent: KMail/1.9.7 References: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> <200802262355.16519.jhb@freebsd.org> <20080227065925.GK83599@server.vk2pj.dyndns.org> In-Reply-To: <20080227065925.GK83599@server.vk2pj.dyndns.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200802271138.33979.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Wed, 27 Feb 2008 13:30:10 -0500 (EST) X-Virus-Scanned: ClamAV 0.91.2/6010/Wed Feb 27 07:54:14 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: arch@freebsd.org, Garrett Wollman Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 18:30:39 -0000 On Wednesday 27 February 2008 01:59:25 am Peter Jeremy wrote: > On Tue, Feb 26, 2008 at 11:55:16PM -0500, John Baldwin wrote: > >Yes, I discovered the macros today while working on my fd as short problem. > > Macros and __inline functions mean that a significant proportion of > software compiled on FreeBSD has the existing definition of FILE > compiled into it. > > >However, I can't fix the fact that our stdio can't handle fd's > SHRT_MAX > >(again, glibc handles this just fine) w/o making a royal mess. > > I don't think a versioned FILE is practical so we are stuck with a > 16-bit _file for the immediate future. > > >What I've gone with instead to fix the SHRT_MAX problem is to change > >fopen/fdopen/freopen to fail to use fd's > SHRT_MAX with an error. > > You could change _file from 'short' to 'unsigned short' without breaking > the ABI - this would allow either 65535 or 65536 file descriptors (I'm > not sure whether _file == -1 is special or not). This would postpone > the problem for some time. -1 is used a lot in the stdio code for file's not backed by an fd. My problem though is that this doesn't help with existing binaries that are already compiled (which is what I have to deal with). Had fileno() not been inlined I would have been ok, but that's pretty much done for me as far as my current problem on 6.x. Had I just been able to change FILE * and not had inlines, then a new fopen would have worked fine in my case. > My suggestion would be: > Now: > a) change _file to 'unsigned short' and add checks as proposed > b) merge __sFILEX into FILE > c) Remove the macros and inlines that poke around inside FILE > d) Note that directly accessing FILE innards is deprecated and > move the definition of struct __sFILE into libc/stdio/local.h Yes, but also d2) tag all the fields that were previously exported and so they are not changed in the future. > Once RELENG_8 is branched: > e) Don asbestos underwear and re-arrange struct __sFILE to grow _file etc. We can't do e) because thanks to symbol versioning, 8.x and 9.x will have libc.so.7, so a 7.0 binary will still use the brand new libc, so it has to preserve the ABI of the currently exported fields pretty much forever. I do think we can get away with renaming '_file' to '_ofile' and adding a new 'int _file' at the bottom of the struct and making sure '_ofile' is always in sync (when possible, truncated when _file is too bug). Also, I think we can do the new _file in HEAD for 8.0 w/o any worries. I don't think waiting until 9.0 buys anything there. Given that, I think I'd rather just patch the current stable branches to handle the edge case better and work on making _file an int in HEAD (with the ABI compat _ofile). -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 18:54:48 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 90B05106566B; Wed, 27 Feb 2008 18:54:48 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (hergotha.csail.mit.edu [66.92.79.170]) by mx1.freebsd.org (Postfix) with ESMTP id 307F88FC12; Wed, 27 Feb 2008 18:54:47 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.13.8/8.13.8) with ESMTP id m1RIskGY031077; Wed, 27 Feb 2008 13:54:46 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.13.8/8.13.8/Submit) id m1RIskZ9031074; Wed, 27 Feb 2008 13:54:46 -0500 (EST) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <18373.45558.444085.196189@hergotha.csail.mit.edu> Date: Wed, 27 Feb 2008 13:54:46 -0500 From: Garrett Wollman To: John Baldwin In-Reply-To: <200802271134.04166.jhb@freebsd.org> References: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> <200802270526.m1R5QQT3024163@hergotha.csail.mit.edu> <200802271134.04166.jhb@freebsd.org> X-Mailer: VM 7.17 under 21.4 (patch 21) "Educational Television" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (hergotha.csail.mit.edu [127.0.0.1]); Wed, 27 Feb 2008 13:54:46 -0500 (EST) X-Spam-Status: No, score=-1.4 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.2.3 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on hergotha.csail.mit.edu Cc: arch@freebsd.org Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 18:54:48 -0000 < said: > I avoided EMFILE in all 3 cases as it struck me as being not really true (an > app would find the rlimit higher than the current fd for example). Also, > EMFILE doesn't really make sense from fdopen() at all. You've already opened > the fd, so you know you can't run out of fd's. [EMFILE] is does not imply that you have run out of fds. POSIX says (for fdopen()): The fdopen( ) function may fail if: [EBADF] The fildes argument is not a valid file descriptor. [EINVAL] The mode argument is not a valid mode. [EMFILE] {FOPEN_MAX} streams are currently open in the calling process. [EMFILE] {STREAM_MAX} streams are currently open in the calling process. [ENOMEM] Insufficient space to allocate a buffer. My change to sysconf() causes {STREAM_MAX} to be clamped at {SHRT_MAX}, so a user calling sysconf(_PC_STREAM_MAX) or $(getconf STREAM_MAX) will see a different value from the resource limit and understand that there is a limit (even if it's not quite on the number of streams). For fopen(), the errors are defined as follows: "shall fail": [EMFILE] {OPEN_MAX} file descriptors are currently open in the calling process. [ENFILE] The maximum allowable number of files is currently open in the system. "may fail": [EINVAL] The value of the mode argument is not valid. [EMFILE] {FOPEN_MAX} streams are currently open in the calling process. [EMFILE] {STREAM_MAX} streams are currently open in the calling process. The other possibility would be [EOVERFLOW], which is defined as: [EOVERFLOW] The named file is a regular file and the size of the file cannot be represented correctly in an object of type off_t. But I truly believe that [EMFILE] is the best option. -GAWollman From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 19:02:14 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C3DD61065670 for ; Wed, 27 Feb 2008 19:02:14 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail.speedfactory.net [66.23.216.219]) by mx1.freebsd.org (Postfix) with ESMTP id 453638FC18 for ; Wed, 27 Feb 2008 19:02:09 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.8s) with ESMTP id 233603361-1834499 for multiple; Wed, 27 Feb 2008 13:59:59 -0500 Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.14.2/8.14.2) with ESMTP id m1RJ1vaB035481; Wed, 27 Feb 2008 14:01:57 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: Garrett Wollman Date: Wed, 27 Feb 2008 13:57:29 -0500 User-Agent: KMail/1.9.7 References: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> <200802271134.04166.jhb@freebsd.org> <18373.45558.444085.196189@hergotha.csail.mit.edu> In-Reply-To: <18373.45558.444085.196189@hergotha.csail.mit.edu> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200802271357.29188.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Wed, 27 Feb 2008 14:01:57 -0500 (EST) X-Virus-Scanned: ClamAV 0.91.2/6010/Wed Feb 27 07:54:14 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: arch@freebsd.org Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 19:02:14 -0000 On Wednesday 27 February 2008 01:54:46 pm Garrett Wollman wrote: > < said: > > > I avoided EMFILE in all 3 cases as it struck me as being not really true (an > > app would find the rlimit higher than the current fd for example). Also, > > EMFILE doesn't really make sense from fdopen() at all. You've already opened > > the fd, so you know you can't run out of fd's. > > [EMFILE] is does not imply that you have run out of fds. POSIX > says (for fdopen()): > > The fdopen( ) function may fail if: > [EBADF] The fildes argument is not a valid file descriptor. > [EINVAL] The mode argument is not a valid mode. > [EMFILE] {FOPEN_MAX} streams are currently open in the > calling process. > [EMFILE] {STREAM_MAX} streams are currently open in the > calling process. > [ENOMEM] Insufficient space to allocate a buffer. > > My change to sysconf() causes {STREAM_MAX} to be clamped at > {SHRT_MAX}, so a user calling sysconf(_PC_STREAM_MAX) or > $(getconf STREAM_MAX) will see a different value from the resource > limit and understand that there is a limit (even if it's not quite on > the number of streams). > > For fopen(), the errors are defined as follows: > > "shall fail": > [EMFILE] {OPEN_MAX} file descriptors are currently open in the calling > process. > [ENFILE] The maximum allowable number of files is currently open in > the system. > > "may fail": > [EINVAL] The value of the mode argument is not valid. > [EMFILE] {FOPEN_MAX} streams are currently open in the calling > process. > [EMFILE] {STREAM_MAX} streams are currently open in the calling > process. > > The other possibility would be [EOVERFLOW], which is defined as: > > [EOVERFLOW] The named file is a regular file and the size of the file > cannot be represented correctly in an object of type off_t. > > But I truly believe that [EMFILE] is the best option. Ok. I was going based on our manpages, but I will happily use EMFILE instead given this. I will commit the temp fix today so we can MFC it. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 20:39:12 2008 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DF9E4106569B for ; Wed, 27 Feb 2008 20:39:12 +0000 (UTC) (envelope-from xcllnt@mac.com) Received: from smtpoutm.mac.com (smtpoutm.mac.com [17.148.16.78]) by mx1.freebsd.org (Postfix) with ESMTP id DBE858FC1D for ; Wed, 27 Feb 2008 20:39:12 +0000 (UTC) (envelope-from xcllnt@mac.com) Received: from mac.com (asmtp006-s [10.150.69.69]) by smtpoutm.mac.com (Xserve/smtpout015/MantshX 4.0) with ESMTP id m1RKdCpq004754 for ; Wed, 27 Feb 2008 12:39:12 -0800 (PST) Received: from mini-g4.jnpr.net (natint3.juniper.net [66.129.224.36]) (authenticated bits=0) by mac.com (Xserve/asmtp006/MantshX 4.0) with ESMTP id m1RKd8fb027906 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) for ; Wed, 27 Feb 2008 12:39:10 -0800 (PST) Message-Id: <8D7D4892-A2E4-42BE-B856-E868E358E5CD@mac.com> From: Marcel Moolenaar To: FreeBSD Arch Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v919.2) Date: Wed, 27 Feb 2008 12:39:07 -0800 X-Mailer: Apple Mail (2.919.2) Subject: Moving vm_pmap in struct vmspace last X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 20:39:13 -0000 All, On PowerPC we'll support Book E alongside the AIM processor and there's one area where this results in an ABI problem: For Book E the struct pmap is larger than for AIM. As such, struct vmspace will have a different layout depending on the configured CPU. This only affects libkvm, but is enough of a hassle that I want to apply the following patch to address that: Index: sys/vm/vm_map.h = = = ======================================================================== --- sys/vm/vm_map.h 2008/02/27 20:14:01 #56 +++ sys/vm/vm_map.h 2008/02/27 20:14:01 @@ -233,7 +233,6 @@ */ struct vmspace { struct vm_map vm_map; /* VM address map */ - struct pmap vm_pmap; /* private physical map */ struct shmmap_state *vm_shm; /* SYS5 shared memory private data XXX */ segsz_t vm_swrss; /* resident set size before last swap */ segsz_t vm_tsize; /* text size (pages) XXX */ @@ -243,6 +242,12 @@ caddr_t vm_daddr; /* (c) user virtual address of data */ caddr_t vm_maxsaddr; /* user VA at max stack growth */ int vm_refcnt; /* number of references */ + /* + * Keep the PMAP last, so that per-CPU variations within a + * single architecture can be handled by the same toolchain + * without having to worry about the MI fields. + */ + struct pmap vm_pmap; /* private physical map */ }; #ifdef _KERNEL The consequence of this patch is that the ABI will be broken for all arhcitectures (once). Again, this only affects libkvm. Do people see a problem with this change? Thanks, -- Marcel Moolenaar xcllnt@mac.com From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 20:56:31 2008 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AF70B106566B for ; Wed, 27 Feb 2008 20:56:31 +0000 (UTC) (envelope-from xcllnt@mac.com) Received: from smtpoutm.mac.com (smtpoutm.mac.com [17.148.16.69]) by mx1.freebsd.org (Postfix) with ESMTP id A303B8FC19 for ; Wed, 27 Feb 2008 20:56:31 +0000 (UTC) (envelope-from xcllnt@mac.com) Received: from mac.com (asmtp006-s [10.150.69.69]) by smtpoutm.mac.com (Xserve/smtpout006/MantshX 4.0) with ESMTP id m1RKuVE8011062; Wed, 27 Feb 2008 12:56:31 -0800 (PST) Received: from mini-g4.jnpr.net (natint3.juniper.net [66.129.224.36]) (authenticated bits=0) by mac.com (Xserve/asmtp006/MantshX 4.0) with ESMTP id m1RKuT87012772 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Wed, 27 Feb 2008 12:56:29 -0800 (PST) Message-Id: <4C36D6F7-F329-4C08-91A5-F89FE3C2D811@mac.com> From: Marcel Moolenaar To: Rink Springer In-Reply-To: <20080227204922.GA19591@rink.nu> Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v919.2) Date: Wed, 27 Feb 2008 12:56:28 -0800 References: <8D7D4892-A2E4-42BE-B856-E868E358E5CD@mac.com> <20080227204922.GA19591@rink.nu> X-Mailer: Apple Mail (2.919.2) Cc: FreeBSD Arch Subject: Re: Moving vm_pmap in struct vmspace last X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 20:56:31 -0000 On Feb 27, 2008, at 12:49 PM, Rink Springer wrote: > On Wed, Feb 27, 2008 at 12:39:07PM -0800, Marcel Moolenaar wrote: >> The consequence of this patch is that the ABI will be >> broken for all arhcitectures (once). Again, this only >> affects libkvm. Do people see a problem with this change? > > I think this is OK, as long as it's in adequately documented in > UPDATING. The other structs in there won't cause problems, I assume? Correct. -- Marcel Moolenaar xcllnt@mac.com From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 21:06:01 2008 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B60E41065673 for ; Wed, 27 Feb 2008 21:06:01 +0000 (UTC) (envelope-from rink@tragedy.rink.nu) Received: from mx1.rink.nu (alastor.rink.nu [213.34.49.5]) by mx1.freebsd.org (Postfix) with ESMTP id 7A30C8FC29 for ; Wed, 27 Feb 2008 21:06:01 +0000 (UTC) (envelope-from rink@tragedy.rink.nu) Received: from localhost (alastor.rink.nu [213.34.49.5]) by mx1.rink.nu (Postfix) with ESMTP id 291C5BFEC52; Wed, 27 Feb 2008 20:49:31 +0000 (UTC) X-Virus-Scanned: amavisd-new at rink.nu Received: from mx1.rink.nu ([213.34.49.5]) by localhost (alastor.rink.nu [213.34.49.5]) (amavisd-new, port 10024) with ESMTP id 0a2I7momnH3N; Wed, 27 Feb 2008 20:49:23 +0000 (UTC) Received: from tragedy.rink.nu (tragedy.rink.nu [213.34.49.3]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.rink.nu (Postfix) with ESMTP id 48F47BFEB79; Wed, 27 Feb 2008 20:49:23 +0000 (UTC) Received: from tragedy.rink.nu (tragedy.rink.nu [213.34.49.3]) by tragedy.rink.nu (8.13.8/8.13.8) with ESMTP id m1RKnMiH021996; Wed, 27 Feb 2008 21:49:22 +0100 (CET) (envelope-from rink@tragedy.rink.nu) Received: (from rink@localhost) by tragedy.rink.nu (8.13.8/8.13.8/Submit) id m1RKnMEB021995; Wed, 27 Feb 2008 21:49:22 +0100 (CET) (envelope-from rink) Date: Wed, 27 Feb 2008 21:49:22 +0100 From: Rink Springer To: Marcel Moolenaar Message-ID: <20080227204922.GA19591@rink.nu> References: <8D7D4892-A2E4-42BE-B856-E868E358E5CD@mac.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8D7D4892-A2E4-42BE-B856-E868E358E5CD@mac.com> User-Agent: Mutt/1.5.17 (2007-11-01) Cc: FreeBSD Arch Subject: Re: Moving vm_pmap in struct vmspace last X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 21:06:01 -0000 On Wed, Feb 27, 2008 at 12:39:07PM -0800, Marcel Moolenaar wrote: > The consequence of this patch is that the ABI will be > broken for all arhcitectures (once). Again, this only > affects libkvm. Do people see a problem with this change? I think this is OK, as long as it's in adequately documented in UPDATING. The other structs in there won't cause problems, I assume? -- Rink P.W. Springer - http://rink.nu "Anyway boys, this is America. Just because you get more votes doesn't mean you win." - Fox Mulder From owner-freebsd-arch@FreeBSD.ORG Wed Feb 27 21:19:25 2008 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7C2221065677 for ; Wed, 27 Feb 2008 21:19:25 +0000 (UTC) (envelope-from julian@elischer.org) Received: from outR.internet-mail-service.net (outR.internet-mail-service.net [216.240.47.241]) by mx1.freebsd.org (Postfix) with ESMTP id 549DE8FC22 for ; Wed, 27 Feb 2008 21:19:25 +0000 (UTC) (envelope-from julian@elischer.org) Received: from mx0.idiom.com (HELO idiom.com) (216.240.32.160) by out.internet-mail-service.net (qpsmtpd/0.40) with ESMTP; Wed, 27 Feb 2008 13:19:24 -0800 Received: from julian-mac.elischer.org (localhost [127.0.0.1]) by idiom.com (Postfix) with ESMTP id B542F127392; Wed, 27 Feb 2008 13:19:23 -0800 (PST) Message-ID: <47C5D3EC.4010604@elischer.org> Date: Wed, 27 Feb 2008 13:19:40 -0800 From: Julian Elischer User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Marcel Moolenaar References: <8D7D4892-A2E4-42BE-B856-E868E358E5CD@mac.com> In-Reply-To: <8D7D4892-A2E4-42BE-B856-E868E358E5CD@mac.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD Arch Subject: Re: Moving vm_pmap in struct vmspace last X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2008 21:19:25 -0000 Marcel Moolenaar wrote: > All, > > On PowerPC we'll support Book E alongside the AIM processor > and there's one area where this results in an ABI problem: > For Book E the struct pmap is larger than for AIM. As such, > struct vmspace will have a different layout depending on > the configured CPU. This only affects libkvm, but is enough of a > hassle that I want to apply the following patch to address > that: just for fun I'm also supporting a ppc variant called gxemul which uses the 'test' devices in gxemul (not emulateed hardware). I have the loader working and warner had the machine booting to almost single user when Cisco did their linux thing. so that's not proep and not book-E. and to make things easier, it would work for mips under gxemul as well. I haven't figured out how to fit it as an option that can run across several architectures.. > > Index: sys/vm/vm_map.h > = > = > = > ======================================================================== > --- sys/vm/vm_map.h 2008/02/27 20:14:01 #56 > +++ sys/vm/vm_map.h 2008/02/27 20:14:01 > @@ -233,7 +233,6 @@ > */ > struct vmspace { > struct vm_map vm_map; /* VM address map */ > - struct pmap vm_pmap; /* private physical map */ > struct shmmap_state *vm_shm; /* SYS5 shared memory private data > XXX */ > segsz_t vm_swrss; /* resident set size before last swap */ > segsz_t vm_tsize; /* text size (pages) XXX */ > @@ -243,6 +242,12 @@ > caddr_t vm_daddr; /* (c) user virtual address of data */ > caddr_t vm_maxsaddr; /* user VA at max stack growth */ > int vm_refcnt; /* number of references */ > + /* > + * Keep the PMAP last, so that per-CPU variations within a > + * single architecture can be handled by the same toolchain > + * without having to worry about the MI fields. > + */ > + struct pmap vm_pmap; /* private physical map */ > }; > > #ifdef _KERNEL > > > The consequence of this patch is that the ABI will be > broken for all arhcitectures (once). Again, this only > affects libkvm. Do people see a problem with this change? > > Thanks, > From owner-freebsd-arch@FreeBSD.ORG Thu Feb 28 08:36:07 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5DBA9106566B; Thu, 28 Feb 2008 08:36:07 +0000 (UTC) (envelope-from peterjeremy@optushome.com.au) Received: from mail17.syd.optusnet.com.au (mail17.syd.optusnet.com.au [211.29.132.198]) by mx1.freebsd.org (Postfix) with ESMTP id E10468FC1A; Thu, 28 Feb 2008 08:36:06 +0000 (UTC) (envelope-from peterjeremy@optushome.com.au) Received: from server.vk2pj.dyndns.org (c220-239-20-82.belrs4.nsw.optusnet.com.au [220.239.20.82]) by mail17.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id m1S8ZtAk018630 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 28 Feb 2008 19:36:04 +1100 Received: from server.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by server.vk2pj.dyndns.org (8.14.2/8.14.1) with ESMTP id m1S8ZtfL075765; Thu, 28 Feb 2008 19:35:55 +1100 (EST) (envelope-from peter@server.vk2pj.dyndns.org) Received: (from peter@localhost) by server.vk2pj.dyndns.org (8.14.2/8.14.2/Submit) id m1S8Zta7075764; Thu, 28 Feb 2008 19:35:55 +1100 (EST) (envelope-from peter) Date: Thu, 28 Feb 2008 19:35:55 +1100 From: Peter Jeremy To: John Baldwin Message-ID: <20080228083555.GX83599@server.vk2pj.dyndns.org> References: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> <200802262355.16519.jhb@freebsd.org> <20080227065925.GK83599@server.vk2pj.dyndns.org> <200802271138.33979.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="gwtGiOGliFx8mAnm" Content-Disposition: inline In-Reply-To: <200802271138.33979.jhb@freebsd.org> X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.17 (2007-11-01) Cc: arch@freebsd.org Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2008 08:36:07 -0000 --gwtGiOGliFx8mAnm Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Feb 27, 2008 at 11:38:33AM -0500, John Baldwin wrote: >> You could change _file from 'short' to 'unsigned short' without breaking >> the ABI - this would allow either 65535 or 65536 file descriptors (I'm >> not sure whether _file =3D=3D -1 is special or not). This would postpone >> the problem for some time. > >-1 is used a lot in the stdio code for file's not backed by an fd. My pro= blem=20 >though is that this doesn't help with existing binaries that are already= =20 >compiled (which is what I have to deal with). Had fileno() not been inlin= ed=20 >I would have been ok, but that's pretty much done for me as far as my curr= ent=20 >problem on 6.x. Had I just been able to change FILE * and not had inlines= ,=20 >then a new fopen would have worked fine in my case. My suggestion was based on short and ushort having the same size and (short)-1 and (ushort)65535 having the same bit pattern. Any code accessing fileno() should not be checking or validating the result but just passing it to low-level I/O routines. This would provide the following: Existing code New code short ushort FD sign-extended zero-extended -1 -1 65535 0..32767 0..32767 0..32767 32768..65534 -32768..-2 [*] 32768..65534 >65534 EMFILE EMFILE [*] This could potentially be fixed using libc or kernel shims. >> e) Don asbestos underwear and re-arrange struct __sFILE to grow _file et= c. > >We can't do e) because thanks to symbol versioning, 8.x and 9.x will have= =20 >libc.so.7, so a 7.0 binary will still use the brand new libc, so it has to= =20 >preserve the ABI of the currently exported fields pretty much forever. Erk. I forgot about that. >think we can get away with renaming '_file' to '_ofile' and adding a new '= int=20 >_file' at the bottom of the struct and making sure '_ofile' is always in s= ync=20 >(when possible, truncated when _file is too bug). Truncation opens up the possibility that old executables could fopen() lots of files (without getting any indication of a problem) and then use fileno() to reference a truncated _ofile - causing it to access some totally unrelated file. Admittedly, that is no worse than would happen today. >Also, I think we can do the new _file in HEAD for 8.0 w/o any worries. I= =20 >don't think waiting until 9.0 buys anything there. I was thinking in terms of changing _file to int without backward compatibility. I'm not sure that that could be done for 8.0 (though, as you have pointed out, it can't be done at all). > Given that, I think I'd=20 >rather just patch the current stable branches to handle the edge case bett= er=20 >and work on making _file an int in HEAD (with the ABI compat _ofile). The EMFILE patch is definitely a good interim step and I support your efforts in removing the limit on the number of open files. --=20 Peter Jeremy Please excuse any delays as the result of my ISP's inability to implement an MTA that is either RFC2821-compliant or matches their claimed behaviour. --gwtGiOGliFx8mAnm Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHxnJr/opHv/APuIcRApt/AJ0Qt+qUxxS2/4DlxAtKZhmoDec/iACfbDW5 5lxxTVS2ZkAeLNjdZ2Pm0QI= =QU7W -----END PGP SIGNATURE----- --gwtGiOGliFx8mAnm-- From owner-freebsd-arch@FreeBSD.ORG Thu Feb 28 10:15:32 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 20A8E1065672 for ; Thu, 28 Feb 2008 10:15:32 +0000 (UTC) (envelope-from frank@pinky.sax.de) Received: from post.frank-behrens.de (post.frank-behrens.de [82.139.255.138]) by mx1.freebsd.org (Postfix) with ESMTP id 767C78FC2E for ; Thu, 28 Feb 2008 10:15:31 +0000 (UTC) (envelope-from frank@pinky.sax.de) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pinky.sax.de; h=from:to:date:mime-version:subject:in-reply-to:references:content-type:content-transfer-encoding:content-description; q=dns/txt; s=pinky1; t=1204193126; i=frank@pinky.sax.de; bh=pkThCzbjZWWKTnb/NNMf0gWNcDg2Xb0SdoXRu7IKaoc=; b=u+qnvOwJHcLrwyqUjVjodENW7NOxOXtpezKBnd72N9bKWKf0Sola2DSEq/BgYFhgWcceYVDMRTfMaeKjvSR+3w== Received: from [192.168.20.32] (sun.behrens [192.168.20.32]) by post.frank-behrens.de (8.14.2/8.14.2) with ESMTP-MSA id m1SA5Neb027352 for ; Thu, 28 Feb 2008 11:05:23 +0100 (CET) (envelope-from frank@pinky.sax.de) Message-Id: <200802281005.m1SA5Neb027352@post.frank-behrens.de> From: "Frank Behrens" To: arch@freebsd.org Date: Thu, 28 Feb 2008 11:05:23 +0100 MIME-Version: 1.0 Priority: normal In-reply-to: <20080228083555.GX83599@server.vk2pj.dyndns.org> References: <200802271138.33979.jhb@freebsd.org> X-mailer: Pegasus Mail for Windows (4.31, DE v4.31 R1) Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Content-description: Mail message body X-Hashcash: 1:24:080228:arch@freebsd.org::B2cy8Dai19qayeA7:SQkB Cc: Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2008 10:15:32 -0000 Peter Jeremy wrote on 28 Feb 2008 19:35: > On Wed, Feb 27, 2008 at 11:38:33AM -0500, John Baldwin wrote: >... > >think we can get away with renaming '_file' to '_ofile' and adding a new 'int > >_file' at the bottom of the struct and making sure '_ofile' is always in sync > >(when possible, truncated when _file is too bug). > > Truncation opens up the possibility that old executables could fopen() > lots of files (without getting any indication of a problem) and then > use fileno() to reference a truncated _ofile - causing it to access > some totally unrelated file. Admittedly, that is no worse than would > happen today. Is it possible and useful to read the compiled in version information for executables as it is done in kernel signal handling? Old executables could receive an error in case of truncation and newer executables access always the extended field. Just an idea, not sure if it works. Regards, Frank -- Frank Behrens, Osterwieck, Germany PGP-key 0x5B7C47ED on public servers available. From owner-freebsd-arch@FreeBSD.ORG Thu Feb 28 13:56:51 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 312B7106566C for ; Thu, 28 Feb 2008 13:56:51 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail.speedfactory.net [66.23.216.219]) by mx1.freebsd.org (Postfix) with ESMTP id B55B88FC1A for ; Thu, 28 Feb 2008 13:56:50 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.8s) with ESMTP id 233689955-1834499 for multiple; Thu, 28 Feb 2008 08:54:48 -0500 Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.14.2/8.14.2) with ESMTP id m1SDuYQa046128; Thu, 28 Feb 2008 08:56:40 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: Peter Jeremy Date: Thu, 28 Feb 2008 08:08:35 -0500 User-Agent: KMail/1.9.7 References: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> <200802271138.33979.jhb@freebsd.org> <20080228083555.GX83599@server.vk2pj.dyndns.org> In-Reply-To: <20080228083555.GX83599@server.vk2pj.dyndns.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200802280808.35678.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Thu, 28 Feb 2008 08:56:40 -0500 (EST) X-Virus-Scanned: ClamAV 0.91.2/6023/Thu Feb 28 07:50:11 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: arch@freebsd.org Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2008 13:56:51 -0000 On Thursday 28 February 2008 03:35:55 am Peter Jeremy wrote: > On Wed, Feb 27, 2008 at 11:38:33AM -0500, John Baldwin wrote: > >> You could change _file from 'short' to 'unsigned short' without breaking > >> the ABI - this would allow either 65535 or 65536 file descriptors (I'm > >> not sure whether _file == -1 is special or not). This would postpone > >> the problem for some time. > > > >-1 is used a lot in the stdio code for file's not backed by an fd. My problem > >though is that this doesn't help with existing binaries that are already > >compiled (which is what I have to deal with). Had fileno() not been inlined > >I would have been ok, but that's pretty much done for me as far as my current > >problem on 6.x. Had I just been able to change FILE * and not had inlines, > >then a new fopen would have worked fine in my case. > > My suggestion was based on short and ushort having the same size and > (short)-1 and (ushort)65535 having the same bit pattern. Any code > accessing fileno() should not be checking or validating the result > but just passing it to low-level I/O routines. This would provide the > following: > Existing code New code > short ushort > FD sign-extended zero-extended > -1 -1 65535 > 0..32767 0..32767 0..32767 > 32768..65534 -32768..-2 [*] 32768..65534 > >65534 EMFILE EMFILE > > [*] This could potentially be fixed using libc or kernel shims. Actually, a correction on my part: ushort would double the range ok in my case as gethostbyname() doesn't use fileno(), and the new fread()/fclose() would work ok in this case. > >think we can get away with renaming '_file' to '_ofile' and adding a new 'int > >_file' at the bottom of the struct and making sure '_ofile' is always in sync > >(when possible, truncated when _file is too bug). > > Truncation opens up the possibility that old executables could fopen() > lots of files (without getting any indication of a problem) and then > use fileno() to reference a truncated _ofile - causing it to access > some totally unrelated file. Admittedly, that is no worse than would > happen today. I had considered that, but I think it is an acceptable tradeoff considering that 1) the apps would already be broken today in this instance, and 2) you have to use fileno(3) to get a problem. If you just use fopen()/fread()/fclose() then it will work fine (even on old apps that aren't recompiled). > >Also, I think we can do the new _file in HEAD for 8.0 w/o any worries. I > >don't think waiting until 9.0 buys anything there. > > I was thinking in terms of changing _file to int without backward > compatibility. I'm not sure that that could be done for 8.0 (though, > as you have pointed out, it can't be done at all). It can if we accept the truncation issue. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Thu Feb 28 19:16:33 2008 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8F82B1065670; Thu, 28 Feb 2008 19:16:33 +0000 (UTC) (envelope-from obrien@NUXI.org) Received: from dragon.nuxi.org (trang.nuxi.org [74.95.12.85]) by mx1.freebsd.org (Postfix) with ESMTP id 63D068FC1F; Thu, 28 Feb 2008 19:16:33 +0000 (UTC) (envelope-from obrien@NUXI.org) Received: from dragon.nuxi.org (obrien@localhost [127.0.0.1]) by dragon.nuxi.org (8.14.1/8.14.1) with ESMTP id m1SJGTW4018612; Thu, 28 Feb 2008 11:16:29 -0800 (PST) (envelope-from obrien@dragon.nuxi.org) Received: (from obrien@localhost) by dragon.nuxi.org (8.14.2/8.14.1/Submit) id m1SJGSFH018611; Thu, 28 Feb 2008 11:16:28 -0800 (PST) (envelope-from obrien) Date: Thu, 28 Feb 2008 11:16:28 -0800 From: "David O'Brien" To: Peter Jeremy Message-ID: <20080228191628.GA17957@dragon.NUXI.org> Mail-Followup-To: freebsd-alpha@freebsd.org, Peter Jeremy , John Baldwin , freebsd-arch@freebsd.org, Garrett Wollman References: <200802262251.m1QMp7bV021709@hergotha.csail.mit.edu> <200802262355.16519.jhb@freebsd.org> <20080227065925.GK83599@server.vk2pj.dyndns.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080227065925.GK83599@server.vk2pj.dyndns.org> X-Operating-System: FreeBSD 8.0-CURRENT User-Agent: Mutt/1.5.16 (2007-06-09) Cc: Garrett Wollman , freebsd-arch@freebsd.org Subject: Re: Cleaning up FILE in stdio.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: freebsd-alpha@freebsd.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2008 19:16:33 -0000 On Wed, Feb 27, 2008 at 05:59:25PM +1100, Peter Jeremy wrote: > Once RELENG_8 is branched: > e) Don asbestos underwear and re-arrange struct __sFILE to grow _file etc. Why do we have to wait until FreeBSD 9.0? What's wrong with doing this in FreeBSD 8.0? -- -- David (obrien@FreeBSD.org) From owner-freebsd-arch@FreeBSD.ORG Thu Feb 28 22:16:13 2008 Return-Path: Delivered-To: freebsd-arch@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 66AD7106566B; Thu, 28 Feb 2008 22:16:13 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 31EA08FC14; Thu, 28 Feb 2008 22:16:13 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (linimon@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id m1SMGDMd089715; Thu, 28 Feb 2008 22:16:13 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.2/8.14.1/Submit) id m1SMGDer089711; Thu, 28 Feb 2008 22:16:13 GMT (envelope-from linimon) Date: Thu, 28 Feb 2008 22:16:13 GMT Message-Id: <200802282216.m1SMGDer089711@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-arch@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/120749: [request] Suggest upping the default kern.ps_arg_cache_limit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2008 22:16:13 -0000 Synopsis: [request] Suggest upping the default kern.ps_arg_cache_limit Responsible-Changed-From-To: freebsd-bugs->freebsd-arch Responsible-Changed-By: linimon Responsible-Changed-When: Thu Feb 28 22:15:30 UTC 2008 Responsible-Changed-Why: Anyone on the arch@ list want to weigh in on this one? http://www.freebsd.org/cgi/query-pr.cgi?pr=120749 From owner-freebsd-arch@FreeBSD.ORG Fri Feb 29 05:19:47 2008 Return-Path: Delivered-To: arch@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CE9BB106566C; Fri, 29 Feb 2008 05:19:47 +0000 (UTC) (envelope-from davidxu@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id B0DAF8FC21; Fri, 29 Feb 2008 05:19:47 +0000 (UTC) (envelope-from davidxu@FreeBSD.org) Received: from apple.my.domain (root@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id m1T5JghM033159; Fri, 29 Feb 2008 05:19:43 GMT (envelope-from davidxu@freebsd.org) Message-ID: <47C7963B.2080509@freebsd.org> Date: Fri, 29 Feb 2008 13:20:59 +0800 From: David Xu User-Agent: Thunderbird 2.0.0.9 (X11/20071211) MIME-Version: 1.0 To: Jeff Roberson References: <20080220175532.Q920@desktop> <20080220213253.A920@desktop> <20080221092011.J52922@fledge.watson.org> <20080222121253.N920@desktop> <20080222231245.GA28788@lor.one-eyed-alien.net> <20080222134923.M920@desktop> <20080223194047.GB38485@lor.one-eyed-alien.net> <20080223111659.K920@desktop> <20080223213507.GD39699@lor.one-eyed-alien.net> <20080224001902.J920@desktop> <20080225231747.GT99258@elvis.mu.org> <20080225143222.B920@desktop> <20080225160433.P920@desktop> <20080225194320.V920@desktop> <20080225213434.L920@desktop> <20080226121251.V920@desktop> <20080226233645.D920@desktop> In-Reply-To: <20080226233645.D920@desktop> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Brooks Davis , Andrew Gallatin , Alfred Perlstein , Daniel Eischen , arch@FreeBSD.org, Robert Watson Subject: Re: cpuset and affinity implementation X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Feb 2008 05:19:48 -0000 Jeff Roberson wrote: > > Are there any objections to commiting this functionality in its current > form? > > I think there is the possibility for further debate and refinement but I > believe the code is stable and simple enough to hit the tree for people > to start using it. > > Thanks, > Jeff I have no objection, it can be refined later if there is any issue. Thanks, David Xu From owner-freebsd-arch@FreeBSD.ORG Fri Feb 29 07:12:33 2008 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 72CFF1065674 for ; Fri, 29 Feb 2008 07:12:33 +0000 (UTC) (envelope-from ndenev@gmail.com) Received: from el-out-1112.google.com (el-out-1112.google.com [209.85.162.178]) by mx1.freebsd.org (Postfix) with ESMTP id 282F28FC2A for ; Fri, 29 Feb 2008 07:12:32 +0000 (UTC) (envelope-from ndenev@gmail.com) Received: by el-out-1112.google.com with SMTP id z25so3267776ele.8 for ; Thu, 28 Feb 2008 23:12:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=siDTEdr4TlNfSUUHXVvZwVDPAQm6N+QyzVZhDdiwum0=; b=OaKGkO6aaQb9ejtELiZ42L/hs458gFoIBj2SET2oF6LRKEHCahroIhYpli6W3fjzvP3BmeAvLJsYwF05QsEmQOw5vMMORMLdBHJFdJ6oemeyPF5BJxcbsSUMEKt7z2zADWV4wIUw+MNGLtkljZhxepeVplwWmh8AhK9K5SnWoYU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=ANWr+y2kH0gq4Wny+FzICpFtdlkwprsLU7EPYMDRlnmTmGueWW51zXe3zGfU8EblKbVRoQ6uu9I8WbIvlmdKXSMjdcR9WdRcB4bGF+zD/dqaOGia/GiajtiACLrRLYOK8pI6qxgzZW0AFNQwfXh7j/hGV8Ewa8k11kgndv4Zwy8= Received: by 10.141.129.14 with SMTP id g14mr6147632rvn.274.1204267703670; Thu, 28 Feb 2008 22:48:23 -0800 (PST) Received: by 10.140.207.1 with HTTP; Thu, 28 Feb 2008 22:48:23 -0800 (PST) Message-ID: <2e77fc10802282248n58fdf51dt587954cf0ce61d72@mail.gmail.com> Date: Fri, 29 Feb 2008 08:48:23 +0200 From: "Niki Denev" To: linimon@freebsd.org In-Reply-To: <200802282216.m1SMGDer089711@freefall.freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <200802282216.m1SMGDer089711@freefall.freebsd.org> Cc: freebsd-bugs@freebsd.org, freebsd-arch@freebsd.org Subject: Re: kern/120749: [request] Suggest upping the default kern.ps_arg_cache_limit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Feb 2008 07:12:33 -0000 On Fri, Feb 29, 2008 at 12:16 AM, wrote: > Synopsis: [request] Suggest upping the default kern.ps_arg_cache_limit > > Responsible-Changed-From-To: freebsd-bugs->freebsd-arch > Responsible-Changed-By: linimon > Responsible-Changed-When: Thu Feb 28 22:15:30 UTC 2008 > Responsible-Changed-Why: > Anyone on the arch@ list want to weigh in on this one? > > http://www.freebsd.org/cgi/query-pr.cgi?pr=120749 I'm not sure, but i think they removed this limit in Linux, and now it's probably dynamicaly resized. Maybe we want something similar? Of course this is not a change that will happen right away, so upping the default seems reasonable to me++ Just my 0.02 cents --Niki From owner-freebsd-arch@FreeBSD.ORG Sat Mar 1 09:33:43 2008 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4D39E106566C for ; Sat, 1 Mar 2008 09:33:43 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from falcon.cybervisiontech.com (falcon.cybervisiontech.com [217.20.163.9]) by mx1.freebsd.org (Postfix) with ESMTP id B87EB8FC1A for ; Sat, 1 Mar 2008 09:33:42 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from localhost (localhost [127.0.0.1]) by falcon.cybervisiontech.com (Postfix) with ESMTP id 9B3EE744013 for ; Sat, 1 Mar 2008 11:33:41 +0200 (EET) X-Virus-Scanned: Debian amavisd-new at falcon.cybervisiontech.com Received: from falcon.cybervisiontech.com ([127.0.0.1]) by localhost (falcon.cybervisiontech.com [127.0.0.1]) (amavisd-new, port 10027) with ESMTP id RvDhXfT4P6Hp for ; Sat, 1 Mar 2008 11:33:41 +0200 (EET) Received: from [91.193.172.111] (unknown [91.193.172.111]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by falcon.cybervisiontech.com (Postfix) with ESMTP id 3D6D774400D for ; Sat, 1 Mar 2008 11:33:41 +0200 (EET) Message-ID: <47C922F1.2050307@icyb.net.ua> Date: Sat, 01 Mar 2008 11:33:37 +0200 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.9 (X11/20071208) MIME-Version: 1.0 To: freebsd-arch@freebsd.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: multiple filesystems sharing/clobbering device vnode X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Mar 2008 09:33:43 -0000 First, a little demonstration suggested by Bruce Evance: [I hope you will continue reading after reboot] 1. mount_cd9660 /dev/acd0 /mnt1 2. mount -r /dev/acd0 /mnt2 # -r is important 3. ls -l /mnt1 The issue can be laconically described as follows: 1. We do not disallow multiple RO mounts of the same device (which could be done either on purpose or by an accident). 2. All popular (on-disk) filesystems use/clobber bufobj of device's vnode, even for RO mounts; some (ufs) do that even if mount fails. 3. There are no considerations for such a shared access, all filesystems act as if it is an exclusive owner of the vnode / its bufobj. Small snippet of code that speaks for itself (the most interesting lines are marked with XXX at the beginning): int g_vfs_open(struct vnode *vp, struct g_consumer **cpp, const char *fsname, int wr) { struct g_geom *gp; struct g_provider *pp; struct g_consumer *cp; struct bufobj *bo; int vfslocked; int error; g_topology_assert(); *cpp = NULL; pp = g_dev_getprovider(vp->v_rdev); if (pp == NULL) return (ENOENT); gp = g_new_geomf(&g_vfs_class, "%s.%s", fsname, pp->name); cp = g_new_consumer(gp); g_attach(cp, pp); error = g_access(cp, 1, wr, 1); if (error) { g_wither_geom(gp, ENXIO); return (error); } vfslocked = VFS_LOCK_GIANT(vp->v_mount); vnode_create_vobject(vp, pp->mediasize, curthread); VFS_UNLOCK_GIANT(vfslocked); *cpp = cp; XXX bo = &vp->v_bufobj; XXX bo->bo_ops = g_vfs_bufops; XXX bo->bo_private = cp; XXX bo->bo_bsize = pp->sectorsize; gp->softc = bo; return (error); } In addition to this, some filesystems (ufs) directly modify v_bufobj. I've been pondering this issue for over a month now, I have some ideas but they all are wanting in one aspect or other. I would like to hear ideas and opinions of the people on this list. P.S. for those who didn't actually run the test, here's a hand-copied excerpt from stack trace: g_io_request g_vfs_strategy ffs_geom_strategy cd9660_strategy VOP_STRATEGY_APV bufstrategy breadn bread cd9660_readdir -- Andriy Gapon