From owner-freebsd-arch@FreeBSD.ORG Mon Jan 23 03:53:59 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3ACA416A41F; Mon, 23 Jan 2006 03:53:59 +0000 (GMT) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.FreeBSD.org (Postfix) with ESMTP id CEC3C43D45; Mon, 23 Jan 2006 03:53:58 +0000 (GMT) (envelope-from jroberson@chesapeake.net) Received: from [10.0.0.1] (67-40-203-22.tukw.qwest.net [67.40.203.22]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.1/8.13.1) with ESMTP id k0N3roTn003673; Sun, 22 Jan 2006 22:53:52 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Sun, 22 Jan 2006 19:52:01 -0800 (PST) From: Jeff Roberson X-X-Sender: jroberson@10.0.0.1 To: Julian Elischer In-Reply-To: <43D18816.3010909@elischer.org> Message-ID: <20060122194739.O602@10.0.0.1> References: <43CD612E.2060002@elischer.org> <63333.1137536336@critter.freebsd.dk> <20060121003908.GD6017@cs.rice.edu> <43D18816.3010909@elischer.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.52 on 216.240.101.25 Cc: alc@freebsd.org, arch@freebsd.org, Poul-Henning Kamp , Alan Cox Subject: Re: Large virtual page size support. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Jan 2006 03:53:59 -0000 On Fri, 20 Jan 2006, Julian Elischer wrote: > Alan Cox wrote: > >> On Tue, Jan 17, 2006 at 11:18:56PM +0100, Poul-Henning Kamp wrote: >> >>> In message <43CD612E.2060002@elischer.org>, Julian Elischer writes: >>> >>>> Jeff Roberson wrote: >>>> >>>> >>>>> I have implemented support in the vm for PAGE_SIZE values which are a >>>>> multiple of the hardware page size. This is primarily useful for two >>>>> things: >>>>> >>>> Mach (and the VM system we inherrited from it) had this. I beieve it was >>>> removed with teh comment >>>> "If we need this and someone is willing to support it it can be added >>>> back" . >>>> >>> It was a VAX artifact and not very usable. I belive we have a couple >>> of comments and macros which still talk about "clicks". >>> >> >> Like Jeff's patch, Mach's VM design allowed for two distinct page >> sizes, one being the native, hardware page size and the other being a >> larger, abstract page size. The essential difference between Jeff's >> patch and what Mach did on the VAX is that Mach's use of the native, >> hardware page size was entirely within the pmap and locore-level code. >> For example, the hardware-supported page size on the VAX was 512 >> bytes. However, as far as the machine-independent layer of the Mach >> kernel was concerned the page size was 4K bytes. This included the >> machine-independent part of the virtual memory system; it too believed >> that the page size was 4K bytes. As a consequences, the granularity >> of mappings and protection was 4K bytes. Finally, there was nothing >> VAX-specific about the design and implementation of this feature. >> However, I don't recall any other pmap implementations having >> different native and abstract page sizes. Today, I speculate that you >> could implement a distinct native and abstract page size on the sparc >> because different versions of processor have had different page sizes. >> Consequently, the ABI documents that I've seen don't specify a >> particular page size only that 64K bytes is the largest that a page >> will ever be; to learn the precise page size, they say that you must >> call the OS at run time. So, you could use a larger abstract page >> without breaking the ABI. >> >> In constrast, Jeff's patch has both the machine-dependent and >> machine-independent layers knowing about both page sizes. Moreover, >> the granularity of mappings and protection is still the native, >> hardware page size. In other words, within the vm_map the page size >> is the native, hardware page size, but over in the vm_object the page >> size is the larger, abstract size. (Reread the last sentence again >> before continuing.) As you can imagine, this is a lot trickier to get >> right in the first place and maintain in the long run than what Mach >> did. This is why Jeff is being so circumspect about committing this >> work. Other the hand, it offers essentially the same benefits as what >> Mach did without breaking the i386 ABI. >> > > was this the reason that it was done in a different way? > What was the reason to not do it entirely in the pmap layer (e.g. Mach). > I know hte Maxh people were very proud of their implementation. It > always appeared in their technical descriptions. > > The phrase "this is a lot trickier to [...] maintain in the long run" > worries me.. There must be a reason to not go with the simpler approach.. > What was it? It doesn't maintain backwards compatibility. I originally implemented it in the mach way, but you have to recompile the entire system with the larger page size. This patch grew the MI parts to support existing binaries. It is complex. I was hoping for someone to chime in and say "That's great, we need that" or "No, that's not useful at all". Unfortunately, the response is somewhere in the middle. I guess the best course is to port it forward and test it on some x86 machines and see if it makes a big difference. Cheers, Jeff > >> Alan >> > From owner-freebsd-arch@FreeBSD.ORG Mon Jan 23 05:01:28 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0BEFA16A41F; Mon, 23 Jan 2006 05:01:28 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7CF7B43D46; Mon, 23 Jan 2006 05:01:26 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from [192.168.254.11] (junior.samsco.home [192.168.254.11]) (authenticated bits=0) by pooker.samsco.org (8.13.4/8.13.4) with ESMTP id k0N510J9007854; Sun, 22 Jan 2006 22:01:00 -0700 (MST) (envelope-from scottl@samsco.org) Message-ID: <43D4630E.70201@samsco.org> Date: Sun, 22 Jan 2006 22:01:02 -0700 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.12) Gecko/20051230 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Roberson References: <43CD612E.2060002@elischer.org> <63333.1137536336@critter.freebsd.dk> <20060121003908.GD6017@cs.rice.edu> <43D18816.3010909@elischer.org> <20060122194739.O602@10.0.0.1> In-Reply-To: <20060122194739.O602@10.0.0.1> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.4 required=3.8 tests=ALL_TRUSTED autolearn=failed version=3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on pooker.samsco.org Cc: alc@freebsd.org, arch@freebsd.org, Poul-Henning Kamp , Alan Cox , Julian Elischer Subject: Re: Large virtual page size support. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Jan 2006 05:01:28 -0000 Jeff Roberson wrote: > On Fri, 20 Jan 2006, Julian Elischer wrote: > >> Alan Cox wrote: >> >>> On Tue, Jan 17, 2006 at 11:18:56PM +0100, Poul-Henning Kamp wrote: >>> >>>> In message <43CD612E.2060002@elischer.org>, Julian Elischer writes: >>>> >>>>> Jeff Roberson wrote: >>>>> >>>>> >>>>>> I have implemented support in the vm for PAGE_SIZE values which >>>>>> are a multiple of the hardware page size. This is primarily >>>>>> useful for two things: >>>>>> >>>>> Mach (and the VM system we inherrited from it) had this. I beieve >>>>> it was removed with teh comment >>>>> "If we need this and someone is willing to support it it can be >>>>> added back" . >>>>> >>>> It was a VAX artifact and not very usable. I belive we have a couple >>>> of comments and macros which still talk about "clicks". >>>> >>> >>> Like Jeff's patch, Mach's VM design allowed for two distinct page >>> sizes, one being the native, hardware page size and the other being a >>> larger, abstract page size. The essential difference between Jeff's >>> patch and what Mach did on the VAX is that Mach's use of the native, >>> hardware page size was entirely within the pmap and locore-level code. >>> For example, the hardware-supported page size on the VAX was 512 >>> bytes. However, as far as the machine-independent layer of the Mach >>> kernel was concerned the page size was 4K bytes. This included the >>> machine-independent part of the virtual memory system; it too believed >>> that the page size was 4K bytes. As a consequences, the granularity >>> of mappings and protection was 4K bytes. Finally, there was nothing >>> VAX-specific about the design and implementation of this feature. >>> However, I don't recall any other pmap implementations having >>> different native and abstract page sizes. Today, I speculate that you >>> could implement a distinct native and abstract page size on the sparc >>> because different versions of processor have had different page sizes. >>> Consequently, the ABI documents that I've seen don't specify a >>> particular page size only that 64K bytes is the largest that a page >>> will ever be; to learn the precise page size, they say that you must >>> call the OS at run time. So, you could use a larger abstract page >>> without breaking the ABI. >>> >>> In constrast, Jeff's patch has both the machine-dependent and >>> machine-independent layers knowing about both page sizes. Moreover, >>> the granularity of mappings and protection is still the native, >>> hardware page size. In other words, within the vm_map the page size >>> is the native, hardware page size, but over in the vm_object the page >>> size is the larger, abstract size. (Reread the last sentence again >>> before continuing.) As you can imagine, this is a lot trickier to get >>> right in the first place and maintain in the long run than what Mach >>> did. This is why Jeff is being so circumspect about committing this >>> work. Other the hand, it offers essentially the same benefits as what >>> Mach did without breaking the i386 ABI. >>> >> >> was this the reason that it was done in a different way? >> What was the reason to not do it entirely in the pmap layer (e.g. Mach). >> I know hte Maxh people were very proud of their implementation. It >> always appeared in their technical descriptions. >> >> The phrase "this is a lot trickier to [...] maintain in the long run" >> worries me.. There must be a reason to not go with the simpler >> approach.. >> What was it? > > > It doesn't maintain backwards compatibility. I originally implemented > it in the mach way, but you have to recompile the entire system with the > larger page size. This patch grew the MI parts to support existing > binaries. > > It is complex. I was hoping for someone to chime in and say "That's > great, we need that" or "No, that's not useful at all". Unfortunately, > the response is somewhere in the middle. I guess the best course is to > port it forward and test it on some x86 machines and see if it makes a > big difference. > > Cheers, > Jeff > Yes, we need that. Please commit =-) Scott From owner-freebsd-arch@FreeBSD.ORG Tue Jan 24 21:10:49 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5C7BF16A41F; Tue, 24 Jan 2006 21:10:49 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2BCAC43D73; Tue, 24 Jan 2006 21:10:47 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (unknown [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id 1F8A3BC74; Tue, 24 Jan 2006 21:10:45 +0000 (UTC) To: arch@freebsd.org, current@freebsd.org From: Poul-Henning Kamp Date: Tue, 24 Jan 2006 22:10:45 +0100 Message-ID: <23570.1138137045@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: Subject: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Jan 2006 21:10:49 -0000 Here is a new version of my cpu accounting change patch. http://phk.freebsd.dk/patch/cpu_acct_1.patch This patch is supposedly harmless (or at least mostly harmless) and I'd appreciate it getting a solid trashing. This patchs changes cpu accounting from accumulating charges in real-time units and instead accumulates in units of some per-arch, possibly per-cpu counter. When the accumulated charge is read by times(2) or getrusage(2) or similar, the frequency of the counter is interrogated and the charge normalized to microseconds. With this patch, the counter is always the timecounter and the only real difference is therefore a minor performance change (because we save the normalizing multiplications for each context switch). On my AMD Athlon 700 and my Sun Ultra 60 the performance difference is barely 1% and of doubtful statistical quality. On my Opteron machine I get a 2.7+/-.6% boost on unixbench's context1 test. Of course, the scheme used in this patch suffers a bit if the hardware counter changes to other hardware of a different rate or simply changes rate. This has been discussed at length in a previous thread already, and I'll simply refer to it, rather than rehash here: http://lists.freebsd.org/pipermail/freebsd-net/2005-October/008637.html The other half of this work is in this separate patch, and this is not yet complete. You are welcome to test it however, as long as you are aware of the problems it may hold: http://phk.freebsd.dk/patch/cpu_acct_2.patch It makes i386 and amd64 use the TSC and sparc64 use the "tick" counter for CPU accounting. On a sparc64 it gives 3.2+/-.3% speedup on unixbench/context1 On a Athlon700 with i8254 timecounter it gives a 95+/-.8% speedup On a Opteron with ACPI-fast timecounter it gives a 36+/-.6% speedup. The downside is, that unless your cpu clock is correctly probed at boot and stays constant, your cpu accounting numbers will have a bogus scaling factor. I belive all the sparc64s we support have constant CPU rates, so they should be safe. For i386 and amd64 things are more tricky. Laptops doing power saving tricks will probably give bogus cpu accounting values, but as such the patch should do no other harm than screw up those values. If you benchmark this patch, please understand that it is vitally important that you benchmark relative to the real-time scale (ie: wall-clock time), the "user" and "system" fields from time(1) are not usable. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Tue Jan 24 22:08:15 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BAA7D16A420; Tue, 24 Jan 2006 22:08:15 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0B7E643D62; Tue, 24 Jan 2006 22:08:12 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.5b3) with ESMTP id 6881794 for multiple; Tue, 24 Jan 2006 17:08:01 -0500 Received: from localhost (john@localhost [127.0.0.1]) by server.baldwin.cx (8.13.4/8.13.4) with ESMTP id k0OM8A41015971; Tue, 24 Jan 2006 17:08:10 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: freebsd-arch@freebsd.org Date: Tue, 24 Jan 2006 17:09:01 -0500 User-Agent: KMail/1.9.1 References: <23570.1138137045@critter.freebsd.dk> In-Reply-To: <23570.1138137045@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200601241709.03450.jhb@freebsd.org> X-Virus-Scanned: ClamAV 0.87.1/1248/Tue Jan 24 05:54:38 2006 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.4 required=4.2 tests=ALL_TRUSTED autolearn=failed version=3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on server.baldwin.cx X-Server: High Performance Mail Server - http://surgemail.com r=1653887525 Cc: Poul-Henning Kamp , current@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Jan 2006 22:08:15 -0000 On Tuesday 24 January 2006 16:10, Poul-Henning Kamp wrote: > Here is a new version of my cpu accounting change patch. > > http://phk.freebsd.dk/patch/cpu_acct_1.patch > > This patch is supposedly harmless (or at least mostly harmless) > and I'd appreciate it getting a solid trashing. The XXX in calcru1() you can remove. The rux you are adding the current time to is a local rusage_ext on the stack. However, your changes probably make it bogus in that the current code assumes it can subtract the start time of another CPU (for a thread running on another CPU) from the current time on this CPU to get the amount of time the other thread has been running on the other CPU since it last updated p->p_rux.rux_runtime. However, with the CPUs having disparate timings this would break as curthread's CPU's notion of now will be unrelated to the other thread's CPU's notion of now. Other than that this patch looks fine to me. FYI, Alpha also has a per-cpu counter (RPCC) that is used for the timecounter on UP Alphas. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org From owner-freebsd-arch@FreeBSD.ORG Tue Jan 24 22:16:29 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E8F8816A420; Tue, 24 Jan 2006 22:16:29 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id 75AAB43D60; Tue, 24 Jan 2006 22:16:25 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (unknown [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id E9196BC74; Tue, 24 Jan 2006 22:16:23 +0000 (UTC) To: John Baldwin From: "Poul-Henning Kamp" In-Reply-To: Your message of "Tue, 24 Jan 2006 17:09:01 EST." <200601241709.03450.jhb@freebsd.org> Date: Tue, 24 Jan 2006 23:16:23 +0100 Message-ID: <24201.1138140983@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: current@freebsd.org, freebsd-arch@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Jan 2006 22:16:30 -0000 In message <200601241709.03450.jhb@freebsd.org>, John Baldwin writes: >On Tuesday 24 January 2006 16:10, Poul-Henning Kamp wrote: >> Here is a new version of my cpu accounting change patch. >> >The XXX in calcru1() you can remove. The rux you are adding the current time >to is a local rusage_ext on the stack. I found that out myself and forgot to remove the XXX :-) >However, your changes probably make >it bogus in that the current code assumes it can subtract the start time of >another CPU (for a thread running on another CPU) from the current time on >this CPU to get the amount of time the other thread has been running on the >other CPU since it last updated p->p_rux.rux_runtime. This is when we call calcru on a running process ? Yeah, that's a problem. I'd tend to say we should just forget about accounting for the current quantum in that case. This is a valid handling IMO because the result can never be used in any final or definitive kind of way anyway. When the process finishes or deschedules the numbers will get updated correctly. Doing this may also simplify the locking of calcru ? >Other than that this patch looks fine to me. FYI, Alpha also has a per-cpu >counter (RPCC) that is used for the timecounter on UP Alphas. My Alpha is hosed right now and doesn't want to boot 6.0-R. I havn't had time to boot my 5.0-R on it and do the upgrade to -current the long way. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Tue Jan 24 22:38:51 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1E05D16A41F; Tue, 24 Jan 2006 22:38:51 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6E4B843D49; Tue, 24 Jan 2006 22:38:50 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.5b3) with ESMTP id 6883515 for multiple; Tue, 24 Jan 2006 17:38:37 -0500 Received: from localhost (john@localhost [127.0.0.1]) by server.baldwin.cx (8.13.4/8.13.4) with ESMTP id k0OMcm6c016149; Tue, 24 Jan 2006 17:38:48 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: arch@freebsd.org Date: Tue, 24 Jan 2006 17:39:39 -0500 User-Agent: KMail/1.9.1 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200601241739.41027.jhb@freebsd.org> X-Virus-Scanned: ClamAV 0.87.1/1248/Tue Jan 24 05:54:38 2006 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.4 required=4.2 tests=ALL_TRUSTED autolearn=failed version=3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on server.baldwin.cx X-Server: High Performance Mail Server - http://surgemail.com r=1653887525 Cc: current@freebsd.org Subject: [PATCH] Initial (working!) version of rwlocks X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Jan 2006 22:38:51 -0000 If you've been following p4 submits to //depot/user/jhb/lock/... recently you've noticed that I've been hacking on an implementation of reader/writer locks. I've just finished doing an initial set of tests on a 4-cpu box and am confident that at least the really big races should all be handled. First, rwlocks are basically read/write mutexes. They cannot be held while sleeping similar to mutexes (and thus different from sx and lockmgr), but this means that they can be used in ithreads, etc. Also, they can do some form of priority propagation. To achieve this latter part, I've patched the turnstile code to grow the notion of having multiple queues of waiters on a given turnstile: a queue of shared (read) waiters, and a queue of exclusive (write) waiters. The modified turnstile code (with suitable updates to the mutex code) ran without incident for several days on alpha, amd64, i386, and sparc64 machines running buildworld -j XX in an infinite loop. Now, as far as limitations, etc. of this reader/writer lock implementation. This implementation is not meant to be the end-all be-all holy grail of reader/writer lock implementations. The goals of this version are to have a _stable_, _working_ implementation that can be used in the tree as well as to provide a base implementation that people can hack on to try out other algorithms and ideas. This means that folks like the networking stack guys can go ahead and have working rwlocks now (though perhaps not 100% optimal or perfect, but allowing more parallelism than mutexes) and that independently other folks can play with other ideas for rwlocks. In other words, I don't want to bikeshed forever about missing features or theoretical changes to the wakeup algorithm, etc. I'd rather commit this now as a starting point. :) That said, here are the known limitations, etc.: - Currently no attempt is made to do propogate priority to threads that hold read locks. Not even a Solaris-style "owner of record" type deal. For one, it would require some more hacking on the turnstile code. Secondly, it would add some sort of expense to read-lock operations and I'm unsure if the extra atomic ops (and thus penalty on the more common read lock operations) would be worth it. - Currently we allow read locks to recurse. For one thing, w/o some way of tracking which threads hold read locks (which would be expensive) you can't verify that code doesn't break this rule unless you use WITNESS. Most of our other simple lock assertions can be verified w/o needing WITNESS, which is why I allowed this. - Because of the previous, readers don't block if the lock is read-locked but there are writers waiting. Otherwise you end up in a trivial deadlock as expounded upon further in the comments. - Because of the previous, the read unlock algorithm is quite simple: it wakes up all write waiters because that's all the waiters there are to wake up. - The algorithm for write unlock is simplistic and matches sx for now in that it prefers readers to writers. - There is no explicit lock handoffs, but our mutexes don't do that either. - Read lock operations are not currently inlined. I think this might could be done now though. At first I was worried that read lock operations would be too complex to inline and so I've only inlined write lock operations. Now that the implementation is in a working state though, it seems that there are some simple easy cases that could possibly be inlined. - Probably more I haven't thought of, etc. The changes are available as a patch and the two new files (sys/rwlock.h and kern/kern_rwlock.c) here: http://www.FreeBSD.org/~jhb/patches/rwlock/ One suggestion I have had and haven't acted on yet is to change the turnstile code to track an internal state (share locked, exclusive locked, unlocked) to make some of the assertions possibly saner. Also, there is some debugging stuff in kern_rwlock.c to map KTR_LOCK to KTR_SUBSYS so I could get ktr traces of just rwlocks w/o the traces muddied by mutex and sx lock traces. I would appreciate people looking at the turnstile changes and the implementation of the rwlocks to see if there are races I've missed, etc. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org From owner-freebsd-arch@FreeBSD.ORG Tue Jan 24 22:45:16 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 83ECD16A41F; Tue, 24 Jan 2006 22:45:16 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219]) by mx1.FreeBSD.org (Postfix) with ESMTP id BDD7E43D6B; Tue, 24 Jan 2006 22:45:10 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.5b3) with ESMTP id 6883871 for multiple; Tue, 24 Jan 2006 17:44:54 -0500 Received: from localhost (john@localhost [127.0.0.1]) by server.baldwin.cx (8.13.4/8.13.4) with ESMTP id k0OMj6No016203; Tue, 24 Jan 2006 17:45:06 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: "Poul-Henning Kamp" Date: Tue, 24 Jan 2006 17:45:36 -0500 User-Agent: KMail/1.9.1 References: <24201.1138140983@critter.freebsd.dk> In-Reply-To: <24201.1138140983@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200601241745.38514.jhb@freebsd.org> X-Virus-Scanned: ClamAV 0.87.1/1248/Tue Jan 24 05:54:38 2006 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.4 required=4.2 tests=ALL_TRUSTED autolearn=failed version=3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on server.baldwin.cx X-Server: High Performance Mail Server - http://surgemail.com r=1653887525 Cc: current@freebsd.org, freebsd-arch@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Jan 2006 22:45:16 -0000 On Tuesday 24 January 2006 17:16, Poul-Henning Kamp wrote: > In message <200601241709.03450.jhb@freebsd.org>, John Baldwin writes: > >On Tuesday 24 January 2006 16:10, Poul-Henning Kamp wrote: > >> Here is a new version of my cpu accounting change patch. > > > >The XXX in calcru1() you can remove. The rux you are adding the current > > time to is a local rusage_ext on the stack. > > I found that out myself and forgot to remove the XXX :-) > > >However, your changes probably make > >it bogus in that the current code assumes it can subtract the start time > > of another CPU (for a thread running on another CPU) from the current > > time on this CPU to get the amount of time the other thread has been > > running on the other CPU since it last updated p->p_rux.rux_runtime. > > This is when we call calcru on a running process ? > > Yeah, that's a problem. Yeah, SIGINFO is an example, and getrusage() of another process might run into this, too. > I'd tend to say we should just forget about accounting for the > current quantum in that case. It was added to fix some of the "time going backwards" and "negative uptime" stuff I think. Ask Bruce. > This is a valid handling IMO because the result can never be used > in any final or definitive kind of way anyway. When the process > finishes or deschedules the numbers will get updated correctly. Unfortunately calcru() updates the millisecond counts (uu, su, iu) for running processes and I think having a bogus runtime gets that very confused. > Doing this may also simplify the locking of calcru ? Nah, it already runs lockless in some cases I think. > >Other than that this patch looks fine to me. FYI, Alpha also has a > > per-cpu counter (RPCC) that is used for the timecounter on UP Alphas. > > My Alpha is hosed right now and doesn't want to boot 6.0-R. I havn't had > time to boot my 5.0-R on it and do the upgrade to -current the long way. Heh, I have one running head atm. However, it's second CPU is having issues (I think the CPU fan has died) so it's less useful than in the past. :( -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org From owner-freebsd-arch@FreeBSD.ORG Wed Jan 25 00:17:18 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from [127.0.0.1] (localhost [127.0.0.1]) by hub.freebsd.org (Postfix) with ESMTP id BC28716A41F for ; Wed, 25 Jan 2006 00:17:17 +0000 (GMT) (envelope-from davidxu@freebsd.org) Message-ID: <43D6C3A5.4060100@freebsd.org> Date: Wed, 25 Jan 2006 08:17:41 +0800 From: David Xu User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.7.12) Gecko/20060117 X-Accept-Language: en-us, en MIME-Version: 1.0 To: arch@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: vfs_aio.c is still not safe X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2006 00:17:18 -0000 Even with recently change to vfs_aio.c, the kernel AIO code is still not safe to be used. The problem is a AIO daemon thread may be blocked on sockets, pipe, and fifo if peer does not transfer any data, the problem can be accumulated and all daemon threads will be blocked if such user process increases. I don't know who hacked socket code to support some level of callback, it seems work, but in fact, it may only work for first AIO request, if user queued multiple requests, same problem will happen, I tried to workaround this problem by using non-blocking I/O, but with current file ops, there is no such support, I can not change O_NONBLOCK on fly because userland and kernel have race, PR: kernel/41331 is a well explained problem, userland will hit the race. So possible solution could be: 1) disable AIO support for none disk file. 2) someone implement callbacks for pipe, fifo, and add non-blocking feature to fo_read/fo_write. The former is simple, the later needs some effort, however with superio kqueue, the AIO support for socket and pipe is less important, I prefer 1) to make the AIO code usable. David Xu From owner-freebsd-arch@FreeBSD.ORG Wed Jan 25 01:26:55 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0AF1D16A41F for ; Wed, 25 Jan 2006 01:26:55 +0000 (GMT) (envelope-from max@love2party.net) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.187]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6ABED43D49 for ; Wed, 25 Jan 2006 01:26:54 +0000 (GMT) (envelope-from max@love2party.net) Received: from [84.163.226.78] (helo=amd64.laiers.local) by mrelayeu.kundenserver.de (node=mrelayeu3) with ESMTP (Nemesis), id 0MKxQS-1F1ZQx1BRE-00009y; Wed, 25 Jan 2006 02:26:52 +0100 From: Max Laier Organization: FreeBSD To: freebsd-arch@freebsd.org Date: Wed, 25 Jan 2006 02:27:29 +0100 User-Agent: KMail/1.9.1 References: <43A8EE23.3010202@errno.com> In-Reply-To: <43A8EE23.3010202@errno.com> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart3564629.9jZ3dugsRC"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200601250227.35868.max@love2party.net> X-Provags-ID: kundenserver.de abuse@kundenserver.de login:61c499deaeeba3ba5be80f48ecc83056 Subject: Re: firmware loading X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2006 01:26:55 -0000 --nextPart3564629.9jZ3dugsRC Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Wednesday 21 December 2005 06:54, Sam Leffler wrote: > Florent Thoumie and I have been working on some generic support for > loading firmware using kld's. You can find a proof of concept at: > > http://www.freebsd.org/~sam/firmware.tgz > > It has code to manage firmware images and load them on demand by > requesting a kld through standard facilities. Firmware is packaged > using a genfw program that's included. You can package one or more > firmware images in a single kld. I've packaged the iwi firmware as a > boot image in a single kld + kld's for each operating mode that have two > firmware images. The tarball also includes modified versions of the iwi > and ipw drivers to use the code. I tested iwi, Florent is working on ipw. > > If you're interested in this stuff feel free to pick it up; I've run out > of time to work on it. There are some potential issues with holding > locks over the linker calls and the genfw program could use some TLC and > probably a new name (plus the man page needs to be completed). > > It appears ispfw can be reworked to use this code. iwi and ipw > definitely can use it. Not sure who else can/should use it. An updated version of this work is here:=20 http://people.freebsd.org/~mlaier/firmware-20060125.tgz It includes the following changes: =2D Firmware module generation with awk and ld (no special tool required). = The=20 kmod.mk Makefile has been changed to support this and it's now very easy to= =20 build a firmware module. =2D Versioning =2D firmware_put() safe from any context =2D digi(4) converted. ATTENTION: this was done blindly - if you have digi= (4)=20 hardware I'd appreciate reports! =2D Plenty iwi(4) changes which make it work much better for me - though th= ere=20 are some rough edges still. =2D firmware(9) manpage To try it, just copy the contents of the tarball over src and apply the two= =20 patches in sys/conf to the respective files. The firmware support can be=20 loaded as a module itself, so testing is really easy. My plan is to import the basic firmware support on the weekend (if no=20 objections are raised). Drivers would be converted later after some more=20 testing. The aim of this is to avoid more and more handrolled sollutions. = I=20 didn't yet have time to look at ispfw, but will do that as well. So, any objections? Comments? Feedback? Thanks! =2D-=20 /"\ Best regards, | mlaier@freebsd.org \ / Max Laier | ICQ #67774661 X http://pf4freebsd.love2party.net/ | mlaier@EFnet / \ ASCII Ribbon Campaign | Against HTML Mail and News --nextPart3564629.9jZ3dugsRC Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (FreeBSD) iD8DBQBD1tQHXyyEoT62BG0RAlrcAJwOpsC++iOVgDh85UR0BB99gyDRyACeLBkq s2eHnWGtvhAmF8MqsYQ71vM= =uQgn -----END PGP SIGNATURE----- --nextPart3564629.9jZ3dugsRC-- From owner-freebsd-arch@FreeBSD.ORG Wed Jan 25 05:19:31 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0C42416A41F; Wed, 25 Jan 2006 05:19:31 +0000 (GMT) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9CCA843D45; Wed, 25 Jan 2006 05:19:30 +0000 (GMT) (envelope-from jroberson@chesapeake.net) Received: from [10.0.0.1] (67-40-203-22.tukw.qwest.net [67.40.203.22]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.1/8.13.1) with ESMTP id k0P5JFC7014832; Wed, 25 Jan 2006 00:19:19 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Tue, 24 Jan 2006 21:17:21 -0800 (PST) From: Jeff Roberson X-X-Sender: jroberson@10.0.0.1 To: Scott Long In-Reply-To: <43D4630E.70201@samsco.org> Message-ID: <20060124211522.D602@10.0.0.1> References: <43CD612E.2060002@elischer.org> <63333.1137536336@critter.freebsd.dk> <20060121003908.GD6017@cs.rice.edu> <43D18816.3010909@elischer.org> <20060122194739.O602@10.0.0.1> <43D4630E.70201@samsco.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.52 on 216.240.101.25 Cc: alc@freebsd.org, arch@freebsd.org, Poul-Henning Kamp , Alan Cox , Julian Elischer Subject: Re: Large virtual page size support. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2006 05:19:31 -0000 On Sun, 22 Jan 2006, Scott Long wrote: > Jeff Roberson wrote: >> On Fri, 20 Jan 2006, Julian Elischer wrote: >> >>> Alan Cox wrote: >>> >>>> On Tue, Jan 17, 2006 at 11:18:56PM +0100, Poul-Henning Kamp wrote: >>>> >>>>> In message <43CD612E.2060002@elischer.org>, Julian Elischer writes: >>>>> >>>>>> Jeff Roberson wrote: >>>>>> >>>>>> >>>>>>> I have implemented support in the vm for PAGE_SIZE values which are a >>>>>>> multiple of the hardware page size. This is primarily useful for two >>>>>>> things: >>>>>>> >>>>>> Mach (and the VM system we inherrited from it) had this. I beieve it >>>>>> was removed with teh comment >>>>>> "If we need this and someone is willing to support it it can be added >>>>>> back" . >>>>>> >>>>> It was a VAX artifact and not very usable. I belive we have a couple >>>>> of comments and macros which still talk about "clicks". >>>>> >>>> >>>> Like Jeff's patch, Mach's VM design allowed for two distinct page >>>> sizes, one being the native, hardware page size and the other being a >>>> larger, abstract page size. The essential difference between Jeff's >>>> patch and what Mach did on the VAX is that Mach's use of the native, >>>> hardware page size was entirely within the pmap and locore-level code. >>>> For example, the hardware-supported page size on the VAX was 512 >>>> bytes. However, as far as the machine-independent layer of the Mach >>>> kernel was concerned the page size was 4K bytes. This included the >>>> machine-independent part of the virtual memory system; it too believed >>>> that the page size was 4K bytes. As a consequences, the granularity >>>> of mappings and protection was 4K bytes. Finally, there was nothing >>>> VAX-specific about the design and implementation of this feature. >>>> However, I don't recall any other pmap implementations having >>>> different native and abstract page sizes. Today, I speculate that you >>>> could implement a distinct native and abstract page size on the sparc >>>> because different versions of processor have had different page sizes. >>>> Consequently, the ABI documents that I've seen don't specify a >>>> particular page size only that 64K bytes is the largest that a page >>>> will ever be; to learn the precise page size, they say that you must >>>> call the OS at run time. So, you could use a larger abstract page >>>> without breaking the ABI. >>>> >>>> In constrast, Jeff's patch has both the machine-dependent and >>>> machine-independent layers knowing about both page sizes. Moreover, >>>> the granularity of mappings and protection is still the native, >>>> hardware page size. In other words, within the vm_map the page size >>>> is the native, hardware page size, but over in the vm_object the page >>>> size is the larger, abstract size. (Reread the last sentence again >>>> before continuing.) As you can imagine, this is a lot trickier to get >>>> right in the first place and maintain in the long run than what Mach >>>> did. This is why Jeff is being so circumspect about committing this >>>> work. Other the hand, it offers essentially the same benefits as what >>>> Mach did without breaking the i386 ABI. >>>> >>> >>> was this the reason that it was done in a different way? >>> What was the reason to not do it entirely in the pmap layer (e.g. Mach). >>> I know hte Maxh people were very proud of their implementation. It >>> always appeared in their technical descriptions. >>> >>> The phrase "this is a lot trickier to [...] maintain in the long run" >>> worries me.. There must be a reason to not go with the simpler >>> approach.. >>> What was it? >> >> >> It doesn't maintain backwards compatibility. I originally implemented it >> in the mach way, but you have to recompile the entire system with the >> larger page size. This patch grew the MI parts to support existing >> binaries. >> >> It is complex. I was hoping for someone to chime in and say "That's great, >> we need that" or "No, that's not useful at all". Unfortunately, the >> response is somewhere in the middle. I guess the best course is to port it >> forward and test it on some x86 machines and see if it makes a big >> difference. >> >> Cheers, >> Jeff >> > > Yes, we need that. Please commit =-) Thanks for the encouragement. There are a few unresolved issues. Most importantly, how are we presently dealing with config options that break modules? If the option is to stay as it is, modules will have to be aware of the page size that is agreed upon by the rest of the kernel. There other issues are mostly considering ways to reduce the impact of the patch on the rest of the system. How to tidy it up a bit more, if it can be. > > Scott > From owner-freebsd-arch@FreeBSD.ORG Wed Jan 25 05:54:15 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9F53F16A41F; Wed, 25 Jan 2006 05:54:15 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1F5C043D46; Wed, 25 Jan 2006 05:54:14 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from [192.168.254.11] (junior.samsco.home [192.168.254.11]) (authenticated bits=0) by pooker.samsco.org (8.13.4/8.13.4) with ESMTP id k0P5rseK024655; Tue, 24 Jan 2006 22:53:54 -0700 (MST) (envelope-from scottl@samsco.org) Message-ID: <43D71277.3020407@samsco.org> Date: Tue, 24 Jan 2006 22:53:59 -0700 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.12) Gecko/20051230 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Roberson References: <43CD612E.2060002@elischer.org> <63333.1137536336@critter.freebsd.dk> <20060121003908.GD6017@cs.rice.edu> <43D18816.3010909@elischer.org> <20060122194739.O602@10.0.0.1> <43D4630E.70201@samsco.org> <20060124211522.D602@10.0.0.1> In-Reply-To: <20060124211522.D602@10.0.0.1> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.4 required=3.8 tests=ALL_TRUSTED autolearn=failed version=3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on pooker.samsco.org Cc: alc@freebsd.org, arch@freebsd.org, Poul-Henning Kamp , Alan Cox , Julian Elischer Subject: Re: Large virtual page size support. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2006 05:54:15 -0000 Jeff Roberson wrote: > On Sun, 22 Jan 2006, Scott Long wrote: > >> Jeff Roberson wrote: >> >>> On Fri, 20 Jan 2006, Julian Elischer wrote: >>> >>>> Alan Cox wrote: >>>> >>>>> On Tue, Jan 17, 2006 at 11:18:56PM +0100, Poul-Henning Kamp wrote: >>>>> >>>>>> In message <43CD612E.2060002@elischer.org>, Julian Elischer writes: >>>>>> >>>>>>> Jeff Roberson wrote: >>>>>>> >>>>>>> >>>>>>>> I have implemented support in the vm for PAGE_SIZE values which >>>>>>>> are a multiple of the hardware page size. This is primarily >>>>>>>> useful for two things: >>>>>>>> >>>>>>> Mach (and the VM system we inherrited from it) had this. I beieve >>>>>>> it was removed with teh comment >>>>>>> "If we need this and someone is willing to support it it can be >>>>>>> added back" . >>>>>>> >>>>>> It was a VAX artifact and not very usable. I belive we have a couple >>>>>> of comments and macros which still talk about "clicks". >>>>>> >>>>> >>>>> Like Jeff's patch, Mach's VM design allowed for two distinct page >>>>> sizes, one being the native, hardware page size and the other being a >>>>> larger, abstract page size. The essential difference between Jeff's >>>>> patch and what Mach did on the VAX is that Mach's use of the native, >>>>> hardware page size was entirely within the pmap and locore-level code. >>>>> For example, the hardware-supported page size on the VAX was 512 >>>>> bytes. However, as far as the machine-independent layer of the Mach >>>>> kernel was concerned the page size was 4K bytes. This included the >>>>> machine-independent part of the virtual memory system; it too believed >>>>> that the page size was 4K bytes. As a consequences, the granularity >>>>> of mappings and protection was 4K bytes. Finally, there was nothing >>>>> VAX-specific about the design and implementation of this feature. >>>>> However, I don't recall any other pmap implementations having >>>>> different native and abstract page sizes. Today, I speculate that you >>>>> could implement a distinct native and abstract page size on the sparc >>>>> because different versions of processor have had different page sizes. >>>>> Consequently, the ABI documents that I've seen don't specify a >>>>> particular page size only that 64K bytes is the largest that a page >>>>> will ever be; to learn the precise page size, they say that you must >>>>> call the OS at run time. So, you could use a larger abstract page >>>>> without breaking the ABI. >>>>> >>>>> In constrast, Jeff's patch has both the machine-dependent and >>>>> machine-independent layers knowing about both page sizes. Moreover, >>>>> the granularity of mappings and protection is still the native, >>>>> hardware page size. In other words, within the vm_map the page size >>>>> is the native, hardware page size, but over in the vm_object the page >>>>> size is the larger, abstract size. (Reread the last sentence again >>>>> before continuing.) As you can imagine, this is a lot trickier to get >>>>> right in the first place and maintain in the long run than what Mach >>>>> did. This is why Jeff is being so circumspect about committing this >>>>> work. Other the hand, it offers essentially the same benefits as what >>>>> Mach did without breaking the i386 ABI. >>>>> >>>> >>>> was this the reason that it was done in a different way? >>>> What was the reason to not do it entirely in the pmap layer (e.g. >>>> Mach). >>>> I know hte Maxh people were very proud of their implementation. It >>>> always appeared in their technical descriptions. >>>> >>>> The phrase "this is a lot trickier to [...] maintain in the long run" >>>> worries me.. There must be a reason to not go with the simpler >>>> approach.. >>>> What was it? >>> >>> >>> >>> It doesn't maintain backwards compatibility. I originally >>> implemented it in the mach way, but you have to recompile the entire >>> system with the larger page size. This patch grew the MI parts to >>> support existing binaries. >>> >>> It is complex. I was hoping for someone to chime in and say "That's >>> great, we need that" or "No, that's not useful at all". >>> Unfortunately, the response is somewhere in the middle. I guess the >>> best course is to port it forward and test it on some x86 machines >>> and see if it makes a big difference. >>> >>> Cheers, >>> Jeff >>> >> >> Yes, we need that. Please commit =-) > > > Thanks for the encouragement. There are a few unresolved issues. Most > importantly, how are we presently dealing with config options that break > modules? If the option is to stay as it is, modules will have to be > aware of the page size that is agreed upon by the rest of the kernel. > > There other issues are mostly considering ways to reduce the impact of > the patch on the rest of the system. How to tidy it up a bit more, if > it can be. > >> >> Scott >> PAE and MAC are two options that break the ABI for modules. As long as modules are compiled as part of the 'makekernel' target, they will get the correct ABI. Scott From owner-freebsd-arch@FreeBSD.ORG Wed Jan 25 06:45:05 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4C90616A41F; Wed, 25 Jan 2006 06:45:05 +0000 (GMT) (envelope-from julian@elischer.org) Received: from a50.ironport.com (a50.ironport.com [63.251.108.112]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0BDFD43D48; Wed, 25 Jan 2006 06:45:04 +0000 (GMT) (envelope-from julian@elischer.org) Received: from unknown (HELO [192.168.2.4]) ([10.251.60.107]) by a50.ironport.com with ESMTP; 24 Jan 2006 22:45:03 -0800 Message-ID: <43D71E6F.1010008@elischer.org> Date: Tue, 24 Jan 2006 22:45:03 -0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7.11) Gecko/20050727 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Roberson References: <43CD612E.2060002@elischer.org> <63333.1137536336@critter.freebsd.dk> <20060121003908.GD6017@cs.rice.edu> <43D18816.3010909@elischer.org> <20060122194739.O602@10.0.0.1> <43D4630E.70201@samsco.org> <20060124211522.D602@10.0.0.1> In-Reply-To: <20060124211522.D602@10.0.0.1> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: alc@freebsd.org, arch@freebsd.org, Alan Cox , Poul-Henning Kamp Subject: Re: Large virtual page size support. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2006 06:45:05 -0000 Jeff Roberson wrote: > > Thanks for the encouragement. There are a few unresolved issues. > Most importantly, how are we presently dealing with config options > that break modules? If the option is to stay as it is, modules will > have to be aware of the page size that is agreed upon by the rest of > the kernel. well we could as a matter of principal make the page size a variable and not a constant. we did the same with HZ which was always a #define. > > There other issues are mostly considering ways to reduce the impact of > the patch on the rest of the system. How to tidy it up a bit more, if > it can be. > >> >> Scott >> From owner-freebsd-arch@FreeBSD.ORG Wed Jan 25 09:40:04 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 856CC16A420; Wed, 25 Jan 2006 09:40:04 +0000 (GMT) (envelope-from des@des.no) Received: from tim.des.no (tim.des.no [194.63.250.121]) by mx1.FreeBSD.org (Postfix) with ESMTP id D4F1E43D46; Wed, 25 Jan 2006 09:40:03 +0000 (GMT) (envelope-from des@des.no) Received: from tim.des.no (localhost [127.0.0.1]) by spam.des.no (Postfix) with ESMTP id AACD22082; Wed, 25 Jan 2006 10:39:58 +0100 (CET) X-Spam-Tests: AWL,BAYES_00,FORGED_RCVD_HELO X-Spam-Learn: ham X-Spam-Score: -3.1/3.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on tim.des.no Received: from xps.des.no (des.no [80.203.243.180]) by tim.des.no (Postfix) with ESMTP id 32F132081; Wed, 25 Jan 2006 10:39:58 +0100 (CET) Received: by xps.des.no (Postfix, from userid 1001) id 1A92233C1D; Wed, 25 Jan 2006 10:39:58 +0100 (CET) To: David Xu References: <43D6C3A5.4060100@freebsd.org> From: des@des.no (=?iso-8859-1?q?Dag-Erling_Sm=F8rgrav?=) Date: Wed, 25 Jan 2006 10:39:58 +0100 In-Reply-To: <43D6C3A5.4060100@freebsd.org> (David Xu's message of "Wed, 25 Jan 2006 08:17:41 +0800") Message-ID: <86k6coodch.fsf@xps.des.no> User-Agent: Gnus/5.110002 (No Gnus v0.2) Emacs/21.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Cc: arch@freebsd.org Subject: Re: vfs_aio.c is still not safe X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2006 09:40:04 -0000 David Xu writes: > Even with recently change to vfs_aio.c, the kernel AIO code is > still not safe to be used. The problem is a AIO daemon thread > may be blocked on sockets, pipe, and fifo if peer does not > transfer any data, the problem can be accumulated and all > daemon threads will be blocked if such user process increases. > [...] > So possible solution could be: > 1) disable AIO support for none disk file. > 2) someone implement callbacks for pipe, fifo, and add > non-blocking feature to fo_read/fo_write. 3) Rewrite the aio code to use kthreads attached to each process, so problems with one process's aio does not propagate to other processes. > The former is simple, the later needs some effort, however > with superio kqueue, the AIO support for socket and pipe is > less important, I prefer 1) to make the AIO code usable. DES --=20 Dag-Erling Sm=F8rgrav - des@des.no From owner-freebsd-arch@FreeBSD.ORG Wed Jan 25 10:09:30 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EA0FA16A41F; Wed, 25 Jan 2006 10:09:30 +0000 (GMT) (envelope-from Alexander@Leidinger.net) Received: from www.ebusiness-leidinger.de (jojo.ms-net.de [84.16.236.246]) by mx1.FreeBSD.org (Postfix) with ESMTP id 27ADB43D55; Wed, 25 Jan 2006 10:09:29 +0000 (GMT) (envelope-from Alexander@Leidinger.net) Received: from Andro-Beta.Leidinger.net (p54A5E159.dip.t-dialin.net [84.165.225.89]) (authenticated bits=0) by www.ebusiness-leidinger.de (8.13.1/8.13.1) with ESMTP id k0PA1LgC020943; Wed, 25 Jan 2006 11:01:22 +0100 (CET) (envelope-from Alexander@Leidinger.net) Received: from localhost (localhost [127.0.0.1]) by Andro-Beta.Leidinger.net (8.13.3/8.13.3) with ESMTP id k0PA9ROE042636; Wed, 25 Jan 2006 11:09:27 +0100 (CET) (envelope-from Alexander@Leidinger.net) Received: from pslux.cec.eu.int (pslux.cec.eu.int [158.169.9.14]) by webmail.leidinger.net (Horde MIME library) with HTTP; Wed, 25 Jan 2006 11:09:27 +0100 Message-ID: <20060125110927.dabpg50ls8o8gg4k@netchild.homeip.net> X-Priority: 3 (Normal) Date: Wed, 25 Jan 2006 11:09:27 +0100 From: Alexander Leidinger To: Poul-Henning Kamp References: <23570.1138137045@critter.freebsd.dk> In-Reply-To: <23570.1138137045@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: 7bit User-Agent: Internet Messaging Program (IMP) H3 (4.0.3) / FreeBSD-4.11 X-Virus-Scanned: by amavisd-new Cc: arch@freebsd.org, current@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2006 10:09:31 -0000 Poul-Henning Kamp wrote: [first patch] > Of course, the scheme used in this patch suffers a bit if the > hardware counter changes to other hardware of a different rate > or simply changes rate. [second patch] > The downside is, that unless your cpu clock is correctly probed > at boot and stays constant, your cpu accounting numbers will have > a bogus scaling factor. > For i386 and amd64 things are more tricky. Laptops doing power > saving tricks will probably give bogus cpu accounting values, > but as such the patch should do no other harm than screw up > those values. Are you going to fix those issues for machines which do power saving tricks (which may even be useful on servers for some people, not only on laptops), or do you not intend to further work on this besides the patches you present here? Bye, Alexander. -- http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 One man tells a falsehood, a hundred repeat it as true. From owner-freebsd-arch@FreeBSD.ORG Wed Jan 25 10:15:33 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2DF8B16A49C; Wed, 25 Jan 2006 10:15:33 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id C862543D49; Wed, 25 Jan 2006 10:15:32 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (unknown [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id 314DABC7C; Wed, 25 Jan 2006 10:15:01 +0000 (UTC) To: Alexander Leidinger From: "Poul-Henning Kamp" In-Reply-To: Your message of "Wed, 25 Jan 2006 11:09:27 +0100." <20060125110927.dabpg50ls8o8gg4k@netchild.homeip.net> Date: Wed, 25 Jan 2006 11:15:01 +0100 Message-ID: <28985.1138184101@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: arch@freebsd.org, current@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2006 10:15:33 -0000 In message <20060125110927.dabpg50ls8o8gg4k@netchild.homeip.net>, Alexander Lei dinger writes: >Are you going to fix those issues for machines which do power saving tricks >(which may even be useful on servers for some people, not only on laptops), >or do you not intend to further work on this besides the patches you present >here? My plan is to add some code to measure and record the maximum "cpu_tick" frequency we see, and use that to normalize the cpu accounting. That way, the user/system time reported will get units of "cpu seconds if the cpu ran full speed". -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Wed Jan 25 10:52:48 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B960516A424; Wed, 25 Jan 2006 10:52:48 +0000 (GMT) (envelope-from Alexander@Leidinger.net) Received: from www.ebusiness-leidinger.de (jojo.ms-net.de [84.16.236.246]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2E43843DB9; Wed, 25 Jan 2006 10:46:28 +0000 (GMT) (envelope-from Alexander@Leidinger.net) Received: from Andro-Beta.Leidinger.net (p54A5E159.dip.t-dialin.net [84.165.225.89]) (authenticated bits=0) by www.ebusiness-leidinger.de (8.13.1/8.13.1) with ESMTP id k0PAbchB021043; Wed, 25 Jan 2006 11:37:38 +0100 (CET) (envelope-from Alexander@Leidinger.net) Received: from localhost (localhost [127.0.0.1]) by Andro-Beta.Leidinger.net (8.13.3/8.13.3) with ESMTP id k0PAjieB049446; Wed, 25 Jan 2006 11:45:44 +0100 (CET) (envelope-from Alexander@Leidinger.net) Received: from pslux.cec.eu.int (pslux.cec.eu.int [158.169.9.14]) by webmail.leidinger.net (Horde MIME library) with HTTP; Wed, 25 Jan 2006 11:45:44 +0100 Message-ID: <20060125114544.edawx42obkkos0ck@netchild.homeip.net> X-Priority: 3 (Normal) Date: Wed, 25 Jan 2006 11:45:44 +0100 From: Alexander Leidinger To: Poul-Henning Kamp References: <28985.1138184101@critter.freebsd.dk> In-Reply-To: <28985.1138184101@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: 7bit User-Agent: Internet Messaging Program (IMP) H3 (4.0.3) / FreeBSD-4.11 X-Virus-Scanned: by amavisd-new Cc: arch@freebsd.org, current@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2006 10:52:48 -0000 Poul-Henning Kamp wrote: > In message <20060125110927.dabpg50ls8o8gg4k@netchild.homeip.net>, > Alexander Lei > dinger writes: > >> Are you going to fix those issues for machines which do power saving tricks >> (which may even be useful on servers for some people, not only on laptops), >> or do you not intend to further work on this besides the patches you present >> here? > > My plan is to add some code to measure and record the maximum "cpu_tick" > frequency we see, and use that to normalize the cpu accounting. > > That way, the user/system time reported will get units of "cpu seconds > if the cpu ran full speed". How large do you expect the error will be? Bye, Alexander. -- http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 Eternity is a terrible thought. I mean, where's it going to end? -- Tom Stoppard From owner-freebsd-arch@FreeBSD.ORG Wed Jan 25 11:02:49 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 83C4316A41F; Wed, 25 Jan 2006 11:02:49 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id 694FC43DB8; Wed, 25 Jan 2006 10:58:39 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (unknown [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id 3B3F2BC74; Wed, 25 Jan 2006 10:58:08 +0000 (UTC) To: Alexander Leidinger From: "Poul-Henning Kamp" In-Reply-To: Your message of "Wed, 25 Jan 2006 11:45:44 +0100." <20060125114544.edawx42obkkos0ck@netchild.homeip.net> Date: Wed, 25 Jan 2006 11:58:07 +0100 Message-ID: <29245.1138186687@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: arch@freebsd.org, current@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2006 11:02:49 -0000 In message <20060125114544.edawx42obkkos0ck@netchild.homeip.net>, Alexander Lei dinger writes: >> That way, the user/system time reported will get units of "cpu seconds >> if the cpu ran full speed". > >How large do you expect the error will be? I don't consider it an error, I consider it increasing precision. If you run time mycommand on your laptop, and along the way the CPU clock ramps up from 75 MHz to 600 MHz before it reports user 2.01 sys 0.30 real 4.00 What exactly have you learned from the first two numbers with the current definition of "cpu second" ? With my definition you would be more likely to see lower numbers maybe user 0.20 sys 0.03 real 4.00 And they would have meaning, they should be pretty much the same no matter what speed your CPU runs at any instant in time. In theory, it should be possible to compare user/sys numbers you collect while running at 75 MHz with the ones you got under full steam at 1600 MHz. In practice however, things that run on the real time, HZ interrupting to run hardclock() for instance, will still make comparison of such numbers quite shaky. But at least they will not be random as they are now. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Wed Jan 25 13:09:25 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4236716A41F; Wed, 25 Jan 2006 13:09:25 +0000 (GMT) (envelope-from ianf@hetzner.co.za) Received: from mail1a.your-server.co.za (mail1a.your-server.co.za [196.7.18.227]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9972043D45; Wed, 25 Jan 2006 13:09:24 +0000 (GMT) (envelope-from ianf@hetzner.co.za) Received: from [196.7.18.226] (helo=hetzner.co.za) by mail1a.your-server.co.za with esmtp (Exim 4.54) id 1F1kOm-0003mL-Ce; Wed, 25 Jan 2006 15:09:20 +0200 Received: from localhost ([127.0.0.1]) by hetzner.co.za with esmtp (Exim 4.51 (FreeBSD)) id 1F1kOm-000FY2-8Z; Wed, 25 Jan 2006 15:09:20 +0200 To: "Poul-Henning Kamp" From: Ian FREISLICH In-Reply-To: Message from "Poul-Henning Kamp" of "Wed, 25 Jan 2006 11:58:07 +0100." <29245.1138186687@critter.freebsd.dk> X-Attribution: BOFH Date: Wed, 25 Jan 2006 15:09:20 +0200 Sender: ianf@hetzner.co.za Message-Id: X-Virus-Scanned: Clear (ClamAV 0.88/1248/Tue Jan 24 12:54:38 2006) Cc: Alexander Leidinger , current@freebsd.org, arch@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2006 13:09:25 -0000 "Poul-Henning Kamp" wrote: > In message <20060125114544.edawx42obkkos0ck@netchild.homeip.net>, Alexander L ei > dinger writes: > > > >> That way, the user/system time reported will get units of "cpu seconds > >> if the cpu ran full speed". > > > >How large do you expect the error will be? > > I don't consider it an error, I consider it increasing precision. > > > If you run > > time mycommand > > on your laptop, and along the way the CPU clock ramps up from > 75 MHz to 600 MHz before it reports > > user 2.01 sys 0.30 real 4.00 > > What exactly have you learned from the first two numbers with the > current definition of "cpu second" ? "One second's worth of the computer's processing time, which is based on actual machine cycles used, not calendar time." ? Is the getrusage() manual page out of date? It claims that user and system time is is "the total amount of time spent executing in user mode" and "the total amount of time spent in the system executing on behalf of the process(es)". > With my definition you would be more likely to see lower numbers > maybe > user 0.20 sys 0.03 real 4.00 > > And they would have meaning, they should be pretty much the same > no matter what speed your CPU runs at any instant in time. For how much of those 4 real seconds was the computer doing something else using your definition? It's certainly not 3.77. It's probably closer to 1.69. > In theory, it should be possible to compare user/sys numbers > you collect while running at 75 MHz with the ones you got > under full steam at 1600 MHz. If my CPU clock runs slower for a period of time, processes remain on the CPU for longer. I don't really see how 0.23 [wallclock] seconds _if_ the cpu ran [at] full speed is different to 2.31 wallclock seconds in this context. One is scaled to maximum CPU clock frequency and the other is scaled to wallclock time. I find the wallclock scale a bit less confusing because I normally exist in that scale[1]: on my two hypothetical identical servers, one clocked down to 50% for some reason, the same job takes twice the wallclock time but identical CPU time? Ian -- Ian Freislich 1. It would be nice to say to my boss that this project would have taken a week if I'd worked faster and get a fat bonus because I could have done it faster. From owner-freebsd-arch@FreeBSD.ORG Wed Jan 25 13:30:31 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BE38716A41F; Wed, 25 Jan 2006 13:30:31 +0000 (GMT) (envelope-from Alexander@Leidinger.net) Received: from c71.sam-solutions.net (c71.sam-solutions.net [217.21.35.67]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3A3BD43D48; Wed, 25 Jan 2006 13:30:29 +0000 (GMT) (envelope-from Alexander@Leidinger.net) Received: from mail pickup service by c71.sam-solutions.net with Microsoft SMTPSVC; Wed, 25 Jan 2006 14:38:15 +0200 Received: from wproxy.gmail.com ([64.233.184.193]) by c71.sam-solutions.net with Microsoft SMTPSVC(6.0.3790.1830); Wed, 25 Jan 2006 12:53:32 +0200 Received: by wproxy.gmail.com with SMTP id i12so139215wra for ; Wed, 25 Jan 2006 02:53:32 -0800 (PST) Received: by 10.65.194.17 with SMTP id w17mr135341qbp; Wed, 25 Jan 2006 02:53:32 -0800 (PST) X-Forwarded-To: m.boyarov@sam-solutions.net X-Forwarded-For: m.boyarov@gmail.com m.boyarov@sam-solutions.net X-Gmail-Received: b51c999bd3bf45c356b616a55527c4f12796a7a9 Delivered-To: m.boyarov@gmail.com Received: by 10.65.159.14 with SMTP id l14cs19033qbo; Wed, 25 Jan 2006 02:53:31 -0800 (PST) Received: by 10.11.100.51 with SMTP id x51mr401062cwb; Wed, 25 Jan 2006 02:53:31 -0800 (PST) Received: from mx2.freebsd.org (mx2.freebsd.org [216.136.204.119]) by aspmx.googlemail.com with ESMTP id o9si1187371cwc.2006.01.25.02.53.31; Wed, 25 Jan 2006 02:53:31 -0800 (PST) Received-SPF: pass (gmail.com: domain of owner-freebsd-current@freebsd.org designates 216.136.204.119 as permitted sender) Received: from hub.freebsd.org (hub.freebsd.org [216.136.204.18]) by mx2.freebsd.org (Postfix) with ESMTP id E36B0CF99D; Wed, 25 Jan 2006 10:53:20 +0000 (GMT) (envelope-from owner-freebsd-current@freebsd.org) Received: from hub.freebsd.org (localhost [127.0.0.1]) by hub.freebsd.org (Postfix) with ESMTP id D19AE16A423; Wed, 25 Jan 2006 10:53:20 +0000 (GMT) (envelope-from owner-freebsd-current@freebsd.org) X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B960516A424; Wed, 25 Jan 2006 10:52:48 +0000 (GMT) (envelope-from Alexander@Leidinger.net) Received: from www.ebusiness-leidinger.de (jojo.ms-net.de [84.16.236.246]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2E43843DB9; Wed, 25 Jan 2006 10:46:28 +0000 (GMT) (envelope-from Alexander@Leidinger.net) Received: from Andro-Beta.Leidinger.net (p54A5E159.dip.t-dialin.net [84.165.225.89]) (authenticated bits=0) by www.ebusiness-leidinger.de (8.13.1/8.13.1) with ESMTP id k0PAbchB021043; Wed, 25 Jan 2006 11:37:38 +0100 (CET) (envelope-from Alexander@Leidinger.net) Received: from localhost (localhost [127.0.0.1]) by Andro-Beta.Leidinger.net (8.13.3/8.13.3) with ESMTP id k0PAjieB049446; Wed, 25 Jan 2006 11:45:44 +0100 (CET) (envelope-from Alexander@Leidinger.net) Received: from pslux.cec.eu.int (pslux.cec.eu.int [158.169.9.14]) by webmail.leidinger.net (Horde MIME library) with HTTP; Wed, 25 Jan 2006 11:45:44 +0100 Message-ID: <20060125114544.edawx42obkkos0ck@netchild.homeip.net> X-Priority: 3 (Normal) Date: Wed, 25 Jan 2006 11:45:44 +0100 From: Alexander Leidinger To: Poul-Henning Kamp References: <28985.1138184101@critter.freebsd.dk> In-Reply-To: <28985.1138184101@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: 7bit User-Agent: Internet Messaging Program (IMP) H3 (4.0.3) / FreeBSD-4.11 X-Virus-Scanned: by amavisd-new X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Sender: owner-freebsd-current@freebsd.org Errors-To: owner-freebsd-current@freebsd.org X-OriginalArrivalTime: 25 Jan 2006 10:53:33.0094 (UTC) FILETIME=[95A68C60:01C6219D] Cc: arch@freebsd.org, current@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2006 13:30:31 -0000 Poul-Henning Kamp wrote: > In message <20060125110927.dabpg50ls8o8gg4k@netchild.homeip.net>, > Alexander Lei > dinger writes: > >> Are you going to fix those issues for machines which do power saving tricks >> (which may even be useful on servers for some people, not only on laptops), >> or do you not intend to further work on this besides the patches you present >> here? > > My plan is to add some code to measure and record the maximum "cpu_tick" > frequency we see, and use that to normalize the cpu accounting. > > That way, the user/system time reported will get units of "cpu seconds > if the cpu ran full speed". How large do you expect the error will be? Bye, Alexander. -- http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 Eternity is a terrible thought. I mean, where's it going to end? -- Tom Stoppard _______________________________________________ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" From owner-freebsd-arch@FreeBSD.ORG Wed Jan 25 19:09:59 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F1EC316A41F; Wed, 25 Jan 2006 19:09:58 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id 93D0043D46; Wed, 25 Jan 2006 19:09:57 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (unknown [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id B95B5BC74; Wed, 25 Jan 2006 19:09:54 +0000 (UTC) To: Ian FREISLICH From: "Poul-Henning Kamp" In-Reply-To: Your message of "Wed, 25 Jan 2006 15:09:20 +0200." Date: Wed, 25 Jan 2006 20:09:54 +0100 Message-ID: <19559.1138216194@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: Alexander Leidinger , current@freebsd.org, arch@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2006 19:09:59 -0000 In message , Ian FREISLICH writes: >"One second's worth of the computer's processing time, which is >based on actual machine cycles used, not calendar time." ? > >Is the getrusage() manual page out of date? Yes. It was written before anybody had gotten the rather weird idea to have a CPU change frequency. Back then it was all about running as fast as possible all the time. We are therefore forced to try to divine the intent behind the text, and as somebody who were around back in the eighties I can testify that the intent was to be able to bill computer users for CPU instructions. Since the clock rate was constant, cpu seconds was a usable approximation. These days with variable clockrate, the cpu second is a bad approximation. If my CPU runs at 600MHz, even if used 100%, it can still do three times as much work, so the fact that my process takes 3 seconds to complete does not mean that I have used (in the sense of denying other users the ability to use) all of the CPU for three seconds. If I had monopolized the entire CPU to its fullest potential, it would have taken only one second. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Wed Jan 25 21:20:42 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9A9F916A45C; Wed, 25 Jan 2006 21:20:40 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1310A4487A; Wed, 25 Jan 2006 20:28:17 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (unknown [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id 2DE1DBC74; Wed, 25 Jan 2006 20:28:16 +0000 (UTC) To: Peter Jeremy From: "Poul-Henning Kamp" In-Reply-To: Your message of "Thu, 26 Jan 2006 07:14:50 +1100." <20060125201450.GE25397@cirb503493.alcatel.com.au> Date: Wed, 25 Jan 2006 21:28:16 +0100 Message-ID: <56988.1138220896@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: arch@freebsd.org, current@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2006 21:20:42 -0000 In message <20060125201450.GE25397@cirb503493.alcatel.com.au>, Peter Jeremy wri tes: >On Wed, 2006-Jan-25 20:09:54 +0100, Poul-Henning Kamp wrote: >>We are therefore forced to try to divine the intent behind the text, >>and as somebody who were around back in the eighties I can testify >>that the intent was to be able to bill computer users for CPU >>instructions. > >This implies that RDTSC (and equivalents) would be the best source of >accounting information, with CPU usage billed in CPU cycles used. >It's just users who expect to be billed in seconds. Right, so we bill users in "full speed CPU second equvivalents" -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Wed Jan 25 21:37:56 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F1D3A16A42D; Wed, 25 Jan 2006 21:37:55 +0000 (GMT) (envelope-from PeterJeremy@optushome.com.au) Received: from mail23.syd.optusnet.com.au (mail23.syd.optusnet.com.au [211.29.133.164]) by mx1.FreeBSD.org (Postfix) with ESMTP id AEA1043F0A; Wed, 25 Jan 2006 20:14:56 +0000 (GMT) (envelope-from PeterJeremy@optushome.com.au) Received: from cirb503493.alcatel.com.au (c220-239-19-236.belrs4.nsw.optusnet.com.au [220.239.19.236]) by mail23.syd.optusnet.com.au (8.12.11/8.12.11) with ESMTP id k0PKEqJa025350 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 26 Jan 2006 07:14:53 +1100 Received: from cirb503493.alcatel.com.au (localhost.alcatel.com.au [127.0.0.1]) by cirb503493.alcatel.com.au (8.12.10/8.12.10) with ESMTP id k0PKEpHh040777; Thu, 26 Jan 2006 07:14:51 +1100 (EST) (envelope-from pjeremy@cirb503493.alcatel.com.au) Received: (from pjeremy@localhost) by cirb503493.alcatel.com.au (8.12.10/8.12.9/Submit) id k0PKEor7040776; Thu, 26 Jan 2006 07:14:50 +1100 (EST) (envelope-from pjeremy) Date: Thu, 26 Jan 2006 07:14:50 +1100 From: Peter Jeremy To: Poul-Henning Kamp Message-ID: <20060125201450.GE25397@cirb503493.alcatel.com.au> References: <19559.1138216194@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <19559.1138216194@critter.freebsd.dk> X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.11 Cc: arch@freebsd.org, current@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2006 21:37:56 -0000 On Wed, 2006-Jan-25 20:09:54 +0100, Poul-Henning Kamp wrote: >We are therefore forced to try to divine the intent behind the text, >and as somebody who were around back in the eighties I can testify >that the intent was to be able to bill computer users for CPU >instructions. This implies that RDTSC (and equivalents) would be the best source of accounting information, with CPU usage billed in CPU cycles used. It's just users who expect to be billed in seconds. >These days with variable clockrate, the cpu second is a bad approximation. Agreed. >If my CPU runs at 600MHz, even if used 100%, it can still do three times >as much work, so the fact that my process takes 3 seconds to complete >does not mean that I have used (in the sense of denying other users the >ability to use) all of the CPU for three seconds. This depends on why the CPU was running at 600MHz instead of 1800MHz. If the user requested that speed (for whatever reason), then that user _was_ denying other users the ability to use the CPU. -- Peter Jeremy From owner-freebsd-arch@FreeBSD.ORG Wed Jan 25 21:38:59 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D47CF16A420; Wed, 25 Jan 2006 21:38:59 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 837AC43E91; Wed, 25 Jan 2006 21:37:47 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from [192.168.254.14] (imini.samsco.home [192.168.254.14]) (authenticated bits=0) by pooker.samsco.org (8.13.4/8.13.4) with ESMTP id k0PLbiOX029638; Wed, 25 Jan 2006 14:37:44 -0700 (MST) (envelope-from scottl@samsco.org) Message-ID: <43D7EFA7.2060309@samsco.org> Date: Wed, 25 Jan 2006 14:37:43 -0700 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7.7) Gecko/20050416 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Poul-Henning Kamp References: <56988.1138220896@critter.freebsd.dk> In-Reply-To: <56988.1138220896@critter.freebsd.dk> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.4 required=3.8 tests=ALL_TRUSTED autolearn=failed version=3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on pooker.samsco.org Cc: Peter Jeremy , current@freebsd.org, arch@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2006 21:39:00 -0000 Poul-Henning Kamp wrote: > In message <20060125201450.GE25397@cirb503493.alcatel.com.au>, Peter Jeremy wri > tes: > >>On Wed, 2006-Jan-25 20:09:54 +0100, Poul-Henning Kamp wrote: >> >>>We are therefore forced to try to divine the intent behind the text, >>>and as somebody who were around back in the eighties I can testify >>>that the intent was to be able to bill computer users for CPU >>>instructions. >> >>This implies that RDTSC (and equivalents) would be the best source of >>accounting information, with CPU usage billed in CPU cycles used. >>It's just users who expect to be billed in seconds. > > > Right, so we bill users in "full speed CPU second equvivalents" > Regardless of the technical merits of one accounting method or another, changing the results of rusage is going to result in many years of questions to the mailing lists and grumbling from uneducated sysadmins that FreeBSD is somehow inferior because of this one detail. I know that's an emotional argument and not a technical one, but it's also important to consider. Scott From owner-freebsd-arch@FreeBSD.ORG Wed Jan 25 22:09:45 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DF9CA16A420; Wed, 25 Jan 2006 22:09:45 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id D668F43D5A; Wed, 25 Jan 2006 22:09:41 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (unknown [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id 0BF74BC74; Wed, 25 Jan 2006 22:09:29 +0000 (UTC) To: Scott Long From: "Poul-Henning Kamp" In-Reply-To: Your message of "Wed, 25 Jan 2006 14:37:43 MST." <43D7EFA7.2060309@samsco.org> Date: Wed, 25 Jan 2006 23:09:28 +0100 Message-ID: <41487.1138226968@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: Peter Jeremy , current@freebsd.org, arch@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2006 22:09:46 -0000 In message <43D7EFA7.2060309@samsco.org>, Scott Long writes: >Regardless of the technical merits of one accounting method or another, >changing the results of rusage is going to result in many years of >questions to the mailing lists and grumbling from uneducated sysadmins >that FreeBSD is somehow inferior because of this one detail. I know >that's an emotional argument and not a technical one, but it's also >important to consider. Well, there is up to 30% improvement in contextswitches to pay for the grumbling. I think more people care about context switches than cpu accounting, but I also think they may not know this. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Thu Jan 26 00:08:00 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from [127.0.0.1] (localhost [127.0.0.1]) by hub.freebsd.org (Postfix) with ESMTP id 457C316A420; Thu, 26 Jan 2006 00:07:58 +0000 (GMT) (envelope-from davidxu@freebsd.org) Message-ID: <43D812F7.50808@freebsd.org> Date: Thu, 26 Jan 2006 08:08:23 +0800 From: David Xu User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.7.12) Gecko/20060117 X-Accept-Language: en-us, en MIME-Version: 1.0 To: =?ISO-8859-1?Q?Dag-Erling_Sm=F8rgrav?= References: <43D6C3A5.4060100@freebsd.org> <86k6coodch.fsf@xps.des.no> In-Reply-To: <86k6coodch.fsf@xps.des.no> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: arch@freebsd.org Subject: Re: vfs_aio.c is still not safe X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Jan 2006 00:08:00 -0000 Dag-Erling Smørgrav wrote: >3) Rewrite the aio code to use kthreads attached to each process, so > problems with one process's aio does not propagate to other > processes. > > > This is not a complete solution, it shifts the problem to another side, allow user to kill process. >DES > > From owner-freebsd-arch@FreeBSD.ORG Thu Jan 26 00:38:43 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4632E16A422; Thu, 26 Jan 2006 00:38:43 +0000 (GMT) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0BF6243D45; Thu, 26 Jan 2006 00:38:43 +0000 (GMT) (envelope-from kris@obsecurity.org) Received: from obsecurity.dyndns.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id E66881A3C30; Wed, 25 Jan 2006 16:38:42 -0800 (PST) Received: by obsecurity.dyndns.org (Postfix, from userid 1000) id 211A55122E; Wed, 25 Jan 2006 19:38:42 -0500 (EST) Date: Wed, 25 Jan 2006 19:38:42 -0500 From: Kris Kennaway To: Poul-Henning Kamp Message-ID: <20060126003841.GA56514@xor.obsecurity.org> References: <23570.1138137045@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="MGYHOYXEY6WxJCY8" Content-Disposition: inline In-Reply-To: <23570.1138137045@critter.freebsd.dk> User-Agent: Mutt/1.4.2.1i Cc: arch@freebsd.org, current@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Jan 2006 00:38:43 -0000 --MGYHOYXEY6WxJCY8 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Tue, Jan 24, 2006 at 10:10:45PM +0100, Poul-Henning Kamp wrote: > On a sparc64 it gives 3.2+/-.3% speedup on unixbench/context1 For me on an e4500 it gives > ministat old new x old + new +--------------------------------------------------------------------------+ | x xx x + + + x + +| ||________M________A________|________|________MA__________________| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 5 5006.9 5042.7 5009.3 5016.96 14.894563 + 5 5019.4 5062.8 5039.4 5039.98 15.787717 Difference at 95.0% confidence 23.02 +/- 22.3836 0.458844% +/- 0.44616% (Student's t, pooled s = 15.3476) Kris --MGYHOYXEY6WxJCY8 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (FreeBSD) iD8DBQFD2BoRWry0BWjoQKURAs6qAJ4gahdyUN5+w4129FbOmcejmROGPACeL7fb WrQJw+55t/Fwc3S17/K+onA= =He5z -----END PGP SIGNATURE----- --MGYHOYXEY6WxJCY8-- From owner-freebsd-arch@FreeBSD.ORG Thu Jan 26 01:28:36 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 921E416A422; Thu, 26 Jan 2006 01:28:36 +0000 (GMT) (envelope-from julian@elischer.org) Received: from a50.ironport.com (a50.ironport.com [63.251.108.112]) by mx1.FreeBSD.org (Postfix) with ESMTP id 55E9343D45; Thu, 26 Jan 2006 01:28:35 +0000 (GMT) (envelope-from julian@elischer.org) Received: from unknown (HELO [10.251.17.229]) ([10.251.17.229]) by a50.ironport.com with ESMTP; 25 Jan 2006 17:28:34 -0800 Message-ID: <43D825C2.4000004@elischer.org> Date: Wed, 25 Jan 2006 17:28:34 -0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7.11) Gecko/20050727 X-Accept-Language: en-us, en MIME-Version: 1.0 To: David Xu References: <43D6C3A5.4060100@freebsd.org> <86k6coodch.fsf@xps.des.no> <43D812F7.50808@freebsd.org> In-Reply-To: <43D812F7.50808@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: =?ISO-8859-1?Q?Dag-Erling_Sm=F8rgrav?= , arch@freebsd.org Subject: Re: vfs_aio.c is still not safe X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Jan 2006 01:28:36 -0000 David Xu wrote: > Dag-Erling Smørgrav wrote: > >> 3) Rewrite the aio code to use kthreads attached to each process, so >> problems with one process's aio does not propagate to other >> processes. >> >> >> > This is not a complete solution, it shifts the problem to another side, > allow user to kill process. A kernel thread cannot be "killed" by the user. it can only agree to kill itself. > >> DES >> >> > > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" From owner-freebsd-arch@FreeBSD.ORG Thu Jan 26 02:24:41 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8237916A420 for ; Thu, 26 Jan 2006 02:24:41 +0000 (GMT) (envelope-from davidxu@freebsd.org) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1FD9243D46; Thu, 26 Jan 2006 02:24:41 +0000 (GMT) (envelope-from davidxu@freebsd.org) Received: from [127.0.0.1] (root@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id k0Q2Ocge058958; Thu, 26 Jan 2006 02:24:39 GMT (envelope-from davidxu@freebsd.org) Message-ID: <43D832F1.6040602@freebsd.org> Date: Thu, 26 Jan 2006 10:24:49 +0800 From: David Xu User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.12) Gecko/20050928 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Julian Elischer References: <43D6C3A5.4060100@freebsd.org> <86k6coodch.fsf@xps.des.no> <43D812F7.50808@freebsd.org> <43D825C2.4000004@elischer.org> In-Reply-To: <43D825C2.4000004@elischer.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: =?ISO-8859-1?Q?Dag-Erling_Sm=F8rgrav?= , arch@freebsd.org Subject: Re: vfs_aio.c is still not safe X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Jan 2006 02:24:41 -0000 Julian Elischer wrote: > > David Xu wrote: > >> Dag-Erling Smørgrav wrote: >> >>> 3) Rewrite the aio code to use kthreads attached to each process, so >>> problems with one process's aio does not propagate to other >>> processes. >>> >>> >>> >> This is not a complete solution, it shifts the problem to another side, >> allow user to kill process. > > > A kernel thread cannot be "killed" by the user. it can only > agree to kill itself. > By attaching kthreads to each process, there still has an serious issue, if I allow max queued AIO requests to be 1000 (sysctl can adjust this), then I will allow 1000 aio threads to be created and be blocked for each process, otherwise, it defeats the purpose of aio, this is why aio thread should not be blocked on sockets, fifo, pipe, and then they can be reused. Using fixed number of aio threads for disk file is ok, since disk data will be availble in foreseeable period. If a user process got a signal which will bring it down, thread_single() call in exit1() should cause the kthreads to exit, this can be done. This also reminds me another problem, if all user threads exited by calling e.g: thr_exit or kse_exit, now, these kthreads should exit too, so the code should be adjusted,not only ptrace code. From owner-freebsd-arch@FreeBSD.ORG Thu Jan 26 02:38:47 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B1EEF16A422; Thu, 26 Jan 2006 02:38:47 +0000 (GMT) (envelope-from davidxu@freebsd.org) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7304C43D46; Thu, 26 Jan 2006 02:38:47 +0000 (GMT) (envelope-from davidxu@freebsd.org) Received: from [127.0.0.1] (root@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id k0Q2ci6l061298; Thu, 26 Jan 2006 02:38:45 GMT (envelope-from davidxu@freebsd.org) Message-ID: <43D8363F.3090104@freebsd.org> Date: Thu, 26 Jan 2006 10:38:55 +0800 From: David Xu User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.12) Gecko/20050928 X-Accept-Language: en-us, en MIME-Version: 1.0 To: David Xu References: <43D6C3A5.4060100@freebsd.org> <86k6coodch.fsf@xps.des.no> <43D812F7.50808@freebsd.org> <43D825C2.4000004@elischer.org> <43D832F1.6040602@freebsd.org> In-Reply-To: <43D832F1.6040602@freebsd.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Cc: =?ISO-8859-1?Q?Dag-Erling_Sm?=, =?ISO-8859-1?Q?=F8rgrav?= , Julian Elischer , arch@freebsd.org Subject: Re: vfs_aio.c is still not safe X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Jan 2006 02:38:47 -0000 David Xu wrote: > If a user process got a signal which will bring it down, thread_single() > call in exit1() should cause the kthreads to exit, this can be done. > This also reminds me another problem, if all user threads exited by > calling e.g: thr_exit or kse_exit, now, these kthreads should exit too, > so the code should be adjusted,not only ptrace code. > > I have rechecked thr_exit(), there is a band-aid, the lastest thread can not exit via thr_exit(), so thr_exit and kse_exit need not be worried. From owner-freebsd-arch@FreeBSD.ORG Thu Jan 26 06:06:45 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 66A6E16A420; Thu, 26 Jan 2006 06:06:45 +0000 (GMT) (envelope-from ianf@hetzner.co.za) Received: from mail1a.your-server.co.za (mail1a.your-server.co.za [196.7.18.227]) by mx1.FreeBSD.org (Postfix) with ESMTP id CC0AE43D46; Thu, 26 Jan 2006 06:06:44 +0000 (GMT) (envelope-from ianf@hetzner.co.za) Received: from [196.7.18.226] (helo=hetzner.co.za) by mail1a.your-server.co.za with esmtp (Exim 4.54) id 1F20HI-0000P9-31; Thu, 26 Jan 2006 08:06:40 +0200 Received: from localhost ([127.0.0.1]) by hetzner.co.za with esmtp (Exim 4.51 (FreeBSD)) id 1F20HI-000IRb-0x; Thu, 26 Jan 2006 08:06:40 +0200 To: "Poul-Henning Kamp" From: Ian FREISLICH In-reply-to: Your message of "Wed, 25 Jan 2006 20:09:54 +0100." <19559.1138216194@critter.freebsd.dk> X-Attribution: BOFH Date: Thu, 26 Jan 2006 08:06:40 +0200 Sender: ianf@hetzner.co.za Message-Id: X-Virus-Scanned: Clear (ClamAV 0.88/1251/Thu Jan 26 02:25:09 2006) Cc: Alexander Leidinger , current@freebsd.org, arch@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Jan 2006 06:06:45 -0000 "Poul-Henning Kamp" wrote: > In message , Ian FREISLICH writes: > > >"One second's worth of the computer's processing time, which is > >based on actual machine cycles used, not calendar time." ? > > > >Is the getrusage() manual page out of date? > > Yes. > > It was written before anybody had gotten the rather weird idea to > have a CPU change frequency. Back then it was all about running > as fast as possible all the time. > > We are therefore forced to try to divine the intent behind the text, > and as somebody who were around back in the eighties I can testify > that the intent was to be able to bill computer users for CPU > instructions. I wonder how many people still bill for CPU time? I'd go for the faster context switches. Ian -- Ian Freislich From owner-freebsd-arch@FreeBSD.ORG Thu Jan 26 10:11:57 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 10AC916A420; Thu, 26 Jan 2006 10:11:57 +0000 (GMT) (envelope-from b.candler@pobox.com) Received: from thorn.pobox.com (thorn.pobox.com [208.210.124.75]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7D7CF43D64; Thu, 26 Jan 2006 10:11:44 +0000 (GMT) (envelope-from b.candler@pobox.com) Received: from thorn (localhost [127.0.0.1]) by thorn.pobox.com (Postfix) with ESMTP id 6C79AC0; Thu, 26 Jan 2006 05:12:05 -0500 (EST) Received: from mappit.local.linnet.org (212-74-113-67.static.dsl.as9105.com [212.74.113.67]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by thorn.sasl.smtp.pobox.com (Postfix) with ESMTP id E70B67FA9; Thu, 26 Jan 2006 05:12:01 -0500 (EST) Received: from lists by mappit.local.linnet.org with local (Exim 4.60 (FreeBSD)) (envelope-from ) id 1F246M-000AgD-Oz; Thu, 26 Jan 2006 10:11:38 +0000 Date: Thu, 26 Jan 2006 10:11:38 +0000 From: Brian Candler To: Poul-Henning Kamp Message-ID: <20060126101138.GA40773@uk.tiscali.com> References: <20060125201450.GE25397@cirb503493.alcatel.com.au> <56988.1138220896@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56988.1138220896@critter.freebsd.dk> User-Agent: Mutt/1.4.2.1i Cc: Peter Jeremy , current@freebsd.org, arch@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Jan 2006 10:11:57 -0000 On Wed, Jan 25, 2006 at 09:28:16PM +0100, Poul-Henning Kamp wrote: > Right, so we bill users in "full speed CPU second equvivalents" How about "BogoMIPS-seconds"? Seriously... don't forget that the *other* usage of CPU-second accounting is for system administrators to assess the amount of CPU resource used by a particular task, in order to plan when the machine is going to need upgrading. In this case, the administrator is not so much interested in the absolute amount of work done, as the amount of work done as a proportion of total work capacity on a particular machine. That is, if task X uses 1200 CPU-seconds over a period of one hour, that's a third of the total available capacity on that machine [1]. If the CPU were then cranked down to 1/3rd of its clock speed, this task would be using the full CPU capacity - and observing that this process is now using 3600 CPU-seconds in an hour is a useful view of the real situation, rather than some mythical 1200 CPU-seconds which it *would have* used *if* it had been running on a different machine (i.e. a machine similar to this one, but running at a faster clock speed). The machine is maxed out on CPU, and that's what matters. Another way of looking at this is that if the CPU is running at 1/3rd speed then CPU cycles are three times as rare, and therefore three times as expensive. That's not good from the point of view of a timeshare user who pays for CPU seconds, as they end up paying three times as much for the same amount of work [2][3]. But it's realistic, especially if the end user owns, runs and pays for the whole asset (which I suggest is more common than the timeshare user these days) Regards, Brian. [1] Of course a dual-CPU box has a capacity of 7200 CPU-seconds per hour, so 1200 CPU-seconds would be one sixth. I don't see a need to normalise that, even if that means I'm taking a slightly inconsistent position :-) Admins are used to thinking of a 4-CPU box as a kind-of cluster of 4 machines. [2] If today CPU cycles are three times as expensive as normal, because the sysadmin needed to reduce the clock speed (e.g. air conditioning failure?) then the user can always choose to run their application on a different day instead. [3] On a multi-CPU machine, bottlenecks such as RAM I/O may mean that the same sequence of instructions takes more cycles (and hence time) to execute than on a single CPU machine, even at the same clock speed. The timeshare user may also feel unfairly penalised for this - but I don't see there's much that can be done about it. That is, it's very difficult to charge the timeshare user for absolute work done, completely independent of the platform their application runs on. I think it's reasonable to charge them based on the proportion of resource they've used on the actual machine they've chosen to run it on, at the time they've chosen to run it. From owner-freebsd-arch@FreeBSD.ORG Thu Jan 26 15:20:42 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AD43516A420; Thu, 26 Jan 2006 15:20:42 +0000 (GMT) (envelope-from marius@newtrinity.zeist.de) Received: from newtrinity.zeist.de (newtrinity.zeist.de [217.24.217.8]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0238543D5A; Thu, 26 Jan 2006 15:20:38 +0000 (GMT) (envelope-from marius@newtrinity.zeist.de) Received: from newtrinity.zeist.de (localhost [127.0.0.1]) by newtrinity.zeist.de (8.12.11/8.12.11/ZEIST.DE) with ESMTP id k0QFKbXM061334; Thu, 26 Jan 2006 16:20:37 +0100 (CET) (envelope-from marius@newtrinity.zeist.de) Received: (from marius@localhost) by newtrinity.zeist.de (8.12.11/8.12.10/Submit) id k0QFKWrI061333; Thu, 26 Jan 2006 16:20:32 +0100 (CET) (envelope-from marius) Date: Thu, 26 Jan 2006 16:20:32 +0100 From: Marius Strobl To: Poul-Henning Kamp Message-ID: <20060126162032.A16564@newtrinity.zeist.de> References: <23570.1138137045@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <23570.1138137045@critter.freebsd.dk>; from phk@phk.freebsd.dk on Tue, Jan 24, 2006 at 10:10:45PM +0100 X-AntiVirus-modified: yes X-AntiVirus: checked by AntiVir Milter (version: 1.1.2-1; AVE: 6.33.0.27; VDF: 6.33.0.162; host: newtrinity.zeist.de) Cc: arch@freebsd.org, current@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Jan 2006 15:20:42 -0000 On Tue, Jan 24, 2006 at 10:10:45PM +0100, Poul-Henning Kamp wrote: > > Here is a new version of my cpu accounting change patch. > > http://phk.freebsd.dk/patch/cpu_acct_1.patch > > This patch is supposedly harmless (or at least mostly harmless) > and I'd appreciate it getting a solid trashing. > > > This patchs changes cpu accounting from accumulating charges > in real-time units and instead accumulates in units of some > per-arch, possibly per-cpu counter. > > When the accumulated charge is read by times(2) or getrusage(2) or > similar, the frequency of the counter is interrogated and the charge > normalized to microseconds. > > With this patch, the counter is always the timecounter and the only > real difference is therefore a minor performance change (because we > save the normalizing multiplications for each context switch). > > On my AMD Athlon 700 and my Sun Ultra 60 the performance difference > is barely 1% and of doubtful statistical quality. > > On my Opteron machine I get a 2.7+/-.6% boost on unixbench's > context1 test. > > Of course, the scheme used in this patch suffers a bit if the > hardware counter changes to other hardware of a different rate > or simply changes rate. This has been discussed at length in > a previous thread already, and I'll simply refer to it, rather > than rehash here: > > http://lists.freebsd.org/pipermail/freebsd-net/2005-October/008637.html > > > > The other half of this work is in this separate patch, and this is > not yet complete. You are welcome to test it however, as long as > you are aware of the problems it may hold: > > http://phk.freebsd.dk/patch/cpu_acct_2.patch > > It makes i386 and amd64 use the TSC and sparc64 use the "tick" > counter for CPU accounting. > > On a sparc64 it gives 3.2+/-.3% speedup on unixbench/context1 > > On a Athlon700 with i8254 timecounter it gives a 95+/-.8% speedup > > On a Opteron with ACPI-fast timecounter it gives a 36+/-.6% speedup. > > The downside is, that unless your cpu clock is correctly probed > at boot and stays constant, your cpu accounting numbers will have > a bogus scaling factor. > > I belive all the sparc64s we support have constant CPU rates, > so they should be safe. > USIIe and greater CPUs support changing the CPU frequency which also affects the tick counter. Regarding CPUs currently supported by FreeBSD/sparc64 this translates to the 550MHz and 650MHz USIIi besides the USIIe CPUs. We currently don't support changing their frequency though but I think that would be easy to do. One other thing that might cause problems regarding your work is that the tick counters are not in sync across the CPUs of a MP system. We try to sync the tick counter of APs with the tick counter of the BSP when attaching the APs but that's already not perfect and they are not that constant over time. That's why we currently use the timecounter of a host-PCI or a host-SBus bridge respectively instead of the tick counters on USI/II MP systems. With USIII CPUs a MP system additionally can consist of different CPU models running at different clock speeds (and having different caches). Marius -- This mail was scanned by AntiVir Milter. This product is licensed for non-commercial use. See www.antivir.de for details. From owner-freebsd-arch@FreeBSD.ORG Thu Jan 26 16:08:07 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9B62416A420; Thu, 26 Jan 2006 16:08:07 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4288F43D48; Thu, 26 Jan 2006 16:08:07 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (unknown [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id 36B31BC74; Thu, 26 Jan 2006 16:08:04 +0000 (UTC) To: Marius Strobl From: "Poul-Henning Kamp" In-Reply-To: Your message of "Thu, 26 Jan 2006 16:20:32 +0100." <20060126162032.A16564@newtrinity.zeist.de> Date: Thu, 26 Jan 2006 17:08:04 +0100 Message-ID: <55785.1138291684@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: arch@freebsd.org, current@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Jan 2006 16:08:07 -0000 In message <20060126162032.A16564@newtrinity.zeist.de>, Marius Strobl writes: >One other thing that might cause problems regarding your work is >that the tick counters are not in sync across the CPUs of a MP >system. Yes, this is outstanding, see my discussion with jhb about this. Synchronizing the counters will not be necessary. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Thu Jan 26 20:40:45 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9872816A422; Thu, 26 Jan 2006 20:40:45 +0000 (GMT) (envelope-from julian@elischer.org) Received: from a50.ironport.com (a50.ironport.com [63.251.108.112]) by mx1.FreeBSD.org (Postfix) with ESMTP id E850743D4C; Thu, 26 Jan 2006 20:40:44 +0000 (GMT) (envelope-from julian@elischer.org) Received: from unknown (HELO [10.251.17.229]) ([10.251.17.229]) by a50.ironport.com with ESMTP; 26 Jan 2006 12:40:44 -0800 Message-ID: <43D933CC.4090505@elischer.org> Date: Thu, 26 Jan 2006 12:40:44 -0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7.11) Gecko/20050727 X-Accept-Language: en-us, en MIME-Version: 1.0 To: David Xu References: <43D6C3A5.4060100@freebsd.org> <86k6coodch.fsf@xps.des.no> <43D812F7.50808@freebsd.org> <43D825C2.4000004@elischer.org> <43D832F1.6040602@freebsd.org> In-Reply-To: <43D832F1.6040602@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: =?ISO-8859-1?Q?Dag-Erling_Sm=F8rgrav?= , arch@freebsd.org Subject: Re: vfs_aio.c is still not safe X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Jan 2006 20:40:45 -0000 David Xu wrote: > Julian Elischer wrote: > >> >> David Xu wrote: >> >>> Dag-Erling Smørgrav wrote: >>> >>>> 3) Rewrite the aio code to use kthreads attached to each process, so >>>> problems with one process's aio does not propagate to other >>>> processes. >>>> >>>> >>>> >>> This is not a complete solution, it shifts the problem to another side, >>> allow user to kill process. >> >> >> >> A kernel thread cannot be "killed" by the user. it can only >> agree to kill itself. >> > > By attaching kthreads to each process, there still has an serious issue, > if I allow max queued AIO requests to be 1000 (sysctl can adjust this), > then I will allow 1000 aio threads to be created and be blocked for each > process, otherwise, it defeats the purpose of aio, this is why aio > thread should not be blocked on sockets, fifo, pipe, and then they can > be reused. > Using fixed number of aio threads for disk file is ok, since disk data > will be availble in foreseeable period. I was thinking that we would use the same scheme and code used for the current AIO just that the threads would be attached to the requesting process instead of the AIO process. i.e. not just one thread per request. > > If a user process got a signal which will bring it down, thread_single() > call in exit1() should cause the kthreads to exit, this can be done. > This also reminds me another problem, if all user threads exited by > calling e.g: thr_exit or kse_exit, now, these kthreads should exit too, > so the code should be adjusted,not only ptrace code. From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 02:32:30 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 92E2916A422 for ; Fri, 27 Jan 2006 02:32:29 +0000 (GMT) (envelope-from Thomas.Sparrevohn@btinternet.com) Received: from smtp804.mail.ukl.yahoo.com (smtp804.mail.ukl.yahoo.com [217.12.12.141]) by mx1.FreeBSD.org (Postfix) with SMTP id DFAE543D55 for ; Fri, 27 Jan 2006 02:32:27 +0000 (GMT) (envelope-from Thomas.Sparrevohn@btinternet.com) Received: (qmail 88319 invoked from network); 27 Jan 2006 02:32:26 -0000 Received: from unknown (HELO w2fzz0vc01.aah-go-on.com) (thomas.sparrevohn@btinternet.com@86.133.244.63 with plain) by smtp804.mail.ukl.yahoo.com with SMTP; 27 Jan 2006 02:32:26 -0000 From: Thomas Sparrevohn To: freebsd-current@freebsd.org Date: Fri, 27 Jan 2006 02:32:10 +0000 User-Agent: KMail/1.9.1 References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200601270232.12528.Thomas.Sparrevohn@btinternet.com> Cc: arch@freebsd.org, Alexander Leidinger , Ian FREISLICH , Poul-Henning Kamp , current@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Thomas.Sparrevohn@btinternet.com List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 02:32:30 -0000 On Thursday 26 January 2006 06:06, Ian FREISLICH wrote: > > I wonder how many people still bill for CPU time? I'd go for the > faster context switches. > Almost all major ITO's providers - From SUN, HP, IBM, EDS etc. has offerings that in some shape or other uses a "Utility model" based upon some sort of financial model based upon actual CPU/IO etc. usage - It is a major area now and provides one of the corner stones in the movement towards "Public Utility models" So it is very relevant as an area for general improvement and the "historical" models are not really good enough, for further information take a look a products as MicroMeasure etc. Regards Thomas From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 02:45:48 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3892F16A420; Fri, 27 Jan 2006 02:45:48 +0000 (GMT) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (gate.funkthat.com [69.17.45.168]) by mx1.FreeBSD.org (Postfix) with ESMTP id ADF6A43D55; Fri, 27 Jan 2006 02:45:46 +0000 (GMT) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (tkenasmjkflixzrr@localhost.funkthat.com [127.0.0.1]) by hydrogen.funkthat.com (8.13.3/8.13.3) with ESMTP id k0R2iXXu087549; Thu, 26 Jan 2006 18:44:33 -0800 (PST) (envelope-from jmg@hydrogen.funkthat.com) Received: (from jmg@localhost) by hydrogen.funkthat.com (8.13.3/8.13.3/Submit) id k0R2iWfe087548; Thu, 26 Jan 2006 18:44:32 -0800 (PST) (envelope-from jmg) Date: Thu, 26 Jan 2006 18:44:32 -0800 From: John-Mark Gurney To: Brian Candler Message-ID: <20060127024432.GT69162@funkthat.com> Mail-Followup-To: Brian Candler , Poul-Henning Kamp , Peter Jeremy , current@freebsd.org, arch@freebsd.org References: <20060125201450.GE25397@cirb503493.alcatel.com.au> <56988.1138220896@critter.freebsd.dk> <20060126101138.GA40773@uk.tiscali.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20060126101138.GA40773@uk.tiscali.com> User-Agent: Mutt/1.4.2.1i X-Operating-System: FreeBSD 5.4-RELEASE-p6 i386 X-PGP-Fingerprint: B7 EC EF F8 AE ED A7 31 96 7A 22 B3 D8 56 36 F4 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html Cc: Peter Jeremy , Poul-Henning Kamp , current@freebsd.org, arch@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: John-Mark Gurney List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 02:45:48 -0000 Brian Candler wrote this message on Thu, Jan 26, 2006 at 10:11 +0000: > If the CPU were then cranked down to 1/3rd of its clock speed, this task Who manually cracks it down? and if it is manually crancked down, then shouldn't we use that as the "maximum clock speed" for the computation? I don't think an admin is going to sit around changing the system's clock speed on a second by second basis... The whole point of this discussion is regarding systems that scale back their cpu clock when not in use, and speed up when the system is heavily used... in your example of a 1/3rd of the clock speed, the system would magicly make the other 2/3rds of the cpu cycles available by the auto scaling daemon... If it's hard set, then the algorithm used to calculate seconds will use the new hard set speed, and not the fastest dynamic speed... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 04:55:57 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3100516A420; Fri, 27 Jan 2006 04:55:57 +0000 (GMT) (envelope-from oberman@es.net) Received: from postal2.es.net (postal2.es.net [198.128.3.206]) by mx1.FreeBSD.org (Postfix) with ESMTP id A4D1143D4C; Fri, 27 Jan 2006 04:55:56 +0000 (GMT) (envelope-from oberman@es.net) Received: from ptavv.es.net ([198.128.4.29]) by postal2.es.net (Postal Node 2) with ESMTP (SSL) id IBA74465; Thu, 26 Jan 2006 20:55:53 -0800 Received: from ptavv.es.net (localhost [127.0.0.1]) by ptavv.es.net (Tachyon Server) with ESMTP id F36B34503E; Thu, 26 Jan 2006 20:55:53 -0800 (PST) X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.0.4 To: Thomas.Sparrevohn@btinternet.com In-reply-to: Your message of "Fri, 27 Jan 2006 02:32:10 GMT." <200601270232.12528.Thomas.Sparrevohn@btinternet.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 26 Jan 2006 20:55:53 -0800 From: "Kevin Oberman" Message-Id: <20060127045553.F36B34503E@ptavv.es.net> Cc: current@freebsd.org, Poul-Henning Kamp , arch@freebsd.org, Ian FREISLICH , freebsd-current@freebsd.org, Alexander Leidinger Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 04:55:57 -0000 > On Thursday 26 January 2006 06:06, Ian FREISLICH wrote: > > > > > I wonder how many people still bill for CPU time? I'd go for the > > faster context switches. > > > > Almost all major ITO's providers - From SUN, HP, IBM, EDS etc. has offerings > that in some shape or other uses a "Utility model" based upon some sort of > financial model based upon actual CPU/IO etc. usage - It is a major area now > and provides one of the corner stones in the movement towards "Public Utility > models" > > So it is very relevant as an area for general improvement and the "historical" > models are not really good enough, for further information take a look a > products as MicroMeasure etc. Good accounting is very important to some, but the issue of dealing with reduced clock speed is almost certainly of no issue when it comes to charging for computer use. I can't imagine any reason someone would be paying for CPU time on a processor not running "full out". The only time that this might be an issue is when thermal management takes over. I'd hope that thermal management would never kick in on a commercial compute server, but, if it did, the customer should, at least, only pay for the number of seconds the job would have run had it been properly cooled. (Actually, he should probably pay less as his time is also being wasted.) -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: oberman@es.net Phone: +1 510 486-8634 From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 08:43:44 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B1C6B16A420; Fri, 27 Jan 2006 08:43:44 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id EFE4743D5E; Fri, 27 Jan 2006 08:43:43 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (unknown [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id 34031BC7A; Fri, 27 Jan 2006 08:43:40 +0000 (UTC) To: Thomas.Sparrevohn@btinternet.com From: "Poul-Henning Kamp" In-Reply-To: Your message of "Fri, 27 Jan 2006 02:32:10 GMT." <200601270232.12528.Thomas.Sparrevohn@btinternet.com> Date: Fri, 27 Jan 2006 09:43:40 +0100 Message-ID: <84017.1138351420@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: Alexander Leidinger , Ian FREISLICH , freebsd-current@freebsd.org, current@freebsd.org, arch@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 08:43:45 -0000 In message <200601270232.12528.Thomas.Sparrevohn@btinternet.com>, Thomas Sparre vohn writes: >On Thursday 26 January 2006 06:06, Ian FREISLICH wrote: > >> >> I wonder how many people still bill for CPU time? I'd go for the >> faster context switches. >> > >Almost all major ITO's providers - From SUN, HP, IBM, EDS etc. has offerings >that in some shape or other uses a "Utility model" based upon some sort of >financial model based upon actual CPU/IO etc. usage - It is a major area now >and provides one of the corner stones in the movement towards "Public Utility >models" Should we also add that all these initiatives are spectacular commercial failures because users hate to buy rubberband by the inch ? -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 08:50:22 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3FC3916A424 for ; Fri, 27 Jan 2006 08:50:22 +0000 (GMT) (envelope-from mikej@rogers.com) Received: from smtp103.rog.mail.re2.yahoo.com (smtp103.rog.mail.re2.yahoo.com [206.190.36.81]) by mx1.FreeBSD.org (Postfix) with SMTP id 05EAF43D4C for ; Fri, 27 Jan 2006 08:50:20 +0000 (GMT) (envelope-from mikej@rogers.com) Received: (qmail 9782 invoked from network); 27 Jan 2006 08:50:20 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=rogers.com; h=Received:Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject:References:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=Bj6O74DH+kpwjUepO2n3KAjCHsacxCKMLVIouQw4ItAjq12C8tqRAmtzhsSdMo6rk9HGIaeEWMHpQPljLxwvz3yCAqjAllE4thud4rf32dDEY3UXR+RjdY165P8x25ZJyDn0HLCdcsB7npgdcfPuXnCtRZRpCB0uEN7gIBi44k4= ; Received: from unknown (HELO ?70.30.133.184?) (mikej@rogers.com@70.30.133.184 with plain) by smtp103.rog.mail.re2.yahoo.com with SMTP; 27 Jan 2006 08:50:20 -0000 Message-ID: <43D9DECF.2060101@rogers.com> Date: Fri, 27 Jan 2006 03:50:23 -0500 From: Mike Jakubik User-Agent: Thunderbird 1.5 (Windows/20051201) MIME-Version: 1.0 To: Kevin Oberman References: <20060127045553.F36B34503E@ptavv.es.net> In-Reply-To: <20060127045553.F36B34503E@ptavv.es.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: current@freebsd.org, Thomas.Sparrevohn@btinternet.com, Ian FREISLICH , arch@freebsd.org, Poul-Henning Kamp , freebsd-current@freebsd.org, Alexander Leidinger Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 08:50:22 -0000 Kevin Oberman wrote: > Good accounting is very important to some, but the issue of dealing with > reduced clock speed is almost certainly of no issue when it comes to charging > for computer use. I can't imagine any reason someone would be paying for CPU > time on a processor not running "full out". > > The only time that this might be an issue is when thermal management takes > over. I'd hope that thermal management would never kick in on a commercial > compute server, but, if it did, the customer should, at least, only pay for > the number of seconds the job would have run had it been properly cooled. > (Actually, he should probably pay less as his time is also being wasted.) > As a user from the 2.x days, i would much rather have the great increase of context switching performance than super accurate cpu accounting that i will never use. FreeBSD needs to focus on performance now. From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 08:57:02 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EB4F516A420; Fri, 27 Jan 2006 08:57:02 +0000 (GMT) (envelope-from b.candler@pobox.com) Received: from thorn.pobox.com (thorn.pobox.com [208.210.124.75]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7961943D48; Fri, 27 Jan 2006 08:57:02 +0000 (GMT) (envelope-from b.candler@pobox.com) Received: from thorn (localhost [127.0.0.1]) by thorn.pobox.com (Postfix) with ESMTP id 69800E4; Fri, 27 Jan 2006 03:57:23 -0500 (EST) Received: from mappit.local.linnet.org (212-74-113-67.static.dsl.as9105.com [212.74.113.67]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by thorn.sasl.smtp.pobox.com (Postfix) with ESMTP id 94CB88134; Fri, 27 Jan 2006 03:57:18 -0500 (EST) Received: from lists by mappit.local.linnet.org with local (Exim 4.60 (FreeBSD)) (envelope-from ) id 1F2PPa-000DQf-8x; Fri, 27 Jan 2006 08:56:54 +0000 Date: Fri, 27 Jan 2006 08:56:54 +0000 From: Brian Candler To: John-Mark Gurney Message-ID: <20060127085653.GA51554@uk.tiscali.com> References: <20060125201450.GE25397@cirb503493.alcatel.com.au> <56988.1138220896@critter.freebsd.dk> <20060126101138.GA40773@uk.tiscali.com> <20060127024432.GT69162@funkthat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20060127024432.GT69162@funkthat.com> User-Agent: Mutt/1.4.2.1i Cc: Peter Jeremy , Poul-Henning Kamp , current@freebsd.org, arch@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 08:57:03 -0000 On Thu, Jan 26, 2006 at 06:44:32PM -0800, John-Mark Gurney wrote: > The whole point of this discussion is regarding systems that scale > back their cpu clock when not in use, and speed up when the system > is heavily used... in your example of a 1/3rd of the clock speed, > the system would magicly make the other 2/3rds of the cpu cycles > available by the auto scaling daemon... Hmm. How is that decision made - based on the amount of time spent in HLT state because there is no work to be done? My initial feeling was that if something was using 100 CPU seconds per hour, and then the clock speed is reduced by 1/3 (for any reason) I wanted to see it using 300 CPU seconds per hour, because that is an accurate representation of the usage. Also, noticing the process using (proportionately) more of the CPU resource would make me investigate why, and possibly lead me to adjust system clock settings to meet my needs better. I think what you're saying is: I'm at no risk of my CPU becoming maxed out when speed has been automatically reduced by a power-saving daemon, because it will only stay there if there is still some spare capacity (i.e. some time is regularly spent in the HLT state). If not, the daemon will keep cranking up the clock speed until there *is* some spare capacity, or until max clock speed is reached. I guess this is OK, *if* you trust the power management system to do its job properly. Unfortunately I have very bad experiences of such things. In many cases I've ended up turning off power management completely and locking everything at max clock speed. Mind you, if I do that, anything you do with scaling factors isn't going to affect me, so actually I don't really care. I'll shut up now :-) Regards, Brian. From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 09:03:13 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E213016A422 for ; Fri, 27 Jan 2006 09:03:13 +0000 (GMT) (envelope-from mikej@rogers.com) Received: from smtp105.rog.mail.re2.yahoo.com (smtp105.rog.mail.re2.yahoo.com [206.190.36.83]) by mx1.FreeBSD.org (Postfix) with SMTP id 9F8C943D4C for ; Fri, 27 Jan 2006 09:03:11 +0000 (GMT) (envelope-from mikej@rogers.com) Received: (qmail 5507 invoked from network); 27 Jan 2006 09:03:10 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=rogers.com; h=Received:Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject:References:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=bIvJkK1f7nV7mzTrqdJ+AthU4Hba+qSGRbWgdPUqpWvDPDAEyDROzpLFem3MzMkTN/frn6nJmc40Ru6EgS1nBYNmykok5enDbuk6pA9/KBCbPVeiuaXWcxtpqB+rSiN/UJkywc2O+aTI0sE+Kx0qOJfE14btE/Kfu18C2Z6mKVo= ; Received: from unknown (HELO ?70.30.133.184?) (mikej@rogers.com@70.30.133.184 with plain) by smtp105.rog.mail.re2.yahoo.com with SMTP; 27 Jan 2006 09:03:10 -0000 Message-ID: <43D9E1D2.6060207@rogers.com> Date: Fri, 27 Jan 2006 04:03:14 -0500 From: Mike Jakubik User-Agent: Thunderbird 1.5 (Windows/20051201) MIME-Version: 1.0 To: Brian Candler References: <20060125201450.GE25397@cirb503493.alcatel.com.au> <56988.1138220896@critter.freebsd.dk> <20060126101138.GA40773@uk.tiscali.com> <20060127024432.GT69162@funkthat.com> <20060127085653.GA51554@uk.tiscali.com> In-Reply-To: <20060127085653.GA51554@uk.tiscali.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: arch@freebsd.org, Peter Jeremy , John-Mark Gurney , Poul-Henning Kamp , current@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 09:03:14 -0000 Brian Candler wrote: > I guess this is OK, *if* you trust the power management system to do its job > properly. Unfortunately I have very bad experiences of such things. In many > cases I've ended up turning off power management completely and locking > everything at max clock speed. Mind you, if I do that, anything you do with > scaling factors isn't going to affect me, so actually I don't really care. > I'll shut up now :-) > Let's not forget, FreeBSD is really a server OS. Who in their right mind uses power saving features on a server? It sounds nice in theory, but doesn't work as well. From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 09:13:08 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C801E16A420; Fri, 27 Jan 2006 09:13:08 +0000 (GMT) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (gate.funkthat.com [69.17.45.168]) by mx1.FreeBSD.org (Postfix) with ESMTP id 42C3B43D45; Fri, 27 Jan 2006 09:13:08 +0000 (GMT) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (qk6klvavxg22eiji@localhost.funkthat.com [127.0.0.1]) by hydrogen.funkthat.com (8.13.3/8.13.3) with ESMTP id k0R9Btxn097246; Fri, 27 Jan 2006 01:11:55 -0800 (PST) (envelope-from jmg@hydrogen.funkthat.com) Received: (from jmg@localhost) by hydrogen.funkthat.com (8.13.3/8.13.3/Submit) id k0R9BsYS097245; Fri, 27 Jan 2006 01:11:54 -0800 (PST) (envelope-from jmg) Date: Fri, 27 Jan 2006 01:11:54 -0800 From: John-Mark Gurney To: Brian Candler Message-ID: <20060127091153.GU69162@funkthat.com> Mail-Followup-To: Brian Candler , Peter Jeremy , Poul-Henning Kamp , current@freebsd.org, arch@freebsd.org References: <20060125201450.GE25397@cirb503493.alcatel.com.au> <56988.1138220896@critter.freebsd.dk> <20060126101138.GA40773@uk.tiscali.com> <20060127024432.GT69162@funkthat.com> <20060127085653.GA51554@uk.tiscali.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20060127085653.GA51554@uk.tiscali.com> User-Agent: Mutt/1.4.2.1i X-Operating-System: FreeBSD 5.4-RELEASE-p6 i386 X-PGP-Fingerprint: B7 EC EF F8 AE ED A7 31 96 7A 22 B3 D8 56 36 F4 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html Cc: Peter Jeremy , Poul-Henning Kamp , current@freebsd.org, arch@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: John-Mark Gurney List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 09:13:08 -0000 Brian Candler wrote this message on Fri, Jan 27, 2006 at 08:56 +0000: > I think what you're saying is: I'm at no risk of my CPU becoming maxed out > when speed has been automatically reduced by a power-saving daemon, because > it will only stay there if there is still some spare capacity (i.e. some > time is regularly spent in the HLT state). If not, the daemon will keep > cranking up the clock speed until there *is* some spare capacity, or until > max clock speed is reached. > > I guess this is OK, *if* you trust the power management system to do its job > properly. Unfortunately I have very bad experiences of such things. In many > cases I've ended up turning off power management completely and locking > everything at max clock speed. Mind you, if I do that, anything you do with > scaling factors isn't going to affect me, so actually I don't really care. > I'll shut up now :-) powerd(8): http://www.freebsd.org/cgi/man.cgi?query=powerd&apropos=0&sektion=0&manpath=FreeBSD+6.0-RELEASE+and+Ports&format=html DESCRIPTION The powerd utility monitors the system state and sets various power con- trol options accordingly. It offers three modes (maximum, minimum, and adaptive) that can be individually selected while on AC power or batter- ies. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 09:36:05 2006 Return-Path: X-Original-To: arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 13E7916A422; Fri, 27 Jan 2006 09:36:05 +0000 (GMT) (envelope-from glebius@FreeBSD.org) Received: from cell.sick.ru (cell.sick.ru [217.72.144.68]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4A3DB43D46; Fri, 27 Jan 2006 09:36:04 +0000 (GMT) (envelope-from glebius@FreeBSD.org) Received: from cell.sick.ru (glebius@localhost [127.0.0.1]) by cell.sick.ru (8.13.3/8.13.3) with ESMTP id k0R9a2CW049685 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 27 Jan 2006 12:36:03 +0300 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.sick.ru (8.13.3/8.13.1/Submit) id k0R9a2DS049684; Fri, 27 Jan 2006 12:36:02 +0300 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.sick.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Fri, 27 Jan 2006 12:36:02 +0300 From: Gleb Smirnoff To: arch@FreeBSD.org Message-ID: <20060127093602.GO83922@cell.sick.ru> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="O8/n5iBOhiUtMkxf" Content-Disposition: inline User-Agent: Mutt/1.5.6i Cc: alfred@FreeBSD.org Subject: fix return code for pipe(2) syscall X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 09:36:05 -0000 --O8/n5iBOhiUtMkxf Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Colleagues, sometimes pipe(2) can return ENFILE, which has nothing to do with number of open files. It happens when we run out of kva. However, pipe(2) can also return correct ENFILE, when falloc() fails. The only way to distinguish between latter error and the former one is looking into the 'dmesg' output, right after failure. If your dmesg is later overwritten with some other logging, then you will loose this information, and you won't be able to tell why pipe(2) syscall yesterday returned ENFILE - did it hit descriptor limit or kva? Recently I've got some communication with people, who have problems with mpd port. They have experienced ENFILE from pipe(2). They have spent some time tuning descriptor limits and I spent some time trying to help them, until we looked into dmesg, and then into source, to discover alternative meaning of ENFILES. Any objection for the attached change? It should make return ENOMEM in case of kva outage? Yes, according to SUSv3 the only errors from pipe(2) are ENFILE and EMFILE. I think that blindly following standard in this case will lead to confusion (see above). And we already return EFAULT from pipe(2), which is not described in standard. -- Totus tuus, Glebius. GLEBIUS-RIPN GLEB-RIPE --O8/n5iBOhiUtMkxf Content-Type: text/plain; charset=koi8-r Content-Disposition: attachment; filename="sys_pipe.c.diff" Index: sys_pipe.c =================================================================== RCS file: /home/ncvs/src/sys/kern/sys_pipe.c,v retrieving revision 1.185 diff -u -r1.185 sys_pipe.c --- sys_pipe.c 16 Dec 2005 18:32:39 -0000 1.185 +++ sys_pipe.c 27 Jan 2006 09:15:56 -0000 @@ -357,10 +357,11 @@ NULL); /* Only the forward direction pipe is backed by default */ - if (pipe_create(rpipe, 1) || pipe_create(wpipe, 0)) { + if ((error = pipe_create(rpipe, 1)) != 0 || + (error = pipe_create(wpipe, 0)) != 0) { pipeclose(rpipe); pipeclose(wpipe); - return (ENFILE); + return (error); } rpipe->pipe_state |= PIPE_DIRECTOK; --O8/n5iBOhiUtMkxf-- From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 16:56:48 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5897A16A420; Fri, 27 Jan 2006 16:56:48 +0000 (GMT) (envelope-from oberman@es.net) Received: from postal4.es.net (postal4.es.net [198.124.252.66]) by mx1.FreeBSD.org (Postfix) with ESMTP id CD2F343D5A; Fri, 27 Jan 2006 16:56:47 +0000 (GMT) (envelope-from oberman@es.net) Received: from ptavv.es.net ([198.128.4.29]) by postal4.es.net (Postal Node 4) with ESMTP (SSL) id IBA74465; Fri, 27 Jan 2006 08:56:45 -0800 Received: from ptavv.es.net (localhost [127.0.0.1]) by ptavv.es.net (Tachyon Server) with ESMTP id 4F9F245083; Fri, 27 Jan 2006 08:56:44 -0800 (PST) To: Mike Jakubik In-reply-to: Your message of "Fri, 27 Jan 2006 04:03:14 EST." <43D9E1D2.6060207@rogers.com> Date: Fri, 27 Jan 2006 08:56:44 -0800 From: "Kevin Oberman" Message-Id: <20060127165644.4F9F245083@ptavv.es.net> Cc: John-Mark Gurney , current@freebsd.org, Brian Candler , Peter Jeremy , arch@freebsd.org, Poul-Henning Kamp Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 16:56:48 -0000 > Date: Fri, 27 Jan 2006 04:03:14 -0500 > From: Mike Jakubik > Sender: owner-freebsd-current@freebsd.org > > Brian Candler wrote: > > I guess this is OK, *if* you trust the power management system to do its job > > properly. Unfortunately I have very bad experiences of such things. In many > > cases I've ended up turning off power management completely and locking > > everything at max clock speed. Mind you, if I do that, anything you do with > > scaling factors isn't going to affect me, so actually I don't really care. > > I'll shut up now :-) > > > > Let's not forget, FreeBSD is really a server OS. Who in their right mind > uses power saving features on a server? It sounds nice in theory, but > doesn't work as well. Playing devil's advocate a bit, don't forget that thermal management will throttle performance even if it is set to maximum (as it should be on a compute server). This should never happen, but fans and air coolers do fail. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: oberman@es.net Phone: +1 510 486-8634 From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 17:47:50 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DF9D016A5FE; Fri, 27 Jan 2006 17:47:35 +0000 (GMT) (envelope-from wollman@khavrinen.csail.mit.edu) Received: from khavrinen.csail.mit.edu (khavrinen.csail.mit.edu [128.30.28.20]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3690344426; Fri, 27 Jan 2006 17:24:34 +0000 (GMT) (envelope-from wollman@khavrinen.csail.mit.edu) Received: from khavrinen.csail.mit.edu (localhost.csail.mit.edu [127.0.0.1]) by khavrinen.csail.mit.edu (8.13.1/8.13.4) with ESMTP id k0RHOURf034302 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK CN=khavrinen.csail.mit.edu issuer=Client+20CA); Fri, 27 Jan 2006 12:24:32 -0500 (EST) (envelope-from wollman@khavrinen.csail.mit.edu) Received: (from wollman@localhost) by khavrinen.csail.mit.edu (8.13.1/8.13.4/Submit) id k0RHOUat034301; Fri, 27 Jan 2006 12:24:30 -0500 (EST) (envelope-from wollman) Date: Fri, 27 Jan 2006 12:24:30 -0500 (EST) From: Garrett Wollman Message-Id: <200601271724.k0RHOUat034301@khavrinen.csail.mit.edu> To: glebius@freebsd.org X-Newsgroups: mit.lcs.mail.freebsd-arch In-Reply-To: <20060127093602.GO83922@cell.sick.ru> Organization: MIT Computer Science & Artificial Intelligence Lab X-Greylist: Sender DNS name whitelisted, not delayed by milter-greylist-2.0.2 (khavrinen.csail.mit.edu [127.0.0.1]); Fri, 27 Jan 2006 12:24:33 -0500 (EST) X-Spam-Status: No, score=-0.0 required=5.0 tests=SPF_HELO_PASS,SPF_PASS version=3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on khavrinen.csail.mit.edu Cc: arch@freebsd.org Subject: Re: fix return code for pipe(2) syscall X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 17:47:50 -0000 In article <20060127093602.GO83922@cell.sick.ru> you write: >Yes, according to SUSv3 the only errors from pipe(2) are ENFILE >and EMFILE. POSIX does not define an exhaustive enumeration of error conditions. *Any* error return is permissible, provided only that *for those conditions noted in the ERRORS section* the code identified for that condition is returned. It is perfectly permissible for every system call to fail with [ENOTADUCK] unless the first five bytes of the caller's address space contain the word "quack". -GAWollman From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 18:21:50 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 507CF16A420; Fri, 27 Jan 2006 18:21:50 +0000 (GMT) (envelope-from julian@elischer.org) Received: from a50.ironport.com (a50.ironport.com [63.251.108.112]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1571843D5A; Fri, 27 Jan 2006 18:21:50 +0000 (GMT) (envelope-from julian@elischer.org) Received: from unknown (HELO [10.251.17.229]) ([10.251.17.229]) by a50.ironport.com with ESMTP; 27 Jan 2006 10:21:49 -0800 Message-ID: <43DA64BD.2070805@elischer.org> Date: Fri, 27 Jan 2006 10:21:49 -0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7.11) Gecko/20050727 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Garrett Wollman References: <200601271724.k0RHOUat034301@khavrinen.csail.mit.edu> In-Reply-To: <200601271724.k0RHOUat034301@khavrinen.csail.mit.edu> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Cc: arch@freebsd.org, glebius@freebsd.org Subject: Re: fix return code for pipe(2) syscall X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 18:21:50 -0000 Garrett Wollman wrote: >In article <20060127093602.GO83922@cell.sick.ru> you write: > > > >>Yes, according to SUSv3 the only errors from pipe(2) are ENFILE >>and EMFILE. >> >> > >POSIX does not define an exhaustive enumeration of error conditions. >*Any* error return is permissible, provided only that *for those >conditions noted in the ERRORS section* the code identified for that >condition is returned. It is perfectly permissible for every system >call to fail with [ENOTADUCK] unless the first five bytes of the >caller's address space contain the word "quack". > > > I like it. We should implement this asap. >-GAWollman > >_______________________________________________ >freebsd-arch@freebsd.org mailing list >http://lists.freebsd.org/mailman/listinfo/freebsd-arch >To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > > From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 18:30:11 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EFE0C16A427; Fri, 27 Jan 2006 18:30:11 +0000 (GMT) (envelope-from drosih@rpi.edu) Received: from smtp3.server.rpi.edu (smtp3.server.rpi.edu [128.113.2.3]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8CA1B43D6A; Fri, 27 Jan 2006 18:30:08 +0000 (GMT) (envelope-from drosih@rpi.edu) Received: from [128.113.24.47] (gilead.netel.rpi.edu [128.113.24.47]) by smtp3.server.rpi.edu (8.13.0/8.13.0) with ESMTP id k0RIU49a027772; Fri, 27 Jan 2006 13:30:05 -0500 Mime-Version: 1.0 Message-Id: In-Reply-To: <43D9E1D2.6060207@rogers.com> References: <20060125201450.GE25397@cirb503493.alcatel.com.au> <56988.1138220896@critter.freebsd.dk> <20060126101138.GA40773@uk.tiscali.com> <20060127024432.GT69162@funkthat.com> <20060127085653.GA51554@uk.tiscali.com> <43D9E1D2.6060207@rogers.com> Date: Fri, 27 Jan 2006 13:30:03 -0500 To: Mike Jakubik From: Garance A Drosihn Content-Type: text/plain; charset="us-ascii" ; format="flowed" X-CanItPRO-Stream: default X-RPI-SA-Score: undef - spam-scanning disabled X-Scanned-By: CanIt (www . canit . ca) on 128.113.2.3 Cc: arch@freebsd.org, current@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 18:30:12 -0000 At 4:03 AM -0500 1/27/06, Mike Jakubik wrote: > >Let's not forget, FreeBSD is really a server OS. Who in their right >mind uses power saving features on a server? It sounds nice in >theory, but doesn't work as well. Apparently your power and cooling bills are much lower than ours. We would very much love it if the computers will use only the energy they need to get the job done. Yes, that means a big bill when some simulation is running on a 100-node beowulf cluster. But it also means we don't want to be paying the bill to run that cluster at full-throttle when there's no work for those CPU's to do. -- Garance Alistair Drosehn = gad@gilead.netel.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 18:37:18 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2D02D16A425; Fri, 27 Jan 2006 18:37:18 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219]) by mx1.FreeBSD.org (Postfix) with ESMTP id DBD2744003; Fri, 27 Jan 2006 14:27:11 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.5b3) with ESMTP id 7114863 for multiple; Fri, 27 Jan 2006 09:26:02 -0500 Received: from localhost (john@localhost [127.0.0.1]) by server.baldwin.cx (8.13.4/8.13.4) with ESMTP id k0RER9YS051089; Fri, 27 Jan 2006 09:27:09 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: freebsd-arch@freebsd.org Date: Fri, 27 Jan 2006 09:27:30 -0500 User-Agent: KMail/1.9.1 References: <20060125201450.GE25397@cirb503493.alcatel.com.au> <20060127085653.GA51554@uk.tiscali.com> <43D9E1D2.6060207@rogers.com> In-Reply-To: <43D9E1D2.6060207@rogers.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200601270927.33772.jhb@freebsd.org> X-Virus-Scanned: ClamAV 0.87.1/1253/Fri Jan 27 05:10:20 2006 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.4 required=4.2 tests=ALL_TRUSTED autolearn=failed version=3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on server.baldwin.cx X-Server: High Performance Mail Server - http://surgemail.com r=1653887525 Cc: John-Mark Gurney , current@freebsd.org, Brian Candler , Mike Jakubik , arch@freebsd.org, Poul-Henning Kamp , Peter Jeremy Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 18:37:19 -0000 On Friday 27 January 2006 04:03, Mike Jakubik wrote: > Brian Candler wrote: > > I guess this is OK, *if* you trust the power management system to do its > > job properly. Unfortunately I have very bad experiences of such things. > > In many cases I've ended up turning off power management completely and > > locking everything at max clock speed. Mind you, if I do that, anything > > you do with scaling factors isn't going to affect me, so actually I don't > > really care. I'll shut up now :-) > > Let's not forget, FreeBSD is really a server OS. Who in their right mind > uses power saving features on a server? It sounds nice in theory, but > doesn't work as well. People with lots and lots of servers who need to keep the entire power load down to avoid overloading their power source. Also, said people might care about the cost of their power bills. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 18:37:18 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2D02D16A425; Fri, 27 Jan 2006 18:37:18 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219]) by mx1.FreeBSD.org (Postfix) with ESMTP id DBD2744003; Fri, 27 Jan 2006 14:27:11 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.5b3) with ESMTP id 7114863 for multiple; Fri, 27 Jan 2006 09:26:02 -0500 Received: from localhost (john@localhost [127.0.0.1]) by server.baldwin.cx (8.13.4/8.13.4) with ESMTP id k0RER9YS051089; Fri, 27 Jan 2006 09:27:09 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: freebsd-arch@freebsd.org Date: Fri, 27 Jan 2006 09:27:30 -0500 User-Agent: KMail/1.9.1 References: <20060125201450.GE25397@cirb503493.alcatel.com.au> <20060127085653.GA51554@uk.tiscali.com> <43D9E1D2.6060207@rogers.com> In-Reply-To: <43D9E1D2.6060207@rogers.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200601270927.33772.jhb@freebsd.org> X-Virus-Scanned: ClamAV 0.87.1/1253/Fri Jan 27 05:10:20 2006 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.4 required=4.2 tests=ALL_TRUSTED autolearn=failed version=3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on server.baldwin.cx X-Server: High Performance Mail Server - http://surgemail.com r=1653887525 Cc: John-Mark Gurney , current@freebsd.org, Brian Candler , Mike Jakubik , arch@freebsd.org, Poul-Henning Kamp , Peter Jeremy Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 18:37:20 -0000 On Friday 27 January 2006 04:03, Mike Jakubik wrote: > Brian Candler wrote: > > I guess this is OK, *if* you trust the power management system to do its > > job properly. Unfortunately I have very bad experiences of such things. > > In many cases I've ended up turning off power management completely and > > locking everything at max clock speed. Mind you, if I do that, anything > > you do with scaling factors isn't going to affect me, so actually I don't > > really care. I'll shut up now :-) > > Let's not forget, FreeBSD is really a server OS. Who in their right mind > uses power saving features on a server? It sounds nice in theory, but > doesn't work as well. People with lots and lots of servers who need to keep the entire power load down to avoid overloading their power source. Also, said people might care about the cost of their power bills. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 18:39:46 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9368116A420; Fri, 27 Jan 2006 18:39:46 +0000 (GMT) (envelope-from Adam.Mullen@c-b.com) Received: from FW1-AP69.c-b.com (translation.c-b.com [216.141.109.4]) by mx1.FreeBSD.org (Postfix) with SMTP id CF0AC43D46; Fri, 27 Jan 2006 18:39:45 +0000 (GMT) (envelope-from Adam.Mullen@c-b.com) Received: from (172.16.5.23) by FW1-AP69.c-b.com via smtp id 616b_99e86ff8_8f63_11da_8287_001143cddd1e; Fri, 27 Jan 2006 12:34:50 -0600 Received: from fw1-ex03.c-b.net ([172.16.5.21]) by fw1-ex06.c-b.net with Microsoft SMTPSVC(6.0.3790.211); Fri, 27 Jan 2006 12:39:44 -0600 X-MimeOLE: Produced By Microsoft Exchange V6.5.6944.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Fri, 27 Jan 2006 12:39:44 -0600 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [TEST/REVIEW] CPU accounting patches Thread-Index: AcYjcDkP3iWwnS52S4iM7aejlhgt9gAAFEZQ From: "Mullen, Adam" To: "Garance A Drosihn" , "Mike Jakubik" X-OriginalArrivalTime: 27 Jan 2006 18:39:44.0885 (UTC) FILETIME=[0AF6A250:01C62371] X-NAIMIME-Disclaimer: 1 X-NAIMIME-Modified: 1 Cc: arch@freebsd.org, current@freebsd.org Subject: RE: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 18:39:46 -0000 Lets see here though, how many of us have a 100 node beowulf cluster who use FreeBSD daily? I know I don't! I've been following this conversation and my 2 cents are this, taking into consideration cpu time accounting for machines that use power management is trivial. I view those that use power management are a lot less likely to be in an environment where they are either A) billing for utility computing, B) care about 99.9999% accurate cpu time accounting, C) have a need for this precision. My thought is that a blurb be submitted with this patch specifying the caveats if powersaved is also used. Just my humble opinion.=20 -----Original Message----- From: owner-freebsd-arch@freebsd.org [mailto:owner-freebsd-arch@freebsd.org] On Behalf Of Garance A Drosihn Sent: Friday, January 27, 2006 12:30 PM To: Mike Jakubik Cc: arch@freebsd.org; current@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches At 4:03 AM -0500 1/27/06, Mike Jakubik wrote: > >Let's not forget, FreeBSD is really a server OS. Who in their right=20 >mind uses power saving features on a server? It sounds nice in theory,=20 >but doesn't work as well. Apparently your power and cooling bills are much lower than ours. We would very much love it if the computers will use only the energy they need to get the job done. Yes, that means a big bill when some simulation is running on a 100-node beowulf cluster. But it also means we don't want to be paying the bill to run that cluster at full-throttle when there's no work for those CPU's to do. --=20 Garance Alistair Drosehn =3D gad@gilead.netel.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu _______________________________________________ freebsd-arch@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-arch To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" This message contains confidential information and is intended only for t= he individual named. If you are not the named addressee=0D you should not disseminate, distribute or copy this e-mail. Please notify= the sender immediately by e-mail if you have received this =0D e-mail by mistake and delete this e-mail from your system. E-mail transmi= ssion cannot be guaranteed to be secured or error-free as=0D information could be intercepted, corrupted, lost, destroyed, received la= te or incomplete, or could contain viruses. The sender therefore =0D does not accept liability for any error or omission in the contents of th= is message, which arises as a result of e-mail transmission. =0D If verification is required, please request a hard-copy version from the = sender. From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 18:54:30 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E516816A420; Fri, 27 Jan 2006 18:54:30 +0000 (GMT) (envelope-from frank@exit.com) Received: from tinker.exit.com (tinker.exit.com [206.223.0.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3C4B943D53; Fri, 27 Jan 2006 18:54:25 +0000 (GMT) (envelope-from frank@exit.com) Received: from realtime.exit.com (realtime [206.223.0.5]) by tinker.exit.com (8.13.4/8.13.4) with ESMTP id k0RIsMdP065621; Fri, 27 Jan 2006 10:54:22 -0800 (PST) (envelope-from frank@exit.com) Received: from realtime.exit.com (localhost [127.0.0.1]) by realtime.exit.com (8.13.4/8.13.4) with ESMTP id k0RIsKNu056808; Fri, 27 Jan 2006 10:54:20 -0800 (PST) (envelope-from frank@exit.com) Received: (from frank@localhost) by realtime.exit.com (8.13.4/8.13.4/Submit) id k0RIsKnC056807; Fri, 27 Jan 2006 10:54:20 -0800 (PST) (envelope-from frank@exit.com) X-Authentication-Warning: realtime.exit.com: frank set sender to frank@exit.com using -f From: Frank Mayhar To: "Mullen, Adam" In-Reply-To: References: Content-Type: text/plain Content-Transfer-Encoding: 7bit Organization: Exit Consulting Date: Fri, 27 Jan 2006 10:54:19 -0800 Message-Id: <1138388059.56322.9.camel@realtime.exit.com> Mime-Version: 1.0 X-Mailer: Evolution 2.4.1 FreeBSD GNOME Team Port X-Virus-Scanned: ClamAV 0.87.1/1253/Fri Jan 27 02:10:20 2006 on tinker.exit.com X-Virus-Status: Clean Cc: Mike Jakubik , Garance A Drosihn , current@freebsd.org, arch@freebsd.org Subject: RE: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: frank@exit.com List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 18:54:31 -0000 On Fri, 2006-01-27 at 12:39 -0600, Mullen, Adam wrote: > Lets see here though, how many of us have a 100 node beowulf cluster who > use FreeBSD daily? I know I don't! I've been following this > conversation and my 2 cents are this, taking into consideration cpu time > accounting for machines that use power management is trivial. I view > those that use power management are a lot less likely to be in an > environment where they are either A) billing for utility computing, B) > care about 99.9999% accurate cpu time accounting, C) have a need for > this precision. My thought is that a blurb be submitted with this patch > specifying the caveats if powersaved is also used. Just my humble > opinion. Hey, you don't have to run a 100-node Beowulf cluster to benefit from this. It would be nice if my systems in my home office would throttle down when they're relatively idle; that might defray some of the several-hundred-dollar electric bill I pay monthly, and during the summer it might even keep the room cooler. -- Frank Mayhar frank@exit.com http://www.exit.com/ Exit Consulting http://www.gpsclock.com/ http://www.exit.com/blog/frank/ From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 19:01:47 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6FEF916A420; Fri, 27 Jan 2006 19:01:47 +0000 (GMT) (envelope-from drosih@rpi.edu) Received: from smtp3.server.rpi.edu (smtp3.server.rpi.edu [128.113.2.3]) by mx1.FreeBSD.org (Postfix) with ESMTP id EDC3343D49; Fri, 27 Jan 2006 19:01:46 +0000 (GMT) (envelope-from drosih@rpi.edu) Received: from [128.113.24.47] (gilead.netel.rpi.edu [128.113.24.47]) by smtp3.server.rpi.edu (8.13.0/8.13.0) with ESMTP id k0RJ1eC6009616; Fri, 27 Jan 2006 14:01:41 -0500 Mime-Version: 1.0 Message-Id: In-Reply-To: References: Date: Fri, 27 Jan 2006 14:01:40 -0500 To: "Mullen, Adam" From: Garance A Drosihn Content-Type: text/plain; charset="us-ascii" ; format="flowed" X-CanItPRO-Stream: default X-RPI-SA-Score: undef - spam-scanning disabled X-Scanned-By: CanIt (www . canit . ca) on 128.113.2.3 Cc: arch@freebsd.org, freebsd-current@freebsd.org Subject: RE: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 19:01:47 -0000 At 12:39 PM -0600 1/27/06, Mullen, Adam wrote: >Lets see here though, how many of us have a 100 node beowulf >cluster who use FreeBSD daily? I know I don't! That isn't the question I am answering. The question was "Who in their right mind users power saving features on a server OS?". And the answer is "People who run LOTS of servers". That is all. I'm not saying *you* have to run power-saving features. I am saying that there are people who use a SERVER operating system, and who have very good ($$$$$$$$$$$$$$$$) reasons to care about power consumption. People who on the one hand do want infinite CPU power, but still have to balance that desire with the monthly bill for energy and cooling. -- Garance Alistair Drosehn = gad@gilead.netel.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 19:06:39 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5908116A420; Fri, 27 Jan 2006 19:06:39 +0000 (GMT) (envelope-from Adam.Mullen@c-b.com) Received: from FW1-AP69.c-b.com (translation.c-b.com [216.141.109.4]) by mx1.FreeBSD.org (Postfix) with SMTP id 7607B43D55; Fri, 27 Jan 2006 19:06:37 +0000 (GMT) (envelope-from Adam.Mullen@c-b.com) Received: from (172.16.5.23) by FW1-AP69.c-b.com via smtp id 20c2_5a93d46a_8f67_11da_9615_001143cddd1e; Fri, 27 Jan 2006 13:01:42 -0600 Received: from fw1-ex03.c-b.net ([172.16.5.21]) by fw1-ex06.c-b.net with Microsoft SMTPSVC(6.0.3790.211); Fri, 27 Jan 2006 13:06:27 -0600 X-MimeOLE: Produced By Microsoft Exchange V6.5.6944.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Fri, 27 Jan 2006 13:06:26 -0600 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [TEST/REVIEW] CPU accounting patches Thread-Index: AcYjdCi5B1dAU6HrRySqZTXCigQmUgAAD3awAAASvEA= From: "Mullen, Adam" To: , X-OriginalArrivalTime: 27 Jan 2006 19:06:27.0200 (UTC) FILETIME=[C6048000:01C62374] X-NAIMIME-Disclaimer: 1 X-NAIMIME-Modified: 1 Cc: Subject: FW: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 19:06:39 -0000 =20 Yes, I clearly understand. My point still stands though, if an admin wishes to use power saving techniques for whatever reason, that choice shouldn't deter the rest of us from gaining a more precise accountability for our CPU cycles. Thus why I suggested a blurb be added with this patch stating the ramifications of using power saving techniques with this method of cycle sampling. =20 -----Original Message----- From: Garance A Drosihn [mailto:drosih@rpi.edu] Sent: Friday, January 27, 2006 1:02 PM To: Mullen, Adam Cc: arch@freebsd.org; freebsd-current@freebsd.org Subject: RE: [TEST/REVIEW] CPU accounting patches At 12:39 PM -0600 1/27/06, Mullen, Adam wrote: >Lets see here though, how many of us have a 100 node beowulf cluster=20 >who use FreeBSD daily? I know I don't! That isn't the question I am answering. The question was "Who in their right mind users power saving features on a server OS?". And the answer is "People who run LOTS of servers". That is all. I'm not saying *you* have to run power-saving features. I am saying that there are people who use a SERVER operating system, and who have very good ($$$$$$$$$$$$$$$$) reasons to care about power consumption. People who on the one hand do want infinite CPU power, but still have to balance that desire with the monthly bill for energy and cooling. --=20 Garance Alistair Drosehn =3D gad@gilead.netel.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu This message contains confidential information and is intended only for t= he individual named. If you are not the named addressee=0D you should not disseminate, distribute or copy this e-mail. Please notify= the sender immediately by e-mail if you have received this =0D e-mail by mistake and delete this e-mail from your system. E-mail transmi= ssion cannot be guaranteed to be secured or error-free as=0D information could be intercepted, corrupted, lost, destroyed, received la= te or incomplete, or could contain viruses. The sender therefore =0D does not accept liability for any error or omission in the contents of th= is message, which arises as a result of e-mail transmission. =0D If verification is required, please request a hard-copy version from the = sender. From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 19:57:10 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DB5EA16A420 for ; Fri, 27 Jan 2006 19:57:10 +0000 (GMT) (envelope-from mikej@rogers.com) Received: from smtp109.rog.mail.re2.yahoo.com (smtp109.rog.mail.re2.yahoo.com [68.142.225.207]) by mx1.FreeBSD.org (Postfix) with SMTP id 5C92543D55 for ; Fri, 27 Jan 2006 19:57:09 +0000 (GMT) (envelope-from mikej@rogers.com) Received: (qmail 62239 invoked from network); 27 Jan 2006 19:57:08 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=rogers.com; h=Received:Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject:References:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=OAoEDrBJ2pjXNYAAF7sKgxb5T9NZowwiYbbcOsEKEpSZFIgds0Ymh27dKtD51QM+viDCXq8nqZMMIbg5zQIl8X5QqmQ+w7XF5/llBOSFyD3w7uT+H/OXEdxQnom19F5bFEmIsjVk88G//+UkbDflU5gPZom7ee7ATawvvqk8x+Q= ; Received: from unknown (HELO ?70.30.133.184?) (mikej@rogers.com@70.30.133.184 with plain) by smtp109.rog.mail.re2.yahoo.com with SMTP; 27 Jan 2006 19:57:08 -0000 Message-ID: <43DA7B1B.4090704@rogers.com> Date: Fri, 27 Jan 2006 14:57:15 -0500 From: Mike Jakubik User-Agent: Thunderbird 1.5 (Windows/20051201) MIME-Version: 1.0 To: John Baldwin References: <20060125201450.GE25397@cirb503493.alcatel.com.au> <20060127085653.GA51554@uk.tiscali.com> <43D9E1D2.6060207@rogers.com> <200601270927.33772.jhb@freebsd.org> In-Reply-To: <200601270927.33772.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: John-Mark Gurney , current@freebsd.org, Brian Candler , Peter Jeremy , arch@freebsd.org, Poul-Henning Kamp , freebsd-arch@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 19:57:11 -0000 John Baldwin wrote: > People with lots and lots of servers who need to keep the entire power load > down to avoid overloading their power source. Also, said people might care > about the cost of their power bills. > > Well thats fine and all, its not like the patch will prevent powerd from working, right? P.S. I don't know of any colo that charges for power usage. From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 19:57:10 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EDEC516A423 for ; Fri, 27 Jan 2006 19:57:10 +0000 (GMT) (envelope-from mikej@rogers.com) Received: from smtp109.rog.mail.re2.yahoo.com (smtp109.rog.mail.re2.yahoo.com [68.142.225.207]) by mx1.FreeBSD.org (Postfix) with SMTP id 5185043D49 for ; Fri, 27 Jan 2006 19:57:09 +0000 (GMT) (envelope-from mikej@rogers.com) Received: (qmail 62239 invoked from network); 27 Jan 2006 19:57:08 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=rogers.com; h=Received:Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject:References:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=OAoEDrBJ2pjXNYAAF7sKgxb5T9NZowwiYbbcOsEKEpSZFIgds0Ymh27dKtD51QM+viDCXq8nqZMMIbg5zQIl8X5QqmQ+w7XF5/llBOSFyD3w7uT+H/OXEdxQnom19F5bFEmIsjVk88G//+UkbDflU5gPZom7ee7ATawvvqk8x+Q= ; Received: from unknown (HELO ?70.30.133.184?) (mikej@rogers.com@70.30.133.184 with plain) by smtp109.rog.mail.re2.yahoo.com with SMTP; 27 Jan 2006 19:57:08 -0000 Message-ID: <43DA7B1B.4090704@rogers.com> Date: Fri, 27 Jan 2006 14:57:15 -0500 From: Mike Jakubik User-Agent: Thunderbird 1.5 (Windows/20051201) MIME-Version: 1.0 To: John Baldwin References: <20060125201450.GE25397@cirb503493.alcatel.com.au> <20060127085653.GA51554@uk.tiscali.com> <43D9E1D2.6060207@rogers.com> <200601270927.33772.jhb@freebsd.org> In-Reply-To: <200601270927.33772.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: John-Mark Gurney , current@freebsd.org, Brian Candler , Peter Jeremy , arch@freebsd.org, Poul-Henning Kamp , freebsd-arch@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 19:57:11 -0000 John Baldwin wrote: > People with lots and lots of servers who need to keep the entire power load > down to avoid overloading their power source. Also, said people might care > about the cost of their power bills. > > Well thats fine and all, its not like the patch will prevent powerd from working, right? P.S. I don't know of any colo that charges for power usage. From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 20:16:02 2006 Return-Path: X-Original-To: freebsd-arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CE92516A42A; Fri, 27 Jan 2006 20:16:02 +0000 (GMT) (envelope-from gad@FreeBSD.org) Received: from smtp3.server.rpi.edu (smtp3.server.rpi.edu [128.113.2.3]) by mx1.FreeBSD.org (Postfix) with ESMTP id D8E7343D53; Fri, 27 Jan 2006 20:15:57 +0000 (GMT) (envelope-from gad@FreeBSD.org) Received: from [128.113.24.47] (gilead.netel.rpi.edu [128.113.24.47]) by smtp3.server.rpi.edu (8.13.0/8.13.0) with ESMTP id k0RKFsgT009301; Fri, 27 Jan 2006 15:15:55 -0500 Mime-Version: 1.0 Message-Id: In-Reply-To: <29245.1138186687@critter.freebsd.dk> References: <29245.1138186687@critter.freebsd.dk> Date: Fri, 27 Jan 2006 15:15:53 -0500 To: "Poul-Henning Kamp" From: Garance A Drosehn Content-Type: text/plain; charset="us-ascii" ; format="flowed" X-CanItPRO-Stream: default X-RPI-SA-Score: undef - spam-scanning disabled X-Scanned-By: CanIt (www . canit . ca) on 128.113.2.3 Cc: freebsd-current@FreeBSD.org, freebsd-arch@FreeBSD.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 20:16:03 -0000 At 11:58 AM +0100 1/25/06, Poul-Henning Kamp wrote: > >With my definition you would be more likely to see lower numbers >maybe > user 0.20 sys 0.03 real 4.00 > >And they would have meaning, they should be pretty much the same >no matter what speed your CPU runs at any instant in time. > >In theory, it should be possible to compare user/sys numbers >you collect while running at 75 MHz with the ones you got >under full steam at 1600 MHz. > >In practice however, things that run on the real time, HZ >interrupting to run hardclock() for instance, will still make >comparison of such numbers quite shaky. > >But at least they will not be random as they are now. Here at RPI we used to have a mainframe, and we used to charge by the CPU second, so I am familiar with that side of the question. However, I am not too concerned by it for my own interests. For one, we don't charge by CPU second any more. For two, even if we did start charging again, we would just come up with some other metric, or simply pick a different rate for charging. The other big usage for timing programs is to compare the performance of various algorithms. We have always had users who cared very much about the accuracy and consistency of such measurements, whether or not we were charging people by the "CPU second". Based on the above description, the new CPU accounting patches will make those comparisons more meaningful, since the values measured will be the same no matter what speed the CPU is running at. As such, I think it's a good idea, even if we ignore the performance improvement. Rambling part: Getting back to the question of charging, I can almost convince myself that these changes are also a good idea for when those values are used for charging. When we (RPI) charged for CPU time, we weren't really charging "for CPU seconds". We were charging to say "when we are forced to buy a new computer because this computer is maxed out, then how much of that load (and thus the expense for the new computer) is the fault of any given user?". Thus, if we had a computer which could vary it's speed, we don't really care about "running out of CPU seconds" if the CPU is in fact running at half-speed. We only incur the cost of a new machine once we "run out of CPU seconds" when it is running at *maximum* speed. Furthermore, if we had a load which was low enough that we *could* get it all done by running the CPU at half-speed, then we (as the computer center) would *prefer* to run it at half- speed. That way, we reduce power and cooling costs. However, we will create extreme hostility in our users if we save that money, only to charge them twice as much because they are now forced to use "twice as many CPU seconds" when we run the CPU at half-speed. Oddly enough, consistency is also the big issue when it comes to charging. The user expects to see the exact same charge every time they run a specific job, and not see their charges vary by a dramatic amount due issues they have no control over. Poul-Henning says the values will "not be as random as they are now". If someone *is* charging by the CPU second, then they don't want those values to be "random". People who receive random bills can get really really hostile, perhaps to the point of bringing in lawyers. (and I have seen that happen). So, it seems to me that this change is *always* the behavior that everyone would prefer. Yes, we have to describe it. And maybe we should call the value something other than "CPU seconds" to make that clear, although I don't know what would be a better name. But I think I have convinced myself that there is no downside to these proposed changes. ...Assuming the changes work, of course! -- Garance Alistair Drosehn = gad@gilead.netel.rpi.edu Senior Systems Programmer or gad@FreeBSD.org Rensselaer Polytechnic Institute; Troy, NY; USA From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 20:28:10 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3007916A420; Fri, 27 Jan 2006 20:28:10 +0000 (GMT) (envelope-from fullermd@over-yonder.net) Received: from mail.localelinks.com (web.localelinks.com [64.39.75.54]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4393A43D6B; Fri, 27 Jan 2006 20:28:02 +0000 (GMT) (envelope-from fullermd@over-yonder.net) Received: from draco.over-yonder.net (adsl-072-148-013-213.sip.jan.bellsouth.net [72.148.13.213]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.localelinks.com (Postfix) with ESMTP id 3815FAD; Fri, 27 Jan 2006 14:28:02 -0600 (CST) Received: by draco.over-yonder.net (Postfix, from userid 100) id 1743061C38; Fri, 27 Jan 2006 14:28:01 -0600 (CST) Date: Fri, 27 Jan 2006 14:28:00 -0600 From: "Matthew D. Fuller" To: Garrett Wollman Message-ID: <20060127202800.GB1388@over-yonder.net> References: <20060127093602.GO83922@cell.sick.ru> <200601271724.k0RHOUat034301@khavrinen.csail.mit.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200601271724.k0RHOUat034301@khavrinen.csail.mit.edu> X-Editor: vi X-OS: FreeBSD User-Agent: Mutt/1.5.11-fullermd.2 Cc: arch@freebsd.org, glebius@freebsd.org Subject: Re: fix return code for pipe(2) syscall X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 20:28:10 -0000 On Fri, Jan 27, 2006 at 12:24:30PM -0500 I heard the voice of Garrett Wollman, and lo! it spake thus: > > It is perfectly permissible for every system call to fail with > [ENOTADUCK] unless the first five bytes of the caller's address > space contain the word "quack". Ducks don't byte, they just nybble a lyttle. -- Matthew Fuller (MF4839) | fullermd@over-yonder.net Systems/Network Administrator | http://www.over-yonder.net/~fullermd/ On the Internet, nobody can hear you scream. From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 20:48:34 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9D1E316A420; Fri, 27 Jan 2006 20:48:34 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0509643D48; Fri, 27 Jan 2006 20:48:33 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from [192.168.254.14] (imini.samsco.home [192.168.254.14]) (authenticated bits=0) by pooker.samsco.org (8.13.4/8.13.4) with ESMTP id k0RKmVUW047825; Fri, 27 Jan 2006 13:48:31 -0700 (MST) (envelope-from scottl@samsco.org) Message-ID: <43DA871F.8020707@samsco.org> Date: Fri, 27 Jan 2006 13:48:31 -0700 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7.7) Gecko/20050416 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Garance A Drosehn References: <29245.1138186687@critter.freebsd.dk> In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.4 required=3.8 tests=ALL_TRUSTED autolearn=failed version=3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on pooker.samsco.org Cc: Poul-Henning Kamp , freebsd-current@freebsd.org, freebsd-arch@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 20:48:34 -0000 Garance A Drosehn wrote: > At 11:58 AM +0100 1/25/06, Poul-Henning Kamp wrote: > >> >> With my definition you would be more likely to see lower numbers >> maybe >> user 0.20 sys 0.03 real 4.00 >> >> And they would have meaning, they should be pretty much the same >> no matter what speed your CPU runs at any instant in time. >> >> In theory, it should be possible to compare user/sys numbers >> you collect while running at 75 MHz with the ones you got >> under full steam at 1600 MHz. >> >> In practice however, things that run on the real time, HZ >> interrupting to run hardclock() for instance, will still make >> comparison of such numbers quite shaky. >> >> But at least they will not be random as they are now. > > > Here at RPI we used to have a mainframe, and we used to charge > by the CPU second, so I am familiar with that side of the > question. However, I am not too concerned by it for my own > interests. For one, we don't charge by CPU second any more. For > two, even if we did start charging again, we would just come up > with some other metric, or simply pick a different rate for > charging. > > The other big usage for timing programs is to compare the > performance of various algorithms. We have always had users who > cared very much about the accuracy and consistency of such > measurements, whether or not we were charging people by the "CPU > second". Based on the above description, the new CPU accounting > patches will make those comparisons more meaningful, since the > values measured will be the same no matter what speed the CPU is > running at. As such, I think it's a good idea, even if we ignore > the performance improvement. > > Rambling part: > > Getting back to the question of charging, I can almost convince > myself that these changes are also a good idea for when those > values are used for charging. When we (RPI) charged for CPU time, > we weren't really charging "for CPU seconds". We were charging > to say "when we are forced to buy a new computer because this > computer is maxed out, then how much of that load (and thus the > expense for the new computer) is the fault of any given user?". > > Thus, if we had a computer which could vary it's speed, we don't > really care about "running out of CPU seconds" if the CPU is in > fact running at half-speed. We only incur the cost of a new > machine once we "run out of CPU seconds" when it is running at > *maximum* speed. > > Furthermore, if we had a load which was low enough that we > *could* get it all done by running the CPU at half-speed, then > we (as the computer center) would *prefer* to run it at half- > speed. That way, we reduce power and cooling costs. However, > we will create extreme hostility in our users if we save that > money, only to charge them twice as much because they are now > forced to use "twice as many CPU seconds" when we run the CPU > at half-speed. Oddly enough, consistency is also the big issue > when it comes to charging. The user expects to see the exact > same charge every time they run a specific job, and not see > their charges vary by a dramatic amount due issues they have > no control over. > > Poul-Henning says the values will "not be as random as they are now". > If someone *is* charging by the CPU second, then they don't want > those values to be "random". People who receive random bills can > get really really hostile, perhaps to the point of bringing in > lawyers. (and I have seen that happen). > > So, it seems to me that this change is *always* the behavior that > everyone would prefer. Yes, we have to describe it. And maybe we > should call the value something other than "CPU seconds" to make > that clear, although I don't know what would be a better name. But > I think I have convinced myself that there is no downside to these > proposed changes. > > ...Assuming the changes work, of course! > Just call it 'cpu cycles'. If I have a job that calls nop 1 billion times, then I expect to get charged for 1 billion cycles regardless of if it takes 1 second or 5 days to run. I agree completely with your argument for consistency and that this will improve consistency and predictability. Scott From owner-freebsd-arch@FreeBSD.ORG Fri Jan 27 21:16:11 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 17CB216A422; Fri, 27 Jan 2006 21:16:11 +0000 (GMT) (envelope-from brdavis@odin.ac.hmc.edu) Received: from odin.ac.hmc.edu (Odin.AC.HMC.Edu [134.173.32.75]) by mx1.FreeBSD.org (Postfix) with ESMTP id 97D4F43D4C; Fri, 27 Jan 2006 21:16:10 +0000 (GMT) (envelope-from brdavis@odin.ac.hmc.edu) Received: from odin.ac.hmc.edu (localhost.localdomain [127.0.0.1]) by odin.ac.hmc.edu (8.13.0/8.13.0) with ESMTP id k0RLG9VP030304; Fri, 27 Jan 2006 13:16:09 -0800 Received: (from brdavis@localhost) by odin.ac.hmc.edu (8.13.0/8.13.0/Submit) id k0RLG9vr030303; Fri, 27 Jan 2006 13:16:09 -0800 Date: Fri, 27 Jan 2006 13:16:09 -0800 From: Brooks Davis To: Scott Long Message-ID: <20060127211609.GD20549@odin.ac.hmc.edu> References: <29245.1138186687@critter.freebsd.dk> <43DA871F.8020707@samsco.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="UfEAyuTBtIjiZzX6" Content-Disposition: inline In-Reply-To: <43DA871F.8020707@samsco.org> User-Agent: Mutt/1.4.1i X-Virus-Scanned: by amavisd-new X-Spam-Status: No, hits=0.0 required=8.0 tests=none autolearn=no version=2.63 X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on odin.ac.hmc.edu Cc: Poul-Henning Kamp , freebsd-current@freebsd.org, Garance A Drosehn , freebsd-arch@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 21:16:11 -0000 --UfEAyuTBtIjiZzX6 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Jan 27, 2006 at 01:48:31PM -0700, Scott Long wrote: > Garance A Drosehn wrote: >=20 > >At 11:58 AM +0100 1/25/06, Poul-Henning Kamp wrote: > > > >> > >>With my definition you would be more likely to see lower numbers > >>maybe > >> user 0.20 sys 0.03 real 4.00 > >> > >>And they would have meaning, they should be pretty much the same > >>no matter what speed your CPU runs at any instant in time. > >> > >>In theory, it should be possible to compare user/sys numbers > >>you collect while running at 75 MHz with the ones you got > >>under full steam at 1600 MHz. > >> > >>In practice however, things that run on the real time, HZ > >>interrupting to run hardclock() for instance, will still make > >>comparison of such numbers quite shaky. > >> > >>But at least they will not be random as they are now. > > > > > >Here at RPI we used to have a mainframe, and we used to charge > >by the CPU second, so I am familiar with that side of the > >question. However, I am not too concerned by it for my own > >interests. For one, we don't charge by CPU second any more. For > >two, even if we did start charging again, we would just come up > >with some other metric, or simply pick a different rate for > >charging. > > > >The other big usage for timing programs is to compare the > >performance of various algorithms. We have always had users who > >cared very much about the accuracy and consistency of such > >measurements, whether or not we were charging people by the "CPU > >second". Based on the above description, the new CPU accounting > >patches will make those comparisons more meaningful, since the > >values measured will be the same no matter what speed the CPU is > >running at. As such, I think it's a good idea, even if we ignore > >the performance improvement. > > > >Rambling part: > > > >Getting back to the question of charging, I can almost convince > >myself that these changes are also a good idea for when those > >values are used for charging. When we (RPI) charged for CPU time, > >we weren't really charging "for CPU seconds". We were charging > >to say "when we are forced to buy a new computer because this > >computer is maxed out, then how much of that load (and thus the > >expense for the new computer) is the fault of any given user?". > > > >Thus, if we had a computer which could vary it's speed, we don't > >really care about "running out of CPU seconds" if the CPU is in > >fact running at half-speed. We only incur the cost of a new > >machine once we "run out of CPU seconds" when it is running at > >*maximum* speed. > > > >Furthermore, if we had a load which was low enough that we > >*could* get it all done by running the CPU at half-speed, then > >we (as the computer center) would *prefer* to run it at half- > >speed. That way, we reduce power and cooling costs. However, > >we will create extreme hostility in our users if we save that > >money, only to charge them twice as much because they are now > >forced to use "twice as many CPU seconds" when we run the CPU > >at half-speed. Oddly enough, consistency is also the big issue > >when it comes to charging. The user expects to see the exact > >same charge every time they run a specific job, and not see > >their charges vary by a dramatic amount due issues they have > >no control over. > > > >Poul-Henning says the values will "not be as random as they are now". > >If someone *is* charging by the CPU second, then they don't want > >those values to be "random". People who receive random bills can > >get really really hostile, perhaps to the point of bringing in > >lawyers. (and I have seen that happen). > > > >So, it seems to me that this change is *always* the behavior that > >everyone would prefer. Yes, we have to describe it. And maybe we > >should call the value something other than "CPU seconds" to make > >that clear, although I don't know what would be a better name. But > >I think I have convinced myself that there is no downside to these > >proposed changes. > > > >...Assuming the changes work, of course! > > >=20 > Just call it 'cpu cycles'. If I have a job that calls nop 1 billion=20 > times, then I expect to get charged for 1 billion cycles regardless of > if it takes 1 second or 5 days to run. I agree completely with your > argument for consistency and that this will improve consistency and=20 > predictability. I agree as well. Certainly if we were charging for use of our cluster, this is what we'd want. While I probably wouldn't run powerd on the cluster, I and thinking about seeing if I can step down the CPU speed when there aren't any queued jobs on the machine. That could save significant power some of the time (I'm in the process of upgrading the cluster portion of our server room to install 300KVA (~KW) of power and plan to use it all within a year or two). Once we have the infrastructure to deal with this correctly, an intresting test for someone to run would be to look at disk and memory bound applications at different CPU speeds. I suspect you'd find that while wallclock increased at lower CPU speeds, cpu cycles would decrease for many workloads because the relative bandwidth of storage and maybe memory would increase. -- Brooks --=20 Any statement of the form "X is the one, true Y" is FALSE. PGP fingerprint 655D 519C 26A7 82E7 2529 9BF0 5D8E 8BE9 F238 1AD4 --UfEAyuTBtIjiZzX6 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQFD2o2YXY6L6fI4GtQRAgeoAJ0eT9g9K0ZHGZSfZS1z5lbypNo4CACgz0Uj hz7x9Hlvj+RXWW3cZYtewds= =nK8B -----END PGP SIGNATURE----- --UfEAyuTBtIjiZzX6-- From owner-freebsd-arch@FreeBSD.ORG Sat Jan 28 03:15:17 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C5FD616A420; Sat, 28 Jan 2006 03:15:17 +0000 (GMT) (envelope-from andy@siliconlandmark.com) Received: from lexi.siliconlandmark.com (lexi.siliconlandmark.com [209.69.98.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0FDBF43D46; Sat, 28 Jan 2006 03:15:16 +0000 (GMT) (envelope-from andy@siliconlandmark.com) Received: from [10.7.6.254] ([63.76.235.163]) by lexi.siliconlandmark.com (8.13.3/8.13.3) with ESMTP id k0S3FDvf090203; Fri, 27 Jan 2006 22:15:13 -0500 (EST) (envelope-from andy@siliconlandmark.com) In-Reply-To: <43D9DECF.2060101@rogers.com> References: <20060127045553.F36B34503E@ptavv.es.net> <43D9DECF.2060101@rogers.com> Mime-Version: 1.0 (Apple Message framework v746.2) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <9C3809F0-05C0-41A4-BD82-5CD8BA3B2A81@siliconlandmark.com> Content-Transfer-Encoding: 7bit From: Andre Guibert de Bruet Date: Fri, 27 Jan 2006 22:15:07 -0500 To: Mike Jakubik X-Mailer: Apple Mail (2.746.2) X-Information: Please contact the ISP for more information X-SL-MailScanner: Found to be clean X-SL-SpamCheck: not spam, SpamAssassin (score=-1.457, required 6, BAYES_00 -2.60, SPF_FAIL 1.14) X-MailScanner-From: andy@siliconlandmark.com Cc: arch@freebsd.org, current@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Jan 2006 03:15:17 -0000 On Jan 27, 2006, at 3:50 AM, Mike Jakubik wrote: > Kevin Oberman wrote: >> Good accounting is very important to some, but the issue of >> dealing with reduced clock speed is almost certainly of no issue >> when it comes to charging for computer use. I can't imagine any >> reason someone would be paying for CPU time on a processor not >> running "full out". >> >> The only time that this might be an issue is when thermal >> management takes over. I'd hope that thermal management would >> never kick in on a commercial compute server, but, if it did, the >> customer should, at least, only pay for the number of seconds the >> job would have run had it been properly cooled. (Actually, he >> should probably pay less as his time is also being wasted.) > > As a user from the 2.x days, i would much rather have the great > increase of context switching performance than super accurate cpu > accounting that i will never use. FreeBSD needs to focus on > performance now. These are my exact thoughts on the matter! Andy /* Andre Guibert de Bruet * 6f43 6564 7020 656f 2e74 4220 7469 6a20 */ /* Code poet / Sysadmin * 636f 656b 2e79 5320 7379 6461 696d 2e6e */ /* GSM: +1 734 846 8758 * 5520 494e 2058 6c73 7565 6874 002e 0000 */ /* WWW: siliconlandmark.com * DP Xeon 3.0-1MB/12GB/570GB */ From owner-freebsd-arch@FreeBSD.ORG Sat Jan 28 14:54:03 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5F32C16A420 for ; Sat, 28 Jan 2006 14:54:03 +0000 (GMT) (envelope-from Thomas.Sparrevohn@btinternet.com) Received: from smtp801.mail.ukl.yahoo.com (smtp801.mail.ukl.yahoo.com [217.12.12.138]) by mx1.FreeBSD.org (Postfix) with SMTP id 27BB243D46 for ; Sat, 28 Jan 2006 14:54:01 +0000 (GMT) (envelope-from Thomas.Sparrevohn@btinternet.com) Received: (qmail 53011 invoked from network); 28 Jan 2006 14:54:01 -0000 Received: from unknown (HELO w2fzz0vc01.aah-go-on.com) (thomas.sparrevohn@btinternet.com@86.133.244.63 with plain) by smtp801.mail.ukl.yahoo.com with SMTP; 28 Jan 2006 14:54:01 -0000 From: Thomas Sparrevohn To: Mike Jakubik Date: Sat, 28 Jan 2006 14:53:46 +0000 User-Agent: KMail/1.9.1 References: <20060127045553.F36B34503E@ptavv.es.net> <43D9DECF.2060101@rogers.com> In-Reply-To: <43D9DECF.2060101@rogers.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200601281453.48564.Thomas.Sparrevohn@btinternet.com> Cc: current@freebsd.org, Kevin Oberman , Ian FREISLICH , arch@freebsd.org, Poul-Henning Kamp , freebsd-current@freebsd.org, Alexander Leidinger Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Thomas.Sparrevohn@btinternet.com List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Jan 2006 14:54:03 -0000 On Friday 27 January 2006 08:50, Mike Jakubik wrote: > Kevin Oberman wrote: > > Good accounting is very important to some, but the issue of dealing with > > reduced clock speed is almost certainly of no issue when it comes to > > charging for computer use. I can't imagine any reason someone would be > > paying for CPU time on a processor not running "full out". > > > > The only time that this might be an issue is when thermal management > > takes over. I'd hope that thermal management would never kick in on a > > commercial compute server, but, if it did, the customer should, at least, > > only pay for the number of seconds the job would have run had it been > > properly cooled. (Actually, he should probably pay less as his time is > > also being wasted.) > > As a user from the 2.x days, i would much rather have the great increase > of context switching performance than super accurate cpu accounting that > i will never use. FreeBSD needs to focus on performance now. Well - Both points are correct - In regards to the thermal issues - I believe we may encounter a movement towards "Infowatt accounting" - and in regards to accounting I think it is worth to understand the limitations and challanges I would rather have a "scaleable" accounting approach e.g. there are a number of other things that could be nice to have the ability to track - but that we most likely would not want enabled by default. I am thinking about VM stats etc. From owner-freebsd-arch@FreeBSD.ORG Sat Jan 28 14:58:41 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 18D4F16A420 for ; Sat, 28 Jan 2006 14:58:41 +0000 (GMT) (envelope-from Thomas.Sparrevohn@btinternet.com) Received: from smtp813.mail.ukl.yahoo.com (smtp813.mail.ukl.yahoo.com [217.12.12.203]) by mx1.FreeBSD.org (Postfix) with SMTP id 83DEB43D5E for ; Sat, 28 Jan 2006 14:58:39 +0000 (GMT) (envelope-from Thomas.Sparrevohn@btinternet.com) Received: (qmail 20798 invoked from network); 28 Jan 2006 14:58:38 -0000 Received: from unknown (HELO w2fzz0vc01.aah-go-on.com) (thomas.sparrevohn@btinternet.com@86.133.244.63 with plain) by smtp813.mail.ukl.yahoo.com with SMTP; 28 Jan 2006 14:58:38 -0000 From: Thomas Sparrevohn To: "Poul-Henning Kamp" Date: Sat, 28 Jan 2006 14:58:35 +0000 User-Agent: KMail/1.9.1 References: <84017.1138351420@critter.freebsd.dk> In-Reply-To: <84017.1138351420@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200601281458.37502.Thomas.Sparrevohn@btinternet.com> Cc: Alexander Leidinger , Ian FREISLICH , freebsd-current@freebsd.org, current@freebsd.org, arch@freebsd.org Subject: Re: [TEST/REVIEW] CPU accounting patches X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Thomas.Sparrevohn@btinternet.com List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Jan 2006 14:58:41 -0000 On Friday 27 January 2006 08:43, Poul-Henning Kamp wrote: > In message <200601270232.12528.Thomas.Sparrevohn@btinternet.com>, Thomas > Sparre > > vohn writes: > >On Thursday 26 January 2006 06:06, Ian FREISLICH wrote: > >> I wonder how many people still bill for CPU time? I'd go for the > >> faster context switches. > > > >Almost all major ITO's providers - From SUN, HP, IBM, EDS etc. has > > offerings that in some shape or other uses a "Utility model" based upon > > some sort of financial model based upon actual CPU/IO etc. usage - It is > > a major area now and provides one of the corner stones in the movement > > towards "Public Utility models" > > Should we also add that all these initiatives are spectacular commercial > failures because users hate to buy rubberband by the inch ? Thats true to some extent - however the fundamental idea - I don't see anything wrong with - and I am not going into the "do'es and don't" of the financials behind utility models - but it does look like that is the direction everything is taking and a accounting model that allows better understanding of whether is indeed viable would benefit everybody