From owner-freebsd-mobile@FreeBSD.ORG Mon Nov 10 19:41:20 2008 Return-Path: Delivered-To: freebsd-mobile@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CD9021065672; Mon, 10 Nov 2008 19:41:20 +0000 (UTC) (envelope-from aragon@phat.za.net) Received: from mail.geek.sh (decoder.geek.sh [196.36.198.81]) by mx1.freebsd.org (Postfix) with ESMTP id 6F0718FC1D; Mon, 10 Nov 2008 19:41:20 +0000 (UTC) (envelope-from aragon@phat.za.net) Received: by mail.geek.sh (Postfix, from userid 1000) id 0790324D22; Mon, 10 Nov 2008 21:41:19 +0200 (SAST) Date: Mon, 10 Nov 2008 21:41:19 +0200 From: Aragon Gouveia To: Alexander Motin Message-ID: <20081110194119.GA57753@phat.za.net> References: <491208D3.2050901@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <491208D3.2050901@FreeBSD.org> User-Agent: Mutt/1.4i X-Operating-System: FreeBSD 4.10-RELEASE-p2 i386 Cc: freebsd-mobile@FreeBSD.org Subject: Re: RFC: powerd algorithms enhancements X-BeenThere: freebsd-mobile@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Mobile computing with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Nov 2008 19:41:20 -0000 | By Alexander Motin | [ 2008-11-05 22:58 +0200 ] > I would like to propose the patch for powerd that fixes some issues, > makes it more universal and on my opinion more usable. The main ideas of > mine were: I've just tested your patch. Responsiveness from idle is definitely improved. I was running a polling frequency of 100 with the old powerd to get good responsiveness out of it, but it's not needed with this. Especially on my notebook with 16 frequencies... Hiadaptive makes things even faster, but it's a bit heavy for notebook use. I'm running hiadaptive on my workstation for now. Thanks for your work! Regards, Aragon From owner-freebsd-mobile@FreeBSD.ORG Wed Nov 12 13:12:01 2008 Return-Path: Delivered-To: freebsd-mobile@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3778C1065740; Wed, 12 Nov 2008 13:12:01 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id C53548FC1B; Wed, 12 Nov 2008 13:12:00 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [IPv6:::1]) (authenticated bits=0) by server.baldwin.cx (8.14.3/8.14.3) with ESMTP id mACDBNK9084446; Wed, 12 Nov 2008 08:11:54 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: freebsd-mobile@freebsd.org Date: Tue, 11 Nov 2008 12:06:53 -0500 User-Agent: KMail/1.9.7 References: <200811060901400000@466321507> <491319C0.8090201@freebsd.org> <49132585.4070601@FreeBSD.org> In-Reply-To: <49132585.4070601@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200811111206.53809.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [IPv6:::1]); Wed, 12 Nov 2008 08:11:55 -0500 (EST) X-Virus-Scanned: ClamAV 0.93.1/8620/Wed Nov 12 04:05:38 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.0 required=4.2 tests=AWL,BAYES_00, DATE_IN_PAST_12_24,NO_RELAYS autolearn=no version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: Alexander Motin , Sam Leffler Subject: Re: RFC: powerd algorithms enhancements X-BeenThere: freebsd-mobile@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Mobile computing with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Nov 2008 13:12:01 -0000 On Thursday 06 November 2008 12:12:37 pm Alexander Motin wrote: > Sam Leffler wrote: > > Alexander Motin wrote: > >>> The biggest problem I see with powerd is that when a system is > >>> running with a reduced clock frequency interrupts are not processed > >>> at full clock speed. This, for example, breaks the ath driver which > >>> can generate interrupts very quickly when h/w MIB counters overflow > >>> in a noisy environment. Because processing happens at the reduced > >>> frequency until powerd gets to run it causes livelock > >> > >> You wanted to say that ath driver/hardware unable to operate on slow > >> CPUs? Ok, but may be it is an ath driver problem? May be it must use > >> some kind of interrupt moderation to avoid it? > > > > You didn't understand me. I used ath as an example of the general problem. > > I understand you. The real problem I see here is that any hardware > interrupts now can livelock the system. It is not limited to ath. Big > packet rate on any fast enough interface that has any significant > receive processing is able to make system not responding, just because > interrupts will consume all available CPU time. On my laptop the ACPI SCI is the culprit. If I let the CPU drop below 400 mhz, the GPE handler for temperature updates takes so long to run the CPU spends the entire time processing GPEs and never runs userland. Thus, powerd never gets to run. This happens on a "modern" laptop, not a Pentium-100. And actually, at certain speeds it would eventually let userland run enough to bump up. I actually added KTR_SCHED events for ACPI GPE and Task handling and hacked schedgraph to parse them and thus had pretty pictures showing the GPE handler using all CPU time during the multiple-second "hangs" I would get on my laptop with powerd. > powerd just makes that situation more probable as it significantly > reduces CPU performance. Just insert gigabit card into Pentium-100 > system and you will not be able to get there onder the load of only did > not using device polling mode. Rising frequency on interrupt processing > _will_not_ fix the problem, but just hide it for some time, until newer > network cards will be able to handle higher packet rate. It will definitely fix the problem on my laptop. > I think the only solutions for this case can be in allowing scheduler to > really do it's job. Or by moving _everything_ out of interrupt threads > to make them extremely fast and so to avoid the livelock problem, or in > some other way allow scheduler to delay interrupt processing to allow > other (for example user-level) threads to obtain at least some part of > their CPU time slot according to their priorities. > > I don't see how powerd itself could do at least anything with this. The point is that powerd is part of a CPU throttling strategy. If you are going to mess with powerd you need to do so in the context of the overall strategy. -- John Baldwin From owner-freebsd-mobile@FreeBSD.ORG Wed Nov 12 22:40:39 2008 Return-Path: Delivered-To: freebsd-mobile@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E29261065673; Wed, 12 Nov 2008 22:40:38 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from cmail.optima.ua (cmail.optima.ua [195.248.191.121]) by mx1.freebsd.org (Postfix) with ESMTP id C775C8FC22; Wed, 12 Nov 2008 22:40:37 +0000 (UTC) (envelope-from mav@FreeBSD.org) X-Spam-Flag: SKIP X-Spam-Yversion: Spamooborona-2.1.0 Received: from [212.86.226.226] (account mav@alkar.net HELO mavbook.mavhome.dp.ua) by cmail.optima.ua (CommuniGate Pro SMTP 5.2.9) with ESMTPSA id 227680798; Thu, 13 Nov 2008 00:40:36 +0200 Message-ID: <491B5B62.40609@FreeBSD.org> Date: Thu, 13 Nov 2008 00:40:34 +0200 From: Alexander Motin User-Agent: Thunderbird 2.0.0.17 (X11/20081029) MIME-Version: 1.0 To: John Baldwin References: <200811060901400000@466321507> <491319C0.8090201@freebsd.org> <49132585.4070601@FreeBSD.org> <200811111206.53809.jhb@freebsd.org> In-Reply-To: <200811111206.53809.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Sam Leffler , freebsd-mobile@freebsd.org Subject: Re: RFC: powerd algorithms enhancements X-BeenThere: freebsd-mobile@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Mobile computing with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Nov 2008 22:40:39 -0000 John Baldwin wrote: > On my laptop the ACPI SCI is the culprit. If I let the CPU drop below 400 > mhz, the GPE handler for temperature updates takes so long to run the CPU > spends the entire time processing GPEs and never runs userland. Thus, powerd > never gets to run. This happens on a "modern" laptop, not a Pentium-100. > And actually, at certain speeds it would eventually let userland run enough > to bump up. I actually added KTR_SCHED events for ACPI GPE and Task handling > and hacked schedgraph to parse them and thus had pretty pictures showing the > GPE handler using all CPU time during the multiple-second "hangs" I would get > on my laptop with powerd. If your system completely freezes at 400MHz, then it spends about 20% of CPU time on this at 2GHz. Doesn't it? With such amount of idle activity you system just unable to save any power! Your 100% running CPU at 400MHz will probably consume more power then any other really idle at 2GHz. If you think that this is normal then disabling powerd is the only way out for you. >> powerd just makes that situation more probable as it significantly >> reduces CPU performance. Just insert gigabit card into Pentium-100 >> system and you will not be able to get there onder the load of only did >> not using device polling mode. Rising frequency on interrupt processing >> _will_not_ fix the problem, but just hide it for some time, until newer >> network cards will be able to handle higher packet rate. > > It will definitely fix the problem on my laptop. No. It only hides the problem. >> I think the only solutions for this case can be in allowing scheduler to >> really do it's job. Or by moving _everything_ out of interrupt threads >> to make them extremely fast and so to avoid the livelock problem, or in >> some other way allow scheduler to delay interrupt processing to allow >> other (for example user-level) threads to obtain at least some part of >> their CPU time slot according to their priorities. >> >> I don't see how powerd itself could do at least anything with this. > > The point is that powerd is part of a CPU throttling strategy. If you are > going to mess with powerd you need to do so in the context of the overall > strategy. Can you show me this strategy to work in context? There was no significant changes at powerd for years. Now it does not works fine for SMP, it does not works fine for systems with big number of power levels, it's functionality is absolutely minimal. That's why I have touched it. There is several good ideas of future improvement was proposed, but nobody give me any real objections against what I have proposed. All of your objections is that your system unable to operate at low frequency. So how it is related to powerd and proposed patches? Here is how I see possible strategy: - Give more information to power controlling application: Differentiate between power level and throttling. Throttling is completely ineffective for CPUs supporting C1E, C2 and deeper states. It will give us better responsibility at equal power consumption. - Make scheduler to use some per-CPU power state priorities to allow us really disable unused cores/chips. - Reduce interrupt time to allow scheduler better handle process priorities and fight against IRQ livelocks. It does not depends on frequencies. What is your strategy vision? -- Alexander Motin From owner-freebsd-mobile@FreeBSD.ORG Thu Nov 13 13:31:37 2008 Return-Path: Delivered-To: freebsd-mobile@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D378A106567C for ; Thu, 13 Nov 2008 13:31:37 +0000 (UTC) (envelope-from jorgen.asmussen@gmail.com) Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.172]) by mx1.freebsd.org (Postfix) with ESMTP id 68AAA8FC0A for ; Thu, 13 Nov 2008 13:31:37 +0000 (UTC) (envelope-from jorgen.asmussen@gmail.com) Received: by ug-out-1314.google.com with SMTP id 30so1340901ugs.39 for ; Thu, 13 Nov 2008 05:31:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:mime-version:content-type:content-transfer-encoding :content-disposition; bh=yLVX1EdFoGu9BkQfZpzlZzsbUJeLBB1zx9aG/IV70Tk=; b=D3XvYd80yrMawK1rzfEy3rS+yqSf7/FmxeRimAcJID+MPpMa7vJRiD3rxEwJVX46Pg n/5kJO89TePql81S2uo/utX9xgB5CeghZEVdFAS7BLAr1VZTZDqfUwJZe2yw7VX9xJu2 Fe3MjCY6EeKLA6UfSfGjqVDbjJFhwVtsxu9Xo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type :content-transfer-encoding:content-disposition; b=FKDfXmPUbJY+RN9mK2K/HbMvK6f6W1o0aVRXkVc9P4X9aG9Kox1wIFN1vkUH+aU3gQ 1HQA57OFlG/GCCPhjvJ3cKmtn5sL/KlMNYnMNFflDImVQiqG5I2DtnI7/hH/pFg3i73K nfhAZckW/PVcn60DaQTFax0yIUZ/Yi06/XvnE= Received: by 10.66.238.18 with SMTP id l18mr4164441ugh.20.1226583096232; Thu, 13 Nov 2008 05:31:36 -0800 (PST) Received: by 10.67.98.6 with HTTP; Thu, 13 Nov 2008 05:31:35 -0800 (PST) Message-ID: Date: Thu, 13 Nov 2008 14:31:35 +0100 From: Jok To: freebsd-mobile@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Subject: Memory leak in 80211node on FreeBSD 7.1-BETA2 X-BeenThere: freebsd-mobile@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Mobile computing with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Nov 2008 13:31:37 -0000 # uname -a FreeBSD right.frequency.dk 7.1-BETA2 FreeBSD 7.1-BETA2 #0: Sun Oct 12 20:59:28 UTC 2008 root@driscoll.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 # vmstat -m Type InUse MemUse HighUse Requests Size(s) 80211node 17716 212570K - 20452 16,1024 The only related things I can find are links from 2005 and 2006 for FreeBSD 6.0. Anyone who knows how to fix it? From owner-freebsd-mobile@FreeBSD.ORG Thu Nov 13 15:31:01 2008 Return-Path: Delivered-To: freebsd-mobile@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8AE161065676 for ; Thu, 13 Nov 2008 15:31:01 +0000 (UTC) (envelope-from onemda@gmail.com) Received: from yx-out-2324.google.com (yx-out-2324.google.com [74.125.44.30]) by mx1.freebsd.org (Postfix) with ESMTP id 498DE8FC20 for ; Thu, 13 Nov 2008 15:31:01 +0000 (UTC) (envelope-from onemda@gmail.com) Received: by yx-out-2324.google.com with SMTP id 8so411002yxb.13 for ; Thu, 13 Nov 2008 07:31:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=tq5wU2aklID2R+NM3U8vKn/LpdbXrofkTRtxgyu+MDA=; b=P4eno4nxu4O7iTguh8kmzixFQbQSoE2PngwcrGvPEBComNrUGG8rO49tw5xyCXUeid NjuhKXz18Wx7ru5YQF/lZYQ+GJEWU2RvGcRP/P6Mt4m0oKhyr5h3WkVrpRxcMfStJByp 54NuQlwxv9MZOGWnSNbkFVkZTLgNTvni7vRVM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=LWro3Ui5a5sRm7KSYXdCPzIPRv1G4PXtqtTzfmocowPso4uev1ft1X6eJF1fxerRIM 0vtDUYlX3wX1VcySu59JgWdd2cwtjTviZgQz4/ZJybpqR1mdnPCMAQadQkd8qCD/ukAG ZKuwyKd3PydhwQdHYalLov5lNsBrc9F8IPB7U= Received: by 10.64.243.19 with SMTP id q19mr10054406qbh.50.1226590260121; Thu, 13 Nov 2008 07:31:00 -0800 (PST) Received: by 10.65.216.9 with HTTP; Thu, 13 Nov 2008 07:31:00 -0800 (PST) Message-ID: <3a142e750811130731u537e24d9o7b57e5e022d66c82@mail.gmail.com> Date: Thu, 13 Nov 2008 16:31:00 +0100 From: "Paul B. Mahol" To: Jok In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: Cc: freebsd-mobile@freebsd.org Subject: Re: Memory leak in 80211node on FreeBSD 7.1-BETA2 X-BeenThere: freebsd-mobile@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Mobile computing with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Nov 2008 15:31:01 -0000 On 11/13/08, Jok wrote: > # uname -a > FreeBSD right.frequency.dk 7.1-BETA2 FreeBSD 7.1-BETA2 #0: Sun Oct 12 > 20:59:28 UTC 2008 > root@driscoll.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 > > # vmstat -m > Type InUse MemUse HighUse Requests Size(s) > > 80211node 17716 212570K - 20452 16,1024 > > The only related things I can find are links from 2005 and 2006 for FreeBSD > 6.0. > > Anyone who knows how to fix it? How to reproduce it? It could be also driver fault. From owner-freebsd-mobile@FreeBSD.ORG Thu Nov 13 19:46:39 2008 Return-Path: Delivered-To: freebsd-mobile@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1D9311065676; Thu, 13 Nov 2008 19:46:39 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 39BEA8FC08; Thu, 13 Nov 2008 19:46:38 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [IPv6:::1]) (authenticated bits=0) by server.baldwin.cx (8.14.3/8.14.3) with ESMTP id mADJkO2n096236; Thu, 13 Nov 2008 14:46:31 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: Alexander Motin Date: Thu, 13 Nov 2008 11:45:39 -0500 User-Agent: KMail/1.9.7 References: <200811060901400000@466321507> <200811111206.53809.jhb@freebsd.org> <491B5B62.40609@FreeBSD.org> In-Reply-To: <491B5B62.40609@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200811131145.39747.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [IPv6:::1]); Thu, 13 Nov 2008 14:46:31 -0500 (EST) X-Virus-Scanned: ClamAV 0.93.1/8628/Thu Nov 13 10:57:02 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.3 required=4.2 tests=AWL,BAYES_00, DATE_IN_PAST_03_06,NO_RELAYS autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: Sam Leffler , freebsd-mobile@freebsd.org Subject: Re: RFC: powerd algorithms enhancements X-BeenThere: freebsd-mobile@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Mobile computing with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Nov 2008 19:46:39 -0000 On Wednesday 12 November 2008 05:40:34 pm Alexander Motin wrote: > John Baldwin wrote: > > On my laptop the ACPI SCI is the culprit. If I let the CPU drop below 400 > > mhz, the GPE handler for temperature updates takes so long to run the CPU > > spends the entire time processing GPEs and never runs userland. Thus, powerd > > never gets to run. This happens on a "modern" laptop, not a Pentium-100. > > And actually, at certain speeds it would eventually let userland run enough > > to bump up. I actually added KTR_SCHED events for ACPI GPE and Task handling > > and hacked schedgraph to parse them and thus had pretty pictures showing the > > GPE handler using all CPU time during the multiple-second "hangs" I would get > > on my laptop with powerd. > > If your system completely freezes at 400MHz, then it spends about 20% of > CPU time on this at 2GHz. Doesn't it? Nope. It is usually very idle at full speed. You are free to go buy your own HP nc6220 if you want to see it for yourself. You can also grab the KTR trace and modified schedgraph.py at www.freebsd.org/~jhb/gpe/. > With such amount of idle activity > you system just unable to save any power! Your 100% running CPU at > 400MHz will probably consume more power then any other really idle at > 2GHz. If you think that this is normal then disabling powerd is the only > way out for you. Except I do get much better battery life with powerd even with lowest set to 400. > >> powerd just makes that situation more probable as it significantly > >> reduces CPU performance. Just insert gigabit card into Pentium-100 > >> system and you will not be able to get there onder the load of only did > >> not using device polling mode. Rising frequency on interrupt processing > >> _will_not_ fix the problem, but just hide it for some time, until newer > >> network cards will be able to handle higher packet rate. > > > > It will definitely fix the problem on my laptop. > > No. It only hides the problem. *sigh* FreeBSD is not usually used for batch-processing. Most of the work FreeBSD does is interrupt-driven. For those sorts of loads, it does make sense that you want to handle your interrupt with minimal latency and then go back to sleep when it is done. The point Sam and I are making is that the idea that all power management can be driven from userland is flawed. It is a task that will need to be shared between the kernel and userland. Sam is also suggesting that this might be the single biggest issue with powerd. I'm not quite sure of the exact priority of the various cpufreq/powerd problems, but I think it is on a similar scale to not handling multiple CPU's properly. > >> I think the only solutions for this case can be in allowing scheduler to > >> really do it's job. Or by moving _everything_ out of interrupt threads > >> to make them extremely fast and so to avoid the livelock problem, or in > >> some other way allow scheduler to delay interrupt processing to allow > >> other (for example user-level) threads to obtain at least some part of > >> their CPU time slot according to their priorities. This is completely backwards. Userland is not more important than interrupt handling in the kernel. The problem is that CPU frequency handling is more important than relegating the entire task to userland. Instead of completely breaking the entire userland/kernel model to get part of userland executed at a kernel-level priority so CPU frequency handling is partially handled at a kernel-level priority, why not just move the CPU frequency bits that need to be kernel-level into the kernel? We already doing the thermal management for passive cooling in the kernel rather than in userland. > >> I don't see how powerd itself could do at least anything with this. > > > > The point is that powerd is part of a CPU throttling strategy. If you are > > going to mess with powerd you need to do so in the context of the overall > > strategy. > > Can you show me this strategy to work in context? There was no > significant changes at powerd for years. Now it does not works fine for > SMP, it does not works fine for systems with big number of power levels, > it's functionality is absolutely minimal. That's why I have touched it. > There is several good ideas of future improvement was proposed, but > nobody give me any real objections against what I have proposed. > > All of your objections is that your system unable to operate at low > frequency. So how it is related to powerd and proposed patches? Sam merely suggested that while you are working on improving other areas, that fixing this problem is one that is also worth looking at. In his opinion it is even more important. > Here is how I see possible strategy: > - Give more information to power controlling application: Differentiate > between power level and throttling. Throttling is completely ineffective > for CPUs supporting C1E, C2 and deeper states. It will give us better > responsibility at equal power consumption. > - Make scheduler to use some per-CPU power state priorities to allow us > really disable unused cores/chips. > - Reduce interrupt time to allow scheduler better handle process > priorities and fight against IRQ livelocks. It does not depends on > frequencies. > > What is your strategy vision? - Move the bits of CPU power management that are really important into the kernel. We should offload things to userland when possible, but interrupt handling isn't one you can offload to userland, and ensuring the system has enough CPU to process an interrupt when it occurs is the job of the kernel, _not_ of userland. -- John Baldwin From owner-freebsd-mobile@FreeBSD.ORG Thu Nov 13 20:52:20 2008 Return-Path: Delivered-To: freebsd-mobile@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BA365106568A; Thu, 13 Nov 2008 20:52:20 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from cmail.optima.ua (cmail.optima.ua [195.248.191.121]) by mx1.freebsd.org (Postfix) with ESMTP id BA30E8FC14; Thu, 13 Nov 2008 20:52:19 +0000 (UTC) (envelope-from mav@FreeBSD.org) X-Spam-Flag: SKIP X-Spam-Yversion: Spamooborona-2.1.0 Received: from [212.86.226.226] (account mav@alkar.net HELO mavbook.mavhome.dp.ua) by cmail.optima.ua (CommuniGate Pro SMTP 5.2.9) with ESMTPSA id 227728420; Thu, 13 Nov 2008 22:52:18 +0200 Message-ID: <491C9380.7050007@FreeBSD.org> Date: Thu, 13 Nov 2008 22:52:16 +0200 From: Alexander Motin User-Agent: Thunderbird 2.0.0.17 (X11/20081029) MIME-Version: 1.0 To: John Baldwin References: <200811060901400000@466321507> <200811111206.53809.jhb@freebsd.org> <491B5B62.40609@FreeBSD.org> <200811131145.39747.jhb@freebsd.org> In-Reply-To: <200811131145.39747.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Sam Leffler , freebsd-mobile@freebsd.org Subject: Re: RFC: powerd algorithms enhancements X-BeenThere: freebsd-mobile@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Mobile computing with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Nov 2008 20:52:20 -0000 John Baldwin wrote: >> If your system completely freezes at 400MHz, then it spends about 20% of >> CPU time on this at 2GHz. Doesn't it? > > Nope. It is usually very idle at full speed. You are free to go buy your own > HP nc6220 if you want to see it for yourself. You can also grab the KTR > trace and modified schedgraph.py at www.freebsd.org/~jhb/gpe/. It's very strange to me that you have 100% load at 400MHz, but zero at full speed. It shouldn't be so! Just an idea. I have noticed a problem, that my mobile Core2Duo does not drops TSC timer frequency on EST. It confuses kernel time counting and leads to incorrect proportional increasing of DELAY() times. I have fixed this problem to myself with "kern.timecounter.invariant_tsc=1". Can't it just be applicable to your CPU? >>>> I think the only solutions for this case can be in allowing scheduler to >>>> really do it's job. Or by moving _everything_ out of interrupt threads >>>> to make them extremely fast and so to avoid the livelock problem, or in >>>> some other way allow scheduler to delay interrupt processing to allow >>>> other (for example user-level) threads to obtain at least some part of >>>> their CPU time slot according to their priorities. > > This is completely backwards. Userland is not more important than interrupt > handling in the kernel. The problem is that CPU frequency handling is more > important than relegating the entire task to userland. Instead of completely > breaking the entire userland/kernel model to get part of userland executed at > a kernel-level priority so CPU frequency handling is partially handled at a > kernel-level priority, why not just move the CPU frequency bits that need to > be kernel-level into the kernel? We already doing the thermal management for > passive cooling in the kernel rather than in userland. The fact of system livelocks means that interrupt processing works out of any priorities! Saying that moving all processing into interrupt handlers is a good way, you are saying that having _all_ our system out of any priorities is a good idea. That's actually the situation we are able to see now with heavy network load with polling disabled. System just dies and there is no other way to manage that except enabling polling! Heavy interrupt handlers is _evil_ from the scheduling point of view! It may be faster in some situations, but it makes system unmanageable! There are never will be enough power to fulfill all requirements, so we must take care about the case when there will be more interrupts then we are able to handle. -- Alexander Motin From owner-freebsd-mobile@FreeBSD.ORG Thu Nov 13 22:17:47 2008 Return-Path: Delivered-To: freebsd-mobile@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 592F81065670; Thu, 13 Nov 2008 22:17:47 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id E902F8FC08; Thu, 13 Nov 2008 22:17:46 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [IPv6:::1]) (authenticated bits=0) by server.baldwin.cx (8.14.3/8.14.3) with ESMTP id mADMHYAI097195; Thu, 13 Nov 2008 17:17:34 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: Alexander Motin Date: Thu, 13 Nov 2008 16:06:24 -0500 User-Agent: KMail/1.9.7 References: <200811060901400000@466321507> <200811131145.39747.jhb@freebsd.org> <491C9380.7050007@FreeBSD.org> In-Reply-To: <491C9380.7050007@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200811131606.24804.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [IPv6:::1]); Thu, 13 Nov 2008 17:17:35 -0500 (EST) X-Virus-Scanned: ClamAV 0.93.1/8628/Thu Nov 13 10:57:02 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00,NO_RELAYS autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: Sam Leffler , freebsd-mobile@freebsd.org Subject: Re: RFC: powerd algorithms enhancements X-BeenThere: freebsd-mobile@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Mobile computing with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Nov 2008 22:17:47 -0000 On Thursday 13 November 2008 03:52:16 pm Alexander Motin wrote: > John Baldwin wrote: > >> If your system completely freezes at 400MHz, then it spends about 20% of > >> CPU time on this at 2GHz. Doesn't it? > > > > Nope. It is usually very idle at full speed. You are free to go buy your own > > HP nc6220 if you want to see it for yourself. You can also grab the KTR > > trace and modified schedgraph.py at www.freebsd.org/~jhb/gpe/. > > It's very strange to me that you have 100% load at 400MHz, but zero at > full speed. It shouldn't be so! I think systems are more complex than you give them credit for. Imagine what CPU frequency changing does to SMI# handlers for example. > Just an idea. I have noticed a problem, that my mobile Core2Duo does not > drops TSC timer frequency on EST. It confuses kernel time counting and > leads to incorrect proportional increasing of DELAY() times. I have > fixed this problem to myself with "kern.timecounter.invariant_tsc=1". > Can't it just be applicable to your CPU? Very, very doubtful. This is a Pentium-M, and I know that the TSC slows down, because until Nate's fixes to make DELAY() work correctly, the 5-second delay on shutdown used to take a lot longer than 5 seconds when I was on battery (after being on A/C). > >>>> I think the only solutions for this case can be in allowing scheduler to > >>>> really do it's job. Or by moving _everything_ out of interrupt threads > >>>> to make them extremely fast and so to avoid the livelock problem, or in > >>>> some other way allow scheduler to delay interrupt processing to allow > >>>> other (for example user-level) threads to obtain at least some part of > >>>> their CPU time slot according to their priorities. > > > > This is completely backwards. Userland is not more important than interrupt > > handling in the kernel. The problem is that CPU frequency handling is more > > important than relegating the entire task to userland. Instead of completely > > breaking the entire userland/kernel model to get part of userland executed at > > a kernel-level priority so CPU frequency handling is partially handled at a > > kernel-level priority, why not just move the CPU frequency bits that need to > > be kernel-level into the kernel? We already doing the thermal management for > > passive cooling in the kernel rather than in userland. > > The fact of system livelocks means that interrupt processing works out > of any priorities! Saying that moving all processing into interrupt > handlers is a good way, you are saying that having _all_ our system out > of any priorities is a good idea. That's actually the situation we are > able to see now with heavy network load with polling disabled. System > just dies and there is no other way to manage that except enabling polling! > > Heavy interrupt handlers is _evil_ from the scheduling point of view! It > may be faster in some situations, but it makes system unmanageable! > There are never will be enough power to fulfill all requirements, so we > must take care about the case when there will be more interrupts then we > are able to handle. I'm not advocating moving the entire system into interrupt handlers. Did you actually read what I wrote? My point is that if you have something in userland that is as important as what gets done in interrupt handlers, the solution is to not rip up the entire scheduler to make certain bits of userland have a higher priority than interrupts. The solution is to move the one bit of userland code that is needed into the kernel. In this case I'm not suggesting moving all of powerd into an interrupt handler. What I am suggesting is that the kernel needs a policy to consider raising the frequency when it gets an interrupt after being in a deep sleep. If the power savings from C2/3/whatever are greater than running throttled, then it is much more ideal when you get an interrupt while idle that you run at full speed to service the interrupt and then return to C2/C3 ASAP rather than running the interrupt handler at a throttled speed and spending less time in C2/C3. -- John Baldwin From owner-freebsd-mobile@FreeBSD.ORG Fri Nov 14 10:26:47 2008 Return-Path: Delivered-To: freebsd-mobile@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D745D1065676; Fri, 14 Nov 2008 10:26:47 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from cmail.optima.ua (cmail.optima.ua [195.248.191.121]) by mx1.freebsd.org (Postfix) with ESMTP id CF3528FC0A; Fri, 14 Nov 2008 10:26:46 +0000 (UTC) (envelope-from mav@FreeBSD.org) X-Spam-Flag: SKIP X-Spam-Yversion: Spamooborona-2.1.0 Received: from orphanage.alkar.net (account mav@alkar.net [212.86.226.11] verified) by cmail.optima.ua (CommuniGate Pro SMTP 5.2.9) with ESMTPA id 227746169; Fri, 14 Nov 2008 12:26:46 +0200 Message-ID: <491D5265.3020003@FreeBSD.org> Date: Fri, 14 Nov 2008 12:26:45 +0200 From: Alexander Motin User-Agent: Thunderbird 2.0.0.14 (X11/20080612) MIME-Version: 1.0 To: John Baldwin References: <200811060901400000@466321507> <200811131145.39747.jhb@freebsd.org> <491C9380.7050007@FreeBSD.org> <200811131606.24804.jhb@freebsd.org> In-Reply-To: <200811131606.24804.jhb@freebsd.org> X-Enigmail-Version: 0.95.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Sam Leffler , freebsd-mobile@freebsd.org Subject: Re: RFC: powerd algorithms enhancements X-BeenThere: freebsd-mobile@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Mobile computing with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Nov 2008 10:26:47 -0000 John Baldwin wrote: > On Thursday 13 November 2008 03:52:16 pm Alexander Motin wrote: >> John Baldwin wrote: >>>> If your system completely freezes at 400MHz, then it spends about 20% of >>>> CPU time on this at 2GHz. Doesn't it? >>> Nope. It is usually very idle at full speed. You are free to go buy your > own >>> HP nc6220 if you want to see it for yourself. You can also grab the KTR >>> trace and modified schedgraph.py at www.freebsd.org/~jhb/gpe/. >> It's very strange to me that you have 100% load at 400MHz, but zero at >> full speed. It shouldn't be so! > > I think systems are more complex than you give them credit for. Imagine what > CPU frequency changing does to SMI# handlers for example. You may be right, I am sure not very good in some hardware aspects, but neither EST, nor throttling affect system bus operation. I don't see direct relation there, it could easily be just some hardware/acpi/whatever bug. >> The fact of system livelocks means that interrupt processing works out >> of any priorities! Saying that moving all processing into interrupt >> handlers is a good way, you are saying that having _all_ our system out >> of any priorities is a good idea. That's actually the situation we are >> able to see now with heavy network load with polling disabled. System >> just dies and there is no other way to manage that except enabling polling! >> >> Heavy interrupt handlers is _evil_ from the scheduling point of view! It >> may be faster in some situations, but it makes system unmanageable! >> There are never will be enough power to fulfill all requirements, so we >> must take care about the case when there will be more interrupts then we >> are able to handle. > > I'm not advocating moving the entire system into interrupt handlers. Did you > actually read what I wrote? My point is that if you have something in > userland that is as important as what gets done in interrupt handlers, the > solution is to not rip up the entire scheduler to make certain bits of > userland have a higher priority than interrupts. All I wanted to say is that CPU frequency should not be so important for system operation. Yes, system will be slower and more latent at lower frequency, but is must be responsible. Scheduler must be able to give every process (even user-level) it's time quantum. > The solution is to move the > one bit of userland code that is needed into the kernel. In this case I'm > not suggesting moving all of powerd into an interrupt handler. What I am > suggesting is that the kernel needs a policy to consider raising the > frequency when it gets an interrupt after being in a deep sleep. If the > power savings from C2/3/whatever are greater than running throttled, then it > is much more ideal when you get an interrupt while idle that you run at full > speed to service the interrupt and then return to C2/C3 ASAP rather than > running the interrupt handler at a throttled speed and spending less time in > C2/C3. C2 does not give visible benefit to me against EST+C1E. C3 and deeper (which suspends the bus) theoretically could, but at this moment they are not working on SMP systems due to APIC timer problem, so IMHO we should better manage it first. Also for interrupt case modern CPUs specially does not getting out of C2/3/... states completely. You can read about that in C2D datasheet. CPU tries to avoid frequency/voltage rise for short times. -- Alexander Motin From owner-freebsd-mobile@FreeBSD.ORG Sat Nov 15 18:01:52 2008 Return-Path: Delivered-To: freebsd-mobile@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 95B721065694; Sat, 15 Nov 2008 18:01:52 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 3F8538FC08; Sat, 15 Nov 2008 18:01:52 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [IPv6:::1]) (authenticated bits=0) by server.baldwin.cx (8.14.3/8.14.3) with ESMTP id mAFI1NmU015493; Sat, 15 Nov 2008 13:01:45 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: Alexander Motin Date: Sat, 15 Nov 2008 11:32:09 -0500 User-Agent: KMail/1.9.7 References: <200811060901400000@466321507> <200811131606.24804.jhb@freebsd.org> <491D5265.3020003@FreeBSD.org> In-Reply-To: <491D5265.3020003@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200811151132.09851.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [IPv6:::1]); Sat, 15 Nov 2008 13:01:46 -0500 (EST) X-Virus-Scanned: ClamAV 0.93.1/8636/Sat Nov 15 00:05:47 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00,NO_RELAYS autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: Sam Leffler , freebsd-mobile@freebsd.org Subject: Re: RFC: powerd algorithms enhancements X-BeenThere: freebsd-mobile@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Mobile computing with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Nov 2008 18:01:52 -0000 On Friday 14 November 2008 05:26:45 am Alexander Motin wrote: > John Baldwin wrote: > > On Thursday 13 November 2008 03:52:16 pm Alexander Motin wrote: > >> John Baldwin wrote: > >>>> If your system completely freezes at 400MHz, then it spends about 20% of > >>>> CPU time on this at 2GHz. Doesn't it? > >>> Nope. It is usually very idle at full speed. You are free to go buy your > > own > >>> HP nc6220 if you want to see it for yourself. You can also grab the KTR > >>> trace and modified schedgraph.py at www.freebsd.org/~jhb/gpe/. > >> It's very strange to me that you have 100% load at 400MHz, but zero at > >> full speed. It shouldn't be so! > > > > I think systems are more complex than you give them credit for. Imagine what > > CPU frequency changing does to SMI# handlers for example. > > You may be right, I am sure not very good in some hardware aspects, but > neither EST, nor throttling affect system bus operation. I don't see > direct relation there, it could easily be just some > hardware/acpi/whatever bug. Well, some more details are that I occasionally see one of the GPE handlers on my laptop take 750ms (milli-seconds, not micro-seconds) to run at full CPU speed. If I ever close the lid and then raise it, then I will see one of these every few seconds (I added debugging printfs to output the time for "long-running" GPEs to my kernel previously to debug this and they are still there; saw them again this morning). They only take up 0.7% of my CPU when they run at full speed. However, when I drop down from 1867 to, say, 100, then each one now takes several seconds, and with them coming in every few seconds, it ends up live-locking the system, even though at full CPU speed it is < 1% CPU every 5 seconds or so. > >> The fact of system livelocks means that interrupt processing works out > >> of any priorities! Saying that moving all processing into interrupt > >> handlers is a good way, you are saying that having _all_ our system out > >> of any priorities is a good idea. That's actually the situation we are > >> able to see now with heavy network load with polling disabled. System > >> just dies and there is no other way to manage that except enabling polling! > >> > >> Heavy interrupt handlers is _evil_ from the scheduling point of view! It > >> may be faster in some situations, but it makes system unmanageable! > >> There are never will be enough power to fulfill all requirements, so we > >> must take care about the case when there will be more interrupts then we > >> are able to handle. > > > > I'm not advocating moving the entire system into interrupt handlers. Did you > > actually read what I wrote? My point is that if you have something in > > userland that is as important as what gets done in interrupt handlers, the > > solution is to not rip up the entire scheduler to make certain bits of > > userland have a higher priority than interrupts. > > All I wanted to say is that CPU frequency should not be so important for > system operation. Yes, system will be slower and more latent at lower > frequency, but is must be responsible. Scheduler must be able to give > every process (even user-level) it's time quantum. Well, but the problem is that it is that important. The scheduler is responsible for managing the resource known as the "CPU", and we obviously stick the scheduler in the kernel. :) It's not really userland's job to ensure that kernel-level tasks have enough CPU horsepower to execute. -- John Baldwin