From owner-freebsd-acpi@FreeBSD.ORG Thu Dec 27 23:37:42 2007 Return-Path: Delivered-To: acpi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 250AA16A417; Thu, 27 Dec 2007 23:37:42 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219]) by mx1.freebsd.org (Postfix) with ESMTP id BC65613C455; Thu, 27 Dec 2007 23:37:41 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.8q) with ESMTP id 226288991-1834499 for multiple; Thu, 27 Dec 2007 18:23:50 -0500 Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.8/8.13.8) with ESMTP id lBRNLYYb054103; Thu, 27 Dec 2007 18:21:35 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: acpi@freebsd.org Date: Thu, 27 Dec 2007 14:49:57 -0500 User-Agent: KMail/1.9.6 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200712271449.58285.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Thu, 27 Dec 2007 18:21:35 -0500 (EST) X-Virus-Scanned: ClamAV 0.91.2/5270/Thu Dec 27 12:48:18 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.1 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00, DATE_IN_PAST_03_06 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: njl@freebsd.org Subject: An issue with powerd.. X-BeenThere: freebsd-acpi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: ACPI and power management development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Dec 2007 23:37:42 -0000 So I've had some issues where I get weird hangs when I run with powerd enabled on my laptop and I think I've finally tracked it down. Note that this is on an older current with the older EC code, but the design flaw in powerd is still relevant even if the new EC code makes my laptop happier (I'm trying to update my laptop to more recent HEAD, but there's some weird scheduling bug that I haven't fixed yet in newer stuff). Anyways, I was trying to debug the weird hangs I had when running with powerd (machine would go unresponsive and fans would spin up, and after a variable number of seconds it would come back and all the pending input (mouse movements, keypresses, etc.) would be processed). I added some code to track how long it takes for GPE's to run that would print out on the console if one took more than 750ms as I had a feeling that something with ACPI was making the system busy. It was also far worse in console mode than in X. In console mode I found that sometimes the system would never "come back". So I was running in console mode recently with my timing patches and noticed that when it hung it started warning about GPE events taking several _seconds_ to process, e.g. 2-3 seconds, or in some cases up to _30_ seconds. So, my theory is that powerd has lowered my CPU all the way down to 100mhz (easy to reproduce in non-X, just let the box sit with no apps running) and that for some reason the machine ends up in a "GPE storm" where it is spending all its time handling GPE's and never has any CPU left for userland apps (due to being at 100mhz). The problem then is that powerd never runs to bump my CPU up to some reasonable speed. In fact, anytime a completely idle box suddently gets a lot of kernel work (e.g. a sudden flow of packets) it could in theory end up trying to handle all this work at the reduced speed since the work has a higher priority than the powerd process. To that end, I think that at least part of powerd needs to be in the kernel, or at least that the kernel should be more proactive about bumping the speed up when it resumes from Cx due to an interrupt. A simple policy would be to bump up to full speed for any non-clock interrupt (possibly bumping up for a clock interrupt if we wake up softclock as well). Thoughts? -- John Baldwin