Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 28 Dec 2007 13:38:13 +1100 (EST)
From:      Ian Smith <smithi@nimnet.asn.au>
To:        John Baldwin <jhb@freebsd.org>
Cc:        acpi@freebsd.org, njl@freebsd.org
Subject:   Re: An issue with powerd..
Message-ID:  <Pine.BSF.3.96.1071228125242.11357A-100000@gaia.nimnet.asn.au>
In-Reply-To: <200712271449.58285.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 27 Dec 2007, John Baldwin wrote:
 > So I've had some issues where I get weird hangs when I run with powerd enabled 
 > on my laptop and I think I've finally tracked it down.  Note that this is on 
 > an older current with the older EC code, but the design flaw in powerd is 
 > still relevant even if the new EC code makes my laptop happier (I'm trying to 
 > update my laptop to more recent HEAD, but there's some weird scheduling bug 
 > that I haven't fixed yet in newer stuff).
 > 
 > Anyways, I was trying to debug the weird hangs I had when running with powerd 
 > (machine would go unresponsive and fans would spin up, and after a variable 
 > number of seconds it would come back and all the pending input (mouse 
 > movements, keypresses, etc.) would be processed).  I added some code to track 
 > how long it takes for GPE's to run that would print out on the console if one 
 > took more than 750ms as I had a feeling that something with ACPI was making 
 > the system busy.

Fans spinning up is perhaps interesting?  As noted in my recent whinge
about lack of component documentation, I've yet to suss out interactions
between acpi_thermal (wrt both fans and passive cooling itself modifying
cpu freqs - could this fight with powerd?), devd and other subsystems. 

Yeah I'm slowly beating through the ACPI spec, up to page 46 of >600
pages, but it's reminiscent of reading govt legislation .. I'd love to
find the ~50 page precis, then I may be better able to follow some code. 

 > It was also far worse in console mode than in X.  In console mode I found that 
 > sometimes the system would never "come back".

Presumably X itself keeps it busy enough to keep cpu freq 'reasonable'?
I use gkrellm to keep an eye on cpu freq, temp, load avg .. but my T23
is only a two-speed, min 733MHZ, so I can't see what you're seeing (and
that's my faster laptop :)

 > So I was running in console mode recently with my timing patches and noticed 
 > that when it hung it started warning about GPE events taking several 
 > _seconds_ to process, e.g. 2-3 seconds, or in some cases up to _30_ seconds.  
 > So, my theory is that powerd has lowered my CPU all the way down to 100mhz 
 > (easy to reproduce in non-X, just let the box sit with no apps running) and 
 > that for some reason the machine ends up in a "GPE storm" where it is 
 > spending all its time handling GPE's and never has any CPU left for userland 
 > apps (due to being at 100mhz).  The problem then is that powerd never runs to 
 > bump my CPU up to some reasonable speed.

One workaround some have noted using is to set debug.cpufreq.lowest to
some value considerably higher than 100MHz, say >500MHz to maintain
reasonable responsiveness, at a cost of higher power use when idle.

 > In fact, anytime a completely idle box suddently gets a lot of kernel work 
 > (e.g. a sudden flow of packets) it could in theory end up trying to handle 
 > all this work at the reduced speed since the work has a higher priority than 
 > the powerd process.  To that end, I think that at least part of powerd needs 
 > to be in the kernel, or at least that the kernel should be more proactive 
 > about bumping the speed up when it resumes from Cx due to an interrupt.  A 
 > simple policy would be to bump up to full speed for any non-clock interrupt 
 > (possibly bumping up for a clock interrupt if we wake up softclock as well).
 > 
 > Thoughts?

Just humble grasshopper droppings, master .. but the default powerd
polling interval is 500ms, which is a really long time on a fast box, so
-p 100 or even less might make a considerable difference?

Can't comment on any in-kernel component, but responding per any sort of
single interrupt/s sounds way too triggerhappy compared to monitoring
load, assuming that such as vm.loadavg and kern.cp_time are themselves
updated promptly in high-stress times?

AU$0.02, which rounds down to 0 since we abandoned coins less than 5c ..

cheers, Ian




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.3.96.1071228125242.11357A-100000>