From owner-freebsd-current@FreeBSD.ORG Mon Aug 23 07:23:57 2010 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B4B491065694; Mon, 23 Aug 2010 07:23:57 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id C34E98FC13; Mon, 23 Aug 2010 07:23:56 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id KAA12193; Mon, 23 Aug 2010 10:23:55 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1OnRNq-000DsE-RL; Mon, 23 Aug 2010 10:23:55 +0300 Message-ID: <4C722209.1020405@icyb.net.ua> Date: Mon, 23 Aug 2010 10:23:53 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.8) Gecko/20100822 Lightning/1.0b2 Thunderbird/3.1.2 MIME-Version: 1.0 To: Doug Barton References: <4C71E858.90009@FreeBSD.org> <4C721334.1050000@icyb.net.ua> <4C7219B2.4070303@FreeBSD.org> In-Reply-To: <4C7219B2.4070303@FreeBSD.org> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-current@FreeBSD.org Subject: Re: runaway intr problems: powerd and/or hw.acpi.cpu.cx_lowest related X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Aug 2010 07:23:57 -0000 on 23/08/2010 09:48 Doug Barton said the following: > On 08/22/2010 23:20, Andriy Gapon wrote: >> on 23/08/2010 06:17 Doug Barton said the following: >> >>> http://people.freebsd.org/~dougb/intr-out-3.txt >> >> So, hm, npviewer.bin eats all the CPU time? > > No, the odd bits of that one are the fact that the intr threads irq17, irq256, > and irq20; are showing up at all, and/or showing up with more than a fraction of > a percent of cpu time. DTrace output doesn't show anything abnormal for those, but it does show that those interrupts do happen and those drivers do work. E.g. there is hdac (sound) activity [irq256: hdac0] and wireless activity [irq17: wpi0]. irq20 is hpet + usb. So did you do anything wireless? Did you play sound? The %WCPU may be _reported_ higher than what it actually is due to other issues with your system (high load by npviewer.bin, HPET+USB interrupt sharing, C3 with LAPIC timer). > Usually they don't, and the fact that they did at that > point in time was indicative of the fact that the "runaway intr" problem was > happening. _Incidentally_ npviewer.bin was taking up more cpu than it usually > does, but I think that's another symptom of the underlying problem. In complex systems it's not always trivially obvious what's incidental and what's not. > Here is a typical, non-problematic top output while running a flash video: > > last pid: 10841; load averages: 0.22, 0.12, 0.19 up 0+04:15:49 23:46:11 > 171 processes: 3 running, 148 sleeping, 20 waiting > CPU 0: 14.8% user, 0.0% nice, 3.1% system, 0.0% interrupt, 82.0% idle > CPU 1: 18.8% user, 0.0% nice, 0.0% system, 0.0% interrupt, 81.3% idle > Mem: 342M Active, 1397M Inact, 168M Wired, 49M Cache, 112M Buf, 45M Free > Swap: 1024M Total, 1444K Used, 1022M Free > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 10 root 171 ki31 0K 16K CPU0 0 203:29 82.86% {idle: cpu0} > 10 root 171 ki31 0K 16K RUN 1 191:24 81.05% {idle: cpu1} > 10813 dougb 54 0 420M 78196K select 0 0:18 17.77% npviewer.bin > 10822 dougb 47 0 420M 78196K futex 1 0:05 6.30% npviewer.bin > 10839 dougb 45 0 420M 78196K futex 1 0:03 3.66% npviewer.bin > 10840 dougb 45 0 420M 78196K futex 1 0:03 3.66% npviewer.bin > 10832 dougb 45 0 420M 78196K pcmwrv 1 0:03 2.88% npviewer.bin > 1598 dougb 44 0 163M 142M select 1 12:06 1.56% Xorg > 11 root -68 - 0K 160K WAIT 1 1:10 0.49% {irq17: wpi0} > 10770 dougb 44 0 178M 136M ucond 0 0:00 0.39% {firefox-bin} > 10770 dougb 45 0 178M 136M select 1 0:15 0.29% {initial thread > 11 root -80 - 0K 160K WAIT 0 0:45 0.10% {irq256: hdac0} Well, notice that in this case your npviewer.bin processes are not "run away" either. They spend most of the time waiting for something and use only a fraction of CPU time. In the report that I commented on they were mostly running on CPU (and who knows what else they were doing, like driving sound card crazy etc). > I really wish people would stop focusing on flash here. :) It's simply the > easiest and most consistent way that I have triggered this problem, it's not the > only one. Well, I just interpreted the DTrace output you gave. No prejudice against flash, although all those reports/complaints by Linux folks are suspicious. -- Andriy Gapon