From owner-freebsd-stable@FreeBSD.ORG Wed Jul 21 17:36:24 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BE9581065673 for ; Wed, 21 Jul 2010 17:36:24 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 78D978FC0A for ; Wed, 21 Jul 2010 17:36:24 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 0A37946B58; Wed, 21 Jul 2010 13:36:24 -0400 (EDT) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 077EC8A03C; Wed, 21 Jul 2010 13:36:23 -0400 (EDT) From: John Baldwin To: Markus Gebert Date: Wed, 21 Jul 2010 10:10:12 -0400 User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100217; KDE/4.4.5; amd64; ; ) References: <6B57591F-9FA2-45EB-825F-1DB025C0635D@hostpoint.ch> <201007201559.45081.jhb@freebsd.org> <6781BC8B-51E0-4F8B-9307-9C062DE70C21@hostpoint.ch> In-Reply-To: <6781BC8B-51E0-4F8B-9307-9C062DE70C21@hostpoint.ch> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201007211010.12409.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Wed, 21 Jul 2010 13:36:23 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00, DATE_IN_PAST_03_06 autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: freebsd-stable@freebsd.org Subject: Re: 8.1-RC2 MCE caused by some LAPIC/clock changes? (was: 8.1-RC2 - PCI fatal error or MCE triggered by USB/ehci on Sun X4100M2?) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Jul 2010 17:36:24 -0000 On Tuesday, July 20, 2010 8:57:07 pm Markus Gebert wrote: > > On 20.07.2010, at 21:59, John Baldwin wrote: > > >> I started narrowing the revisions down until I > >> found out, that while on r202386 I'm still able to trigger the MCE, r202387 > >> seems to solve the problem on CURRENT: > >> > >> http://svn.freebsd.org/viewvc/base?view=revision&revision=202387 > > > > Although this change was MFC'd, it was later disabled by default because it > > causes issues on other machines. I think there is a tunable you need to set > > in loader.conf to enable it for 8.1. Attilio (the author of that commit) > > should know which tunable to set. > > Might be this one in sys/amd64/amd64/clock.c: > > ---- > static int lapic_allclocks = 1; > TUNABLE_INT("machdep.lapic_allclocks", &lapic_allclocks); > ---- > > The r202387 changes put this into local_apic.c, guess it was moved later on (or after MFC), and that's why I couldn't find it on 8-stable. And, indeed, this tunable seems to be gone again in current. Testing with machdep.lapic_allclocks=0 right now. So far it looks very promising. I'll let it run overnight. > > Another thing though: Today I compared verbose boot output from 8-stable and the current box. I saw that the ioapic sets up IRQ routing differently on these two systems although the hardware is the same. This seemed not so interesting at first, but then I noticed that 8-stable sets up two routes (to lapic0 and lapic2, or sometimes lapic3) for IRQ58 (mpt0), while current only uses one route (to lapic0). There is only one route. The sort is probably making it confusing. What happens is that during boot we route all interrupts to lapic 0 since the other CPUs aren't ready to handle interrupts when we probe devices. However, once all the CPUs are up and running we redistribute the interrupts round-robin among the available CPUs. > I used 'cpuset -c -l 0 -x 58' in an attempt to make my 8-stable box behave like the one running current. Indeed, this seems to have changed IRQ58 to be routed to lapic0 only. And the box was running for hours without showing the symptoms. Probably this is because using lapic_allclocks enables an extra interrupt (IRQ0 or IRQ8) that alters the round-robin alloction so that IRQ 58 now lands on CPU 0 instead of some other CPU. The routing is not "wrong", just different. Any interrupt in an I/O APIC (or MSI message) can be routed to any CPU. The OS is free to make that choice arbitrarily. However, it may be that having mpt send its interrupts to CPU 0 masks the hardware fault you are encountering. -- John Baldwin