From owner-freebsd-stable@FreeBSD.ORG  Wed Jul 21 17:36:24 2010
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id BE9581065673
	for <freebsd-stable@freebsd.org>; Wed, 21 Jul 2010 17:36:24 +0000 (UTC)
	(envelope-from jhb@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 78D978FC0A
	for <freebsd-stable@freebsd.org>; Wed, 21 Jul 2010 17:36:24 +0000 (UTC)
Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net
	[66.111.2.69])
	by cyrus.watson.org (Postfix) with ESMTPSA id 0A37946B58;
	Wed, 21 Jul 2010 13:36:24 -0400 (EDT)
Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id 077EC8A03C;
	Wed, 21 Jul 2010 13:36:23 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: Markus Gebert <markus.gebert@hostpoint.ch>
Date: Wed, 21 Jul 2010 10:10:12 -0400
User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100217; KDE/4.4.5; amd64; ; )
References: <6B57591F-9FA2-45EB-825F-1DB025C0635D@hostpoint.ch>
	<201007201559.45081.jhb@freebsd.org>
	<6781BC8B-51E0-4F8B-9307-9C062DE70C21@hostpoint.ch>
In-Reply-To: <6781BC8B-51E0-4F8B-9307-9C062DE70C21@hostpoint.ch>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-15"
Content-Transfer-Encoding: 7bit
Message-Id: <201007211010.12409.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1
	(bigwig.baldwin.cx); Wed, 21 Jul 2010 13:36:23 -0400 (EDT)
X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx
X-Virus-Status: Clean
X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00,
	DATE_IN_PAST_03_06 autolearn=no version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx
Cc: freebsd-stable@freebsd.org
Subject: Re: 8.1-RC2 MCE caused by some LAPIC/clock changes? (was: 8.1-RC2 -
	PCI fatal error or MCE triggered by USB/ehci on Sun X4100M2?)
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 21 Jul 2010 17:36:24 -0000

On Tuesday, July 20, 2010 8:57:07 pm Markus Gebert wrote:
> 
> On 20.07.2010, at 21:59, John Baldwin wrote:
> 
> >> I started narrowing the revisions down until I 
> >> found out, that while on r202386 I'm still able to trigger the MCE, 
r202387 
> >> seems to solve the problem on CURRENT:
> >> 
> >> http://svn.freebsd.org/viewvc/base?view=revision&revision=202387
> > 
> > Although this change was MFC'd, it was later disabled by default because 
it 
> > causes issues on other machines.  I think there is a tunable you need to 
set 
> > in loader.conf to enable it for 8.1.  Attilio (the author of that commit) 
> > should know which tunable to set.
> 
> Might be this one in sys/amd64/amd64/clock.c:
> 
> ----
> static int lapic_allclocks = 1;
> TUNABLE_INT("machdep.lapic_allclocks", &lapic_allclocks);
> ----
> 
> The r202387 changes put this into local_apic.c, guess it was moved later on 
(or after MFC), and that's why I couldn't find it on 8-stable. And, indeed, 
this tunable seems to be gone again in current. Testing with 
machdep.lapic_allclocks=0 right now. So far it looks very promising. I'll let 
it run overnight.
> 
> Another thing though: Today I compared verbose boot output from 8-stable and 
the current box. I saw that the ioapic sets up IRQ routing differently on 
these two systems although the hardware is the same. This seemed not so 
interesting at first, but then I noticed that 8-stable sets up two routes (to 
lapic0 and lapic2, or sometimes lapic3) for IRQ58 (mpt0), while current only 
uses one route (to lapic0).

There is only one route.  The sort is probably making it confusing.  What 
happens is that during boot we route all interrupts to lapic 0 since the other 
CPUs aren't ready to handle interrupts when we probe devices.  However, once 
all the CPUs are up and running we redistribute the interrupts round-robin 
among the available CPUs.

> I used 'cpuset -c -l 0 -x 58' in an attempt to make my 8-stable box behave 
like the one running current. Indeed, this seems to have changed IRQ58 to be 
routed to lapic0 only. And the box was running for hours without showing the 
symptoms.

Probably this is because using lapic_allclocks enables an extra interrupt 
(IRQ0 or IRQ8) that alters the round-robin alloction so that IRQ 58 now lands 
on CPU 0 instead of some other CPU.

The routing is not "wrong", just different.  Any interrupt in an I/O APIC (or 
MSI message) can be routed to any CPU.  The OS is free to make that choice 
arbitrarily.  However, it may be that having mpt send its interrupts to CPU 0 
masks the hardware fault you are encountering.

-- 
John Baldwin