Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 12 Jul 2010 08:48:48 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        Markus Gebert <markus.gebert@hostpoint.ch>
Cc:        alc@freebsd.org, freebsd-stable@freebsd.org
Subject:   Re: 8.1-RC2 - PCI fatal error or MCE triggered by USB/ehci on Sun X4100M2?
Message-ID:  <201007120848.48162.jhb@freebsd.org>
In-Reply-To: <591666AA-E6CA-4478-9E96-3A2D558BD6B4@hostpoint.ch>
References:  <6B57591F-9FA2-45EB-825F-1DB025C0635D@hostpoint.ch> <AANLkTinhQq9V8qIlD68l7LRLf1P5Iz5Kq5XDuIYzLOim@mail.gmail.com> <591666AA-E6CA-4478-9E96-3A2D558BD6B4@hostpoint.ch>

next in thread | previous in thread | raw e-mail | index | archive | help
On Monday, July 12, 2010 8:25:54 am Markus Gebert wrote:
> 
> On 10.07.2010, at 19:37, Alan Cox wrote:
> 
> > On Fri, Jul 9, 2010 at 6:53 PM, Markus Gebert <markus.gebert@hostpoint.ch> 
wrote:
> > [snip]
> > 
> > Yes, this hardware comes from Sun directly, but getting Sun (/Oracle) 
support for this issue is gonna be tough. FreeBSD is unsupported, and in a 
short test we couldn't reproduce the problem with a Linux kernel. While I 
agree that a hardware issue has always been and still is a possibility to be 
considered, the fact that we tested this on two machines remains as well as 
the fact that 6.x, 7.x do not show the behavior. Another possibility is of 
course, that the X4100 is prone to such issues and somehow 6.x and 7.x have 
workarounds we're not aware of or just do something different in way so that 
this issue does not get triggered.
> > 
> > 
> > 8.1 is our first release to have the driver for configuring and reporting 
machine check exceptions enabled by default.  Prior to 8.1, you had to 
explicitly enable the driver at boot time.
> 
> 
> I was aware of that, but I don't think that it might be the cause. Disabling 
MCA just makes the reporting go away, but the MCE and subsequent fatal trap 
remain. With default BIOS settings, the OS does not even get a chance to 
panic, the system just forces a reset before the OS could do anything. And, as 
far as I can tell, that did not happen on previous stable branches.

Hmm with mca disabled in the loader you should not be getting any MCE's at all 
as we don't enable the MCE interrupt in the CPU in that case.  Are you 
disabling it in the BIOS rather than loader.conf?

> Don't know though wether MCA changes the situation even when disabled in 
loader.conf (hw.mca.enabled=0). I just checked our 7.2 setup, and MCA does not 
seem to be in an 7.2 kernel, so I guess this was added to 8.0 and activated by 
default in 8.1. To be honest, we did not check, wether 8.0 shows the same 
behavior, but I guess running 8.1 with hw.mca.enabled=0 should pretty much 
give the same situation as far as MCA is concerned.

7.3 has MCA support, but disabled by default.

> Is there a way to get rid of MCA completely? (as opposed to just "turning it 
off" via loader.conf)

Turning it off in loader.conf does get rid of it completely as it prevents us 
from initializing the MSRs.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201007120848.48162.jhb>