Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 9 Jul 2010 16:03:31 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-stable@freebsd.org
Cc:        Markus Gebert <markus.gebert@hostpoint.ch>
Subject:   Re: 8.1-RC2 - PCI fatal error or MCE triggered by USB/ehci on Sun X4100M2?
Message-ID:  <201007091603.31843.jhb@freebsd.org>
In-Reply-To: <6B57591F-9FA2-45EB-825F-1DB025C0635D@hostpoint.ch>
References:  <6B57591F-9FA2-45EB-825F-1DB025C0635D@hostpoint.ch>

next in thread | previous in thread | raw e-mail | index | archive | help
On Friday, July 09, 2010 11:26:00 am Markus Gebert wrote:
> --
> MCA: Bank 4, Status 0xb400004000030c2b
> MCA: Global Cap 0x0000000000000105, Status 0x0000000000000007
> MCA: Vendor "AuthenticAMD", ID 0x40f13, APIC ID 2
> MCA: CPU 2 UNCOR BUSLG Observer WR I/O
> MCA: Address 0xfd00000000

Using my local port of mcelog this is what I get for this check:

CPU 2 4 northbridge 
ADDR fd00000000 
  Northbridge Master abort
  link number = 4
       bit61 = error uncorrected
  bus error 'local node observed, request didn't time out
             generic write mem transaction
             i/o access, level generic'
STATUS b400004000030c2b MCGSTATUS 7
MCGCAP 105 APICID 2 SOCKETID 0 
CPUID Vendor AMD Family 15 Model 65

I don't know what to tell you off hand.  Did you buy this hardware from Sun 
directly?  If so, I would try bugging them about this, especially given the 
error that the BIOS is logging.  It does sound like a hardware issue, but in 
the chipset, not in the RAM, so you might need to swap out the main board 
rather than the RAM.

I'm curious if disabling USB legacy support in the BIOS causes it to still die 
even with ehci not loaded.  If so, then the SMI# for the ehci controller must 
somehow prevent the issue, perhaps by triggering frequently enough to slow the 
rate of I/O requests down?

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201007091603.31843.jhb>