Date: Mon, 13 Sep 2010 10:11:18 -0400 From: John Baldwin <jhb@freebsd.org> To: freebsd-hackers@freebsd.org Cc: Simon <simon@optinet.com> Subject: Re: MCE Decoding - MCA: Bank 8, Status 0xcc0031800001009f/0xc8000980000200cf Message-ID: <201009131011.19089.jhb@freebsd.org> In-Reply-To: <20100911060704.55B611065670@hub.freebsd.org> References: <20100911060704.55B611065670@hub.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Saturday, September 11, 2010 1:40:28 am Simon wrote: > Hello, > > Can someone please help me decode these two errors on FreeBSD 8.1-R: > > MCA: Bank 8, Status 0xcc0031800001009f > MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000 > MCA: Vendor "GenuineIntel", ID 0x106a5, APIC ID 16 > MCA: CPU 0 COR (198) OVER RD channel ?? memory error > MCA: Address 0x1b6188d80 > MCA: Misc 0x72ae242000000084 > > MCA: Bank 8, Status 0xc8000980000200cf > MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000 > MCA: Vendor "GenuineIntel", ID 0x106a5, APIC ID 16 > MCA: CPU 0 COR (38) OVER MS channel ?? memory error > MCA: Misc 0x72ae242000000140 HARDWARE ERROR. This is *NOT* a software problem! Please contact your hardware vendor CPU 0 BANK 8 MISC 72ae242000000084 ADDR 1b6188d80 MCG status: MCi status: Error overflow MCi_MISC register valid MCi_ADDR register valid MCA: MEMORY CONTROLLER RD_CHANNELunspecified_ERR Transaction: Memory read error Memory read ECC error Memory corrected error count (CORE_ERR_CNT): 198 Memory transaction Tracker ID (RTId): 84 Memory DIMM ID of error: 0 Memory channel ID of error: 0 Memory ECC syndrome: 72ae2420 STATUS cc0031800001009f MCGSTATUS 0 MCGCAP 1c09 APICID 10 SOCKETID 0 CPUID Vendor Intel Family 6 Model 26 HARDWARE ERROR. This is *NOT* a software problem! Please contact your hardware vendor CPU 0 BANK 8 MISC 72ae242000000140 MCG status: MCi status: Error overflow MCi_MISC register valid MCA: MEMORY CONTROLLER MS_CHANNELunspecified_ERR Transaction: Memory scrubbing error Memory ECC error occurred during scrub Memory corrected error count (CORE_ERR_CNT): 38 Memory transaction Tracker ID (RTId): 40 Memory DIMM ID of error: 0 Memory channel ID of error: 0 Memory ECC syndrome: 72ae2420 STATUS c8000980000200cf MCGSTATUS 0 MCGCAP 1c09 APICID 10 SOCKETID 0 CPUID Vendor Intel Family 6 Model 26 You have some corrected memory errors (198+38 = 236) in the first DIMM (on the SuperMicro boards we have at work, it would correspond to the DIMM slot labeled P1_DIMM1A). In my experience I would just ignore them unless the count gets much higher (say 10000+ / per hour). -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201009131011.19089.jhb>