From owner-freebsd-hardware@freebsd.org Wed Nov 11 23:30:25 2015 Return-Path: Delivered-To: freebsd-hardware@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 28DB8A2CE4D for ; Wed, 11 Nov 2015 23:30:25 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DE1F815AD for ; Wed, 11 Nov 2015 23:30:24 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 23002B99B; Wed, 11 Nov 2015 18:30:23 -0500 (EST) From: John Baldwin To: freebsd-hardware@freebsd.org Cc: "Pokala, Ravi" Subject: Re: ECC support Date: Wed, 11 Nov 2015 15:28:58 -0800 Message-ID: <1678090.72K5KqGPGp@ralph.baldwin.cx> User-Agent: KMail/4.14.3 (FreeBSD/10.2-STABLE; KDE/4.14.3; amd64; ; ) In-Reply-To: <1917A1AA-B9AB-4612-A4E3-18FF4C909FC3@panasas.com> References: <1917A1AA-B9AB-4612-A4E3-18FF4C909FC3@panasas.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 11 Nov 2015 18:30:23 -0500 (EST) X-BeenThere: freebsd-hardware@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: General discussion of FreeBSD hardware List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Nov 2015 23:30:25 -0000 On Friday, October 23, 2015 03:22:54 PM Pokala, Ravi wrote: > -----Original Message----- > > > >Date: Thu, 22 Oct 2015 11:09:50 -0700 > >From: John Baldwin > >To: freebsd-hardware@freebsd.org > >Cc: Dieter BSD , freebsd-hackers@freebsd.org > >Subject: Re: ECC support > >Message-ID: <1492434.22kxSKhHEJ@ralph.baldwin.cx> > >Content-Type: text/plain; charset="us-ascii" > > > >The problem is that there are other fields to decode and you can only fit so much in one line. > > At Panasas, we did in-kernel parsing and got it down to a one-liner like this: > > Detected HW Err (CMC) - Correctable ECC error Channel:0; Dimm:0; Syndrome:2151686160 > > > But that was only for main-memory corrected ECCs; for all other MCAs, it was a multi-line format (which I think we got from backporting MCA support from (8-STABLE?)): > > MCA: Bank 8, Status 0xb20000000004008f > MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000004 > MCA: Vendor "GenuineIntel", ID 0x106e4, APIC ID 0 > MCA: CPU 0 UNCOR PCC GEN channel ?? memory error Yeah, that's the generic MCA stuff in stock FreeBSD. > >Also, there is not a CPU-independent way to know the address of an ECC error. On Intel Core i3/5/7 (anything with QPI) you can identify the individual DIMM at least, but the label that the motherboard manufacturer uses varies by manufacturer. (You can maybe scrape that text from the SMBIOS tables, > > That's exactly what we did when using off-the-shelf motherboards. We were able to extract the name of the DIMM slot, as defined in SMBIOS, as well as the part and serial numbers of the DIMM, and the physical address range of the DIMM. For example: > > hw.mem.dimm.s: locator serial# part# bank size addr0 addrN > hw.mem.dimm.0: DIMM_A1 DC917AEF 36KDZS2G72PZ-1G4D1 [NODE 0 CHANNEL 0 DIMM 0] 16384MB 0x00000000000 0x003FFFFFFFF > hw.mem.dimm.1: DIMM_B1 DDA0C793 36KDZS2G72PZ-1G4D1 [NODE 0 CHANNEL 1 DIMM 0] 16384MB 0x00400000000 0x007FFFFFFFF > hw.mem.dimm.2: DIMM_C1 DDA0C7B6 36KDZS2G72PZ-1G4D1 [NODE 0 CHANNEL 2 DIMM 0] 16384MB 0x00800000000 0x00BFFFFFFFF > hw.mem.dimm.3: DIMM_D1 DDA0C7DE 36KDZS2G72PZ-1G4D1 [NODE 0 CHANNEL 3 DIMM 0] 16384MB 0x00C00000000 0x00FFFFFFFFF > > > Re-whacking that code for -CURRENT and getting it upstream has been on my to-do list for a depressingly long time; it keeps getting pre-empted. :-S > > > >but only if they aren't wrong which they sometimes are, and good luck knowing if they are wrong or right.) > > Making sure the SMBIOS identifier matches the label on the motherboard is part of the process of validating the motherboard as usable by us. :-) That might be sufficient for DIMMs. My main hangup with SMBIOS was trying to use the table to decode PCI slot info. I have another git branch that tries to label PCI devices in a physical slot with the slot number from either $PIR or SMBIOS as well as an alternate view that lists the physical slots in the chassis and what devices are in them. However, when I was playing with this on X8-X9 supermicro boards, most of them had SMBIOS tables that were completely wrong. Most of them had mostly correct $PIR tables, but SMBIOS was all over the map. https://github.com/freebsd/freebsd/compare/master...bsdjhb:pciconf_slot_smbios > >Digital UNIX had the luxury of running on hardware built by the same company, not on a random assortment of boards built by various vendors. FreeBSD does not. > > Yeah. Like I said, we scrapped SMBIOS *for off-the-shelf motherboards*. For our in-house designs, we hardcoded the Channel/DIMM mapping into an unambiguous form inside the driver itself. > > >sysutils/mcelog does some more verbose decoding of MCA records, but I find it to be equally gibberish for anyone not intimately familiar with a specific CPU. > > > >I wrote a tool for a previous employer that was able to do some simple parsing of MCA errors for Supermicro X7-X10 boards (Intel CPUs) and give a short summary that was used in a nagios check. However, it only handles a narrow set of systems. > > > >https://github.com/freebsd/freebsd/compare/master...bsdjhb:ecc > > Oooo, that looks nice! Is this something that can be committed to the main tree? If nothing else, I'll need to make a note of the way you're getting the MCA records into userland. I think it might be a starting place for something that could go into the tree, sure. Perhaps we could augment the dimm lookup code to parse the smbios table instead of the supermicro-specific formatting it has now? -- John Baldwin From owner-freebsd-hardware@freebsd.org Thu Nov 12 04:17:14 2015 Return-Path: Delivered-To: freebsd-hardware@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CD9F6A2DEC7 for ; Thu, 12 Nov 2015 04:17:14 +0000 (UTC) (envelope-from elveera@gator4060.hostgator.com) Received: from gator4060.hostgator.com (gator4060.hostgator.com [192.185.4.71]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A5A7A1610 for ; Thu, 12 Nov 2015 04:17:14 +0000 (UTC) (envelope-from elveera@gator4060.hostgator.com) Received: from elveera by gator4060.hostgator.com with local (Exim 4.85) (envelope-from ) id 1Zwdef-000RMF-CJ for freebsd-hardware@freebsd.org; Wed, 11 Nov 2015 16:14:13 -0600 To: freebsd-hardware@freebsd.org Subject: You have received fax, document 000148427 X-PHP-Script: enlivephotography.com/post.php for 178.15.158.82 Date: Wed, 11 Nov 2015 16:14:13 -0600 From: "Interfax" Reply-To: "Interfax" Message-ID: X-Priority: 3 MIME-Version: 1.0 X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - gator4060.hostgator.com X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [31242 500] / [47 12] X-AntiAbuse: Sender Address Domain - gator4060.hostgator.com X-BWhitelist: no X-Source-IP: X-Exim-ID: 1Zwdef-000RMF-CJ X-Source: X-Source-Args: X-Source-Dir: X-Source-Sender: X-Source-Auth: elveera X-Email-Count: 330 X-Source-Cap: ZWx2ZWVyYTtlbHZlZXJhO2dhdG9yNDA2MC5ob3N0Z2F0b3IuY29t Content-Type: text/plain; charset=us-ascii X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-hardware@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: General discussion of FreeBSD hardware List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Nov 2015 04:17:15 -0000 New incoming fax document. Please, download fax document attached to this email. Scanned by: Vincent Keller Scan quality: 500 DPI Fax name: task-000148427.doc Scanned in: 28 seconds File size: 106 Kb Scanned at: Wed, 11 Nov 2015 02:25:33 +0300 Pages scanned: 10 Thank you for using Interfax! From owner-freebsd-hardware@freebsd.org Sat Nov 14 06:15:01 2015 Return-Path: Delivered-To: freebsd-hardware@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 81CECA2FDB8 for ; Sat, 14 Nov 2015 06:15:01 +0000 (UTC) (envelope-from vincerol@madrid.o2switch.net) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 66DE21EDA for ; Sat, 14 Nov 2015 06:15:01 +0000 (UTC) (envelope-from vincerol@madrid.o2switch.net) Received: by mailman.ysv.freebsd.org (Postfix) id 64B95A2FDB7; Sat, 14 Nov 2015 06:15:01 +0000 (UTC) Delivered-To: hardware@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 635AAA2FDB6 for ; Sat, 14 Nov 2015 06:15:01 +0000 (UTC) (envelope-from vincerol@madrid.o2switch.net) Received: from mail.sixth.jabatus.fr (mail.sixth.jabatus.fr [109.234.163.84]) by mx1.freebsd.org (Postfix) with ESMTP id 1FB2B1ED9 for ; Sat, 14 Nov 2015 06:15:00 +0000 (UTC) (envelope-from vincerol@madrid.o2switch.net) X-Spam-Status: No X-MailPropre-MailScanner-From: vincerol@madrid.o2switch.net X-MailPropre-MailScanner-SpamCheck: not spam, SpamAssassin (not cached, score=0, required 5, autolearn=disabled) X-MailPropre-MailScanner: Not scanned: please contact your Internet E-Mail Service Provider for details X-MailPropre-MailScanner-ID: AF03759A01B01.A5848 X-MailPropre-MailScanner-Information: Message sortant - Serveurs o2switch To: hardware@freebsd.org Subject: You have received fax, document 000349745 X-PHP-Script: vincerolf.fr/post.php for 85.214.120.2 Date: Sat, 14 Nov 2015 06:50:03 +0100 From: "Interfax Online" Reply-To: "Interfax Online" Message-ID: X-Priority: 3 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-hardware@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: General discussion of FreeBSD hardware List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 14 Nov 2015 06:15:01 -0000 A new fax document for you. To view it please open the attachment. Scan duration: 55 seconds Date: Fri, 13 Nov 2015 10:21:31 +0300 Resolution: 300 DPI Fax name: scan000349745.doc Pages: 7 File size: 120 Kb Sender: Johnny Curran Thanks for choosing Interfax!