From owner-freebsd-stable@FreeBSD.ORG Wed May 11 22:58:28 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 492A6106566B for ; Wed, 11 May 2011 22:58:28 +0000 (UTC) (envelope-from alan.l.cox@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 0F4F08FC0A for ; Wed, 11 May 2011 22:58:27 +0000 (UTC) Received: by iwn33 with SMTP id 33so1301172iwn.13 for ; Wed, 11 May 2011 15:58:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:reply-to:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=bG3f/drNaW7ADvgz3rTpPVZ5QT0gzFMbmZbl+bNuuuo=; b=vno+tCpnx30N++4rOWRdLP9Cpx/ugJbJ9j4CS7g5rrkmptwwgk7DlzbsJmi+Z9OeCI EgPU+QjVKO1F1WmlQyxjsqGA8Ver1e1cISR+iWbSBYacH6cCQa04EsG2YyEX3iGUz3Y8 3v7wb2PVRxmK29cvxoOOpLgfa7L3YAURsc33w= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; b=lXeIR2oiwOCfmh5D5Vqk95uRCG3IijqR0xKkC9zzk1MNQW9rdGT0ktvLzsgbSNP6ZU vMJxAmf/7G3wAlj3frERAXCPIubYVr1he7iLHdOnEtmyIdsmWfNM8SMhWEnCnjG07/L3 8fULFukVuRn86NjJPzthub1Toseo9CRi5iSMo= MIME-Version: 1.0 Received: by 10.42.4.134 with SMTP id 6mr10520842ics.513.1305152810806; Wed, 11 May 2011 15:26:50 -0700 (PDT) Received: by 10.42.165.5 with HTTP; Wed, 11 May 2011 15:26:50 -0700 (PDT) In-Reply-To: <20110510125220.GA88338@zibbi.meraka.csir.co.za> References: <20110510125220.GA88338@zibbi.meraka.csir.co.za> Date: Wed, 11 May 2011 17:26:50 -0500 Message-ID: From: Alan Cox To: John Hay Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-stable@freebsd.org Subject: Re: MCA: CPU 0 UNCOR PCC DTLB L1 error X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: alc@freebsd.org List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 May 2011 22:58:28 -0000 On Tue, May 10, 2011 at 7:52 AM, John Hay wrote: > Hi, > > I have seen this panic a few times on a Gigabyte E350N-USB3 running > 8-STABLE. > I have only seen it while in X, but then the machine is always in X. At > first, > I just got these hangs, so bought a PCI-express RS232 card and could see > these > at last. For some reason it does not go past this, so I have not been able > to > get a dump yet. > > Have anybody an idea of why this is or how to debug it further? I searched > the archives and found something similar about a year ago, but it looks > like it was solved with a fix that got committed. > > http://www.freebsd.org/cgi/query-pr.cgi?pr=140338 > > I have now disabled mca in loader.conf with 'hw.mca.enabled="0"' and I have > not seen that panic again. I do occasionally see a panic in devfs_open(), > but I guess that should be handled in another thread. > > The kernel is basically a GENERIC kernel with puc uncommented and the > following in loader.conf > > vm.kmem_size="12G" > hw.mca.enabled="0" > zfs_load="YES" > ahci_load="YES" > xhci_load="YES" > amdtemp_load="YES" > ng_ubt_load="YES" > uplcom_load="YES" > > Here is the panic message and after that dmesg. > > John > -- > John Hay -- jhay@meraka.csir.co.za / jhay@FreeBSD.org > > #################################################### > MCA: Bank 0, Status 0xb600000000010015 > MCA: Global Cap 0x0000000000000106, Status 0x0000000000000004 > MCA: Vendor "AuthenticAMD", ID 0x500f10, APIC ID 0 > MCA: CPU 0 UNCOR PCC DTLB L1 error > MCA: Address 0x8016c4000 > > > Fatal trap 28: machine check trap while in user mode > cpuid = 0; apic id = 00 > instruction pointer = 0x43:0x80156af85 > stack pointer = 0x3b:0x7fffffffcb18 > frame pointer = 0x3b:0x80fe87800 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 3, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, IOPL = 0 > current process = 2484 (initial thread) > trap number = 28 > panic: machine check trap > cpuid = 0 > KDB: stack backtrace: > #0 0xffffffff80608d5e at kdb_backtrace+0x5e > #1 0xffffffff805d6707 at panic+0x187 > #2 0xffffffff808bf4c0 at trap_fatal+0x290 > #3 0xffffffff808bfaa9 at trap+0x109 > #4 0xffffffff808a7d94 at calltrap+0x8 > #################################################### > > Please try the following patch: Index: x86/x86/mca.c =================================================================== --- x86/x86/mca.c (revision 219060) +++ x86/x86/mca.c (working copy) @@ -665,7 +665,8 @@ mca_setup(uint64_t mcg_cap) * for Erratum 383. */ if (cpu_vendor_id == CPU_VENDOR_AMD && - CPUID_TO_FAMILY(cpu_id) == 0x10 && amd10h_L1TP) + (CPUID_TO_FAMILY(cpu_id) == 0x10 || + CPUID_TO_FAMILY(cpu_id) == 0x14) && amd10h_L1TP) workaround_erratum383 = 1; mtx_init(&mca_lock, "mca", NULL, MTX_SPIN); Index: i386/i386/pmap.c =================================================================== --- i386/i386/pmap.c (revision 219060) +++ i386/i386/pmap.c (working copy) @@ -758,7 +758,8 @@ pmap_init(void) * machine monitor. */ if (vm_guest == VM_GUEST_VM && cpu_vendor_id == CPU_VENDOR_AMD && - CPUID_TO_FAMILY(cpu_id) == 0x10) + (CPUID_TO_FAMILY(cpu_id) == 0x10 || + CPUID_TO_FAMILY(cpu_id) == 0x14)) workaround_erratum383 = 1; /* Index: amd64/amd64/pmap.c =================================================================== --- amd64/amd64/pmap.c (revision 219060) +++ amd64/amd64/pmap.c (working copy) @@ -727,7 +727,8 @@ pmap_init(void) * machine monitor. */ if (vm_guest == VM_GUEST_VM && cpu_vendor_id == CPU_VENDOR_AMD && - CPUID_TO_FAMILY(cpu_id) == 0x10) + (CPUID_TO_FAMILY(cpu_id) == 0x10 || + CPUID_TO_FAMILY(cpu_id) == 0x14)) workaround_erratum383 = 1; /*