From owner-svn-src-all@freebsd.org Mon Oct 24 16:40:29 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 93C33C1F1CE; Mon, 24 Oct 2016 16:40:29 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5FCCB83C; Mon, 24 Oct 2016 16:40:29 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id u9OGeSkn009501; Mon, 24 Oct 2016 16:40:28 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id u9OGeRFB009494; Mon, 24 Oct 2016 16:40:27 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201610241640.u9OGeRFB009494@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Mon, 24 Oct 2016 16:40:27 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r307866 - in head/sys: amd64/amd64 i386/i386 kern x86/include x86/x86 X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Oct 2016 16:40:29 -0000 Author: kib Date: Mon Oct 24 16:40:27 2016 New Revision: 307866 URL: https://svnweb.freebsd.org/changeset/base/307866 Log: Handle broadcast NMIs. On several Intel chipsets, diagnostic NMIs sent from BMC or NMIs reporting hardware errors are broadcasted to all CPUs. When kernel is configured to enter kdb on NMI, the outcome is problematic, because each CPU tries to enter kdb. All CPUs are executing NMI handlers, which set the latches disabling the nested NMI delivery; this means that stop_cpus_hard(), used by kdb_enter() to stop other cpus by broadcasting IPI_STOP_HARD NMI, cannot work. One indication of this is the harmless but annoying diagnostic "timeout stopping cpus". Much more harming behaviour is that because all CPUs try to enter kdb, and if ddb is used as debugger, all CPUs issue prompt on console and race for the input, not to mention the simultaneous use of the ddb shared state. Try to fix this by introducing a pseudo-lock for simultaneous attempts to handle NMIs. If one core happens to enter NMI trap handler, other cores see it and simulate reception of the IPI_STOP_HARD. More, generic_stop_cpus() avoids sending IPI_STOP_HARD and avoids waiting for the acknowledgement, relying on the nmi handler on other cores suspending and then restarting the CPU. Since it is impossible to detect at runtime whether some stray NMI is broadcast or unicast, add a knob for administrator (really developer) to configure debugging NMI handling mode. The updated patch was debugged with the help from Andrey Gapon (avg) and discussed with him. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D8249 Modified: head/sys/amd64/amd64/trap.c head/sys/i386/i386/trap.c head/sys/kern/subr_smp.c head/sys/x86/include/x86_smp.h head/sys/x86/include/x86_var.h head/sys/x86/x86/cpu_machdep.c head/sys/x86/x86/mp_x86.c Modified: head/sys/amd64/amd64/trap.c ============================================================================== --- head/sys/amd64/amd64/trap.c Mon Oct 24 16:28:54 2016 (r307865) +++ head/sys/amd64/amd64/trap.c Mon Oct 24 16:40:27 2016 (r307866) @@ -144,11 +144,6 @@ static char *trap_msg[] = { "DTrace pid return trap", /* 32 T_DTRACE_RET */ }; -#ifdef KDB -static int kdb_on_nmi = 1; -SYSCTL_INT(_machdep, OID_AUTO, kdb_on_nmi, CTLFLAG_RWTUN, - &kdb_on_nmi, 0, "Go to KDB on NMI"); -#endif static int panic_on_nmi = 1; SYSCTL_INT(_machdep, OID_AUTO, panic_on_nmi, CTLFLAG_RWTUN, &panic_on_nmi, 0, "Panic on NMI"); @@ -377,21 +372,7 @@ trap(struct trapframe *frame) #ifdef DEV_ISA case T_NMI: - /* machine/parity/power fail/"kitchen sink" faults */ - if (isa_nmi(frame->tf_err) == 0) { -#ifdef KDB - /* - * NMI can be hooked up to a pushbutton - * for debugging. - */ - if (kdb_on_nmi) { - printf ("NMI ... going to debugger\n"); - kdb_trap(type, 0, frame); - } -#endif /* KDB */ - goto userout; - } else if (panic_on_nmi) - panic("NMI indicates hardware failure"); + nmi_handle_intr(type, frame, true); break; #endif /* DEV_ISA */ @@ -563,20 +544,8 @@ trap(struct trapframe *frame) #ifdef DEV_ISA case T_NMI: - /* machine/parity/power fail/"kitchen sink" faults */ - if (isa_nmi(frame->tf_err) == 0) { -#ifdef KDB - /* - * NMI can be hooked up to a pushbutton - * for debugging. - */ - if (kdb_on_nmi) { - printf ("NMI ... going to debugger\n"); - kdb_trap(type, 0, frame); - } -#endif /* KDB */ - goto out; - } else if (panic_on_nmi == 0) + if (nmi_handle_intr(type, frame, false) || + !panic_on_nmi) goto out; /* FALLTHROUGH */ #endif /* DEV_ISA */ Modified: head/sys/i386/i386/trap.c ============================================================================== --- head/sys/i386/i386/trap.c Mon Oct 24 16:28:54 2016 (r307865) +++ head/sys/i386/i386/trap.c Mon Oct 24 16:40:27 2016 (r307866) @@ -467,21 +467,7 @@ user_trctrap_out: } goto userout; #else /* !POWERFAIL_NMI */ - /* machine/parity/power fail/"kitchen sink" faults */ - if (isa_nmi(frame->tf_err) == 0) { -#ifdef KDB - /* - * NMI can be hooked up to a pushbutton - * for debugging. - */ - if (kdb_on_nmi) { - printf ("NMI ... going to debugger\n"); - kdb_trap(type, 0, frame); - } -#endif /* KDB */ - goto userout; - } else if (panic_on_nmi) - panic("NMI indicates hardware failure"); + nmi_handle_intr(type, frame, true); break; #endif /* POWERFAIL_NMI */ #endif /* DEV_ISA */ @@ -730,20 +716,8 @@ kernel_trctrap: } goto out; #else /* !POWERFAIL_NMI */ - /* machine/parity/power fail/"kitchen sink" faults */ - if (isa_nmi(frame->tf_err) == 0) { -#ifdef KDB - /* - * NMI can be hooked up to a pushbutton - * for debugging. - */ - if (kdb_on_nmi) { - printf ("NMI ... going to debugger\n"); - kdb_trap(type, 0, frame); - } -#endif /* KDB */ - goto out; - } else if (panic_on_nmi == 0) + if (nmi_handle_intr(type, frame, false) || + !panic_on_nmi) goto out; /* FALLTHROUGH */ #endif /* POWERFAIL_NMI */ Modified: head/sys/kern/subr_smp.c ============================================================================== --- head/sys/kern/subr_smp.c Mon Oct 24 16:28:54 2016 (r307865) +++ head/sys/kern/subr_smp.c Mon Oct 24 16:40:27 2016 (r307866) @@ -209,6 +209,11 @@ forward_signal(struct thread *td) * 1: ok * */ +#if defined(__amd64__) || defined(__i386__) +#define X86 1 +#else +#define X86 0 +#endif static int generic_stop_cpus(cpuset_t map, u_int type) { @@ -220,12 +225,11 @@ generic_stop_cpus(cpuset_t map, u_int ty volatile cpuset_t *cpus; KASSERT( -#if defined(__amd64__) || defined(__i386__) - type == IPI_STOP || type == IPI_STOP_HARD || type == IPI_SUSPEND, -#else - type == IPI_STOP || type == IPI_STOP_HARD, + type == IPI_STOP || type == IPI_STOP_HARD +#if X86 + || type == IPI_SUSPEND #endif - ("%s: invalid stop type", __func__)); + , ("%s: invalid stop type", __func__)); if (!smp_started) return (0); @@ -233,7 +237,7 @@ generic_stop_cpus(cpuset_t map, u_int ty CTR2(KTR_SMP, "stop_cpus(%s) with %u type", cpusetobj_strprint(cpusetbuf, &map), type); -#if defined(__amd64__) || defined(__i386__) +#if X86 /* * When suspending, ensure there are are no IPIs in progress. * IPIs that have been issued, but not yet delivered (e.g. @@ -245,6 +249,9 @@ generic_stop_cpus(cpuset_t map, u_int ty mtx_lock_spin(&smp_ipi_mtx); #endif +#if X86 + if (!nmi_is_broadcast || nmi_kdb_lock == 0) { +#endif if (stopping_cpu != PCPU_GET(cpuid)) while (atomic_cmpset_int(&stopping_cpu, NOCPU, PCPU_GET(cpuid)) == 0) @@ -253,8 +260,11 @@ generic_stop_cpus(cpuset_t map, u_int ty /* send the stop IPI to all CPUs in map */ ipi_selected(map, type); +#if X86 + } +#endif -#if defined(__amd64__) || defined(__i386__) +#if X86 if (type == IPI_SUSPEND) cpus = &suspended_cpus; else @@ -272,7 +282,7 @@ generic_stop_cpus(cpuset_t map, u_int ty } } -#if defined(__amd64__) || defined(__i386__) +#if X86 if (type == IPI_SUSPEND) mtx_unlock_spin(&smp_ipi_mtx); #endif @@ -295,7 +305,7 @@ stop_cpus_hard(cpuset_t map) return (generic_stop_cpus(map, IPI_STOP_HARD)); } -#if defined(__amd64__) || defined(__i386__) +#if X86 int suspend_cpus(cpuset_t map) { @@ -325,20 +335,18 @@ generic_restart_cpus(cpuset_t map, u_int #endif volatile cpuset_t *cpus; - KASSERT( -#if defined(__amd64__) || defined(__i386__) - type == IPI_STOP || type == IPI_STOP_HARD || type == IPI_SUSPEND, -#else - type == IPI_STOP || type == IPI_STOP_HARD, + KASSERT(type == IPI_STOP || type == IPI_STOP_HARD +#if X86 + || type == IPI_SUSPEND #endif - ("%s: invalid stop type", __func__)); + , ("%s: invalid stop type", __func__)); if (!smp_started) - return 0; + return (0); CTR1(KTR_SMP, "restart_cpus(%s)", cpusetobj_strprint(cpusetbuf, &map)); -#if defined(__amd64__) || defined(__i386__) +#if X86 if (type == IPI_SUSPEND) cpus = &suspended_cpus; else @@ -348,11 +356,17 @@ generic_restart_cpus(cpuset_t map, u_int /* signal other cpus to restart */ CPU_COPY_STORE_REL(&map, &started_cpus); +#if X86 + if (!nmi_is_broadcast || nmi_kdb_lock == 0) { +#endif /* wait for each to clear its bit */ while (CPU_OVERLAP(cpus, &map)) cpu_spinwait(); +#if X86 + } +#endif - return 1; + return (1); } int @@ -362,7 +376,7 @@ restart_cpus(cpuset_t map) return (generic_restart_cpus(map, IPI_STOP)); } -#if defined(__amd64__) || defined(__i386__) +#if X86 int resume_cpus(cpuset_t map) { @@ -370,6 +384,7 @@ resume_cpus(cpuset_t map) return (generic_restart_cpus(map, IPI_SUSPEND)); } #endif +#undef X86 /* * All-CPU rendezvous. CPUs are signalled, all execute the setup function Modified: head/sys/x86/include/x86_smp.h ============================================================================== --- head/sys/x86/include/x86_smp.h Mon Oct 24 16:28:54 2016 (r307865) +++ head/sys/x86/include/x86_smp.h Mon Oct 24 16:40:27 2016 (r307866) @@ -45,6 +45,9 @@ extern u_int ipi_page; extern u_int ipi_range; extern u_int ipi_range_size; +extern int nmi_kdb_lock; +extern int nmi_is_broadcast; + struct cpu_info { int cpu_present:1; int cpu_bsp:1; Modified: head/sys/x86/include/x86_var.h ============================================================================== --- head/sys/x86/include/x86_var.h Mon Oct 24 16:28:54 2016 (r307865) +++ head/sys/x86/include/x86_var.h Mon Oct 24 16:40:27 2016 (r307866) @@ -85,6 +85,7 @@ struct reg; struct fpreg; struct dbreg; struct dumperinfo; +struct trapframe; /* * The interface type of the interrupt handler entry point cannot be @@ -107,6 +108,10 @@ bool fix_cpuid(void); void fillw(int /*u_short*/ pat, void *base, size_t cnt); int is_physical_memory(vm_paddr_t addr); int isa_nmi(int cd); +bool nmi_call_kdb(u_int cpu, u_int type, struct trapframe *frame, + bool panic); +bool nmi_call_kdb_smp(u_int type, struct trapframe *frame, bool panic); +int nmi_handle_intr(u_int type, struct trapframe *frame, bool panic); void pagecopy(void *from, void *to); void printcpuinfo(void); int user_dbreg_trap(void); Modified: head/sys/x86/x86/cpu_machdep.c ============================================================================== --- head/sys/x86/x86/cpu_machdep.c Mon Oct 24 16:28:54 2016 (r307865) +++ head/sys/x86/x86/cpu_machdep.c Mon Oct 24 16:40:27 2016 (r307866) @@ -47,6 +47,7 @@ __FBSDID("$FreeBSD$"); #include "opt_ddb.h" #include "opt_inet.h" #include "opt_isa.h" +#include "opt_kdb.h" #include "opt_kstack_pages.h" #include "opt_maxmem.h" #include "opt_mp_watchdog.h" @@ -522,3 +523,52 @@ idle_sysctl(SYSCTL_HANDLER_ARGS) SYSCTL_PROC(_machdep, OID_AUTO, idle, CTLTYPE_STRING | CTLFLAG_RW, 0, 0, idle_sysctl, "A", "currently selected idle function"); + +int nmi_is_broadcast = 1; +SYSCTL_INT(_machdep, OID_AUTO, nmi_is_broadcast, CTLFLAG_RWTUN, + &nmi_is_broadcast, 0, + "Chipset NMI is broadcast"); +#ifdef KDB +int kdb_on_nmi = 1; +SYSCTL_INT(_machdep, OID_AUTO, kdb_on_nmi, CTLFLAG_RWTUN, + &kdb_on_nmi, 0, + "Go to KDB on NMI"); +#endif + +#ifdef DEV_ISA +bool +nmi_call_kdb(u_int cpu, u_int type, struct trapframe *frame, bool do_panic) +{ + + /* machine/parity/power fail/"kitchen sink" faults */ + if (isa_nmi(frame->tf_err) == 0) { +#ifdef KDB + /* + * NMI can be hooked up to a pushbutton for debugging. + */ + if (kdb_on_nmi) { + printf ("NMI/cpu%d ... going to debugger\n", cpu); + kdb_trap(type, 0, frame); + return (true); + } + } else +#endif /* KDB */ + if (do_panic) + panic("NMI indicates hardware failure"); + return (false); +} +#endif + +int +nmi_handle_intr(u_int type, struct trapframe *frame, bool panic) +{ + +#ifdef DEV_ISA +#ifdef SMP + if (nmi_is_broadcast) + return (nmi_call_kdb_smp(type, frame, panic)); + else +#endif + return (nmi_call_kdb(0, type, frame, panic)); +#endif +} Modified: head/sys/x86/x86/mp_x86.c ============================================================================== --- head/sys/x86/x86/mp_x86.c Mon Oct 24 16:28:54 2016 (r307865) +++ head/sys/x86/x86/mp_x86.c Mon Oct 24 16:40:27 2016 (r307866) @@ -31,6 +31,7 @@ __FBSDID("$FreeBSD$"); #include "opt_apic.h" #endif #include "opt_cpu.h" +#include "opt_isa.h" #include "opt_kstack_pages.h" #include "opt_pmap.h" #include "opt_sched.h" @@ -140,6 +141,7 @@ int cpu_apic_ids[MAXCPU]; volatile u_int cpu_ipi_pending[MAXCPU]; static void release_aps(void *dummy); +static void cpustop_handler_post(u_int cpu); static int hyperthreading_allowed = 1; SYSCTL_INT(_machdep, OID_AUTO, hyperthreading_allowed, CTLFLAG_RDTUN, @@ -1211,6 +1213,34 @@ ipi_nmi_handler(void) return (0); } +#ifdef DEV_ISA +int nmi_kdb_lock; + +bool +nmi_call_kdb_smp(u_int type, struct trapframe *frame, bool do_panic) +{ + int cpu; + bool call_post, ret; + + cpu = PCPU_GET(cpuid); + if (atomic_cmpset_acq_int(&nmi_kdb_lock, 0, 1)) { + ret = nmi_call_kdb(cpu, type, frame, do_panic); + call_post = false; + } else { + ret = true; + savectx(&stoppcbs[cpu]); + CPU_SET_ATOMIC(cpu, &stopped_cpus); + while (!atomic_cmpset_acq_int(&nmi_kdb_lock, 0, 1)) + ia32_pause(); + call_post = true; + } + atomic_store_rel_int(&nmi_kdb_lock, 0); + if (call_post) + cpustop_handler_post(cpu); + return (ret); +} +#endif + /* * Handle an IPI_STOP by saving our current context and spinning until we * are resumed. @@ -1231,6 +1261,13 @@ cpustop_handler(void) while (!CPU_ISSET(cpu, &started_cpus)) ia32_pause(); + cpustop_handler_post(cpu); +} + +static void +cpustop_handler_post(u_int cpu) +{ + CPU_CLR_ATOMIC(cpu, &started_cpus); CPU_CLR_ATOMIC(cpu, &stopped_cpus);