From owner-svn-src-all@freebsd.org Mon Oct 24 11:47:29 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 14503C1F132; Mon, 24 Oct 2016 11:47:29 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CB295C99; Mon, 24 Oct 2016 11:47:28 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id u9OBlRXu097504; Mon, 24 Oct 2016 11:47:27 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id u9OBlRce097499; Mon, 24 Oct 2016 11:47:27 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201610241147.u9OBlRce097499@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Mon, 24 Oct 2016 11:47:27 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r307856 - in stable/11/sys: amd64/amd64 amd64/include i386/include x86/include x86/x86 X-SVN-Group: stable-11 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Oct 2016 11:47:29 -0000 Author: kib Date: Mon Oct 24 11:47:27 2016 New Revision: 307856 URL: https://svnweb.freebsd.org/changeset/base/307856 Log: MFC r306680: Reduce the cost of TLB invalidation on x86 by using per-CPU completion flags. Modified: stable/11/sys/amd64/amd64/mp_machdep.c stable/11/sys/amd64/include/pcpu.h stable/11/sys/i386/include/pcpu.h stable/11/sys/x86/include/x86_smp.h stable/11/sys/x86/x86/mp_x86.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/amd64/amd64/mp_machdep.c ============================================================================== --- stable/11/sys/amd64/amd64/mp_machdep.c Mon Oct 24 11:33:42 2016 (r307855) +++ stable/11/sys/amd64/amd64/mp_machdep.c Mon Oct 24 11:47:27 2016 (r307856) @@ -409,6 +409,7 @@ void invltlb_invpcid_handler(void) { struct invpcid_descr d; + uint32_t generation; #ifdef COUNT_XINVLTLB_HITS xhits_gbl[PCPU_GET(cpuid)]++; @@ -417,17 +418,20 @@ invltlb_invpcid_handler(void) (*ipi_invltlb_counts[PCPU_GET(cpuid)])++; #endif /* COUNT_IPIS */ + generation = smp_tlb_generation; d.pcid = smp_tlb_pmap->pm_pcids[PCPU_GET(cpuid)].pm_pcid; d.pad = 0; d.addr = 0; invpcid(&d, smp_tlb_pmap == kernel_pmap ? INVPCID_CTXGLOB : INVPCID_CTX); - atomic_add_int(&smp_tlb_wait, 1); + PCPU_SET(smp_tlb_done, generation); } void invltlb_pcid_handler(void) { + uint32_t generation; + #ifdef COUNT_XINVLTLB_HITS xhits_gbl[PCPU_GET(cpuid)]++; #endif /* COUNT_XINVLTLB_HITS */ @@ -435,6 +439,7 @@ invltlb_pcid_handler(void) (*ipi_invltlb_counts[PCPU_GET(cpuid)])++; #endif /* COUNT_IPIS */ + generation = smp_tlb_generation; /* Overlap with serialization */ if (smp_tlb_pmap == kernel_pmap) { invltlb_glob(); } else { @@ -450,5 +455,5 @@ invltlb_pcid_handler(void) smp_tlb_pmap->pm_pcids[PCPU_GET(cpuid)].pm_pcid); } } - atomic_add_int(&smp_tlb_wait, 1); + PCPU_SET(smp_tlb_done, generation); } Modified: stable/11/sys/amd64/include/pcpu.h ============================================================================== --- stable/11/sys/amd64/include/pcpu.h Mon Oct 24 11:33:42 2016 (r307855) +++ stable/11/sys/amd64/include/pcpu.h Mon Oct 24 11:47:27 2016 (r307856) @@ -65,7 +65,8 @@ u_int pc_vcpu_id; /* Xen vCPU ID */ \ uint32_t pc_pcid_next; \ uint32_t pc_pcid_gen; \ - char __pad[149] /* be divisor of PAGE_SIZE \ + uint32_t pc_smp_tlb_done; /* TLB op acknowledgement */ \ + char __pad[145] /* be divisor of PAGE_SIZE \ after cache alignment */ #define PC_DBREG_CMD_NONE 0 Modified: stable/11/sys/i386/include/pcpu.h ============================================================================== --- stable/11/sys/i386/include/pcpu.h Mon Oct 24 11:33:42 2016 (r307855) +++ stable/11/sys/i386/include/pcpu.h Mon Oct 24 11:47:27 2016 (r307856) @@ -59,7 +59,8 @@ u_int pc_cmci_mask; /* MCx banks for CMCI */ \ u_int pc_vcpu_id; /* Xen vCPU ID */ \ vm_offset_t pc_qmap_addr; /* KVA for temporary mappings */\ - char __pad[229] + uint32_t pc_smp_tlb_done; /* TLB op acknowledgement */ \ + char __pad[225] #ifdef _KERNEL Modified: stable/11/sys/x86/include/x86_smp.h ============================================================================== --- stable/11/sys/x86/include/x86_smp.h Mon Oct 24 11:33:42 2016 (r307855) +++ stable/11/sys/x86/include/x86_smp.h Mon Oct 24 11:47:27 2016 (r307856) @@ -35,7 +35,7 @@ extern volatile int aps_ready; extern struct mtx ap_boot_mtx; extern int cpu_logical; extern int cpu_cores; -extern volatile int smp_tlb_wait; +extern volatile uint32_t smp_tlb_generation; extern struct pmap *smp_tlb_pmap; extern u_int xhits_gbl[]; extern u_int xhits_pg[]; Modified: stable/11/sys/x86/x86/mp_x86.c ============================================================================== --- stable/11/sys/x86/x86/mp_x86.c Mon Oct 24 11:33:42 2016 (r307855) +++ stable/11/sys/x86/x86/mp_x86.c Mon Oct 24 11:47:27 2016 (r307856) @@ -1308,12 +1308,22 @@ cpususpend_handler(void) void invlcache_handler(void) { + uint32_t generation; + #ifdef COUNT_IPIS (*ipi_invlcache_counts[PCPU_GET(cpuid)])++; #endif /* COUNT_IPIS */ + /* + * Reading the generation here allows greater parallelism + * since wbinvd is a serializing instruction. Without the + * temporary, we'd wait for wbinvd to complete, then the read + * would execute, then the dependent write, whuch must then + * complete before return from interrupt. + */ + generation = smp_tlb_generation; wbinvd(); - atomic_add_int(&smp_tlb_wait, 1); + PCPU_SET(smp_tlb_done, generation); } /* @@ -1371,7 +1381,7 @@ SYSINIT(mp_ipi_intrcnt, SI_SUB_INTR, SI_ /* Variables needed for SMP tlb shootdown. */ static vm_offset_t smp_tlb_addr1, smp_tlb_addr2; pmap_t smp_tlb_pmap; -volatile int smp_tlb_wait; +volatile uint32_t smp_tlb_generation; #ifdef __amd64__ #define read_eflags() read_rflags() @@ -1381,15 +1391,16 @@ static void smp_targeted_tlb_shootdown(cpuset_t mask, u_int vector, pmap_t pmap, vm_offset_t addr1, vm_offset_t addr2) { - int cpu, ncpu, othercpus; - - othercpus = mp_ncpus - 1; /* does not shootdown self */ + cpuset_t other_cpus; + volatile uint32_t *p_cpudone; + uint32_t generation; + int cpu; /* * Check for other cpus. Return if none. */ if (CPU_ISFULLSET(&mask)) { - if (othercpus < 1) + if (mp_ncpus <= 1) return; } else { CPU_CLR(PCPU_GET(cpuid), &mask); @@ -1403,23 +1414,28 @@ smp_targeted_tlb_shootdown(cpuset_t mask smp_tlb_addr1 = addr1; smp_tlb_addr2 = addr2; smp_tlb_pmap = pmap; - smp_tlb_wait = 0; + generation = ++smp_tlb_generation; if (CPU_ISFULLSET(&mask)) { - ncpu = othercpus; ipi_all_but_self(vector); + other_cpus = all_cpus; + CPU_CLR(PCPU_GET(cpuid), &other_cpus); } else { - ncpu = 0; + other_cpus = mask; while ((cpu = CPU_FFS(&mask)) != 0) { cpu--; CPU_CLR(cpu, &mask); CTR3(KTR_SMP, "%s: cpu: %d ipi: %x", __func__, cpu, vector); ipi_send_cpu(cpu, vector); - ncpu++; } } - while (smp_tlb_wait < ncpu) - ia32_pause(); + while ((cpu = CPU_FFS(&other_cpus)) != 0) { + cpu--; + CPU_CLR(cpu, &other_cpus); + p_cpudone = &cpuid_to_pcpu[cpu]->pc_smp_tlb_done; + while (*p_cpudone != generation) + ia32_pause(); + } mtx_unlock_spin(&smp_ipi_mtx); } @@ -1477,6 +1493,8 @@ smp_cache_flush(void) void invltlb_handler(void) { + uint32_t generation; + #ifdef COUNT_XINVLTLB_HITS xhits_gbl[PCPU_GET(cpuid)]++; #endif /* COUNT_XINVLTLB_HITS */ @@ -1484,16 +1502,23 @@ invltlb_handler(void) (*ipi_invltlb_counts[PCPU_GET(cpuid)])++; #endif /* COUNT_IPIS */ + /* + * Reading the generation here allows greater parallelism + * since invalidating the TLB is a serializing operation. + */ + generation = smp_tlb_generation; if (smp_tlb_pmap == kernel_pmap) invltlb_glob(); else invltlb(); - atomic_add_int(&smp_tlb_wait, 1); + PCPU_SET(smp_tlb_done, generation); } void invlpg_handler(void) { + uint32_t generation; + #ifdef COUNT_XINVLTLB_HITS xhits_pg[PCPU_GET(cpuid)]++; #endif /* COUNT_XINVLTLB_HITS */ @@ -1501,14 +1526,16 @@ invlpg_handler(void) (*ipi_invlpg_counts[PCPU_GET(cpuid)])++; #endif /* COUNT_IPIS */ + generation = smp_tlb_generation; /* Overlap with serialization */ invlpg(smp_tlb_addr1); - atomic_add_int(&smp_tlb_wait, 1); + PCPU_SET(smp_tlb_done, generation); } void invlrng_handler(void) { - vm_offset_t addr; + vm_offset_t addr, addr2; + uint32_t generation; #ifdef COUNT_XINVLTLB_HITS xhits_rng[PCPU_GET(cpuid)]++; @@ -1518,10 +1545,12 @@ invlrng_handler(void) #endif /* COUNT_IPIS */ addr = smp_tlb_addr1; + addr2 = smp_tlb_addr2; + generation = smp_tlb_generation; /* Overlap with serialization */ do { invlpg(addr); addr += PAGE_SIZE; - } while (addr < smp_tlb_addr2); + } while (addr < addr2); - atomic_add_int(&smp_tlb_wait, 1); + PCPU_SET(smp_tlb_done, generation); }