From owner-svn-src-all@FreeBSD.ORG Sat Jun 7 17:12:28 2014 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9679B526; Sat, 7 Jun 2014 17:12:28 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 821B22FC2; Sat, 7 Jun 2014 17:12:28 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.8/8.14.8) with ESMTP id s57HCSff021313; Sat, 7 Jun 2014 17:12:28 GMT (envelope-from alc@svn.freebsd.org) Received: (from alc@localhost) by svn.freebsd.org (8.14.8/8.14.8/Submit) id s57HCR70021295; Sat, 7 Jun 2014 17:12:27 GMT (envelope-from alc@svn.freebsd.org) Message-Id: <201406071712.s57HCR70021295@svn.freebsd.org> From: Alan Cox Date: Sat, 7 Jun 2014 17:12:27 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r267213 - in head/sys: amd64/amd64 arm/arm i386/i386 vm X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 Jun 2014 17:12:28 -0000 Author: alc Date: Sat Jun 7 17:12:26 2014 New Revision: 267213 URL: http://svnweb.freebsd.org/changeset/base/267213 Log: Add a page size field to struct vm_page. Increase the page size field when a partially populated reservation becomes fully populated, and decrease this field when a fully populated reservation becomes partially populated. Use this field to simplify the implementation of pmap_enter_object() on amd64, arm, and i386. On all architectures where we support superpages, the cost of creating a superpage mapping is roughly the same as creating a base page mapping. For example, both kinds of mappings entail the creation of a single PTE and PV entry. With this in mind, use the page size field to make the implementation of vm_map_pmap_enter(..., MAP_PREFAULT_PARTIAL) a little smarter. Previously, if MAP_PREFAULT_PARTIAL was specified to vm_map_pmap_enter(), that function would only map base pages. Now, it will create up to 96 base page or superpage mappings. Reviewed by: kib Sponsored by: EMC / Isilon Storage Division Modified: head/sys/amd64/amd64/pmap.c head/sys/arm/arm/pmap-v6.c head/sys/i386/i386/pmap.c head/sys/vm/vm_map.c head/sys/vm/vm_page.c head/sys/vm/vm_page.h head/sys/vm/vm_reserv.c Modified: head/sys/amd64/amd64/pmap.c ============================================================================== --- head/sys/amd64/amd64/pmap.c Sat Jun 7 15:51:29 2014 (r267212) +++ head/sys/amd64/amd64/pmap.c Sat Jun 7 17:12:26 2014 (r267213) @@ -4428,9 +4428,7 @@ pmap_enter_object(pmap_t pmap, vm_offset while (m != NULL && (diff = m->pindex - m_start->pindex) < psize) { va = start + ptoa(diff); if ((va & PDRMASK) == 0 && va + NBPDR <= end && - (VM_PAGE_TO_PHYS(m) & PDRMASK) == 0 && - pmap_ps_enabled(pmap) && - vm_reserv_level_iffullpop(m) == 0 && + m->psind == 1 && pmap_ps_enabled(pmap) && pmap_enter_pde(pmap, va, m, prot, &lock)) m = &m[NBPDR / PAGE_SIZE - 1]; else Modified: head/sys/arm/arm/pmap-v6.c ============================================================================== --- head/sys/arm/arm/pmap-v6.c Sat Jun 7 15:51:29 2014 (r267212) +++ head/sys/arm/arm/pmap-v6.c Sat Jun 7 17:12:26 2014 (r267213) @@ -3228,8 +3228,7 @@ pmap_enter_object(pmap_t pmap, vm_offset while (m != NULL && (diff = m->pindex - m_start->pindex) < psize) { va = start + ptoa(diff); if ((va & L1_S_OFFSET) == 0 && L2_NEXT_BUCKET(va) <= end && - (VM_PAGE_TO_PHYS(m) & L1_S_OFFSET) == 0 && - sp_enabled && vm_reserv_level_iffullpop(m) == 0 && + m->psind == 1 && sp_enabled && pmap_enter_section(pmap, va, m, prot)) m = &m[L1_S_SIZE / PAGE_SIZE - 1]; else Modified: head/sys/i386/i386/pmap.c ============================================================================== --- head/sys/i386/i386/pmap.c Sat Jun 7 15:51:29 2014 (r267212) +++ head/sys/i386/i386/pmap.c Sat Jun 7 17:12:26 2014 (r267213) @@ -3733,8 +3733,7 @@ pmap_enter_object(pmap_t pmap, vm_offset while (m != NULL && (diff = m->pindex - m_start->pindex) < psize) { va = start + ptoa(diff); if ((va & PDRMASK) == 0 && va + NBPDR <= end && - (VM_PAGE_TO_PHYS(m) & PDRMASK) == 0 && - pg_ps_enabled && vm_reserv_level_iffullpop(m) == 0 && + m->psind == 1 && pg_ps_enabled && pmap_enter_pde(pmap, va, m, prot)) m = &m[NBPDR / PAGE_SIZE - 1]; else Modified: head/sys/vm/vm_map.c ============================================================================== --- head/sys/vm/vm_map.c Sat Jun 7 15:51:29 2014 (r267212) +++ head/sys/vm/vm_map.c Sat Jun 7 17:12:26 2014 (r267213) @@ -1773,20 +1773,22 @@ vm_map_submap( } /* - * The maximum number of pages to map + * The maximum number of pages to map if MAP_PREFAULT_PARTIAL is specified */ #define MAX_INIT_PT 96 /* * vm_map_pmap_enter: * - * Preload read-only mappings for the specified object's resident pages - * into the target map. If "flags" is MAP_PREFAULT_PARTIAL, then only - * the resident pages within the address range [addr, addr + ulmin(size, - * ptoa(MAX_INIT_PT))) are mapped. Otherwise, all resident pages within - * the specified address range are mapped. This eliminates many soft - * faults on process startup and immediately after an mmap(2). Because - * these are speculative mappings, cached pages are not reactivated and + * Preload the specified map's pmap with mappings to the specified + * object's memory-resident pages. No further physical pages are + * allocated, and no further virtual pages are retrieved from secondary + * storage. If the specified flags include MAP_PREFAULT_PARTIAL, then a + * limited number of page mappings are created at the low-end of the + * specified address range. (For this purpose, a superpage mapping + * counts as one page mapping.) Otherwise, all resident pages within + * the specified address range are mapped. Because these mappings are + * being created speculatively, cached pages are not reactivated and * mapped. */ void @@ -1795,7 +1797,7 @@ vm_map_pmap_enter(vm_map_t map, vm_offse { vm_offset_t start; vm_page_t p, p_start; - vm_pindex_t psize, tmpidx; + vm_pindex_t mask, psize, threshold, tmpidx; if ((prot & (VM_PROT_READ | VM_PROT_EXECUTE)) == 0 || object == NULL) return; @@ -1813,8 +1815,6 @@ vm_map_pmap_enter(vm_map_t map, vm_offse } psize = atop(size); - if (psize > MAX_INIT_PT && (flags & MAP_PREFAULT_PARTIAL) != 0) - psize = MAX_INIT_PT; if (psize + pindex > object->size) { if (object->size < pindex) { VM_OBJECT_RUNLOCK(object); @@ -1825,6 +1825,7 @@ vm_map_pmap_enter(vm_map_t map, vm_offse start = 0; p_start = NULL; + threshold = MAX_INIT_PT; p = vm_page_find_least(object, pindex); /* @@ -1839,8 +1840,10 @@ vm_map_pmap_enter(vm_map_t map, vm_offse * don't allow an madvise to blow away our really * free pages allocating pv entries. */ - if ((flags & MAP_PREFAULT_MADVISE) && - vm_cnt.v_free_count < vm_cnt.v_free_reserved) { + if (((flags & MAP_PREFAULT_MADVISE) != 0 && + vm_cnt.v_free_count < vm_cnt.v_free_reserved) || + ((flags & MAP_PREFAULT_PARTIAL) != 0 && + tmpidx >= threshold)) { psize = tmpidx; break; } @@ -1849,6 +1852,16 @@ vm_map_pmap_enter(vm_map_t map, vm_offse start = addr + ptoa(tmpidx); p_start = p; } + /* Jump ahead if a superpage mapping is possible. */ + if (p->psind > 0 && ((addr + ptoa(tmpidx)) & + (pagesizes[p->psind] - 1)) == 0) { + mask = atop(pagesizes[p->psind]) - 1; + if (tmpidx + mask < psize && + vm_page_ps_is_valid(p)) { + p += mask; + threshold += mask; + } + } } else if (p_start != NULL) { pmap_enter_object(map->pmap, start, addr + ptoa(tmpidx), p_start, prot); Modified: head/sys/vm/vm_page.c ============================================================================== --- head/sys/vm/vm_page.c Sat Jun 7 15:51:29 2014 (r267212) +++ head/sys/vm/vm_page.c Sat Jun 7 17:12:26 2014 (r267213) @@ -3044,6 +3044,31 @@ vm_page_is_valid(vm_page_t m, int base, } /* + * vm_page_ps_is_valid: + * + * Returns TRUE if the entire (super)page is valid and FALSE otherwise. + */ +boolean_t +vm_page_ps_is_valid(vm_page_t m) +{ + int i, npages; + + VM_OBJECT_ASSERT_LOCKED(m->object); + npages = atop(pagesizes[m->psind]); + + /* + * The physically contiguous pages that make up a superpage, i.e., a + * page with a page size index ("psind") greater than zero, will + * occupy adjacent entries in vm_page_array[]. + */ + for (i = 0; i < npages; i++) { + if (m[i].valid != VM_PAGE_BITS_ALL) + return (FALSE); + } + return (TRUE); +} + +/* * Set the page's dirty bits if the page is modified. */ void Modified: head/sys/vm/vm_page.h ============================================================================== --- head/sys/vm/vm_page.h Sat Jun 7 15:51:29 2014 (r267212) +++ head/sys/vm/vm_page.h Sat Jun 7 17:12:26 2014 (r267213) @@ -149,6 +149,7 @@ struct vm_page { uint8_t aflags; /* access is atomic */ uint8_t oflags; /* page VPO_* flags (O) */ uint8_t queue; /* page queue index (P,Q) */ + int8_t psind; /* pagesizes[] index (O) */ int8_t segind; uint8_t order; /* index of the buddy queue */ uint8_t pool; @@ -447,6 +448,7 @@ vm_page_t vm_page_next(vm_page_t m); int vm_page_pa_tryrelock(pmap_t, vm_paddr_t, vm_paddr_t *); struct vm_pagequeue *vm_page_pagequeue(vm_page_t m); vm_page_t vm_page_prev(vm_page_t m); +boolean_t vm_page_ps_is_valid(vm_page_t m); void vm_page_putfake(vm_page_t m); void vm_page_readahead_finish(vm_page_t m); void vm_page_reference(vm_page_t m); Modified: head/sys/vm/vm_reserv.c ============================================================================== --- head/sys/vm/vm_reserv.c Sat Jun 7 15:51:29 2014 (r267212) +++ head/sys/vm/vm_reserv.c Sat Jun 7 17:12:26 2014 (r267213) @@ -249,6 +249,11 @@ vm_reserv_depopulate(vm_reserv_t rv, int if (rv->inpartpopq) { TAILQ_REMOVE(&vm_rvq_partpop, rv, partpopq); rv->inpartpopq = FALSE; + } else { + KASSERT(rv->pages->psind == 1, + ("vm_reserv_depopulate: reserv %p is already demoted", + rv)); + rv->pages->psind = 0; } clrbit(rv->popmap, index); rv->popcnt--; @@ -302,6 +307,8 @@ vm_reserv_populate(vm_reserv_t rv, int i index)); KASSERT(rv->popcnt < VM_LEVEL_0_NPAGES, ("vm_reserv_populate: reserv %p is already full", rv)); + KASSERT(rv->pages->psind == 0, + ("vm_reserv_populate: reserv %p is already promoted", rv)); if (rv->inpartpopq) { TAILQ_REMOVE(&vm_rvq_partpop, rv, partpopq); rv->inpartpopq = FALSE; @@ -311,7 +318,8 @@ vm_reserv_populate(vm_reserv_t rv, int i if (rv->popcnt < VM_LEVEL_0_NPAGES) { rv->inpartpopq = TRUE; TAILQ_INSERT_TAIL(&vm_rvq_partpop, rv, partpopq); - } + } else + rv->pages->psind = 1; } /*