From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 5 17:12:42 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id E90C1B9E; Tue, 5 Feb 2013 17:12:42 +0000 (UTC) (envelope-from neelnatu@gmail.com) Received: from mail-ie0-x22b.google.com (ie-in-x022b.1e100.net [IPv6:2607:f8b0:4001:c03::22b]) by mx1.freebsd.org (Postfix) with ESMTP id 7A3C39E0; Tue, 5 Feb 2013 17:12:42 +0000 (UTC) Received: by mail-ie0-f171.google.com with SMTP id 10so541108ied.2 for ; Tue, 05 Feb 2013 09:12:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=pPawft0ejh5FStjKuVEev75Wc/tPMWbLg3Lmwls9Uq4=; b=gMuq+mbQFL23xP4oxjyNfib/GbDCoB8e3HSDhN4j/vBjZgCVwZgsdfsMicJDlAtlY2 TAbSPM2SDd0RsjlN0EfiyI0vzSscEmpi6xi8ZJxA+Pqa/1WERA14F4oUP9Qy2ffcPF47 ufhRHfmESHrvbB4Xn8GygyVHR2y7rMLTOSzZGzLDZ7Vv974tz6Tw1bwJ8bEk5+UQlP00 KvwEcYJvH7tGcOAn0bIBbELqR3lwB/oDjJCWPsOKOXn3UNh5rYBtAnOg/2sB3kJBOsoo IVs7B7/k5QiSR5evRxImHgIyP/hnW1wCQHMhZLwX/qXt7rzB5C/sHXBjXrNZXX1XjOXm rp9w== MIME-Version: 1.0 X-Received: by 10.50.161.135 with SMTP id xs7mr14638188igb.3.1360084362040; Tue, 05 Feb 2013 09:12:42 -0800 (PST) Received: by 10.42.23.132 with HTTP; Tue, 5 Feb 2013 09:12:41 -0800 (PST) In-Reply-To: <20130205151413.GL2522@kib.kiev.ua> References: <20130205151413.GL2522@kib.kiev.ua> Date: Tue, 5 Feb 2013 09:12:41 -0800 Message-ID: Subject: Re: dynamically calculating NKPT [was: Re: huge ktr buffer] From: Neel Natu To: Konstantin Belousov Content-Type: text/plain; charset=ISO-8859-1 Cc: alc@freebsd.org, davide@freebsd.org, hackers@freebsd.org, avg@freebsd.org, rank1seeker@gmail.com X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2013 17:12:43 -0000 Hi Konstantin, On Tue, Feb 5, 2013 at 7:14 AM, Konstantin Belousov wrote: > On Mon, Feb 04, 2013 at 03:05:15PM -0800, Neel Natu wrote: >> Hi, >> >> I have a patch to dynamically calculate NKPT for amd64 kernels. This >> should fix the various issues that people pointed out in the email >> thread. >> >> Please review and let me know if there are any objections to committing this. >> >> Also, thanks to Alan (alc@) for reviewing and providing feedback on >> the initial version of the patch. >> >> Patch (also available at http://people.freebsd.org/~neel/patches/nkpt_diff.txt): >> >> Index: sys/amd64/include/pmap.h >> =================================================================== >> --- sys/amd64/include/pmap.h (revision 246277) >> +++ sys/amd64/include/pmap.h (working copy) >> @@ -113,13 +113,7 @@ >> ((unsigned long)(l2) << PDRSHIFT) | \ >> ((unsigned long)(l1) << PAGE_SHIFT)) >> >> -/* Initial number of kernel page tables. */ >> -#ifndef NKPT >> -#define NKPT 32 >> -#endif >> - >> #define NKPML4E 1 /* number of kernel PML4 slots */ >> -#define NKPDPE howmany(NKPT, NPDEPG)/* number of kernel PDP slots */ >> >> #define NUPML4E (NPML4EPG/2) /* number of userland PML4 pages */ >> #define NUPDPE (NUPML4E*NPDPEPG)/* number of userland PDP pages */ >> @@ -181,6 +175,7 @@ >> #define PML4map ((pd_entry_t *)(addr_PML4map)) >> #define PML4pml4e ((pd_entry_t *)(addr_PML4pml4e)) >> >> +extern int nkpt; /* Initial number of kernel page tables */ >> extern u_int64_t KPDPphys; /* physical address of kernel level 3 */ >> extern u_int64_t KPML4phys; /* physical address of kernel level 4 */ >> >> Index: sys/amd64/amd64/minidump_machdep.c >> =================================================================== >> --- sys/amd64/amd64/minidump_machdep.c (revision 246277) >> +++ sys/amd64/amd64/minidump_machdep.c (working copy) >> @@ -232,7 +232,7 @@ >> /* Walk page table pages, set bits in vm_page_dump */ >> pmapsize = 0; >> pdp = (uint64_t *)PHYS_TO_DMAP(KPDPphys); >> - for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR, >> + for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR, >> kernel_vm_end); ) { >> /* >> * We always write a page, even if it is zero. Each >> @@ -364,7 +364,7 @@ >> /* Dump kernel page directory pages */ >> bzero(fakepd, sizeof(fakepd)); >> pdp = (uint64_t *)PHYS_TO_DMAP(KPDPphys); >> - for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + NKPT * NBPDR, >> + for (va = VM_MIN_KERNEL_ADDRESS; va < MAX(KERNBASE + nkpt * NBPDR, >> kernel_vm_end); va += NBPDP) { >> i = (va >> PDPSHIFT) & ((1ul << NPDPEPGSHIFT) - 1); >> >> Index: sys/amd64/amd64/pmap.c >> =================================================================== >> --- sys/amd64/amd64/pmap.c (revision 246277) >> +++ sys/amd64/amd64/pmap.c (working copy) >> @@ -202,6 +202,10 @@ >> vm_offset_t virtual_avail; /* VA of first avail page (after kernel bss) */ >> vm_offset_t virtual_end; /* VA of last avail page (end of kernel AS) */ >> >> +int nkpt; >> +SYSCTL_INT(_machdep, OID_AUTO, nkpt, CTLFLAG_RD, &nkpt, 0, >> + "Number of kernel page table pages allocated on bootup"); >> + >> static int ndmpdp; >> static vm_paddr_t dmaplimit; >> vm_offset_t kernel_vm_end = VM_MIN_KERNEL_ADDRESS; >> @@ -495,17 +499,42 @@ >> >> CTASSERT(powerof2(NDMPML4E)); >> >> +/* number of kernel PDP slots */ >> +#define NKPDPE(ptpgs) howmany((ptpgs), NPDEPG) >> + >> static void >> +nkpt_init(vm_paddr_t addr) >> +{ >> + int pt_pages; >> + >> +#ifdef NKPT >> + pt_pages = NKPT; >> +#else >> + pt_pages = howmany(addr, 1 << PDRSHIFT); >> + pt_pages += NKPDPE(pt_pages); >> + >> + /* >> + * Add some slop beyond the bare minimum required for bootstrapping >> + * the kernel. >> + * >> + * This is quite important when allocating KVA for kernel modules. >> + * The modules are required to be linked in the negative 2GB of >> + * the address space. If we run out of KVA in this region then >> + * pmap_growkernel() will need to allocate page table pages to map >> + * the entire 512GB of KVA space which is an unnecessary tax on >> + * physical memory. >> + */ >> + pt_pages += 4; /* 8MB additional slop for kernel modules */ > 8MB might be to low. I just checked one of my machines with fully > modularized kernel, it takes slightly more than 6 MB to load 50 modules. > I think that 16MB would be safer, but it probably needs to be scaled > down based on the available phys memory. amd64 kernel could be booted > on 128MB machine still. Sounds fine. I can bump it up to 8 pages. Also, wrt your comment about scaling this number based on available memory, I wonder if it makes sense to optimize for 16KB of additional space. I would much rather work with you and Alan to fix pmap_growkernel() so we don't need to care about this slack in the first place :-) best Neel