From owner-freebsd-current@FreeBSD.ORG Tue Jul 16 21:12:45 2013 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 1F11786B for ; Tue, 16 Jul 2013 21:12:45 +0000 (UTC) (envelope-from alan.l.cox@gmail.com) Received: from mail-ea0-x234.google.com (mail-ea0-x234.google.com [IPv6:2a00:1450:4013:c01::234]) by mx1.freebsd.org (Postfix) with ESMTP id 979001F0 for ; Tue, 16 Jul 2013 21:12:44 +0000 (UTC) Received: by mail-ea0-f180.google.com with SMTP id k10so641667eaj.25 for ; Tue, 16 Jul 2013 14:12:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=77GV+4nhQG3iAh/xcm+EZN7W1AcrarNdTAvUNyn4u2I=; b=olyNyW2STGIBu9frTQhqp33lPw4NvTr0ey5i5tGiRhiQIjD2LUrUgAW7wxgmE4eNHx aJ8YVx8zHec7cDTofUs2jD7XU06wGouKrIgaTHnimRtUxSyXIsb2G3Yo2OcUOIqjRwBm zBbIoi/aai86x/IYQNebzaq9DQVXNA9XGVqPUnaf5456PYL/+3AO0Vw8mI35rnA3XKcg dHYfu+Hyk3YXznuvDBjIsilmF/pA/PNhwA0ox9MYeljI1ZKKsFuQxskjz5H2QD/W/Gaj LyMuGH7iB0q//xgF75CtEYWh7Abvs/gZ7OD2jNGXhYE04UR568iapXiMOu8kNI//KCNG bXPQ== MIME-Version: 1.0 X-Received: by 10.14.219.6 with SMTP id l6mr2755028eep.152.1374009162887; Tue, 16 Jul 2013 14:12:42 -0700 (PDT) Received: by 10.223.61.130 with HTTP; Tue, 16 Jul 2013 14:12:42 -0700 (PDT) In-Reply-To: <51E553C4.9000207@pix.net> References: <51E553C4.9000207@pix.net> Date: Tue, 16 Jul 2013 14:12:42 -0700 Message-ID: Subject: Re: expanding past 1 TB on amd64 From: Alan Cox To: Kurt Lidl Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-current X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: alc@freebsd.org List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Jul 2013 21:12:45 -0000 On Tue, Jul 16, 2013 at 7:08 AM, Kurt Lidl wrote: > On Wed, Jun 19, 2013 at 1:32 AM, Chris Torek >> wrote: >> >> In src/sys/amd64/include/vmparam.**h is this handy map: >>> >>> * 0x0000000000000000 - 0x00007fffffffffff user map >>> * 0x0000800000000000 - 0xffff7fffffffffff does not exist (hole) >>> * 0xffff800000000000 - 0xffff804020100fff recursive page table (512GB >>> slot) >>> * 0xffff804020101000 - 0xfffffdffffffffff unused >>> * 0xfffffe0000000000 - 0xfffffeffffffffff 1TB direct map >>> * 0xffffff0000000000 - 0xffffff7fffffffff unused >>> * 0xffffff8000000000 - 0xffffffffffffffff 512GB kernel map >>> >>> showing that the system can deal with at most 1 TB of address space >>> (because of the direct map), using at most half of that for kernel >>> memory (less, really, due to the inevitable VM fragmentation). >>> >>> New boards are coming soonish that will have the ability to go >>> past that (24 DIMMs of 64 GB each = 1.5 TB). Or, if some crazy >>> people :-) might want to use a most of a 768 GB board (24 DIMMs of >>> 32 GB each, possible today although the price is kind of >>> staggering) as wired-down kernel memory, the 512 GB VM area is >>> already a problem. >>> >>> I have not wrapped my head around the amd64 pmap code but figured >>> I'd ask: what might need to change to support larger spaces? >>> Obviously NKPML4E in amd64/include/pmap.h, for the kernel start >>> address; and NDMPML4E for the direct map. It looks like this >>> would adjust KERNBASE and the direct map appropriately. But would >>> that suffice, or have I missed something? >>> >>> For that matter, if these are changed to make space for future >>> expansion, what would be a good expansion size? Perhaps multiply >>> the sizes by 16? (If memory doubles roughly every 18 months, >>> that should give room for at least 5 years.) >>> >>> >>> Chris, Neel, >> >> The actual data that I've seen shows that DIMMs are doubling in size at >> about half that pace, about every three years. For example, see >> http://users.ece.cmu.edu/~**omutlu/pub/mutlu_memory-** >> scaling_imw13_invited-talk.**pdfslide >> #8. So, I think that a factor of 16 is a lot more than we'll need in >> the next five years. I would suggest configuring the kernel virtual >> address space for 4 TB. Once you go beyond 512 GB, 4 TB is the net >> "plateau" in terms of address translation cost. At 4 TB all of the PML4 >> entries for the kernel virtual address space will reside in the same L2 >> cache line, so a page table walk on a TLB miss for an instruction fetch >> will effectively prefetch the PML4 entry for the kernel heap and vice >> versa. >> > > The largest commodity motherboards that are shipping today support > 24 DIMMs, at a max size of 32GB per DIMM. That's 768GB, right now. > (So FreeBSD is already "out of bits" in terms of supporting current > shipping hardware.) Actually, this scenario with 768 GB of RAM on amd64 as it is today is analogous to the typical 32-bit i386 machine, where the amount of RAM has long exceeded the default 1 GB size of the kernel virtual address space. In theory, we could currently handle up to 1 TB of RAM, but the kernel virtual address space would only be 512 GB. ... The Haswell line of CPUs is widely reported to > support DIMMs twice as large, and it's due in September. That would > make the systems of late 2013 hold up to 1536GB of memory. > > Using your figure of doubling in 3 years, we'll see 3072GB systems by > ~2016. And in ~2019, we'll see 6TB systems, and need to finally expand > to using more than a single cache line to hold all the PML4 entries. > > Yes, this is a reasonable prognostication. Alan > Of course, that's speculating furiously about two generations out, and > assumes keeping the current memory architecture / board design > constraints. > > -Kurt > > > ______________________________**_________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/**mailman/listinfo/freebsd-**current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@** > freebsd.org " >