From owner-freebsd-arch Fri Oct 11 23:53:45 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C159E37B401 for ; Fri, 11 Oct 2002 23:53:42 -0700 (PDT) Received: from softweyr.com (softweyr.com [65.88.244.127]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1788E43E77 for ; Fri, 11 Oct 2002 23:53:42 -0700 (PDT) (envelope-from wes@softweyr.com) Received: from nextgig-9.access.nethere.net ([66.63.140.201] helo=softweyr.com) by softweyr.com with esmtp (Exim 3.35 #1) id 180G96-0001p7-00; Sat, 12 Oct 2002 00:53:09 -0600 Message-ID: <3DA7C997.95F3C4FF@softweyr.com> Date: Sat, 12 Oct 2002 00:04:55 -0700 From: Wes Peters Organization: Softweyr LLC X-Mailer: Mozilla 4.78 [en] (X11; U; Linux 2.4.2 i386) X-Accept-Language: en MIME-Version: 1.0 To: Terry Lambert Cc: Matthew Dillon , arch@FreeBSD.ORG Subject: Re: Database indexes and ram (was Re: using mem above 4Gb was:swapon some regular file) References: <1034105993.913.1.camel@vbook.express.ru> <200210082015.g98KFFrq084625@apollo.backplane.com> <1034109053.913.7.camel@vbook.express.ru> <200210082051.g98KpjU1084793@apollo.backplane.com> <3DA4C271.37AACAA3@softweyr.com> <3DA4C632.325F2EBE@mindspring.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Terry Lambert wrote: > > Wes Peters wrote: > > Linux solved this problem by refusing to do it. The candidates for DMA > > transfers include skbufs and buffers from the disk buffer pool, both of > > which are allocated from the lowest 4GB of physical ram when using PAE > > mode. > > Yes; this is the "Fast RAM/bounce buffer" approach I mentioned > already. Linux has an advantage here, in that they already run > software virtualization on the VM system, in order to try to be > architecture independent. The result is overhead in reverse > lookups that has only recently been fixed (and you need patches > to use it). FreeBSD would eat more overhead doing this, where > it sort of "fell out" of the extra overhead they already eat in > the Linux case. Yup. We could do much the same, but it'll take a bit of architecting. Adding some physical locality preferences to pools in JeffR's slab allocator would be a way to start investigating this, at a guess. > > Nah, it works great. Each process gets 3GB process virtual address and > > 1GB kernel virtual address and all of the program text+data can be located > > anywhere in physical ram. For things like databases that need large > > indeces in memory, this is a big win. > > This, I don't get: I don't understand how they can live with only > 1G of KVA space. I guess they are expecting a small number of net > connections... Per-process. I don't know that they've made socket pcb's (or their equivalent) per-process or not, but it seems a logical leap. I haven't looked into any of this because for our application, with a relatively small number of connections, it just works. > > Neither will help you with index sizes if you're using really honking big > > tables, where the index just won't fit. We actually use multiple processes > > to hold cached data, including indexes, in order to make use of the extra > > RAM. I should shut up now. ;^) > > ...or you'll have to kill you. 8-) 8-). Gurk! Sad, but true. > > > of accesses to the index that might result in cacheable table data are > > > also the types of accesses to the index that will likely result in > > > cacheable index data. Using the same argument, the types of accesses > > > that might result in an uncacheable index would also likely result in > > > uncacheable table data which means you are going to run up against > > > seek/read problems on the table data, making it more worthwhile to > > > spend the money on beefing up the storage subsystem. > > > > That's only true if your database server is I/O bound. Depending on your > > job mix, this may or may not be the problem. > > Likely, it will not be true, for any very large database, particularly > if you end up doing a reasonable number of joins. Hardly anybody goes > past 3rd normal form, and some people never even get that far. 8-). Some? You've seen a production database that was normalized at ALL? Gee, that'd be... nice? astonishing? like seeing the pope tour temple square? DBA stands for Data Base A..... The key to accelerating database access rarely has much to do with I/O speed. How many Oracle servers do you know that can stuff a Gigabit channel full, even doing straight selects? Memory usage is VERY important and DBAs are not famous for optimizing queries to make effecient use of the processor cache. Or anything else, for that matter. -- "Where am I, and what am I doing in this handbasket?" Wes Peters Softweyr LLC wes@softweyr.com http://softweyr.com/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message