From owner-freebsd-arch  Tue Oct  8 13:15:17 2002
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id EDF1737B401
	for <arch@FreeBSD.ORG>; Tue,  8 Oct 2002 13:15:15 -0700 (PDT)
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 7D4E543E6A
	for <arch@FreeBSD.ORG>; Tue,  8 Oct 2002 13:15:15 -0700 (PDT)
	(envelope-from dillon@apollo.backplane.com)
Received: from apollo.backplane.com (localhost [127.0.0.1])
	by apollo.backplane.com (8.12.5/8.12.4) with ESMTP id g98KFFPQ084626;
	Tue, 8 Oct 2002 13:15:15 -0700 (PDT)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.12.5/8.12.4/Submit) id g98KFFrq084625;
	Tue, 8 Oct 2002 13:15:15 -0700 (PDT)
	(envelope-from dillon)
Date: Tue, 8 Oct 2002 13:15:15 -0700 (PDT)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200210082015.g98KFFrq084625@apollo.backplane.com>
To: "Vladimir B. " Grebenschikov <vova@sw.ru>
Cc: Nate Lawson <nate@root.org>, arch@FreeBSD.ORG
Subject: Database indexes and ram (was Re: using mem above 4Gb was: swapon some regular file)
References: <Pine.BSF.4.21.0210081209010.11243-100000@root.org> <1034105993.913.1.camel@vbook.express.ru>
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

:..
:> It's often surprisingly effective to just access the index on disk and
:> tune your VM cache instead.  You can lose performance by double-caching
:> data.
:
:I don't want cache disk data in extra memory - simply store index in RAM
:(no disk access at all) - I think it must be faster.
:
:> -Nate
:
:-- 
:Vladimir B. Grebenschikov
:vova@sw.ru, SWsoft, Inc.

    If you have enough ram to hold the index, copying the index into
    anonymous memory will be no slower or faster then mmap()ing it into ram.

    If you do not have enough ram to hold the index then trying to store 
    it in ram won't work.

    Database indexes, e.g. typically B+Trees or similar entities, are
    highly cacheable and designed to reduce the number of seek/reads 
    required to do a lookup as much as possible.   This tends to result
    in fairly good matching between our VM system and a fairly optimal
    caching of the index.  

    For example, take a B+Tree with 64 elements per node and a database with
    16 million records in it.  16 million records can be represented by 
    four levels in the B+Tree.  The first three levels (64*64*
    64*sizeof(btreeelm)) = 262144 * sizeof(btreeelm), or, typically,
    less then 16 MB of data which the VM system will cache at a high 
    priority due to the frequency of accesses.  The last B+Tree level in
    this example represents the only seek/read that would have to occur on
    the disk (if you didn't have enough memory to hold the entire index).

    The only *PROBLEM* with using mmap() is that the database will not have
    a very good idea about whether a particular mapped memory location is
    resident or whether it will stall the process while doing a disk read,
    which can seriously impact multi-threaded access to the database.
    madvise() and mincore() can be used to some effect but that still means
    making system calls that one would rather not have to make.  Still,
    mmap() can be used to good effect and I usually find it easier to use
    then having to write a userland shared memory disk cache manager.

						-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message