From owner-freebsd-hackers@FreeBSD.ORG Mon Oct 27 09:09:34 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B063916A4B3 for ; Mon, 27 Oct 2003 09:09:34 -0800 (PST) Received: from VARK.homeunix.com (adsl-68-123-40-77.dsl.pltn13.pacbell.net [68.123.40.77]) by mx1.FreeBSD.org (Postfix) with ESMTP id 44CCB43FA3 for ; Mon, 27 Oct 2003 09:09:31 -0800 (PST) (envelope-from das@FreeBSD.ORG) Received: from VARK.homeunix.com (localhost [127.0.0.1]) by VARK.homeunix.com (8.12.9/8.12.9) with ESMTP id h9RH90en030270; Mon, 27 Oct 2003 09:09:00 -0800 (PST) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by VARK.homeunix.com (8.12.9/8.12.9/Submit) id h9RH8xKO030269; Mon, 27 Oct 2003 09:08:59 -0800 (PST) (envelope-from das@FreeBSD.ORG) Date: Mon, 27 Oct 2003 09:08:59 -0800 From: David Schultz To: Lowell Gilbert Message-ID: <20031027170859.GA30164@VARK.homeunix.com> Mail-Followup-To: Lowell Gilbert , freebsd-hackers@FreeBSD.ORG References: <1066789354.21430.39.camel@boxster.onthenet.com.au> <20031022082953.GA69506@rot13.obsecurity.org> <1066816287.25609.34.camel@boxster.onthenet.com.au> <20031022095754.GA70026@rot13.obsecurity.org> <1066820436.25609.93.camel@boxster.onthenet.com.au> <20031026052854.GA20701@VARK.homeunix.com> <44he1wi2yy.fsf@be-well.ilk.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <44he1wi2yy.fsf@be-well.ilk.org> cc: freebsd-hackers@FreeBSD.ORG Subject: Re: Some mmap observations compared to Linux 2.6/OpenBSD X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Oct 2003 17:09:34 -0000 On Sun, Oct 26, 2003, Lowell Gilbert wrote: > David Schultz writes: > > > On Sun, Oct 26, 2003, Dag-Erling Smrgrav wrote: > > > Q writes: > > > > Yes, it would appear this is a legacy thing that existed in the original > > > > 1994 import of the BSD 4.4 Lite source. Both FreeBSD and NetBSD still > > > > use this technique, but OpenBSD changed to using Red-Black trees back in > > > > Feb 2002. > > > > [...] > > > > I am wondering if there is a compelling reason why the technique used by > > > > OpenBSD could not be adapted to FreeBSD's VM system. > > > > > > Adapting OpenBSD's red-balck patches would require quite a bit of work > > > as FreeBSD and OpenBSD have diverged quite a bit in this area. Though > > > it is a good idea to change the list into a tree, I think you'd get > > > more mileage by addressing the fundamental problem, which is the lack > > > of a free list. The current code (in both FreeBSD and OpenBSD) > > > searches a list or tree of allocated extents, sorted by location, > > > looking for a pair that have sufficient space between them for the > > > extent you want to create. We should instead keep track of free > > > extents in a structure that makes it easy to locate one of the correct > > > size. We probably need a dual structure, though, because we need to > > > keep the free extents sorted both by size (to quickly find what we > > > need) and by location (to facilitate aggregation of adjacent extents, > > > without which we'd suffer horribly from address space fragmentation). > > > > > > I have no idea how much this means for real-life workloads though. > > > > Your idea of using a size-hashed freelist as well as a > > location-sorted list is appealing in its simplicity. Though it > > can cause a bit of fragmentation, it gives you constant time > > lookup. Bonwick's vmem allocator ([1], section 4.4.2 and > > following), apparently works quite well using this principle. > > A hash probably isn't the right data structure for either dimension > (DES didn't say it was, I notice). Finding the next-largest available > entry is a useful operation, here, so a list would be better than a > hash. [Or a tree; the point is that exact-match isn't the only kind > of search you need.] Erm, did you read the paper I referred to? If you have, say, 32 power-of-two buckets, you can use a bitmask representing which buckets are non-empty to locate spcae in constant time. The caveat (also in the paper) is that the price of the constant time operation is that your allocation may be up to twice as large as necessary. A tree lacks this disadvantage, but also carries with it some additional overhead.