From owner-freebsd-hackers@FreeBSD.ORG Fri May 9 17:06:32 2008 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 411DE106566C for ; Fri, 9 May 2008 17:06:32 +0000 (UTC) (envelope-from joerg@britannica.bec.de) Received: from www.pkgsrc-box.org (www.ostsee-abc.de [62.206.222.50]) by mx1.freebsd.org (Postfix) with ESMTP id F04F88FC15 for ; Fri, 9 May 2008 17:06:31 +0000 (UTC) (envelope-from joerg@britannica.bec.de) Received: from britannica.bec.de (www.pkgsrc-box.org [127.0.0.1]) by www.pkgsrc-box.org (Postfix) with ESMTP id 146E2E506B3 for ; Fri, 9 May 2008 17:06:30 +0000 (UTC) Received: by britannica.bec.de (Postfix, from userid 1000) id B8E6B16FC2; Fri, 9 May 2008 19:06:33 +0200 (CEST) Date: Fri, 9 May 2008 19:06:33 +0200 From: Joerg Sonnenberger To: freebsd-hackers@freebsd.org Message-ID: <20080509170633.GB3571@britannica.bec.de> Mail-Followup-To: freebsd-hackers@freebsd.org References: <20080509124308.GA596@britannica.bec.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.17 (2007-11-01) Subject: Re: Adding .db support to pkg_tools X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 May 2008 17:06:32 -0000 On Fri, May 09, 2008 at 06:50:10PM +0200, Anders Nore wrote: > Yes that would probably be bad for the database, but I'm sure one can > manage to get around this problem by copying it before changing the db and > delete the copy if it doesn't fail. At the next time executed it will check > for a copy, use that and assume that the last run was unsuccessful. /var/db/pkg contains 10MB for the various packages installed on my laptop and 10MB for the cache of +CONTENTS. You don't want to copy that around all the time. >> Secondly, I would also advisy against just storing all meta data in a >> single key/value pair. For example, +CONTENTS can be extremely large. >> Check texmf for a good example. > > When it comes to storing large values in a key/value pair, I think that's > what bdb was designed for, handling large amounts of data (in the orders of > gigabytes even in key's) fast. No, actually that is exactly what it was *not* designed for. Having billions of keys is fine, but data that exceeds the size of a database page is going to slow down. While it might not be a problem of you are copying the data to a new file anyway, it also means that fragmentation in the database will be more problematic. My main point is that for the interesting operations you want to actually look up with fine grained keys and that's what is not possible if you store the meta data as blob. In fact, storing the meta data as blob is not faster than just using the filesystem. Joerg