From owner-freebsd-hackers@FreeBSD.ORG Tue May 6 07:49:03 2008 Return-Path: Delivered-To: hackers@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5485A1065679 for ; Tue, 6 May 2008 07:49:03 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from vlakno.cz (vlk.vlakno.cz [62.168.28.247]) by mx1.freebsd.org (Postfix) with ESMTP id 132448FC24 for ; Tue, 6 May 2008 07:49:02 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from localhost (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id E15F467E447 for ; Tue, 6 May 2008 09:48:31 +0200 (CEST) X-Virus-Scanned: amavisd-new at vlakno.cz Received: from vlakno.cz ([127.0.0.1]) by localhost (vlk.vlakno.cz [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vWnTQ88GujFn for ; Tue, 6 May 2008 09:48:30 +0200 (CEST) Received: from vlk.vlakno.cz (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id 66FE567E446 for ; Tue, 6 May 2008 09:48:30 +0200 (CEST) Received: (from rdivacky@localhost) by vlk.vlakno.cz (8.14.2/8.14.2/Submit) id m467mUHW070988 for hackers@FreeBSD.ORG; Tue, 6 May 2008 09:48:30 +0200 (CEST) (envelope-from rdivacky) Date: Tue, 6 May 2008 09:48:30 +0200 From: Roman Divacky To: hackers@FreeBSD.ORG Message-ID: <20080506074830.GA70848@freebsd.org> References: <20080505204350.GA45321@freebsd.org> <20080506062556.GA68171@zim.MIT.EDU> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080506062556.GA68171@zim.MIT.EDU> User-Agent: Mutt/1.4.2.3i Cc: Subject: Re: hashinit versus phashinit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 May 2008 07:49:03 -0000 On Tue, May 06, 2008 at 02:25:56AM -0400, David Schultz wrote: > On Mon, May 05, 2008, Roman Divacky wrote: > > hi > > > > when we want to use a hash table in kernel we call "hashinit" which > > initializes a hash table with power-of-2 size. There's also "phashinit" > > that creates hash table of size that is a prime number. This was > > added in 1995 by davidg@ but it is not used anywhere in the kernel. > > > > phk@ commited rev. 1.30 of vfs_cache.c replacing phashinit with hashinit > > stating that it's better because it replaces a division with logical and. > > is this reason still valid today? (the commit was done almost 11 years ago) > > > > is there still any reason why not use the phashinit instead of hashinit? > > I believe using prime-sized hash table might have positive performance > > impact... > > There's a tradeoff. > > The argument for using powers of 2 is that division takes many > times longer than a logical AND. > > The argument for using primes is that if your hash function isn't > as good as you thought it was, or if the data has some regularity > you weren't expecting, you can get screwed a lot more easily with > a power of 2 hash table. With a prime-sized hash table, you only > get screwed if lots of your data is congruent modulo the prime, > which is very rare. > > Most general-purpose hash implementations I've used (e.g., GNU > libstdc++, Sun JDK, Microsoft .NET) use prime table sizes, > probably to make it less likely that programmers will shoot > themselves in the foot with pathological data or bad hash functions. yes... a division takes roughly 40 cycles on modern i386 hw, while and is just 1, but does it matter compared to the access times of memory these days? the ratio between cpu-speed/mem-speed has changed a lot. I am not arguing if it's still true that avoiding the division helps the performance these days...