Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 4 Aug 2014 23:56:41 +0200
From:      Jilles Tjoelker <jilles@stack.nl>
To:        Baptiste Daroussin <bapt@FreeBSD.org>
Cc:        arch@FreeBSD.org, hackers@FreeBSD.org
Subject:   Re: Add ohash (OpenBSD hash) into libutil
Message-ID:  <20140804215641.GA80850@stack.nl>
In-Reply-To: <20140727230110.GI50802@ivaldir.etoilebsd.net>
References:  <20140727230110.GI50802@ivaldir.etoilebsd.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jul 28, 2014 at 01:01:10AM +0200, Baptiste Daroussin wrote:
> If noone objects I would like to bring ohash from OpenBSD into our
> libutil.

> Ohash is already in base (usr.bin/m4/ohash) having it into libutil
> will increase compatibility with OpenBSD as well as allowing the rest
> of the base system to be able to benefit from ohash implementation.

Following OpenBSD's general low priority for binary compatibility
issues, the functions have the application allocate struct ohash. For
use in a symbol-versioned library that promises backwards compatibility,
it would be better to leave the type incomplete in the public header and
have the library allocate it.

Adding these functions encourages using open addressing (no chaining)
because that's what's available. This may be OK but it might be bad in
specific situations. The automatic resizing of the hash table, absent in
many home-grown hash table implementations, may help avoid such problems
somewhat.

The code in ohash_resize() that may set the hash table size to UINT_MAX
violates the requirement that the probe interval (incr) must be relative
prime to the hash table size (otherwise ohash_lookup_interval() might
loop indefinitely examining only a subset of the table, while the rest
of the table still has free space). Normally, the code enforces this
requirement by making the hash table size a power of two and the probe
interval an odd number.

On another note, if the hash table size is always a power of two, the
slow hv % h->size can be replaced by hv & (h->size - 1). The incr should
probably only be calculated when needed, or using a method that does not
involve dividing by a variable. (Note that hv doesn't contain sufficient
bits for an independent i and hv if the hash table is larger than 65536
entries, so such large hash tables will be used less uniformly.)

Utilities that value startup and fork performance (such as sh(1)) do not
want to depend on additional DSOs (why is libutil separate from libc?).

-- 
Jilles Tjoelker



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140804215641.GA80850>