Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 19 Jun 2001 20:16:44 +0100
From:      Ian Dowse <iedowse@maths.tcd.ie>
To:        freebsd-fs@freebsd.org
Subject:   CFR: UFS directory hashing
Message-ID:   <200106192016.aa70837@salmon.maths.tcd.ie>

next in thread | raw e-mail | index | archive | help

If there are no objections, I would like to commit the UFS_DIRHASH
patch in the next few days. The current patch is at

	http://www.maths.tcd.ie/~iedowse/FreeBSD/dirhash.diff4

and if anyone is interested, there is a RELENG_4 version at:

	http://www.maths.tcd.ie/~iedowse/FreeBSD/dirhash_releng_4.diff

This patch has been discussed already on -hackers a few weeks ago,
and I have incorporated many of the suggested improvements. For
anyone who missed that discussion, a brief description is below.

Ian


The idea of this code is to maintain a throw-away in-core data
structure for large directories, allowing all operations to be
performed quickly without the need for linear searches. This new
behaviour is completely optional, and it is only enabled when the
kernel is compiled with "options UFS_DIRHASH".

The implementation uses a hash array that maps filenames to the
directory offset at which the corresponding directory entry exists.
A simple spillover mechanism is used to deal with hash collisions,
and some extra summary information permits the quick location of
free space within the directory itself for create operations.

The in-core data structures have a memory requirement approximately
equal to half of the on-disk directory size. Currently there are
two sysctls that determine which directories get hashed:

  vfs.ufs.dirhashminsize	Minimum directory on-disk size for which
				hashing should be used (default 2.5k).
  vfs.ufs.dirhashmaxmem		Maximum system-wide amount of memory to
				use for directory hashes (default 2Mb).

When the memory limit is reached, the implementation will try to
recycle dirhash memory from other directories as necessary. If it
fails to build up the hash structures for any reason, the code
reverts to the normal linear searches.

Even on a relatively slow machine (200Mhz P5), I'm seeing a file
creation speed that remains at around 1000 creations/second for
directories with more than 100,000 entries. Without this patch, I
get less than 20 creations per second on the same directory (in
both cases soft-updates is enabled).


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi? <200106192016.aa70837>