Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 1 Jun 2001 12:39:44 -0400 (EDT)
From:      Robert Watson <rwatson@freebsd.org>
To:        Ian Dowse <iedowse@maths.tcd.ie>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: UFS large directory performance
Message-ID:  <Pine.NEB.3.96L.1010601123814.65702E-100000@fledge.watson.org>
In-Reply-To: <200106011541.aa41502@salmon.maths.tcd.ie>

next in thread | previous in thread | raw e-mail | index | archive | help

This is great -- once I finish moving back to Maryland (sometime
mid-next-week) I'd be very interested in running this code on a -CURRENT
mock-up of my Cyrus server, which regularly runs with 65,000+ file
directories.  I assume this is a -CURRENT patch set? 

(Mind you, I've found that most of the perceived "large directory
suffering" people tell me about is running ls with sorting enabled :-). 

Robert N M Watson             FreeBSD Core Team, TrustedBSD Project
robert@fledge.watson.org      NAI Labs, Safeport Network Services

On Fri, 1 Jun 2001, Ian Dowse wrote:

> 
> Prompted by the recent discussion about performance with large
> directories, I had a go at writing some code to improve the situation
> without requiring any filesystem changes. Large directories can
> usually be avoided by design, but the performance hit is very
> annoying when it occurs. The namei cache may help for lookups, but
> each create, rename or delete operation always involves a linear
> sweep of the directory.
> 
> The idea of this code is to maintain a throw-away in-core data
> structure for large directories, allowing all operations to be
> performed quickly without the need for a linear search. The
> experimental (read 'may trash your system'!) proof-of-concept patch
> is available at:
> 
> 	http://www.maths.tcd.ie/~iedowse/FreeBSD/dirhash.diff
> 
> The implementation uses a hash array that maps filenames to the
> directory offset where the corresponding directory entry exists.
> A simple spillover mechanism is used to deal with hash collisions,
> and some extra summary information permits the quick location of
> free space within the directory itself for create operations.
> 
> The in-core data structures have a memory requirement approximately
> equal to half of the on-disk directory size. Currently there are
> two sysctls that determine when directories get hashed:
> 
>  vfs.ufs.dirhashminsize		Minimum directory on-disk size for which
> 				hashing should be used (default 2.5k).
>  vfs.ufs.dirhashmaxmem		Maximum system-wide amount of memory to
> 				use for directory hashes (default 2Mb).
> 
> Even on a relatively slow machine (200Mhz P5), I'm seeing a file
> creation speed that remains at around 1000 creations/second for
> directories with more than 100,000 entries. Without this patch, I
> get less than 20 creations per second on the same directory (in
> both cases soft-updates is enabled).
> 
> To test, apply the patch, and add "options UFS_DIRHASH" to the
> kernel config.
> 
> Currently there are a number of features missing, and there is a
> lot of code for debugging and sanity checking that may affect
> performance. The main issues I'm aware of are:
> - There is no LRU mechanism for directory hash data structures. The
>   hash tables get freed when the in-code inode is recycled, but no
>   attempt is made to free existing memory when the dirhashmaxmem limit
>   is reached.
> - The lookup code does not optimise the case where successive
>   offsets from the hash table are in the same filesystem block.
> 
> Ian 
> 
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-hackers" in the body of the message
> 


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1010601123814.65702E-100000>