Date: Wed, 19 Nov 2008 14:29:24 +0100 From: "Nick Barkas" <nick.barkas@gmail.com> To: dan-freebsd-fs@ourbrains.org Cc: freebsd-fs@freebsd.org Subject: Re: (no subject) Message-ID: <cd41f5860811190529y365f876bn4613d77c4164597d@mail.gmail.com> In-Reply-To: <20081119052428.GC4136@ourbrains.org> References: <20081119052428.GC4136@ourbrains.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Nov 19, 2008 at 06:24, Dan <dan-freebsd-fs@ourbrains.org> wrote: > A recent question came up about huge numbers of files in one directory. > Well, some people actually have to deal with it on the job: > > http://leaf.dragonflybsd.org/mailarchive/kernel/2008-11/msg00070.html > > An FS doesn't have to be designed such that file look-ups take a very > long time to search when directories are large. When a nice hash is used > as part of the FS design, the time to search for 1 in a 100 files or 2 > billion is the same. I view it as a feature. I can imagine a few cases > where a large, non-human-readable directory is used to store many files. > When developers know they have this feature at hand, they might as well > use it. FS-based databases, image/sound editing, etc. I'm not sure if this is what you're looking for, but FreeBSD's does have some provisions to avoid too much performance degradation with large directories. The VFS name cache will speed up look-up operations on specific individual files in any size directory that are repeatedly searched for, and it is filesystem independent. Specific to UFS2 there is dirhash, which was implemented by Ian Dowse and David Malone. It speeds up more types of operations involving large directories. They wrote a paper about it you can find here: http://www.usenix.org/events/usenix02/tech/freenix/dowse.html More recently I've done a little bit of work on dirhash as well that might further speed things up. It's not committed to SVN yet, but is in Perforce. I sent out patches to this list a little while back but have not received any reports from testers. My patches might need to be updated to apply on the latest -CURRENT, and I'll try to update the wiki page (http://wiki.freebsd.org/DirhashDynamicMemory) if I find out that that is the case. I am hoping to find the time in the next few months to start working on on-disk directory indexing for UFS2 so that linear searching through directory entries is never necessary. You are correct in that filesystems don't have to be designed such that searches are slow for large directories, but UFS was designed quite a long time ago. It is not trivial to change disk formats for directories now, especially given that we want to remain backwards compatible and be able to work properly with softupdates. I hope I can help make it happen though :) Nick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?cd41f5860811190529y365f876bn4613d77c4164597d>