Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 16 Aug 1997 12:20:48 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        karpen@ocean.campus.luth.se (Mikael Karpberg)
Cc:        dg@root.com, hackers@FreeBSD.ORG
Subject:   Re: More info on slow "rm" times with 2.2.1+.
Message-ID:  <199708161920.MAA04387@phaeton.artisoft.com>
In-Reply-To: <199708161229.OAA01231@ocean.campus.luth.se> from "Mikael Karpberg" at Aug 16, 97 02:29:10 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> >    How many files are in the directory isn't important. What is important is
> > the size of the directory. You can have a 20MB directory and yet have only a
> > 100 files in it. There is code to free up unused space in directories, but
> > it only works if the free space is at the end. If the directory is large,
> > then it will take a large amount of time to search through it.
> 
> And this cuz it's slow? Or? Isn't there a command (which could be run in
> daily, or weekly, or something) that goes through a directory (or many) and
> optimize the space they take?
> 
> If there isn't... why? And would it be hard to write?

This particular optimization is not possible "in-band" because of
you can't reorder the directory entries when someone has the
directory open without damaging the validit of their offsef for
their next "getdents()".

The closest you can get with this approach is a sparse directory
file (do not get me wrong; this is a not insignificant win).  But
even so, you will probably not be in the area of the previous versions
performance, unless you are right on the cusp of directory entry
pages being LRU'ed on you.  And if you were, you could speed it up
much more easily by adding RAM (best) or swap (good) to extend the
LRU period so that the directory entry traversal did not force
pages out.

Of course, doing that, you aren't going to get a real win: you are
just putting off the problem for a future recurrence when your
number of entries goes up, yet again.  Throwing hardware at a
problem is a piss-poor way to optimize.

One *very* nice possibility would be to seperate, completely,
the directory and file entry operations (the VFS abstraction
fails to do this in a number of circumstances right now, and
namei() and the directory cache being per FS instead of in
the common VFS layer are in the middle of where the blame
should fall).  If you did this, you could provide a directory
entry function "iterator" operation.  If you had one of these,
you could add a system call to call the iterator with a delete
function (yes, the function would have globbing in the kernel)
and delete everything matching the criteria in a single, linear
pass of the directory, without kernel/user transitions.  Yet
another VFS layering issue, I'm afraid.  8-(.


					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199708161920.MAA04387>