Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 28 Mar 2001 20:48:44 -0600
From:      Mike Meyer <mwm@mired.org>
To:        Gabriel Ambuehl <gabriel_ambuehl@buz.ch>
Cc:        questions@freebsd.org
Subject:   Re: List of changed files
Message-ID:  <15042.41612.894010.794600@guru.mired.org>
In-Reply-To: <72115728@toto.iv>

next in thread | previous in thread | raw e-mail | index | archive | help
Gabriel Ambuehl <gabriel_ambuehl@buz.ch> types:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hello,
> I'm currently working on some sort of filesystem replication for our
> webservers but got the following problem: scanning (i.e. recursively
> going through the whole filesystem and see what files have a new
> modtime) the whole filesystem for files with changed modtime just
> isn't fast enough.

Personally, I think you're trying to fix this problem at the wrong
place. Replication should be built into the release process, not done
as an afterthought. Good source control software will have the logging
information you're looking for, but you shouldn't need that log. After
all, loading current source into a workspace should be a basic
operation.  This also means the overhead for all this is happening in
your development environment, not on the production server.

> Given some hundred thousand files on one partition, this takes way too
> long as the system seems to be reading all the data from the disk
> and there appears to be no real caching that would speed up subsequent
> scans (which makes some sense, as the cache would have to be HUGE).

You shouldn't need to look at anything but inodes. My favorite Unix
tools for that - icheck and ncheck - don't seem to be in FreeBSD
anymore. You could role a custom tool to scan the inodes on each file
system, which would give you the list of udpated inodes pretty
quickly. It's not clear that you can turn that into a list of updated
files any faster than you could find the files by a recursive
directory scan.

> So there has to be another possibility to get to the data I
> need (i.e. what files did change and when). Since the kernel is
> responsible for FS, it would make sense to have him log all writes to
> every file on a given filesystem so one could simply parse the logfile
> and act accordingly without having the need to scan the entire
> filesystem. Is there any facility for this sort of info (I thought
> spy:
> http://people.freebsd.org/~abial/spy/ might be an option as I could
> log all calls to open but couldn't get it to compile on 4.3 RC1)?

That would certainly do the job, but I suspect the log files would be
huge - just like the cache - and it would put extra load on your web
server. It also looks like it's trying to plug into the kernel, but
there don't seem to be instructions on how to go about doing that.

	<mike
--
Mike Meyer <mwm@mired.org>			http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?15042.41612.894010.794600>