From owner-freebsd-questions Wed Mar 28 18:48:54 2001 Delivered-To: freebsd-questions@freebsd.org Received: from guru.mired.org (okc-65-26-235-186.mmcable.com [65.26.235.186]) by hub.freebsd.org (Postfix) with SMTP id 5EE8037B726 for ; Wed, 28 Mar 2001 18:48:51 -0800 (PST) (envelope-from mwm@mired.org) Received: (qmail 38726 invoked by uid 100); 29 Mar 2001 02:48:44 -0000 From: Mike Meyer MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15042.41612.894010.794600@guru.mired.org> Date: Wed, 28 Mar 2001 20:48:44 -0600 To: Gabriel Ambuehl Cc: questions@freebsd.org Subject: Re: List of changed files In-Reply-To: <72115728@toto.iv> X-Mailer: VM 6.89 under 21.1 (patch 14) "Cuyahoga Valley" XEmacs Lucid X-face: "5Mnwy%?j>IIV\)A=):rjWL~NB2aH[}Yq8Z=u~vJ`"(,&SiLvbbz2W`;h9L,Yg`+vb1>RG% *h+%X^n0EZd>TM8_IB;a8F?(Fb"lw'IgCoyM.[Lg#r\ Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Gabriel Ambuehl types: > -----BEGIN PGP SIGNED MESSAGE----- > Hello, > I'm currently working on some sort of filesystem replication for our > webservers but got the following problem: scanning (i.e. recursively > going through the whole filesystem and see what files have a new > modtime) the whole filesystem for files with changed modtime just > isn't fast enough. Personally, I think you're trying to fix this problem at the wrong place. Replication should be built into the release process, not done as an afterthought. Good source control software will have the logging information you're looking for, but you shouldn't need that log. After all, loading current source into a workspace should be a basic operation. This also means the overhead for all this is happening in your development environment, not on the production server. > Given some hundred thousand files on one partition, this takes way too > long as the system seems to be reading all the data from the disk > and there appears to be no real caching that would speed up subsequent > scans (which makes some sense, as the cache would have to be HUGE). You shouldn't need to look at anything but inodes. My favorite Unix tools for that - icheck and ncheck - don't seem to be in FreeBSD anymore. You could role a custom tool to scan the inodes on each file system, which would give you the list of udpated inodes pretty quickly. It's not clear that you can turn that into a list of updated files any faster than you could find the files by a recursive directory scan. > So there has to be another possibility to get to the data I > need (i.e. what files did change and when). Since the kernel is > responsible for FS, it would make sense to have him log all writes to > every file on a given filesystem so one could simply parse the logfile > and act accordingly without having the need to scan the entire > filesystem. Is there any facility for this sort of info (I thought > spy: > http://people.freebsd.org/~abial/spy/ might be an option as I could > log all calls to open but couldn't get it to compile on 4.3 RC1)? That would certainly do the job, but I suspect the log files would be huge - just like the cache - and it would put extra load on your web server. It also looks like it's trying to plug into the kernel, but there don't seem to be instructions on how to go about doing that. http://www.mired.org/home/mwm/ Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message