Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 10 Feb 2002 21:09:43 -0800
From:      Kirk McKusick <mckusick@mckusick.com>
To:        Alfred Perlstein <bright@mu.org>
Cc:        Ian Dowse <iedowse@maths.tcd.ie>, fs@FreeBSD.ORG
Subject:   Re: fsck and predictive readahead? 
Message-ID:  <200202110509.g1B59hi06207@beastie.mckusick.com>
In-Reply-To: Your message of "Sat, 22 Dec 2001 15:08:14 CST." <20011222150814.Z48837@elvis.mu.org> 

next in thread | previous in thread | raw e-mail | index | archive | help
	Date: Sat, 22 Dec 2001 15:08:14 -0600
	From: Alfred Perlstein <bright@mu.org>
	To: Ian Dowse <iedowse@maths.tcd.ie>
	Cc: mckusick@freebsd.org, fs@freebsd.org
	Subject: Re: fsck and predictive readahead?

	* Ian Dowse <iedowse@maths.tcd.ie> [011222 07:53] wrote:
	> In message <20011222053306.Y48837@elvis.mu.org>, Alfred Perlstein writes:
	> >I'm wondering if fsck uses any sort of tricks to do read-ahead
	> >to prefect data for pass1 and pass2.
	> >
	> >If not does anyone thing it might speed things up?
	> 
	> I've wondered about this also. Since fsck spends virtually all of
	> its time waiting for disk reads, doing most kinds of speculative
	> disk reads would only slow things down. However, there is some
	> potential for re-ordering the reads to reduce seeking and to allow
	> data to be read in larger chunks.
	> 
	> Pass 1 involves quite a lot of disk seeking because it goes off and
	> retrieves all indirection blocks (blocks of block numbers) for any
	> inodes that have them. Otherwise pass 1 would be a simple linear
	> scan through all inodes. It would be possible to defer the reading
	> of indirection blocks and then read them in order (having 2nd- and
	> 3rd-level indirection blocks complicates this). I think I tried a
	> simple form of this a few years ago, but the speedup was only
	> marginal. I believe I also tried changing fsck's bread() to read
	> larger blocks when contiguous reads were detected, again with no
	> significant improvements.
	> 
	> For pass 2, the directories are sorted by the block number of their
	> first block, so there is very little seeking. Some speed improvement
	> might be possible by doing a larger read when a few directory blocks
	> are close together on the disk.
	> 
	> An interesting exercise would be to modify fsck to print out a list
	> of the offset and length for every disk read it performs. Then sort
	> that list, coalesce contiguous reads, and see how long it takes the
	> disk to read the new list as compared to the original. Such perfect
	> sorting is obviously not feasable in practice, but it would give
	> some idea of the potential for improvements.

	The problem you didn't address with all these changes was stalls
	due to disk IO.

	/usr/src/sbin/fsck_ffs # time ./fsck_ffs -d -n /vol/spare 
	** /dev/ad0s1g (NO WRITE)
	** Last Mounted on /vol/spare
	** Phase 1 - Check Blocks and Sizes
	** Phase 2 - Check Pathnames
	** Phase 3 - Check Connectivity
	** Phase 4 - Check Reference Counts
	** Phase 5 - Check Cyl groups
	303580 files, 12865338 used, 9183427 free (51827 frags, 1141450 blocks, 0.2% fragmentation)
	./fsck_ffs -d -n /vol/spare  24.50s user 4.72s system 19% cpu 2:30.73 total

	No matter how you order the IO, fsck is going to have to
	wait for read(2) to return.  If we can offload that waiting
	to a child process we may be able to fix this.

	Is there any detailed commenting on the sources available,
	they are quite readable, but still very terse.  A more in
	depth explanation of each function would really help.  Do
	you know of a paper, manpage or do you have the time to
	sprinkle some commentary into the code?

	-- 
	-Alfred Perlstein [alfred@freebsd.org]
	'Instead of asking why a piece of software is using "1970s technology,"
	 start asking why software is ignoring 30 years of accumulated wisdom.'
				   http://www.morons.org/rants/gpl-harmful.php3

There was apaper in the Usenix general conference in the early 1990's
which talked about several ways of speeding up fsck. I incorporated
most of those ideas into fsck at that time. The two most significant
were reading the inodes in big chunks, and collecting all the directory
block numbers and sorting them as Ian describes above. More recently
with the advent of soft updates, a further optimization to fsck is to
read only those inodes that are marked as allocated in the bit maps.

	Kirk McKusick

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200202110509.g1B59hi06207>