Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 2 Jun 1996 14:21:53 -0700 (PDT)
From:      Jake Hamby <jehamby@lightside.com>
To:        Terry Lambert <terry@lambert.org>
Cc:        Bruce Evans <bde@zeta.org.au>, dufault@hda, hackers@freebsd.org
Subject:   Re: Breaking ffs - speed enhancement?
Message-ID:  <Pine.AUX.3.91.960602135428.20289A-100000@covina.lightside.com>
In-Reply-To: <199606012308.QAA22341@phaeton.artisoft.com>

next in thread | previous in thread | raw e-mail | index | archive | help
While looking through my copy of UNIX Internals (excellent book, BTW, see
below for ordering info), I found some interesting alternative FS
approaches, which seemed applicable here.  I am referring to the various
log-structured and journalling filesystems, including BSD-LFS which is
included in FreeBSD but is currently broken because of changes to the VM 
system.

BSD-LFS implements the entire filesystem (file data and metadata) as an
append-only log, and relies on a cleaner daemon (lfs_cleanerd) to
garbage-collect outdated entries.  Although read/write performance is
probably not any faster than our enhanced, extent-based FFS, LFS should
have a great performance advantage for metadata updates, and recovery
failure (no fsck).  On the downside, it is very complex to implement,
requires all new LFS-aware utilities, is slowed down by the cleaner
daemon, and requires a large buffer cache (therefore lots of RAM) in order
to achieve reasonable read performance.  Fortunately, most of the
implementation has already been done by 4.4BSD, we would only need to
integrate it into FreeBSD's VM system and fix any implementation bugs
(probably quite a few, since LFS is not in common use).  As for RAM 
usage, the original intention was use on NNTP servers, which typically 
have gobs of RAM, so this shouldn't be a problem.

Another, perhaps simpler solution, would be journalled metadata updates, 
as in Cedar or Sun's DiskSuite.  This combines the FFS we already have 
with journalled metadata updates that guarantee consistency in the event 
of a crash, while allowing the actual updates to be cached longer and 
written optimally.  In addition, Sun's implementation provides other nice 
features like software RAID and the ability to add a new disk and "grow" 
an existing filesystem onto the new partition without taking the old 
partition offline.  Of course we can't hope for all the features of a 
commercial product like that, but the journalled metadata updating is 
probably a simpler solution than bringing LFS up to date, and has the 
benefit that an existing FFS partition can be upgraded quite easily (the 
log can be simply a fixed-size file near the center of the partition).

Of course either solution will be a lot more difficult to implement
initially, but should ultimately provide both faster and safer metadata 
updates, while simultaneously eliminating fsck delays during crash 
recovery!  At least that's how I read things.  I'm still a newbie kernel 
hacker, after all...  Comments?

---Jake

Unix Internals:  The New Frontier
Author: Uresh Vahalia
Publisher: Prentice Hall
ISBN: 0-13-101908-2

Great book, covers SVR4, 4.4BSD, Solaris, SunOS, Mach, Digital Unix, and 
others.  Great for comparing the various approaches, or if you need to 
support both BSD and SVR4 (as I do).



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.AUX.3.91.960602135428.20289A-100000>