Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 07 Jun 2005 21:19:52 -0400
From:      Richard Coleman <rcoleman@criticalmagic.com>
To:        Scott Long <scottl@samsco.org>
Cc:        Pawel Jakub Dawidek <pjd@FreeBSD.org>, scottl@FreeBSD.org, Ivan Voras <ivoras@fer.hr>, David Malone <dwmalone@maths.tcd.ie>, hackers@FreeBSD.org, phk@FreeBSD.org
Subject:   Re: Google SoC idea
Message-ID:  <42A647B8.30709@criticalmagic.com>
In-Reply-To: <42A6091C.40409@samsco.org>
References:  <42A475AB.6020808@fer.hr>	<20050607194005.GG837@darkness.comp.waw.pl>	<20050607201642.GA58346@walton.maths.tcd.ie> <42A6091C.40409@samsco.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Scott Long wrote:
> /me jumps up and down and waves his hands
> 
> The problem with journalling at the block layer is that you pretty much 
> become forced to journal metadata and data, since the block layer really 
> doesn't know the distinction, and definitely not in a 
> filesystem-independent way (yes, UFS does evil things to the buffer 
> cache by representing metadata with negative block numbers, but that is 
> just UFS).  Full journalling has many drawbacks from the viewpoint of 
> speed and complexity, of course.  So you really want to be able to do 
> just metadata journalling.
> 
> Another hard part of distinguishing between metadata and data is that 
> filesystems have a habit of migrating disk blocks from holding metadata 
> to holding data, and vice versa (think indirect pointer blocks, not 
> inode blocks).  If you are only replaying metadata, you want to make 
> sure that you don't smash data blocks with old metadata.
> 
> Coming up with a filesystem independent way to represent all of this for 
> the block layer is not easy.  Filesystems would have to be able to be 
> modified to provide proper metadata vs. data hints to the block layer. 
> And if you're going to do that, then why not just make it a library in 
> VFS, like what Darwin does?
> 
> The UFS Journalling work is already well underway, and I expect it to 
> follow the path of being a VFS library.  Note that I'm saying 'library' 
> here, not 'layer'.  There really is no way to make journalling work with 
> an arbitrary filesystem 'for free', whether as a VFS layer or a GEOM 
> transform, since journalling is 100% dependent on the filesystem working 
> with the buffer-cache to do sane operations in a defined in order.
> 
> An alternate SoC project that would be very useful is block-level 
> snapshots.  I'm not sure if I'll be able to retain the filesystem 
> snapshot functionality in UFS with journalling enabled, so moving to 
> doing the snapshots in the block layer would be a good way to make up 
> for this.  Beware that while the GEOM transform would be pretty 
> straight-forward to write, the real trick comes from being able to make 
> the consumer of a block device (a filesystem, maybe) flush itself to a 
> consistent state while the snapshot is being taken.  The infrastructure 
> for this is the part that is very interesting, but also the most work.
> 
> Scott

Scott,

Have you looked at the journaling layer that Matt has been adding to 
DragonflyBSD?  What you are talking about appears very similar.  Or am I 
misunderstanding something?

Richard Coleman
rcoleman@criticalmagic.com



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?42A647B8.30709>