Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 12 Feb 2001 19:28:06 -0500 (EST)
From:      Zhiui Zhang <zzhang@cs.binghamton.edu>
To:        Terry Lambert <tlambert@primenet.com>
Cc:        Russell Cattelan <cattelan@thebarn.com>, freebsd-fs@FreeBSD.ORG
Subject:   Re: Design a journalled file system
Message-ID:  <Pine.SOL.4.21.0102121917080.7164-100000@onyx>
In-Reply-To: <200102122306.QAA11325@usr08.primenet.com>

next in thread | previous in thread | raw e-mail | index | archive | help


On Mon, 12 Feb 2001, Terry Lambert wrote:

> > It seems to me that I have failed to explain my point again. So an example
> > may help. Suppose I have a bitmap block buffer.  One transaction allocate
> > some blocks from it, the other transaction free some blocks into it. If
> > the bitmap block buffer is not locked for the duration of a transaction,
> > then it could contain modifications made both transactions. The atomicity
> > is violated unless you can make the two transactions merge into one later.
> > On the other hand, if it is locked for a transaction and that transaction
> > blocks for some other I/O, then performance will suffer (no one can use
> > the bitmap block buffer for a while).
> 
> Russell is right, for XFS, and for most Journalled FS's, where the
> validity marking on the journal entry (as being the most recent)
> is the most important thing.  All transactions are written as if
> by way of a write-through cache of the modification data.
> 
> In other words, in his world, there's no such conflict between
> concurrent operations.

I think I got better understanding this time. Each transaction's log entry
only log changes *made by itself* using logical logging (instead of
physical logging.  In physical logging, the entire bitmap block will be
logged, potentially including modifications made by others). From time to
time, the filesystem will force a sync operation that write the metadata
in-place to free log space.  All transactions must be finished to reach
such a sync point. IBM JFS seems to do logging this way.

-Zhihui


> Per a previous post, Soft Updates is all about "unless you can
> make the two transactions merge into one later".
> 
> Specifically, if you have a disk block, it's 512b.  An inode on
> disk is 128b.  This means 4 inodes per block.
> 
> Similarly, a directory entry block is 512b.  A given block will
> contain between 1 and 16 directory entries, each of which may
> be in the process of being manipulated.
> 
> And so on.
> 
> Soft Updates keeps a list of modifications to conflicted blocks,
> in core, and actually makes a copy of the conflicted block, and
> backs out transaction state, when committing partial transactions.
> It does this by maintaining a state conflict domain dependency
> list (which is why Soft Updates are sometimes called Soft
> Dependencies instead).
> 
> 
> Practically, for a design, you can generally reduce the domains
> of conflict by increasing your object sizes to 512b.  This lets
> you have things like ACL and immediate file support in inodes,
> which you can then bill as a feature.
> 
> For the directory entry blocks, the conflict is already somewhat
> mitigated by the fact that anyone iterating the directory, you
> make a copy of the block -- it is a snapshot, not the actual
> directory contents you are iterating.  The NFS "cookie" code
> for iteration restart is really a kludge; it could have just as
> easily worked around the difference between on disk and wire and
> user space directory entries within a given block, by seperating
> the code into a "copy FS sided unit into snapshot" and "copy data
> from snapshot into representation buffer" VOPs (I've suggested
> this many times, and provided the code several).
> 
> The bottom line is that bitmaps only matter if you implement
> using bitmaps.  For inescapable conflicts (like the "last
> modified" or "time of last update" in superblock data, which
> you must have for recovery following a crash, the easiest method
> to work around the problem is to log superblocks as well, and
> then iterate to the "most recent valid", during recovery.
> 
> Ideally, you probably _do_ want to incorporate Soft Updates
> technology, since it lets you avoid artificial stalls when you
> enter into an unavoidable conflict (XFS stalls and drains at
> those points), but it's not immediately necessary (just don't
> design against it as a future optimization).
> 
> I really, really urge FS designers to go back to first principles
> when examining problems, and to consider FSs as transactions to
> be applied to persistant state data as a result of events.  If
> you do that, then protecting the integrity of the persistant
> state becomes obvious and easy.
> 
> 
> Actually, this really brings home the license point for XFS,
> since it should be obvious that it could benefit from soft
> updates, which it won't get without paying something (like
> access to its sources in a useful fashion for the BSD community).
> 
> Yes, I'm still looking for a commercial license that prohibits
> making XFS a stand-alone product, but still allows it be used
> in a commercial setting.  The Sun License on the original SLPv1,
> but fails to grant in perpetuity.  It may be that SGIs lawyer
> will have to do lawyering to work out one that satisfies them.
> 
> Hopefully SGI will learn the HP JetSend and the Sun JINI and the
> Net/1 & Net/2 TCP/IP lesson: if you want something to be standard,
> you can't control it, and if you control it, it won't be standard.
> 
> Note: my March 1st offer stands.  I have yet to hear how to get
> the unencumbered (SGI-only) GPL code... the clock's ticking.
> 
> 
> 					Terry Lambert
> 					terry@lambert.org



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.SOL.4.21.0102121917080.7164-100000>