From owner-freebsd-fs  Tue Oct 13 17:49:57 1998
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id RAA03896
          for freebsd-fs-outgoing; Tue, 13 Oct 1998 17:49:57 -0700 (PDT)
          (envelope-from owner-freebsd-fs@FreeBSD.ORG)
Received: from smtp03.primenet.com (smtp03.primenet.com [206.165.6.133])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id RAA03888;
          Tue, 13 Oct 1998 17:49:55 -0700 (PDT)
          (envelope-from tlambert@usr08.primenet.com)
Received: (from daemon@localhost)
	by smtp03.primenet.com (8.8.8/8.8.8) id RAA04271;
	Tue, 13 Oct 1998 17:49:40 -0700 (MST)
Received: from usr08.primenet.com(206.165.6.208)
 via SMTP by smtp03.primenet.com, id smtpd004237; Tue Oct 13 17:49:37 1998
Received: (from tlambert@localhost)
	by usr08.primenet.com (8.8.5/8.8.5) id RAA20004;
	Tue, 13 Oct 1998 17:49:31 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <199810140049.RAA20004@usr08.primenet.com>
Subject: Re: filesystem safety and SCSI disk write caching
To: gibbs@plutotech.com (Justin T. Gibbs)
Date: Wed, 14 Oct 1998 00:49:31 +0000 (GMT)
Cc: tlambert@primenet.com, gibbs@plutotech.com, Don.Lewis@tsc.tdk.com,
        julian@whistle.com, freebsd-fs@FreeBSD.ORG, freebsd-scsi@FreeBSD.ORG
In-Reply-To: <199810140007.SAA02391@pluto.plutotech.com> from "Justin T. Gibbs" at Oct 13, 98 06:00:52 pm
X-Mailer: ELM [version 2.4 PL25]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-fs@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

> >It doesn't, since "# of anomalies == 0" with write caching disabled.
> 
> This doesn't follow.  If the cache is disabled, it doesn't matter if
> the drive loses power due to hitting the reset button.  We already 
> know that losing power on a drive that cached data will not work.

We do?

If writes are committed in dependency order, and the write is cached
and there is no reordering of subsequent writes (ie: writes occur in
tag order, even if they are cached), then I think this satisifies soft
updates.


> >> I'm still unclear as to whether Don was turning off power or hitting what I
> >> consider the reset button.  His comment about UPSes use makes me think he
> >> was testing power outage scenarios.
> >
> >Well, I know that this might sound insane, but we could ask Don, and
> >I could get out of the middle of this whole thing... ;-).
> 
> Well, if your offering, I'd be more than happy to take you up on your
> offer.

No need; he's spoken up: He was using the front panel reset, not
power loss.


> >> Since you were able to test 4 drives so quickly, I'd love to see well
> >> documented information on exactly how the file system was inconsistent
> >> in the failure cases.
> >
> >There were directory dependencies which were committed out of order
> >(the modified fsck reports these as soft dependency errors...).
> 
> Can you be more specific?  Are you positive that the transactions
> were committed out of order or could it be that some transactions
> were never committed at all?

Transactions were committed out of order.  If the transactions that
had been committed had been committed in order, then the loss of cached
transactions would only result in a rewind of state of the drive to
a previous state.  The previous state, by definition, must also be
consistent.

In other words, transaction dependency order as required for soft
updates must cause commits to the drives to occur in dependency
order, and the problem is the reordering of consecutive write
requests by the drive itself if there is *ever* a case where the
state is ever inconsistent.

The only alternative is a dependency order bug, and such a bug
would be very easy to reproduce using a break-to-debugger or a
reset+fsck.  This is not the case, from all observable data, so
it *must* be that the soft updates code is being given the go-ahead
on another transaction before the previous transaction on which it
depends has been committed to stable storage, and the drive is
subsequently committing them out of order.

There is no other explanation for a soft dependency inconsistency,
so long as dependencies are both correctly modelled and enqueued
(which we believe to be the case).


> What was the size of the directory.

For me, it was a large directory; it was the full X11R6 source tree
from the XFree86 distribution (this is what Julian has used as one
of his tests at Whistle, so I did the same at home).

> Was the failure in directory creation or destruction?  Which portion
> of the dependency graph was violated?

I would have to manually examine the data on the disk after getting
the fsck report to answer this.  Unfortunately, fsck is a tool for
returning the disk to a known stable state, and if the dependency
order is not enforced by synchronous writes and/or soft update
dependency order being honored by the drive, then I would need to
be able to intuit what the correct state should have been vs. what
the current state was (in other words, guess the outstanding cached
transactions remaining uncommitted, and determine if they would be
rolled forward or backward by fsck).

My *hunch* from what I was doing at the time (rm -rf) is "destruction",
but since I started this before all the creations were committed to
stable storage (only enqueued in the dependency queue), I can't
really say for sure what operation was in effect when I hit the reset
button.  Maybe Don can elaborate on what he was doing when he hit
reset?


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message