FreeBSD Mail Archives

Date:      Fri, 27 Sep 2002 11:45:03 +0200
From:      Alexander Leidinger <Alexander@Leidinger.net>
To:        Terry Lambert <tlambert2@mindspring.com>
Cc:        freebsd-current@FreeBSD.ORG
Subject:   Re: Journaled filesystem in CURRENT
Message-ID:  <20020927114503.7c839b9b.Alexander@Leidinger.net>
In-Reply-To: <3D9362C1.CFA66F90@mindspring.com>
References:  <200209251319.g8PDJYoD047918@ib.com.ua> <20020925111232.B3686@Odin.AC.HMC.Edu> <20020926111949.5c0da160.Alexander@Leidinger.net> <20020926090325.A24614@zardoc.esmtp.org> <3D93459B.E4405568@mindspring.com> <20020926210947.5d5fdd45.Alexander@Leidinger.net> <3D9362C1.CFA66F90@mindspring.com>

On Thu, 26 Sep 2002 12:40:49 -0700 Terry Lambert
<tlambert2@mindspring.com> wrote:

> > > Journalling has advantages that a non-journalling FS with soft
> > > updates does not -- can not -- have, particularly since it is
> > > not possible to distinguish a power failure from a hardware
> > > failure from (some) software failures, and those cases need to
> > 
> > Power failure:
> >    No problem for both.
> > Hardware failure (I assume you think about a HDD failure):
> >    Read failure: doesn't matter here
> >    Write failure: either the sector gets remapped (no problem
> >                   for both), or the disk is in self destruct
> >                   mode (both can't cope with this)
> > Software failure:
> >    Are you talking about bugs in the FS code? Or about a nasty
> >    person which writes some bad data into the FS structures?
> > 
> > > be treated differently for the purposes of recovery.  The soft
> > 
> > Sorry, I don't get it. Can you please be more verbose?
> 
> This has been discussed to death before, and Kirk McKusick has
> already posted the definitive post on the topic to FreeBSD-FS.

Keywords (besides SO and Kirk McKusick)/timeframe/message ID/URL?

> The upshot is that it is important to distinguish between an
> FS that had only bad cylinder group bitmap contents, and an FS
> that needs a more thorough consistency checking.
> 
> You can not do this if the failure reason for the system is not
> recorded in non-volatile memory somewhere.  For a power failure,
> this is practically impossible, unless you have AC loss notification
> with a sufficient DC holdup time (e.g. like in the InterJet II
> power supply).
> 
> Note that recent disk drives (I *will not* call them "modern")
> will potentially trash sectors, if a power failure occurs during
> writes.

They don't have a power reservoir large enough to write the entire
content of their cache to disk? Damn. But I shouldn't wonder, the actual
economy is the result of letting marketing people make decissions.

> One way to handle Scott Dodson's problem (for example) is to add
> a "softcheck started" flag in the superblock, so that if a crash
> occurs durin the abbreviated check, then the full check is done

I asked Kirk a while ago what happens if we have a power failure while
we do a bg-fsck. He told me that this isn't harmful, the actual code
DTRT.

> > > JFS that journals both data and metadata can recover from all
> > > three, to a consistant state, and one that journals only
> > > metadata can recover from two of them.
> > 
> > SO writes the data directly to free sectors in the target
> > filesystem. I don't see where journaled data is an improvement in
> > fs-consistency here.
> 
> The write occurs, or it does not.  The journal entry timestamp
> gets updated after the write completes, or it does not.
> 
> Thus, you can always recover a JFS to a consistent state almost
> instantaneously, simply by finding the most recent valid journal
> entry timestamp, and ignoring anything else -- as long as data is
> journalled, and not just metadata.

I'm with Matthias Schündehütte here. SO writes the data and then it
writes the metadata. So either the just written blocks get referenced by
metadata or it does not. So we can recover to a consistent state almost
instantaneously too. The only problem is: when you delete some files,
and the metadata (directory entries) is written, but the free blocks
information isn't updated yet. Then you have to use (bg-)fsck to correct
the free block information. But if you need to go online as fast as
possible with a consistent FS SO doesn't holds you back from this.

Bye,
Alexander.

-- 
               I believe the technical term is "Oops!"

http://www.Leidinger.net                       Alexander @ Leidinger.net
  GPG fingerprint = C518 BC70 E67F 143F BE91  3365 79E2 9C60 B006 3FE7

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020927114503.7c839b9b.Alexander>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation