Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 03 Oct 2003 16:29:10 -0700
From:      Kirk McKusick <mckusick@beastie.mckusick.com>
To:        Aaron Wohl <freebsd@soith.com>
Cc:        Robert Watson <rwatson@FreeBSD.org>
Subject:   Re: file system (UFS2) consistancy after -current crash? (fwd) 
Message-ID:  <200310032329.h93NTAfL057725@beastie.mckusick.com>
In-Reply-To: Your message of "Fri, 03 Oct 2003 16:42:00 EDT."

next in thread | raw e-mail | index | archive | help
	Date: Fri, 03 Oct 2003 05:03:34 -0600
	From: Aaron Wohl <freebsd@soith.com>
	To: freebsd-current@freebsd.org
	Subject: file system (UFS2) consistancy after -current crash?

	After crashes recently ive been geting softupdate inconsistancies.
	Directories in which a file has recently been renamed have neither
	the old file nor the new file.  fsck -y recovers the inode and drops
	it in lost in found.

	I was under the impression that atomic rename() synced all the way
	to the disk before returning?

	Does softupdate enabled/disable have any bearing on this?

	The disks themselfs are a raid5 on an adaptec 5400s.  We have had
	some problems recently with aac (the 5400s driver) related crashes
	we have been working with Scott Long on.  I was wondering if maybe
	rename is only syncing as far as the raid controller memory?

The problem that we have been having with many of the RAID
systems is that they give an I/O completion interrupt after
they copy the change into their memeory, but before the I/O
is completed to the disk. Since the filesystem uses the I/O
completion interrupt as an indication that the change is on
disk, it proceeds to the next step. If the RAID ultimately
fails to get the data to the disk, inconsistencies arise.
This problem can arise whether or not soft updates are being
used, but because soft updates makes individual changes over 
a longer time period (potentially up to a minute rather than
the few milliseconds of 2-3 synchronous writes), it is more
likely to be apparent after a crash. None of this helped by
a journalling filesystem as the RAID lies about writing the
log so you may not have it available to do a rollback after
a crash. As we discovered with IDE disks, disabling the "write
cache enable" feature causes a massive performance hit, so in
practice that does not seem like a viable strategy. What does
work is to use tag-queueing. Unfortunately tag-queueing is
found primarily in SCSI systems, though it is starting to
show up in the high-end IDE disks.

	Kirk McKusick



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200310032329.h93NTAfL057725>