Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 19 Oct 2002 08:12:02 +0300
From:      Maxim Sobolev <sobomax@FreeBSD.ORG>
To:        Matthew Dillon <dillon@apollo.backplane.com>
Cc:        hackers@FreeBSD.ORG
Subject:   Re: Patch to allow a driver to report unrecoverable write errors to the buf layer
Message-ID:  <20021019051202.GB14922@vega.vega.com>
In-Reply-To: <200210181835.g9IIZsBX061970@apollo.backplane.com>
References:  <3DB048B5.21097613@FreeBSD.org> <200210181807.g9II7cBY024485@apollo.backplane.com> <3DB0516F.9BE00F57@FreeBSD.org> <200210181835.g9IIZsBX061970@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Oct 18, 2002 at 11:35:54AM -0700, Matthew Dillon wrote:
> 
> :> :
> :> :There is a very easy way to trigger the problem: insert blank floppy
> :> :...
> :> 
> :>     Your patch looks slightly incomplete to me, but the concept is reasonable.
> :>     The BIO_NORETRY test that sets B_INVAL should probably be done in
> :>     brelse(), not in bufwait().  It is the code in brelse() that actually
> :>     does the re-dirtying of the buffer in case of a write-error.
> :
> :Ah, actually I've initially put it into brelse() but then reconsidered
> :a decision and moved it down into bufwait(). I'll move it back. ;)
> 
>     Heh heh.  Well, it seems to me that since it is the BUF abstraction
>     that has the error check / redirtying / retry code, then the BUF
>     abstraction should probably be responsible for the no-retry case as
>     well.  The BIO abstraction is really designed to hold an I/O operation,
>     not really to hold meta operations.  You could still specify a BIO
>     flag for it since it's a media hack of sorts, but the BUF code should
>     be responsible for processing it.

OK, thank you for deteiled explanation.

>     I dunno about a formal abstraction.  We need to differentiate between
>     media which can and cannot remap blocks.  A 'perfect' solution
>     would be far more complex.  File data blocks would have to be
>     remapped at the filesystem level and meta-data would have to be 
>     invalidated in-core (bitmap, inode blocks with write errors), and
>     the filesystem would have to be marked dirty on unmount.  Then unmount
>     could safely destroy the buffers representing the write-error'd meta
>     data. 
> 
>     The VFS layer would definitely need to be involved.  We have the
>     advantage in that the buffer cache is already logically mapped, but
>     it would still be a fairly sophisticated piece of work.
> 
> :>     This re-dirtying is necessary in most cases to prevent filesystem
> :>     corruption.  Otherwise the buffer may be thrown away and a re-read
> :>     may return the original pre-modified data, causing massive filesystem
> :>     corruption elsewhere (consider what that would mean for a bitmap block).
> :> 
> :>     I think it's perfectly reasonable to do away with the buffer in the
> :>     case of a floppy error, though.
> 
>     Just a bit of history.  Originally the buffer cache did not retry error'd
>     out writes.  I changed it several years ago because the mechanism
>     was producing massive filesystem corruption in the face of disk write
>     errors.  The floppy issue was a known issue at the time and I am quite
>     happy that someone is tackling the problem now!

Hmm, the current approach doesn't look all that "right" to me, because we are
retrying operation even though the upper-layer code that initiated it was
already notified about the failure (e.g. received EIO), so that it should not
assume that the data was actually written successfully. Or I am missing
something?

-Maxim

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20021019051202.GB14922>