Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 18 Oct 2002 20:45:25 +0300
From:      Maxim Sobolev <sobomax@FreeBSD.org>
To:        hackers@FreeBSD.org, dillon@FreeBSD.org
Subject:   Patch to allow a driver to report unrecoverable write errors to the buf  layer
Message-ID:  <3DB048B5.21097613@FreeBSD.org>

next in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------23672A0561E832EE864612C2
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit

Hi folks,

I noticed that FreeBSD buf/bio subsystem has one very annoying problem
- once the write request is ejected into it, and write operation
failed, there seemingly no way valid to tell the layer to drop the
buffer. Instead, it retries the attempt over and over again, until
reboot, even though originator of request (usually vfs layer) was
already notified about failure and propagated error condition to the
underlying user-lever program.

There is a very easy way to trigger the problem: insert blank floppy
into your drive, format it with newfs_msdos, mount it, remove the disk
from the drive without unmounting and do `touch /floppy/somefile'.
You'll see that touch(1) fails with Input/Output error and the kernel
reports write failure on the console. However, after couple of seconds
you'll notice that the kernel tries to write exactly the same buffer
again, then again ad infinitum. The same effect if you'll mount
write-protected floppy in read/write mode. 

Moreover, such stale buffer prevents the fs from being unmounted (even
forcefully) because before unmounting the kernel wants to ensure that
all dirty buffers are flushed, thus blocking umount(8) forever in
synchronization routine.

OK, you can tell "well, don't do that!", and in this particular case
I'd probably agree, but there at least few others situation in which
such functionality would be very helpful: consider a machine, which
has several disk drives mounted and suddenly one of the drives fails -
it would be nice if the OS could at least try to withstand, or another
example: a RAID array, which due to the failure of some stripes has
been degraded into read-only mode, so that any write operation would
cause above mentioned buf stall. Also in the era of P-n-P hardware
(USB, FireWire etc), it is no longer safe to assume that the disk
drive will be staying connected until the OS lets it go.

Attached patch addresses the problem (with fd(4) only right now, but
it should be trivial to extend other drivers) by allowing any device
driver to inform the buf layer that unrecoverable error condition
occurred during write operation, so that it is meaningless to do a
retry. I would like to hear any comments or suggestions about my
approach.

Also it would be very nice to devise some way to propagate such error
condition into vfs layer, so that the fs driver could act upon it
somehow (e.g. degrade fs into read-only mode).

Thanks!

-Maxim
--------------23672A0561E832EE864612C2
Content-Type: text/plain; charset=koi8-r;
 name="buf.noretry.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="buf.noretry.diff"

Index: sys/bio.h
===================================================================
RCS file: /home/ncvs/src/sys/sys/bio.h,v
retrieving revision 1.122
diff -d -u -r1.122 bio.h
--- sys/bio.h	9 Oct 2002 07:11:03 -0000	1.122
+++ sys/bio.h	18 Oct 2002 16:53:02 -0000
@@ -100,6 +100,15 @@
 /* bio_flags */
 #define BIO_ERROR	0x00000001
 #define BIO_DONE	0x00000004
+#define BIO_NORETRY	0x00000008	/* Don't attempt to retry failed   */
+					/* operation. Should be set when   */
+					/* the underlying driver detected  */
+					/* some unrecoverable condition    */
+					/* e.g. fatal hardware failure,	   */
+					/* forcefully ejected removable	   */
+					/* media, media that has been made */
+					/* write-protected, replaced with  */
+					/* another media etc.		   */
 #define BIO_FLAG2	0x40000000	/* Available for local hacks */
 #define BIO_FLAG1	0x80000000	/* Available for local hacks */
 
Index: kern/vfs_bio.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/vfs_bio.c,v
retrieving revision 1.338
diff -d -u -r1.338 vfs_bio.c
--- kern/vfs_bio.c	28 Sep 2002 17:46:30 -0000	1.338
+++ kern/vfs_bio.c	18 Oct 2002 16:53:05 -0000
@@ -2915,6 +2915,8 @@
 		return (EINTR);
 	}
 	if (bp->b_ioflags & BIO_ERROR) {
+		if (bp->b_ioflags & BIO_NORETRY)
+			bp->b_flags |= B_INVAL;
 		return (bp->b_error ? bp->b_error : EIO);
 	} else {
 		return (0);
Index: isa/fd.c
===================================================================
RCS file: /home/ncvs/src/sys/isa/fd.c,v
retrieving revision 1.241
diff -d -u -r1.241 fd.c
--- isa/fd.c	2 Oct 2002 20:29:54 -0000	1.241
+++ isa/fd.c	18 Oct 2002 16:53:13 -0000
@@ -2530,6 +2530,8 @@
 		}
 		if ((fd->options & FDOPT_NOERROR) == 0) {
 			bp->bio_flags |= BIO_ERROR;
+			if (bp->bio_cmd == BIO_WRITE)
+				bp->bio_flags |= BIO_NORETRY;
 			bp->bio_error = EIO;
 			bp->bio_resid = bp->bio_bcount - fdc->fd->skip;
 		} else

--------------23672A0561E832EE864612C2--


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3DB048B5.21097613>