Date: Tue, 11 Jan 2011 21:43:35 +0000 (UTC) From: Warner Losh <imp@FreeBSD.org> To: src-committers@freebsd.org, svn-src-projects@freebsd.org Subject: svn commit: r217287 - projects/graid/head/sys/geom/raid Message-ID: <201101112143.p0BLhZEY036737@svn.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: imp Date: Tue Jan 11 21:43:35 2011 New Revision: 217287 URL: http://svn.freebsd.org/changeset/base/217287 Log: Fix a few problems with read error recovery: o We need to check the bp we're given in *done() not pbp since that's where the error is. o Just check bio_error and forget the BIO_ERROR flag. o bump the inbed count a little later in the processing. o Start to do write-remapping, but only detect when we need to, rather than actually doing anything (yet). o minor style cleanup o improve mirror breaking/degrading notes and add one. With these changes I can survive at a 10% error rate both raw operations, as well as file system operations... Modified: projects/graid/head/sys/geom/raid/tr_raid1.c Modified: projects/graid/head/sys/geom/raid/tr_raid1.c ============================================================================== --- projects/graid/head/sys/geom/raid/tr_raid1.c Tue Jan 11 21:18:29 2011 (r217286) +++ projects/graid/head/sys/geom/raid/tr_raid1.c Tue Jan 11 21:43:35 2011 (r217287) @@ -291,21 +291,18 @@ static void g_raid_tr_iodone_raid1(struct g_raid_tr_object *tr, struct g_raid_subdisk *sd, struct bio *bp) { + struct bio *cbp; + struct g_raid_subdisk *nsd; + struct g_raid_volume *vol; struct bio *pbp; + int i; pbp = bp->bio_parent; - pbp->bio_inbed++; - if ((pbp->bio_flags & BIO_ERROR) && pbp->bio_cmd == BIO_READ && + if (bp->bio_error != 0 && bp->bio_cmd == BIO_READ && pbp->bio_children == 1) { - struct bio *cbp; - struct g_raid_subdisk *nsd; - struct g_raid_volume *vol; - int i; - /* - * Retry the error on the other disk drive, if available, - * before erroring out the read. Do we need to mark the - * 'sd' disk as degraded somehow? + * Retry the read error on the other disk drive, if + * available, before erroring out the read. */ vol = tr->tro_volume; sd->sd_read_errs++; @@ -323,25 +320,31 @@ g_raid_tr_iodone_raid1(struct g_raid_tr_ if (cbp == NULL) break; g_raid_subdisk_iostart(nsd, cbp); + pbp->bio_inbed++; return; } /* * something happened, so we can't retry. Return the * original error by falling through. + * + * XXX degrade/break the mirror? + */ + } + pbp->bio_inbed++; + if (pbp->bio_cmd == BIO_READ && pbp->bio_children == 2) { + /* + * If it was a read, and bio_children is 2, then we just + * recovered the data from the second drive. We should try to + * write that data to the first drive if sector remapping is + * enabled. A write should put the data in a new place on the + * disk, remapping the bad sector. Do we need to do that by + * queueing a request to the main worker thread? It doesn't + * affect the return code of this current read, and can be + * done at our liesure. + * + * XXX TODO */ } - /* - * If it was a read, and bio_children is 2, then we just - * recovered the data from the second drive. We should try to - * write that data to the first drive if sector remapping is - * enabled. A write should put the data in a new place on the - * disk, remapping the bad sector. Do we need to do that by - * queueing a request to the main worker thread? It doesn't - * affect the return code of this current read, and can be - * done at our liesure. - * - * XXX TODO - */ if (pbp->bio_children == pbp->bio_inbed) { pbp->bio_completed = pbp->bio_length; g_raid_iodone(pbp, bp->bio_error);
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201101112143.p0BLhZEY036737>