Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 24 Apr 2005 20:09:06 +0200
From:      Pawel Jakub Dawidek <pjd@FreeBSD.org>
To:        Paul Mather <paul@gromit.dlib.vt.edu>
Cc:        freebsd-geom@freebsd.org
Subject:   Re: Is there a "disconnected" state for geom_mirror providers?
Message-ID:  <20050424180906.GE837@darkness.comp.waw.pl>
In-Reply-To: <1114364989.77743.14.camel@zappa.Chelsea-Ct.Org>
References:  <1114308801.71938.2.camel@zappa.Chelsea-Ct.Org> <20050424094148.GZ837@darkness.comp.waw.pl> <1114360313.77313.14.camel@zappa.Chelsea-Ct.Org> <20050424170415.GC837@darkness.comp.waw.pl> <1114364989.77743.14.camel@zappa.Chelsea-Ct.Org>

next in thread | previous in thread | raw e-mail | index | archive | help

--pjZlUIsMYcU39AAc
Content-Type: text/plain; charset=iso-8859-2
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sun, Apr 24, 2005 at 01:49:49PM -0400, Paul Mather wrote:
+> > So you want me to count number of failures of every sector and mark
+> > component as broken if I've 2 failures related to the same sector or
+> > something like that?:)
+>=20
+> No, I was just pointing out that the "endless loop" scenario you gave
+> might well not hold for certain common classes of read-induced failures.

I've no way to detect what kind of failure EIO is, that's why I need
a general solution.

+> > +> The shame about it being deleted from the mirror as opposed to mark=
ed as
+> > +> "broken" is you lose info (shown in "gmirror list") about the broken
+> > +> component priority, etc., which is useful for when you add a replac=
ement
+> > +> device (or re-add the same one, as in my case).
+> >=20
+> > You can use 'gmirror dump /dev/<your_component>'.
+>=20
+> Thanks!  I guess I missed that in the man page.

Maybe because it wasn't documented:) I missed this command in gmirror
manual page, but it is fixed in HEAD and RELENG_5 already.

+> > +> If you marked a component as "broken" (but still listed as part of =
the
+> > +> mirror), you could add a "-f" option to "gmirror rebuild" to force
+> > +> rebuilding onto it a la RAIDframe. :-)
+> >=20
+> > This is not so simple. I don't store any info on broken component, tha=
t it
+> > is broken, because e.g. bad sector could be the sector with metadata.
+> > Other components are informed that something wrong is going on.
+> > How one can remove such broken component for good? Let's say you was a=
ble
+> > to read metadata from the component, but you cannot write there any mo=
re.
+> > How you can easily replace this component?
+>=20
+> If "gmirror rebuild -f" was used, it would imply autosynchronisation was
+> turned off.  So, if you had real hardware problems with the provider, it
+> would remain broken because the rebuild would fail, too, with some
+> hardware error.  Eventually, as an operator, you'd get the hint that the
+> provider really had lasting problems, and replace it with something that
+> really worked. ;-)

Imagine something like this:

- da1 is broken, but still connected to the mirror.
- gmirror rebuild -f da1 doesn't work.
- gmirror remove da1 also doesn't work, because we've an error when
  updating metadata.
- gmirror insert da4 (spare disk) also doesn't work, because I cannot
  update metadata on all components.

etc.

--=20
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd@FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

--pjZlUIsMYcU39AAc
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (FreeBSD)

iD8DBQFCa+DCForvXbEpPzQRApdvAKDydK/xQfjjeO6Gyh24kLJHv+fNuQCfXMg/
Hb+Wf0UsRVXj5ICiPvM2aXY=
=PxRX
-----END PGP SIGNATURE-----

--pjZlUIsMYcU39AAc--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050424180906.GE837>