Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 5 Aug 2017 19:51:44 +0200
From:      Fabian Keil <freebsd-listen@fabiankeil.de>
To:        "Eugene M. Zheganin" <emz@norma.perm.ru>
Cc:        freebsd-stable@FreeBSD.org, freebsd-fs@freebsd.org
Subject:   Re: a strange and terrible saga of the cursed iSCSI ZFS SAN
Message-ID:  <20170805195144.1caf98dc@fabiankeil.de>
In-Reply-To: <1d53f489-5135-7633-fef4-35d26e4969dc@norma.perm.ru>
References:  <1bd10b1e-0583-6f44-297e-3147f6daddc5@norma.perm.ru> <1d53f489-5135-7633-fef4-35d26e4969dc@norma.perm.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
--Sig_/iPaWgOy.Hpp7k6QXYIQOmL8
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

"Eugene M. Zheganin" <emz@norma.perm.ru> wrote:

> On 05.08.2017 22:08, Eugene M. Zheganin wrote:
> >
> >   pool: userdata
> >  state: ONLINE
> > status: One or more devices has experienced an error resulting in data
> >         corruption.  Applications may be affected.
> > action: Restore the file in question if possible.  Otherwise restore the
> >         entire pool from backup.
> >    see: http://illumos.org/msg/ZFS-8000-8A
> >   scan: none requested
> > config:
> >
> >         NAME               STATE     READ WRITE CKSUM
> >         userdata           ONLINE       0     0  216K
> >           mirror-0         ONLINE       0     0  432K
> >             gpt/userdata0  ONLINE       0     0  432K
> >             gpt/userdata1  ONLINE       0     0  432K =20
> That would be funny, if not that sad, but while writing this message,=20
> the pool started to look like below (I just asked zpool status twice in=20
> a row, comparing to what it was):
>=20
> [root@san1:~]# zpool status userdata
>    pool: userdata
>   state: ONLINE
> status: One or more devices has experienced an error resulting in data
>          corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the
>          entire pool from backup.
>     see: http://illumos.org/msg/ZFS-8000-8A
>    scan: none requested
> config:
>=20
>          NAME               STATE     READ WRITE CKSUM
>          userdata           ONLINE       0     0  728K
>            mirror-0         ONLINE       0     0 1,42M
>              gpt/userdata0  ONLINE       0     0 1,42M
>              gpt/userdata1  ONLINE       0     0 1,42M
>=20
> errors: 4 data errors, use '-v' for a list
> [root@san1:~]# zpool status userdata
>    pool: userdata
>   state: ONLINE
> status: One or more devices has experienced an error resulting in data
>          corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the
>          entire pool from backup.
>     see: http://illumos.org/msg/ZFS-8000-8A
>    scan: none requested
> config:
>=20
>          NAME               STATE     READ WRITE CKSUM
>          userdata           ONLINE       0     0  730K
>            mirror-0         ONLINE       0     0 1,43M
>              gpt/userdata0  ONLINE       0     0 1,43M
>              gpt/userdata1  ONLINE       0     0 1,43M
>=20
> errors: 4 data errors, use '-v' for a list
>=20
> So, you see, the error rate is like speed of light. And I'm not sure if=20
> the data access rate is that enormous, looks like they are increasing on=
=20
> their own.
> So may be someone have an idea on what this really means.

Quoting a comment from sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_m=
isc.c:
/*
 * If destroy encounters an EIO while reading metadata (e.g. indirect
 * blocks), space referenced by the missing metadata can not be freed.
 * Normally this causes the background destroy to become "stalled", as
 * it is unable to make forward progress.  While in this stalled state,
 * all remaining space to free from the error-encountering filesystem is
 * "temporarily leaked".  Set this flag to cause it to ignore the EIO,
 * permanently leak the space from indirect blocks that can not be read,
 * and continue to free everything else that it can.
 *
 * The default, "stalling" behavior is useful if the storage partially
 * fails (i.e. some but not all i/os fail), and then later recovers.  In
 * this case, we will be able to continue pool operations while it is
 * partially failed, and when it recovers, we can continue to free the
 * space, with no leaks.  However, note that this case is actually
 * fairly rare.
 *
 * Typically pools either (a) fail completely (but perhaps temporarily,
 * e.g. a top-level vdev going offline), or (b) have localized,
 * permanent errors (e.g. disk returns the wrong data due to bit flip or
 * firmware bug).  In case (a), this setting does not matter because the
 * pool will be suspended and the sync thread will not be able to make
 * forward progress regardless.  In case (b), because the error is
 * permanent, the best we can do is leak the minimum amount of space,
 * which is what setting this flag will do.  Therefore, it is reasonable
 * for this flag to normally be set, but we chose the more conservative
 * approach of not setting it, so that there is no possibility of
 * leaking space in the "partial temporary" failure case.
 */

In FreeBSD the "flag" currently isn't easily reachable due to the lack
of a powerful kernel debugger (like mdb in Solaris offsprings) but
it can be made reachable with a sysctl using the patch from:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D218954

Fabian

--Sig_/iPaWgOy.Hpp7k6QXYIQOmL8
Content-Type: application/pgp-signature
Content-Description: OpenPGP digital signature

-----BEGIN PGP SIGNATURE-----

iF0EARECAB0WIQTKUNd6H/m3+ByGULIFiohV/3dUnQUCWYYFsQAKCRAFiohV/3dU
nd3AAJ94LgHj630WLpNwyH3SKQj2l6hF9ACgqm2KgnEqE0xGYO0wswxBFpktykA=
=hqGa
-----END PGP SIGNATURE-----

--Sig_/iPaWgOy.Hpp7k6QXYIQOmL8--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170805195144.1caf98dc>