Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 10 Aug 2016 13:48:53 -0400
From:      Allan Jude <allanjude@freebsd.org>
To:        freebsd-current@freebsd.org
Subject:   Re: Possible zpool online, resilvering issue
Message-ID:  <6fa613f2-6087-fa5d-c75b-d1a80ce9f06a@freebsd.org>
In-Reply-To: <CANJ8om5vTFfRjH%2BOd-Sfgy5hNtu2nSvZcMD11Ae4F1NYdW-Onw@mail.gmail.com>
References:  <CANJ8om5ddhu2brfHO2RUcjc4ctDbRNwzYwOKjZZm_wfL=1ibwg@mail.gmail.com> <196319fe-8113-bb2d-74b7-fbdd3369d988@freebsd.org> <CANJ8om5vTFfRjH%2BOd-Sfgy5hNtu2nSvZcMD11Ae4F1NYdW-Onw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--UDQeGERPu927LMjKpAmMVrPr0OMCGgM8F
Content-Type: multipart/mixed; boundary="ExWA3k3LH01dnjuH05vQPVbIUEaOsdgq5"
From: Allan Jude <allanjude@freebsd.org>
To: freebsd-current@freebsd.org
Message-ID: <6fa613f2-6087-fa5d-c75b-d1a80ce9f06a@freebsd.org>
Subject: Re: Possible zpool online, resilvering issue
References: <CANJ8om5ddhu2brfHO2RUcjc4ctDbRNwzYwOKjZZm_wfL=1ibwg@mail.gmail.com>
 <196319fe-8113-bb2d-74b7-fbdd3369d988@freebsd.org>
 <CANJ8om5vTFfRjH+Od-Sfgy5hNtu2nSvZcMD11Ae4F1NYdW-Onw@mail.gmail.com>
In-Reply-To: <CANJ8om5vTFfRjH+Od-Sfgy5hNtu2nSvZcMD11Ae4F1NYdW-Onw@mail.gmail.com>

--ExWA3k3LH01dnjuH05vQPVbIUEaOsdgq5
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

On 2016-08-10 12:53, Ultima wrote:
> Hello,
>=20
>> I didn't see any reply on the list, so I thought I might let you know
>=20
> Sorry, never received this reply (till now) xD
>=20
>> what I assume is happening:
>=20
>> ZFS never updates data in place, which affects inode updates, e.g. if
>> a file has been read and access times must be updated. (For that reaso=
n,
>> many ZFS file systems are configured to ignore access time updates).
>=20
>> Even if there were only R/O accesses to files in the pool, there will
>> have been updates to the inodes, which were missed by the offlined
>> drives (unless you ignore atime updates).
>=20
>> But even if there are no access time updates, ZFS might have written
>> new uberblocks and other meta information. Check the POOL history and
>> see if there were any TXGs created during the scrub.
>=20
>> If you scrub the pooll while it is off-line, it should stay stable
>> (but if any information about the scrub, the offlining of drives etc.
>> is recorded in the pool's history log, differences are to be expected)=
=2E
>=20
>> Just my $.02 ...
>=20
>> Regards, STefan
>=20
> Thanks for the reply, I'm not completely sure what would be considered =
a
> TXG. Maintained normal operations during most this noise and this pool =
has
> quite a bit of activity during normal operations. My zpool history look=
s
> like it gos on forever and the last scrub is showing it repaired 9.48G.=

> That was for all these access time updates? I guess that would be a lit=
tle
> less then 2.5G per disk worth.
>=20
> The zpool history looks like it gos on forever (733373 lines). This poo=
l
> has much of this activity with poudriere. All the entries I see are clo=
ne,
> destroy, rollback and snapshotting. I can't really say how much but at
> least 500 (prob much more than that) entries between the last two scrub=
s.
> Atime is off on all datasets.
>=20
>  So to be clear, this is expected behavior with atime=3Doff + TXGs duri=
ng
> offline time? I had thought that the resilver after onlining the disk w=
ould
> bring that disk up-to-date with the pool. I guess my understanding was =
a
> bit off.
>=20
> Ultima
> _______________________________________________
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.o=
rg"
>=20

A new transaction group (TXG) is created at LEAST every
vfs.zfs.txg.timeout (defaults to 5) seconds.

If you offline a drive for hours or more, it must have all blocks with a
'birth time' newer than the last transaction that was recorded on the
offlined drive replayed to catch that drive up to the other drives in
the pool.

As long as you have enough redundancy, the checksum errors can be
corrected without concern.

In the end, the checksum errors can be written off as being caused by
the bad hardware. After you finish the scrub and everything is OK, do:
'zpool clear poolname', and it will reset all of the error and checksum
counts to 0, so you can track if any more ever show up.


--=20
Allan Jude


--ExWA3k3LH01dnjuH05vQPVbIUEaOsdgq5--

--UDQeGERPu927LMjKpAmMVrPr0OMCGgM8F
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (MingW32)

iQIcBAEBAgAGBQJXq2kIAAoJEBmVNT4SmAt+kRgP/i39xNRJ6ify5UIoa1CXEWJ6
iygjGvW3JUDWrW3VMgeCCMk1O+2J4eSQs8b2X4DyIDngTtsA33/6G2VRzh+9TgUU
bz2NDNf0icfrnkhd9ry6rdYk6dgzn6kj7UZtlWzxTlWh4JL0asZMON0nBvWjPpwW
eX5b6kClECVHZYo8XUA/1Ozw5KtfurTsryHS2zIw4Qttpq1+mPRK0kaUUv8MHXrm
2uNyI3Vi7+lfqjMuEao1gyurIC5UKHut3AvpQQSFE3sNegkYHmV3POJp8GKM1qTM
vDimh4F6+UwQUsgsnb0D7/71iNSUUJEAb8cb2R5WFkcwqrNl1QKuFdXcnIcEh8JC
1FUZqzOSO53vOk5uCtyHu/yufSEVQ5/nElwTsqZi+G0V36j1fjP7OXuZt3kpu+X0
5KM8K+icRPor5tBB14qxJeohdfDKTqytOVL9aD6RkfAdNarzuYJvD3DnaxgWyF5k
diyqP4WxY6UyEDux1xEyLwk9HB+XlKyKK8A0LpGi0xlkskIIk0z6bG1truy6+qBS
9DlVl23dtF8MsMsNJ12EnSiX79TciaqbZW6lZfBS6ibKV0bxcqf6VWaRWEPJu6aI
yT20UMsYKy/LIaS0tq2uzm2YULbxAsN36VrsWb/YRbL234Lb06ab7+gMIz4Fv1XS
TqpWrAkoid9ilmXbuhA7
=D+au
-----END PGP SIGNATURE-----

--UDQeGERPu927LMjKpAmMVrPr0OMCGgM8F--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6fa613f2-6087-fa5d-c75b-d1a80ce9f06a>