Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 4 Sep 2021 11:56:44 +0200 (CEST)
From:      Jimmy Olgeni <olgeni@FreeBSD.org>
To:        freebsd-questions@freebsd.org
Subject:   Locating ZFS checksum errors
Message-ID:  <ce522b85-336c-98de-cc3-c091f3cdc463@FreeBSD.org>

next in thread | raw e-mail | index | archive | help

Hi,

Short version of the story: due to a bad RAM stick I managed to collect some
checksum errors on a ZFS pool; they are not reported by a scrub, but show up
when running "zdb -bcsvL".

They look like this:

                            capacity   operations   bandwidth  ---- errors ----
description                used avail  read write  read write  read write cksum
rpool                      469G  451G 1.65K     0  146M     0     0     0     0
  mirror                   469G  451G 1.65K     0  146M     0     0     0     0
    /dev/gpt/pool1                      843     0 73.0M     0     0     0    98
    /dev/gpt/pool0                      842     0 73.1M     0     0     0    98

A few of them are logged during the scan:

zdb_blkptr_cb: Got error 97 reading <404, 0, 1, 17> DVA[0]=<0:e0956e000:6000> DVA[1]=<0:1200c25000:6000> [L1 DMU dnode] fletcher4 lz4 unencrypted LE contiguous unique double size=20000L/6000P
 birth=7322L/7322P fill=5419 cksum=743743d4a15:652404d7275bf1:349b01108bcc58b4:6eb5731a7332a4d1 -- skipping
zdb_blkptr_cb: Got error 97 reading <2271, 0, 5, 0> DVA[0]=<0:c2f30ab000:1000> DVA[1]=<0:c600436000:1000> [L5 DMU dnode] fletcher4 lz4 unencrypted LE contiguous unique double size=20000L/1000
P birth=7289L/7289P fill=337300 cksum=85bc497d18:1eb5fc0b1421b:38938f1daa6522b:5ad4e58754321611 -- skipping
zdb_blkptr_cb: Got error 97 reading <3310, 4, 1, 0> DVA[0]=<0:e0956c000:2000> DVA[1]=<0:120086c000:2000> [L1 ZFS directory] fletcher4 lz4 unencrypted LE contiguous unique double size=20000L/2
000P birth=7322L/7322P fill=129 cksum=288290d57d8:bc9ebda8906ed:200f1da7dabb56ec:4fcfb4af9ef377a4 -- skipping
zdb_blkptr_cb: Got error 97 reading <3722, 0, 0, 0> DVA[0]=<0:600a59000:1000> DVA[1]=<0:a000c5000:1000> [L0 DMU dnode] fletcher4 lz4 unencrypted LE contiguous unique double size=4000L/1000P b
irth=7302L/7302P fill=28 cksum=aa07fc9336:1ad004ec7af20:25a14bccd8cf7cf:61322f0ae33d86ad -- skipping
zdb_blkptr_cb: Got error 97 reading <3722, 0, 0, 2> DVA[0]=<0:601948000:1000> DVA[1]=<0:a05199000:1000> [L0 DMU dnode] fletcher4 lz4 unencrypted LE contiguous unique double size=4000L/1000P b
irth=7316L/7316P fill=20 cksum=67169bc899:13f93c35f3010:1ff78fe1b055272:31b6e7e44bb229c0 -- skipping
zdb_blkptr_cb: Got error 97 reading <3722, 139, 0, 0> DVA[0]=<0:ca000f6000:1000> [L0 ZFS plain file] fletcher4 uncompressed unencrypted LE contiguous unique single size=800L/800P birth=7298L/
7298P fill=1 cksum=8af8a441c3:9476600aa3da:63e5ffe2b26478:3244cdb4fc8d9b34 -- skipping
zdb_blkptr_cb: Got error 97 reading <3722, 0, 0, 9> DVA[0]=<0:600881000:1000> DVA[1]=<0:a000ba000:1000> [L0 DMU dnode] fletcher4 lz4 unencrypted LE contiguous unique double size=4000L/1000P b
irth=7300L/7300P fill=11 cksum=4f4ecf4565:10844f3c60c42:1bdd551a4002c08:fb9b06c01f06226f -- skipping
zdb_blkptr_cb: Got error 97 reading <3722, 385, 0, 0> DVA[0]=<0:e09567000:2000> [L0 ZFS plain file] fletcher4 uncompressed unencrypted LE contiguous unique single size=1400L/1400P birth=7322L
/7322P fill=1 cksum=d3768c188b:21a57dbb50d02:37b485d3c25dc20:5c04a9cd9a910a53 -- skipping
zdb_blkptr_cb: Got error 97 reading <3722, 760, 0, 0> DVA[0]=<0:ca00280000:1000> [L0 ZFS plain file] fletcher4 uncompressed unencrypted LE contiguous unique single size=400L/400P birth=7322L/
7322P fill=1 cksum=97317b812:7d938af334d:364ea0c01c20e:1030d470306d731 -- skipping

Now, how do I find out which files (or whatever else) are affected, in order to
fix them? :)

I tried to get a detailed log from zdb with all the DVAs and checksums, but I
could not find any match.

-- 
jimmy



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ce522b85-336c-98de-cc3-c091f3cdc463>