FreeBSD Mail Archives

Date:      Sun, 10 Mar 2019 13:34:55 +1100
From:      Michelle Sullivan <michelle@sorbs.net>
To:        Ben RUBSON <ben.rubson@gmail.com>, "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>, Stefan Esser <se@freebsd.org>
Subject:   Re: ZFS pool faulted (corrupt metadata) but the disk data appears ok...
Message-ID:  <FF6E0ABD-48BB-4AB6-814C-925157876977@sorbs.net>
In-Reply-To: <3be04f0b-bded-9b77-896b-631824a14c4a@sorbs.net>
References:  <54D3E9F6.20702@sorbs.net> <54D41608.50306@delphij.net> <54D41AAA.6070303@sorbs.net> <54D41C52.1020003@delphij.net> <54D424F0.9080301@sorbs.net> <54D47F94.9020404@freebsd.org> <54D4A552.7050502@sorbs.net> <54D4BB5A.30409@freebsd.org> <54D8B3D8.6000804@sorbs.net> <54D8CECE.60909@freebsd.org> <54D8D4A1.9090106@sorbs.net> <54D8D5DE.4040906@sentex.net> <54D8D92C.6030705@sorbs.net> <54D8E189.40201@sorbs.net> <54D924DD.4000205@sorbs.net> <54DCAC29.8000301@sorbs.net> <9c995251-45f1-cf27-c4c8-30a4bd0f163c@sorbs.net> <8282375D-5DDC-4294-A69C-03E9450D9575@gmail.com> <73dd7026-534e-7212-a037-0cbf62a61acd@sorbs.net> <FAB7C3BA-057F-4AB4-96E1-5C3208BABBA7@gmail.com> <027070fb-f7b5-3862-3a52-c0f280ab46d1@sorbs.net> <42C31457-1A84-4CCA-BF14-357F1F3177DA@gmail.com> <5eb35692-37ab-33bf-aea1-9f4aa61bb7f7@sorbs.net> <3be04f0b-bded-9b77-896b-631824a14c4a@sorbs.net>

Turns out the cause of the fire is now known...

https://www.southcoastregister.com.au/story/5945663/homes-left-without-power=
-after-electrical-pole-destroyed-in-sanctuary-point-accident/ UPSs couldn=E2=
=80=99t deal with 11kv down the 240v line... (guess I=E2=80=99m lucky no one=
 was killed..)

Anyhow..





Any clues on how to get the pool back would be greatly appreciated..  the =E2=
=80=9Ccannot open=E2=80=9D disk was the faulted disk that mfid13 was replaci=
ng...  being raidz2 there should be all the data there=20



Michelle Sullivan
http://www.mhix.org/
Sent from my iPad

> On 10 Mar 2019, at 12:29, Michelle Sullivan <michelle@sorbs.net> wrote:
>=20
> Michelle Sullivan wrote:
>> Ben RUBSON wrote:
>>>> On 02 Feb 2018 21:48, Michelle Sullivan wrote:
>>>>=20
>>>> Ben RUBSON wrote:
>>>>=20
>>>>> So disks died because of the carrier, as I assume the second unscathed=
 server was OK...
>>>>=20
>>>> Pretty much.
>>>>=20
>>>>> Heads must have scratched the platters, but they should have been park=
ed, so... Really strange.
>>>>=20
>>>> You'd have thought... though 2 of the drives look like it was wear and w=
ear issues (the 2 not showing red lights) just not picked up on the periodic=
 scrub....  Could be that the recovery showed that one up... you know - how y=
ou can have an array working fine, but one disk dies then others fail during=
 the rebuild because of the extra workload.
>>>=20
>>> Yes... To try to mitigate this, when I add a new vdev to a pool, I sprea=
d the new disks I have among the existing vdevs, and construct the new vdev w=
ith the remaining new disk(s) + other disks retrieved from the other vdevs. T=
hus, when possible, avoiding vdevs with all disks at the same runtime.
>>> However I only use mirrors, applying this with raid-Z could be a little b=
it more tricky...
>>>=20
>> Believe it or not...
>>=20
>> # zpool status -v
>>  pool: VirtualDisks
>> state: ONLINE
>> status: One or more devices are configured to use a non-native block size=
.
>>    Expect reduced performance.
>> action: Replace affected devices with devices that support the
>>    configured block size, or migrate data to a properly configured
>>    pool.
>>  scan: none requested
>> config:
>>=20
>>    NAME                       STATE     READ WRITE CKSUM
>>    VirtualDisks               ONLINE       0     0     0
>>      zvol/sorbs/VirtualDisks  ONLINE       0     0     0  block size: 512=
B configured, 8192B native
>>=20
>> errors: No known data errors
>>=20
>>  pool: sorbs
>> state: ONLINE
>>  scan: resilvered 2.38T in 307445734561816429h29m with 0 errors on Sat Au=
g 26 09:26:53 2017
>> config:
>>=20
>>    NAME                  STATE     READ WRITE CKSUM
>>    sorbs                 ONLINE       0     0     0
>>      raidz2-0            ONLINE       0     0     0
>>        mfid0             ONLINE       0     0     0
>>        mfid1             ONLINE       0     0     0
>>        mfid7             ONLINE       0     0     0
>>        mfid8             ONLINE       0     0     0
>>        mfid12            ONLINE       0     0     0
>>        mfid10            ONLINE       0     0     0
>>        mfid14            ONLINE       0     0     0
>>        mfid11            ONLINE       0     0     0
>>        mfid6             ONLINE       0     0     0
>>        mfid15            ONLINE       0     0     0
>>        mfid2             ONLINE       0     0     0
>>        mfid3             ONLINE       0     0     0
>>        spare-12          ONLINE       0     0     3
>>          mfid13          ONLINE       0     0     0
>>          mfid9           ONLINE       0     0     0
>>        mfid4             ONLINE       0     0     0
>>        mfid5             ONLINE       0     0     0
>>    spares
>>      185579620420611382  INUSE     was /dev/mfid9
>>=20
>> errors: No known data errors
>>=20
>>=20
>> It would appear that the when I replaced the damaged drives it picked one=
 of them up as being rebuilt from back in August (before it was packed up to=
 go) and that was why it saw it as 'corrupted metadata' and spent the last 3=
 weeks importing it, it rebuilt it as it was importing it.. no dataloss that=
 I can determine. (literally just finished in the middle of the night here.)=

>>=20
>=20
> And back to this little nutmeg...
>=20
> We had a fire last night ...  and it (the same pool) was resilvering again=
...  Corrupted the metadata.. import -fFX worked and it started rebuilding, t=
hen during the early hours when the pool was at 50%(ish) rebuilt/resilvered (=
one vdev) there was at last one more issue on the powerline... UPSs went out=
 after multiple hits and now can't get it imported - the server was in singl=
e user mode - on a FBSD-12 USB stick ...  so it was only resilvering...  "zd=
b -AAA -L -uhdi -FX -e storage" returns sanely...
>=20
> anyone any thoughts how I might get the data back/pool to import? (zpool i=
mport -fFX storage spends a long time working and eventually comes back with=
 unable to import as one or more of the vdevs are unavailable - however they=
 are all there as far as I can tell)
>=20
> THanks,
>=20
> --=20
> Michelle Sullivan
> http://www.mhix.org/
>=20

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?FF6E0ABD-48BB-4AB6-814C-925157876977>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation