From owner-freebsd-stable@freebsd.org  Wed May  1 17:39:44 2019
Return-Path: <owner-freebsd-stable@freebsd.org>
Delivered-To: freebsd-stable@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id A4C17159C731
 for <freebsd-stable@mailman.ysv.freebsd.org>;
 Wed,  1 May 2019 17:39:44 +0000 (UTC)
 (envelope-from killing@multiplay.co.uk)
Received: from mail-ed1-x542.google.com (mail-ed1-x542.google.com
 [IPv6:2a00:1450:4864:20::542])
 (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
 server-signature RSA-PSS (4096 bits)
 client-signature RSA-PSS (2048 bits) client-digest SHA256)
 (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id B745486943
 for <freebsd-stable@freebsd.org>; Wed,  1 May 2019 17:39:42 +0000 (UTC)
 (envelope-from killing@multiplay.co.uk)
Received: by mail-ed1-x542.google.com with SMTP id w37so15509586edw.4
 for <freebsd-stable@freebsd.org>; Wed, 01 May 2019 10:39:42 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=multiplay-co-uk.20150623.gappssmtp.com; s=20150623;
 h=subject:to:cc:references:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-transfer-encoding:content-language;
 bh=bT+2JE66EV5C+360NnFnIEPIRC8yh648yuvET2KW1ls=;
 b=1yJwBFgbiJIovR0UiWsNS0t1jPGy5udPzQ4zk2Se/vgq+jwLkTjbpgVwvUo0IqS/iA
 tHs3/1izMxRu8w3iQWA2McvgekKbbvF2cwFLzrPKqeZmLWL/htmQCoL+RIrcFbj5eENO
 /aRz/UwnApPAdCiJwv/TUU6bIen+toNdsCORaQIiCebnz59yjbDBODZILpCazLymbON3
 epXVBj4ebNq8NRE5X5Zfvybr8A17jcB7Sd95OeF5gG6bbwilQTP1xd905+KDbxR0FAZp
 CreOLlvYaYKFUsr0I6BM9d89YIycrDfy+4HXgYGlOV5aq54iTPuQ8IQbp233bk1ijeiC
 YGGg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:cc:references:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-transfer-encoding
 :content-language;
 bh=bT+2JE66EV5C+360NnFnIEPIRC8yh648yuvET2KW1ls=;
 b=ZYu2AzxYDq8PdT/AXswvcaLin8HSTV2ML1/txP8nruiPh7K+EtFWVaPL3UHw0S57dK
 I2/zDZiVJ3+v/Ow1YpoTR6E1tyuYR1KXR6W7YQqqkgNSnBgLgqQXD5LE2J25DdKFVjTC
 jGpRSzaK8ayG7keSEWW7fhu93o9uRTQzKF/x8q6KdV2zlSL8PKQaK4m7SxYISVOlaBJL
 esdHi7xc3R4Gr6hzvX48wbmQbs7lwhfVNgo1xL9t6eZcGNK2N45sDeB+RHMECcR+JNIM
 +bglsyjbKFlfsDAMw/VrpYMU3TwOmh0MewQ4axznPnRewt+F8vygzcpqkfTk8vpT8Uue
 a2IA==
X-Gm-Message-State: APjAAAU3UGthhZu6ByxW6fCss1XYvsrD0KHMMy4GzbukIj/YpGd0LOFn
 2qMfFq2fcwDveSsrZpDK5IcL+ZCU8fQ=
X-Google-Smtp-Source: APXvYqwOjb4eJeu6JFd6CVfAxn8MTRePs747cPWuKrvUcwaRF4PN2xo+q+Qs4r6MfyiUEL+auB5ZDg==
X-Received: by 2002:a17:906:2447:: with SMTP id
 a7mr37261429ejb.235.1556732380611; 
 Wed, 01 May 2019 10:39:40 -0700 (PDT)
Received: from [10.44.128.75] ([161.12.40.153])
 by smtp.gmail.com with ESMTPSA id l3sm11110382edl.63.2019.05.01.10.39.39
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Wed, 01 May 2019 10:39:39 -0700 (PDT)
Subject: Re: ZFS...
To: Michelle Sullivan <michelle@sorbs.net>,
 Paul Mather <paul@gromit.dlib.vt.edu>
Cc: freebsd-stable <freebsd-stable@freebsd.org>
References: <30506b3d-64fb-b327-94ae-d9da522f3a48@sorbs.net>
 <17B373DA-4AFC-4D25-B776-0D0DED98B320@sorbs.net>
 <70fac2fe3f23f85dd442d93ffea368e1@ultra-secure.de>
 <70C87D93-D1F9-458E-9723-19F9777E6F12@sorbs.net>
 <CAGMYy3tYqvrKgk2c==WTwrH03uTN1xQifPRNxXccMsRE1spaRA@mail.gmail.com>
 <5ED8BADE-7B2C-4B73-93BC-70739911C5E3@sorbs.net>
 <d0118f7e-7cfc-8bf1-308c-823bce088039@denninger.net>
 <2e4941bf-999a-7f16-f4fe-1a520f2187c0@sorbs.net>
 <CAOtMX2gOwwZuGft2vPpR-LmTpMVRy6hM_dYy9cNiw+g1kDYpXg@mail.gmail.com>
 <34539589-162B-4891-A68F-88F879B59650@sorbs.net>
 <CAOtMX2iB7xJszO8nT_KU+rFuSkTyiraMHddz1fVooe23bEZguA@mail.gmail.com>
 <576857a5-a5ab-eeb8-2391-992159d9c4f2@denninger.net>
 <A7928311-8F51-4C72-839C-C9C2BA62C66E@sorbs.net>
 <b0fa0f8e-dc45-9d66-cc48-c733cbb9645b@denninger.net>
 <FD9802E0-E2E4-464A-8ABD-83B0A21C08F2@sorbs.net> <bf63007@sorbs.net>
 <CB86C16D-87D9-4D3F-9291-1E2586246E04@sorbs.net>
 <7DBA7907-BE8F-4944-9A71-86E5AC1B85CA@gromit.dlib.vt.edu>
 <5c458075-351f-6eb6-44aa-1bd268398343@sorbs.net>
From: Steven Hartland <killing@multiplay.co.uk>
Message-ID: <47137ea9-1ab2-1271-c15f-c0c05a17b92f@multiplay.co.uk>
Date: Wed, 1 May 2019 18:39:39 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101
 Thunderbird/60.6.1
MIME-Version: 1.0
In-Reply-To: <5c458075-351f-6eb6-44aa-1bd268398343@sorbs.net>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Content-Language: en-US
X-Rspamd-Queue-Id: B745486943
X-Spamd-Bar: ---
Authentication-Results: mx1.freebsd.org;
 dkim=pass header.d=multiplay-co-uk.20150623.gappssmtp.com header.s=20150623
 header.b=1yJwBFgb; 
 spf=pass (mx1.freebsd.org: domain of killing@multiplay.co.uk designates
 2a00:1450:4864:20::542 as permitted sender)
 smtp.mailfrom=killing@multiplay.co.uk
X-Spamd-Result: default: False [-3.87 / 15.00]; ARC_NA(0.00)[];
 RCVD_VIA_SMTP_AUTH(0.00)[];
 R_DKIM_ALLOW(-0.20)[multiplay-co-uk.20150623.gappssmtp.com:s=20150623];
 NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[];
 RCPT_COUNT_THREE(0.00)[3];
 R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36];
 NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain];
 PREVIOUSLY_DELIVERED(0.00)[freebsd-stable@freebsd.org];
 DMARC_NA(0.00)[multiplay.co.uk]; RCVD_COUNT_THREE(0.00)[3];
 TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[];
 DKIM_TRACE(0.00)[multiplay-co-uk.20150623.gappssmtp.com:+];
 MX_GOOD(-0.01)[ASPMX.L.GOOGLE.COM,ALT2.ASPMX.L.GOOGLE.COM,ALT1.ASPMX.L.GOOGLE.COM,ASPMX3.GOOGLEMAIL.COM,ASPMX2.GOOGLEMAIL.COM];
 RCVD_IN_DNSWL_NONE(0.00)[2.4.5.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.5.4.1.0.0.a.2.list.dnswl.org
 : 127.0.5.0]; SUBJ_ALL_CAPS(0.45)[6];
 IP_SCORE(-0.97)[ip: (-0.28), ipnet: 2a00:1450::/32(-2.25), asn: 15169(-2.24),
 country: US(-0.06)]; NEURAL_HAM_SHORT(-0.84)[-0.842,0];
 FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+];
 RCVD_TLS_LAST(0.00)[];
 ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US];
 MID_RHS_MATCH_FROM(0.00)[]
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-stable>, 
 <mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable/>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 01 May 2019 17:39:45 -0000


On 01/05/2019 15:53, Michelle Sullivan wrote:
> Paul Mather wrote:
>> On Apr 30, 2019, at 11:17 PM, Michelle Sullivan <michelle@sorbs.net> 
>> wrote:
>>
>>> Been there done that though with ext2 rather than UFS..  still got 
>>> all my data back... even though it was a nightmare..
>>
>>
>> Is that an implication that had all your data been on UFS (or ext2:) 
>> this time around you would have got it all back?  (I've got that 
>> impression through this thread from things you've written.) That sort 
>> of makes it sound like UFS is bulletproof to me.
>
> Its definitely not (and far from it) bullet proof - however when the 
> data on disk is not corrupt I have managed to recover it - even if it 
> has been a nightmare - no structure - all files in lost+found etc... 
> or even resorting to r-studio in the even of lost raid information etc..
Yes but you seem to have done this with ZFS too, just not in this 
particularly bad case.

If you imagine that the in memory update for the metadata was corrupted 
and then written out to disk, which is what you seem to have experienced 
with your ZFS pool, then you'd be in much the same position.
>
> This case - from what my limited knowledge has managed to fathom is a 
> spacemap has become corrupt due to partial write during the hard power 
> failure. This was the second hard outage during the resilver process 
> following a drive platter failure (on a ZRAID2 - so single platter 
> failure should be completely recoverable all cases - except hba 
> failure or other corruption which does not appear to be the case).. 
> the spacemap fails checksum (no surprises there being that it was part 
> written) however it cannot be repaired (for what ever reason))... how 
> I get that this is an interesting case... one cannot just assume 
> anything about the corrupt spacemap... it could be complete and just 
> the checksum is wrong, it could be completely corrupt and ignorable.. 
> but what I understand of ZFS (and please watchers chime in if I'm 
> wrong) the spacemap is just the freespace map.. if corrupt or missing 
> one cannot just 'fix it' because there is a very good chance that the 
> fix would corrupt something that is actually allocated and therefore 
> the best solution would be (to "fix it") would be consider it 100% 
> full and therefore 'dead space' .. but zfs doesn't do that - probably 
> a good thing - the result being that a drive that is supposed to be 
> good (and zdb reports some +36m objects there) becomes completely 
> unreadable ...  my thought (desire/want) on a 'walk' tool would be a 
> last resort tool that could walk the datasets and send them elsewhere 
> (like zfs send) so that I could create a new pool elsewhere and send 
> the data it knows about to another pool and then blow away the 
> original - if there are corruptions or data missing, thats my problem 
> it's a last resort.. but in the case the critical structures become 
> corrupt it means a local recovery option is enabled.. it means that if 
> the data is all there and the corruption is just a spacemap one can 
> transfer the entire drive/data to a new pool whilst the original host 
> is rebuilt... this would *significantly* help most people with large 
> pools that have to blow them away and re-create the pools because of 
> errors/corruptions etc... and with the addition of 'rsync' (the 
> checksumming of files) it would be trivial to just 'fix' the data 
> corrupted or missing from a mirror host rather than transferring the 
> entire pool from (possibly) offsite....

 From what I've read that's not a partial write issue, as in that case 
the pool would have just rolled back. It sounds more like the write was 
successful but the data in that write was trashed due to your power 
incident and that was replicated across ALL drives.

To be clear this may or may not be what your seeing as you don't see to 
have covered any of the details of the issues your seeing and what in 
detail steps you have tried to recover with?

I'm not saying this is the case but all may not be lost depending on the 
exact nature of the corruption.

For more information on space maps see:
https://www.delphix.com/blog/delphix-engineering/openzfs-code-walk-metaslabs-and-space-maps
https://sdimitro.github.io/post/zfs-lsm-flushing/

A similar behavior resulted in being a bug:
https://www.reddit.com/r/zfs/comments/97czae/zfs_zdb_space_map_errors_on_unmountable_zpool/

     Regards
     Steve