Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 12 Feb 2021 11:37:17 -0500
From:      Karl Denninger <karl@denninger.net>
To:        freebsd-fs@freebsd.org
Subject:   Re: Reading a corrupted file on ZFS
Message-ID:  <2f82f113-9ca1-99a9-a433-89e3ae5edcbe@denninger.net>
In-Reply-To: <10977ffc-f806-69dd-0cef-d4fd4fc5f649@artem.ru>
References:  <da892eeb-233f-551f-2faa-62f42c3c1d5b@artem.ru> <0ca45adf-8f60-a4c3-6264-6122444a3ffd@denninger.net> <899c6b4f-2368-7ec2-4dfe-fa09fab35447@artem.ru> <20210212165216.2f613482@fabiankeil.de> <10977ffc-f806-69dd-0cef-d4fd4fc5f649@artem.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a cryptographically signed message in MIME format.

--------------ms080601010705050507080003
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: quoted-printable

On 2/12/2021 11:22, Artem Kuchin wrote:
> 12.02.2021 18:52, Fabian Keil =D0=BF=D0=B8=D1=88=D0=B5=D1=82:
>> Artem Kuchin <artem@artem.ru> wrote on 2021-02-12:
>>
>>> 12.02.2021 18:06, Karl Denninger =D0=BF=D0=B8=D1=88=D0=B5=D1=82:
>>>> Blocking the read forces you to get the good copy off backup media a=
nd
>>>> thus prevents that from happening.
>>>>
>>> I know what ZFS does and i damaged the same file in the same place on=

>>> purpose. Question is: how to read what's left of it. Just for kicks, =
i
>>> don't have a backup, and i need to read what's left. It could be 1GB
>>> file with only one byte damaged and it is of crazy importance to me. =

>>> So,
>>> how to bypass all the checks and make it read the file no matter what=
?
>> The patch from this PR adds a sysctl that allows to send corrupted dat=
a:
>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D221909
>>
>> Using the added sysctl you can send and receive the dataset and then
>> read the corrupted file from the received dataset. Note that ZFS=20
>> replaces
>> corrupted blocks completely with the 0x'zfs badd bloc' pattern instead=

>> of returning the corrupted data as is, thus increasing the amount of
>> corruption in case of simple bit flips to whole blocks.
>>
>> Fabian
>
> Arghh. That's not what i want. This is strange. In case of stupid old=20
> FS like FAT or even newer UFS i can dig into damaged file and collect=20
> as much data as possible, while newer ZFS does not provide tools to=20
> dig into data. That's was always my concern about ZFS. If something=20
> bad goes with FAT/NTFS and even UFS - there are tons of tools which=20
> can dissect the file system into bits so i can get as much as possible =

> of what's left. In case of ZFS there are no tools that i know and even =

> ZFS itself does not allow to get what left of normal data.
>
> This is frustrating. why..why..

You created a synthetic situation that in the real world almost-never=20
exists (ONE byte modified in all copies in the same allocation block but =

all other data in that block is intact and recoverable.)

In almost-all actual cases of "bit rot" it's exactly that; random and by =

statistics extraordinarily unlikely to hit all copies at once in the=20
same allocation block.=C2=A0 Therefore, ZFS can and does fix it; UFS or F=
AT=20
silently returns the corrupted data, propagates it, and eventually=20
screws you down the road.

The nearly-every-case situation in the real world where a disk goes=20
physically bad (I've had this happen *dozens* of times over my IT=20
career) results in the drive being unable to return the block at all;=20
you don't get all but the bad byte back, you get nothing for that block=20
and any attempt to "touch" it results in either a hard error coming back =

with no data in the buffer or (if not a TLER device) a wildly-extended=20
timeout before an I/O error is returned with, again, no usable data in=20
the buffer.=C2=A0 On "old" winchester-style spinning media and even flopp=
y=20
drives this resulted in an entire physical sector (usually 512 bytes)=20
being irretrievably lost.=C2=A0 In the case of a "modern" zoned or=20
advanced-format hard drive or an SSD the amount of data impacted and=20
unreadable is typically much larger than one sector; for an SDD it is=20
frequently *at least* a 4k block (which can and frequently does span=20
multiple files!) and for many instances of rotating rust it can be an=20
entire *track* if the servo data is where the fault lies which can be a=20
*huge* amount of data.

The patch gives you all but one allocation block of data from ZFS, with=20
that one block effectively zeroed.=C2=A0 This is no worse than the usual =

actual (not your synthesized test) impact of such a failure in a the=20
real world with other filesystems in virtually every instance where it=20
happens "in the wild."

In short there are very, very few actual "in the wild" failures where=20
one byte is damaged and the rest surrounding that one byte is intact and =

retrievable.=C2=A0 In most cases where an actual failure occurs the=20
unreadable data constitutes *at least* a physical sector.

--=20
Karl Denninger
karl@denninger.net <mailto:karl@denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/

--------------ms080601010705050507080003
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgMFADCABgkqhkiG9w0BBwEAAKCC
DdgwggagMIIEiKADAgECAhMA5EiKghDOXrvfxYxjITXYDdhIMA0GCSqGSIb3DQEBCwUAMIGL
MQswCQYDVQQGEwJVUzEQMA4GA1UECAwHRmxvcmlkYTESMBAGA1UEBwwJTmljZXZpbGxlMRkw
FwYDVQQKDBBDdWRhIFN5c3RlbXMgTExDMRgwFgYDVQQLDA9DdWRhIFN5c3RlbXMgQ0ExITAf
BgNVBAMMGEN1ZGEgU3lzdGVtcyBMTEMgMjAxNyBDQTAeFw0xNzA4MTcxNjQyMTdaFw0yNzA4
MTUxNjQyMTdaMHsxCzAJBgNVBAYTAlVTMRAwDgYDVQQIDAdGbG9yaWRhMRkwFwYDVQQKDBBD
dWRhIFN5c3RlbXMgTExDMRgwFgYDVQQLDA9DdWRhIFN5c3RlbXMgQ0ExJTAjBgNVBAMMHEN1
ZGEgU3lzdGVtcyBMTEMgMjAxNyBJbnQgQ0EwggIiMA0GCSqGSIb3DQEBAQUAA4ICDwAwggIK
AoICAQC1aJotNUI+W4jP7xQDO8L/b4XiF4Rss9O0B+3vMH7Njk85fZ052QhZpMVlpaaO+sCI
KqG3oNEbuOHzJB/NDJFnqh7ijBwhdWutdsq23Ux6TvxgakyMPpT6TRNEJzcBVQA0kpby1DVD
0EKSK/FrWWBiFmSxg7qUfmIq/mMzgE6epHktyRM3OGq3dbRdOUgfumWrqHXOrdJz06xE9NzY
vc9toqZnd79FUtE/nSZVm1VS3Grq7RKV65onvX3QOW4W1ldEHwggaZxgWGNiR/D4eosAGFxn
uYeWlKEC70c99Mp1giWux+7ur6hc2E+AaTGh+fGeijO5q40OGd+dNMgK8Es0nDRw81lRcl24
SWUEky9y8DArgIFlRd6d3ZYwgc1DMTWkTavx3ZpASp5TWih6yI8ACwboTvlUYeooMsPtNa9E
6UQ1nt7VEi5syjxnDltbEFoLYcXBcqhRhFETJe9CdenItAHAtOya3w5+fmC2j/xJz29og1KH
YqWHlo3Kswi9G77an+zh6nWkMuHs+03DU8DaOEWzZEav3lVD4u76bKRDTbhh0bMAk4eXriGL
h4MUoX3Imfcr6JoyheVrAdHDL/BixbMH1UUspeRuqQMQ5b2T6pabXP0oOB4FqldWiDgJBGRd
zWLgCYG8wPGJGYgHibl5rFiI5Ix3FQncipc6SdUzOQIDAQABo4IBCjCCAQYwHQYDVR0OBBYE
FF3AXsKnjdPND5+bxVECGKtc047PMIHABgNVHSMEgbgwgbWAFBu1oRhUMNEzjODolDka5k4Q
EDBioYGRpIGOMIGLMQswCQYDVQQGEwJVUzEQMA4GA1UECAwHRmxvcmlkYTESMBAGA1UEBwwJ
TmljZXZpbGxlMRkwFwYDVQQKDBBDdWRhIFN5c3RlbXMgTExDMRgwFgYDVQQLDA9DdWRhIFN5
c3RlbXMgQ0ExITAfBgNVBAMMGEN1ZGEgU3lzdGVtcyBMTEMgMjAxNyBDQYIJAKxAy1WBo2kY
MBIGA1UdEwEB/wQIMAYBAf8CAQAwDgYDVR0PAQH/BAQDAgGGMA0GCSqGSIb3DQEBCwUAA4IC
AQCB5686UCBVIT52jO3sz9pKuhxuC2npi8ZvoBwt/IH9piPA15/CGF1XeXUdu2qmhOjHkVLN
gO7XB1G8CuluxofOIUce0aZGyB+vZ1ylHXlMeB0R82f5dz3/T7RQso55Y2Vog2Zb7PYTC5B9
oNy3ylsnNLzanYlcW3AAfzZcbxYuAdnuq0Im3EpGm8DoItUcf1pDezugKm/yKtNtY6sDyENj
tExZ377cYA3IdIwqn1Mh4OAT/Rmh8au2rZAo0+bMYBy9C11Ex0hQ8zWcvPZBDn4v4RtO8g+K
uQZQcJnO09LJNtw94W3d2mj4a7XrsKMnZKvm6W9BJIQ4Nmht4wXAtPQ1xA+QpxPTmsGAU0Cv
HmqVC7XC3qxFhaOrD2dsvOAK6Sn3MEpH/YrfYCX7a7cz5zW3DsJQ6o3pYfnnQz+hnwLlz4MK
17NIA0WOdAF9IbtQqarf44+PEyUbKtz1r0KGeGLs+VGdd2FLA0e7yuzxJDYcaBTVwqaHhU2/
Fna/jGU7BhrKHtJbb/XlLeFJ24yvuiYKpYWQSSyZu1R/gvZjHeGb344jGBsZdCDrdxtQQcVA
6OxsMAPSUPMrlg9LWELEEYnVulQJerWxpUecGH92O06wwmPgykkz//UmmgjVSh7ErNvL0lUY
UMfunYVO/O5hwhW+P4gviCXzBFeTtDZH259O7TCCBzAwggUYoAMCAQICEwCg0WvVwekjGFiO
62SckFwepz0wDQYJKoZIhvcNAQELBQAwezELMAkGA1UEBhMCVVMxEDAOBgNVBAgMB0Zsb3Jp
ZGExGTAXBgNVBAoMEEN1ZGEgU3lzdGVtcyBMTEMxGDAWBgNVBAsMD0N1ZGEgU3lzdGVtcyBD
QTElMCMGA1UEAwwcQ3VkYSBTeXN0ZW1zIExMQyAyMDE3IEludCBDQTAeFw0xNzA4MTcyMTIx
MjBaFw0yMjA4MTYyMTIxMjBaMFcxCzAJBgNVBAYTAlVTMRAwDgYDVQQIDAdGbG9yaWRhMRkw
FwYDVQQKDBBDdWRhIFN5c3RlbXMgTExDMRswGQYDVQQDDBJrYXJsQGRlbm5pbmdlci5uZXQw
ggIiMA0GCSqGSIb3DQEBAQUAA4ICDwAwggIKAoICAQC+HVSyxVtJhy3Ohs+PAGRuO//Dha9A
16l5FPATr6wude9zjX5f2lrkRyU8vhCXTZW7WbvWZKpcZ8r0dtZmiK9uF58Ec6hhvfkxJzbg
96WHBw5Fumd5ahZzuCJDtCAWW8R7/KN+zwzQf1+B3MVLmbaXAFBuKzySKhKMcHbK3/wjUYTg
y+3UK6v2SBrowvkUBC+jxNg3Wy12GsTXcUS/8FYIXgVVPgfZZrbJJb5HWOQpvvhILpPCD3xs
YJFNKEPltXKWHT7Qtc2HNqikgNwj8oqOb+PeZGMiWapsatKm8mxuOOGOEBhAoTVTwUHlMNTg
6QUCJtuWFCK38qOCyk9Haj+86lUU8RG6FkRXWgMbNQm1mWREQhw3axgGLSntjjnznJr5vsvX
SYR6c+XKLd5KQZcS6LL8FHYNjqVKHBYM+hDnrTZMqa20JLAF1YagutDiMRURU23iWS7bA9tM
cXcqkclTSDtFtxahRifXRI7Epq2GSKuEXe/1Tfb5CE8QsbCpGsfSwv2tZ/SpqVG08MdRiXxN
5tmZiQWo15IyWoeKOXl/hKxA9KPuDHngXX022b1ly+5ZOZbxBAZZMod4y4b4FiRUhRI97r9l
CxsP/EPHuuTIZ82BYhrhbtab8HuRo2ofne2TfAWY2BlA7ExM8XShMd9bRPZrNTokPQPUCWCg
CdIATQIDAQABo4IBzzCCAcswPAYIKwYBBQUHAQEEMDAuMCwGCCsGAQUFBzABhiBodHRwOi8v
b2NzcC5jdWRhc3lzdGVtcy5uZXQ6ODg4ODAJBgNVHRMEAjAAMBEGCWCGSAGG+EIBAQQEAwIF
oDAOBgNVHQ8BAf8EBAMCBeAwHQYDVR0lBBYwFAYIKwYBBQUHAwIGCCsGAQUFBwMEMDMGCWCG
SAGG+EIBDQQmFiRPcGVuU1NMIEdlbmVyYXRlZCBDbGllbnQgQ2VydGlmaWNhdGUwHQYDVR0O
BBYEFLElmNWeVgsBPe7O8NiBzjvjYnpRMIHKBgNVHSMEgcIwgb+AFF3AXsKnjdPND5+bxVEC
GKtc047PoYGRpIGOMIGLMQswCQYDVQQGEwJVUzEQMA4GA1UECAwHRmxvcmlkYTESMBAGA1UE
BwwJTmljZXZpbGxlMRkwFwYDVQQKDBBDdWRhIFN5c3RlbXMgTExDMRgwFgYDVQQLDA9DdWRh
IFN5c3RlbXMgQ0ExITAfBgNVBAMMGEN1ZGEgU3lzdGVtcyBMTEMgMjAxNyBDQYITAORIioIQ
zl6738WMYyE12A3YSDAdBgNVHREEFjAUgRJrYXJsQGRlbm5pbmdlci5uZXQwDQYJKoZIhvcN
AQELBQADggIBAJXboPFBMLMtaiUt4KEtJCXlHO/3ZzIUIw/eobWFMdhe7M4+0u3te0sr77QR
dcPKR0UeHffvpth2Mb3h28WfN0FmJmLwJk+pOx4u6uO3O0E1jNXoKh8fVcL4KU79oEQyYkbu
2HwbXBU9HbldPOOZDnPLi0whi/sbFHdyd4/w/NmnPgzAsQNZ2BYT9uBNr+jZw4SsluQzXG1X
lFL/qCBoi1N2mqKPIepfGYF6drbr1RnXEJJsuD+NILLooTNf7PMgHPZ4VSWQXLNeFfygoOOK
FiO0qfxPKpDMA+FHa8yNjAJZAgdJX5Mm1kbqipvb+r/H1UAmrzGMbhmf1gConsT5f8KU4n3Q
IM2sOpTQe7BoVKlQM/fpQi6aBzu67M1iF1WtODpa5QUPvj1etaK+R3eYBzi4DIbCIWst8MdA
1+fEeKJFvMEZQONpkCwrJ+tJEuGQmjoQZgK1HeloepF0WDcviiho5FlgtAij+iBPtwMuuLiL
shAXA5afMX1hYM4l11JXntle12EQFP1r6wOUkpOdxceCcMVDEJBBCHW2ZmdEaXgAm1VU+fnQ
qS/wNw/S0X3RJT1qjr5uVlp2Y0auG/eG0jy6TT0KzTJeR9tLSDXprYkN2l/Qf7/nT6Q03qyE
QnnKiBXWAZXveafyU/zYa7t3PTWFQGgWoC4w6XqgPo4KV44OMYIFBzCCBQMCAQEwgZIwezEL
MAkGA1UEBhMCVVMxEDAOBgNVBAgMB0Zsb3JpZGExGTAXBgNVBAoMEEN1ZGEgU3lzdGVtcyBM
TEMxGDAWBgNVBAsMD0N1ZGEgU3lzdGVtcyBDQTElMCMGA1UEAwwcQ3VkYSBTeXN0ZW1zIExM
QyAyMDE3IEludCBDQQITAKDRa9XB6SMYWI7rZJyQXB6nPTANBglghkgBZQMEAgMFAKCCAkUw
GAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0BCQUxDxcNMjEwMjEyMTYzNzE3
WjBPBgkqhkiG9w0BCQQxQgRAojSSPqWSKToURGHHABIi8Mnp6PROgkTJ8mTdEfYwMACbhbZF
+eGC5d+Jpp1yq2cJuPeqI5j0EtylJjrgz72YLDBsBgkqhkiG9w0BCQ8xXzBdMAsGCWCGSAFl
AwQBKjALBglghkgBZQMEAQIwCgYIKoZIhvcNAwcwDgYIKoZIhvcNAwICAgCAMA0GCCqGSIb3
DQMCAgFAMAcGBSsOAwIHMA0GCCqGSIb3DQMCAgEoMIGjBgkrBgEEAYI3EAQxgZUwgZIwezEL
MAkGA1UEBhMCVVMxEDAOBgNVBAgMB0Zsb3JpZGExGTAXBgNVBAoMEEN1ZGEgU3lzdGVtcyBM
TEMxGDAWBgNVBAsMD0N1ZGEgU3lzdGVtcyBDQTElMCMGA1UEAwwcQ3VkYSBTeXN0ZW1zIExM
QyAyMDE3IEludCBDQQITAKDRa9XB6SMYWI7rZJyQXB6nPTCBpQYLKoZIhvcNAQkQAgsxgZWg
gZIwezELMAkGA1UEBhMCVVMxEDAOBgNVBAgMB0Zsb3JpZGExGTAXBgNVBAoMEEN1ZGEgU3lz
dGVtcyBMTEMxGDAWBgNVBAsMD0N1ZGEgU3lzdGVtcyBDQTElMCMGA1UEAwwcQ3VkYSBTeXN0
ZW1zIExMQyAyMDE3IEludCBDQQITAKDRa9XB6SMYWI7rZJyQXB6nPTANBgkqhkiG9w0BAQEF
AASCAgChcAOaPOAnILJ42FOFKUyx/LMYY5cyHU3pjPAZmYX55rZZlyh/woZQQqAzd9qPtvrZ
oXHDLAtWEh2h6ljzoXjueSAJQvVlg55v3ly2WvXU+AFp1+2roVMMeUMqeOmXNbVrP60dt0Yu
M4+/ad/Pn8s9OgehB/wST0JoKh/oRy9lkkJEwv0wgNJTiliszovx/Xigne6amui117qEs3k1
QWiS84QSNIK1dTZjlTKihNrCD7iiWpFrf5JFabzOBT0ikbgkFRHs7x7Uvy4L2mCuQ0kDKxbC
mBLgwx2xyQMw9yeWJOiJKedyYHQ7oLR/1VHLVNZ4qb5lSXUHJQ3Ke9+Z+RmgDY2jytMGCEXs
E+uJqIkOg79H+p/49UfThjBMomtEqFoKtyiu3RkSyfoz3G8U/sbyNRDbQ30y3HboM/I71+Sr
H8jlQfWWvxq7RBdKRU6L9ZY+5sWnO2rzeYVA1+fwR0l9ZKcBPGfvszEOPr3OPuKll0GBz28G
B5gY3IrDpRKisTIBbSz2JCHlwOHj9FWWHLE8cI33rMJ4TE4UpZ0oSKNXivnunJ28P+hPDmTu
YbgzvB046w+TbmU9TaSb0EgkWe7SNecC3YSNMXyJf9OElSg15dHBDE6X4EvbxBiWippeJOHL
5HwkSy0UVSzi61pdk9QtQwpLElI6wSqDFTffMSRPMgAAAAAAAA==
--------------ms080601010705050507080003--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2f82f113-9ca1-99a9-a433-89e3ae5edcbe>