Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 01 May 2019 13:17:36 +1000
From:      Michelle Sullivan <michelle@sorbs.net>
To:        Karl Denninger <karl@denninger.net>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: ZFS...
Message-ID:  <CB86C16D-87D9-4D3F-9291-1E2586246E04@sorbs.net>
In-Reply-To: <bf630074-2e68-2f8f-b69f-adf99ac5d3de@denninger.net>
References:  <30506b3d-64fb-b327-94ae-d9da522f3a48@sorbs.net> <CAOtMX2gf3AZr1-QOX_6yYQoqE-H%2B8MjOWc=eK1tcwt5M3dCzdw@mail.gmail.com> <56833732-2945-4BD3-95A6-7AF55AB87674@sorbs.net> <3d0f6436-f3d7-6fee-ed81-a24d44223f2f@netfence.it> <17B373DA-4AFC-4D25-B776-0D0DED98B320@sorbs.net> <70fac2fe3f23f85dd442d93ffea368e1@ultra-secure.de> <70C87D93-D1F9-458E-9723-19F9777E6F12@sorbs.net> <CAGMYy3tYqvrKgk2c==WTwrH03uTN1xQifPRNxXccMsRE1spaRA@mail.gmail.com> <5ED8BADE-7B2C-4B73-93BC-70739911C5E3@sorbs.net> <d0118f7e-7cfc-8bf1-308c-823bce088039@denninger.net> <2e4941bf-999a-7f16-f4fe-1a520f2187c0@sorbs.net> <CAOtMX2gOwwZuGft2vPpR-LmTpMVRy6hM_dYy9cNiw%2Bg1kDYpXg@mail.gmail.com> <34539589-162B-4891-A68F-88F879B59650@sorbs.net> <CAOtMX2iB7xJszO8nT_KU%2BrFuSkTyiraMHddz1fVooe23bEZguA@mail.gmail.com> <576857a5-a5ab-eeb8-2391-992159d9c4f2@denninger.net> <A7928311-8F51-4C72-839C-C9C2BA62C66E@sorbs.net> <b0fa0f8e-dc45-9d66-cc48-c733cbb9645b@denninger.net> <FD9802E0-E2E4-464A-8ABD-83B0A21C08F2@sorbs.net> <bf63007@sorbs.net>

next in thread | previous in thread | raw e-mail | index | archive | help


Michelle Sullivan
http://www.mhix.org/
Sent from my iPad

> On 01 May 2019, at 12:37, Karl Denninger <karl@denninger.net> wrote:
>=20
> On 4/30/2019 20:59, Michelle Sullivan wrote
>>> On 01 May 2019, at 11:33, Karl Denninger <karl@denninger.net> wrote:
>>>=20
>>>> On 4/30/2019 19:14, Michelle Sullivan wrote:
>>>>=20
>>>> Michelle Sullivan
>>>> http://www.mhix.org/
>>>> Sent from my iPad
>>>>=20
>>> Nope.  I'd much rather *know* the data is corrupt and be forced to
>>> restore from backups than to have SILENT corruption occur and perhaps
>>> screw me 10 years down the road when the odds are my backups have
>>> long-since been recycled.
>> Ahh yes the be all and end all of ZFS.. stops the silent corruption of da=
ta.. but don=E2=80=99t install it on anything unless it=E2=80=99s server gra=
de with backups and ECC RAM, but it=E2=80=99s good on laptops because it pro=
tects you from silent corruption of your data when 10 years later the backup=
s have long-since been recycled...  umm is that not a circular argument?
>>=20
>> Don=E2=80=99t get me wrong here.. and I know you (and some others are) zf=
s in the DC with 10s of thousands in redundant servers and/or backups to kee=
p your critical data corruption free =3D good thing.
>>=20
>> ZFS on everything is what some say (because it prevents silent corruption=
) but then you have default policies to install it everywhere .. including h=
ardware not equipped to function safely with it (in your own arguments) and y=
et it=E2=80=99s still good because it will still prevent silent corruption e=
ven though it relies on hardware that you can trust...  umm say what?
>>=20
>> Anyhow veered way way off (the original) topic...
>>=20
>> Modest (part consumer grade, part commercial) suffered irreversible data l=
oss because of a (very unusual, but not impossible) double power outage.. an=
d no tools to recover the data (or part data) unless you have some form of b=
ackup because the file system deems the corruption to be too dangerous to le=
t you access any of it (even the known good bits) ... =20
>>=20
>> Michelle
>=20
> IMHO you're dead wrong Michelle.  I respect your opinion but disagree
> vehemently.

I guess we=E2=80=99ll have to agree to disagree then, but I think your attit=
ude to pronounce me =E2=80=9Cdead wrong=E2=80=9D is short sighted, because i=
t strikes of =E2=80=9CI=E2=80=99m right because ZFS is the answer to all pro=
blems.=E2=80=9D .. I=E2=80=99ve been around in the industry long enough to s=
ee a variety of issues... some disasters, some not so...

I also should know better than to run without backups but financial constrai=
nts precluded me.... as will for many non commercial people.

>=20
> I run ZFS on both of my laptops under FreeBSD.  Both have
> non-power-protected SSDs in them.  Neither is mirrored or Raidz-anything.
>=20
> So why run ZFS instead of UFS?
>=20
> Because a scrub will detect data corruption that UFS cannot detect *at all=
.*

I get it, I really do, but that balances out against, if you can=E2=80=99t r=
ebuild it make sure you have (tested and working) backups and be prepared fo=
r downtime when such corruption does occur.

>=20
> It is a balance-of-harms test and you choose.  I can make a very clean
> argument that *greater information always wins*; that is, I prefer in
> every case to *know* I'm screwed rather than not.  I can defend against
> being screwed with some amount of diligence but in order for that
> diligence to be reasonable I have to know about the screwing in a
> reasonable amount of time after it happens.

Not disagreeing (and have not been.)

>=20
> You may have never had silent corruption bite you.

I have... but not with data on disks..  most of my silent corruption issues h=
ave been with a layer or two above the hardware... like subversion commits o=
verwriting previous commits without notification (damn I wish I could reliab=
ly replicate it!)


>   I have had it happen
> several times over my IT career.  If that happens to you the odds are
> that it's absolutely unrecoverable and whatever gets corrupted is
> *gone.*

Every drive corruption I have suffered in my career I have been able to reco=
ver, all or partial data except where the hardware itself was totally hosed (=
Ie clean room options only available)... even with brtfs.. yuk.. puck.. yuk.=
. oh what a mess that was...  still get nightmares on that one...  but I sti=
ll managed to get most of the data off... in fact I put it onto this machine=
 I currently have problems with.. so after the nightmare of brtfs looks like=
 zfs eventually nailed me.


>   The defensive measures against silent corruption require
> retention of backup data *literally forever* for the entire useful life
> of the information because from the point of corruption forward *the
> backups are typically going to be complete and correct copies of the
> corrupt data and thus equally worthless to what's on the disk itself.*=20
> With non-ZFS filesystems quite a lot of thought and care has to go into
> defending against that, and said defense usually requires the active
> cooperation of whatever software wrote said file in the first place

Say what? =20

> (e.g. a database, etc.)

So dbs (any?) talk actively to the file systems (any?) to actively prevent s=
ilent corruption?

Lol...

I=E2=80=99m guessing you are actually talking about internal checks and bala=
nces of data in the DB to ensure that dat retrieved from disk is not corrupt=
/altered...  you know like writing sha256 checksums of files you might downl=
oad from the internet to ensure you got what you asked for and it wasn=E2=80=
=99t changed/altered in transit.

>   If said software has no tools to "walk" said
> data or if it's impractical to have it do so you're at severe risk of
> being hosed.

Umm what?  I=E2=80=99m talking about a userland (libzfs) tool (Ie doesn=E2=80=
=99t need the pool imported) such as zfs send (which requires the pool to be=
 imported - hence me not calling it a userland tool) to allow a sending of d=
ata that can be found to other places where it can be either blindly recover=
ed (corruption might be present) or can be used to locate files/paths etc th=
at are known to be good (checksums match etc).. walk the structures, feed th=
e data elsewhere where it can be examined/recovered... don=E2=80=99t alter i=
t.... it=E2=80=99s a last resort tool when you don=E2=80=99t have working ba=
ckups..

>   Prior to ZFS there really wasn't any comprehensive defense
> against this sort of event.  There are a whole host of applications that
> manipulate data that are absolutely reliant on that sort of thing not
> happening (e.g. anything using a btree data structure) and recovery if
> it *does* happen is a five-alarm nightmare if it's possible at all.  In
> the worst-case scenario you don't detect the corruption and the data
> that has the pointer to it that gets corrupted is overwritten and=20
> destroyed.
>=20
> A ZFS scrub on a volume that has no redundancy cannot *fix* that
> corruption but it can and will detect it.

So you=E2=80=99re advocating restore from backup for every corruption ... ok=
...


>   This puts a boundary on the
> backups that I must keep in order to *not* have that happen.  This is of
> very high value to me and is why, even on systems without ECC memory and
> without redundant disks, provided there is enough RAM to make it
> reasonable (e.g. not on embedded systems I do development on with are
> severely RAM-constrained) I run ZFS.
>=20
> BTW if you've never had a UFS volume unlink all the blocks within a file
> on an fsck and then recover them back into the free list after a crash
> you're a rare bird indeed.  If you think a corrupt ZFS volume is fun try
> to get your data back from said file after that happens.

Been there done that though with ext2 rather than UFS..  still got all my da=
ta back... even though it was a nightmare..


>=20
> --=20
> Karl Denninger
> karl@denninger.net <mailto:karl@denninger.net>
> /The Market Ticker/
> /[S/MIME encrypted email preferred]/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CB86C16D-87D9-4D3F-9291-1E2586246E04>