Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Apr 2019 10:41:17 +1000
From:      Michelle Sullivan <michelle@sorbs.net>
To:        Alan Somers <asomers@freebsd.org>
Cc:        freebsd-stable <freebsd-stable@freebsd.org>
Subject:   Re: ZFS...
Message-ID:  <56833732-2945-4BD3-95A6-7AF55AB87674@sorbs.net>
In-Reply-To: <CAOtMX2gf3AZr1-QOX_6yYQoqE-H%2B8MjOWc=eK1tcwt5M3dCzdw@mail.gmail.com>
References:  <30506b3d-64fb-b327-94ae-d9da522f3a48@sorbs.net> <CAOtMX2gf3AZr1-QOX_6yYQoqE-H%2B8MjOWc=eK1tcwt5M3dCzdw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Comments inline..

Michelle Sullivan
http://www.mhix.org/
Sent from my iPad

> On 30 Apr 2019, at 03:06, Alan Somers <asomers@freebsd.org> wrote:
>=20
>> On Mon, Apr 29, 2019 at 10:23 AM Michelle Sullivan <michelle@sorbs.net> w=
rote:
>>=20
>> I know I'm not going to be popular for this, but I'll just drop it here
>> anyhow.
>>=20
>> http://www.michellesullivan.org/blog/1726
>>=20
>> Perhaps one should reconsider either:
>>=20
>> 1. Looking at tools that may be able to recover corrupt ZFS metadata, or
>> 2. Defaulting to non ZFS filesystems on install.
>>=20
>> --
>> Michelle Sullivan
>> http://www.mhix.org/
>=20
> Wow, losing multiple TB sucks for anybody.  I'm sorry for your loss.
> But I want to respond to a few points from the blog post.
>=20
> 1) When ZFS says that "the data is always correct and there's no need
> for fsck", they mean metadata as well as data.  The spacemap is
> protected in exactly the same way as all other data and metadata. (to
> be pedantically correct, the labels and uberblocks are protected in a
> different way, but still protected).  The only way to get metadata
> corruption is due a disk failure (3-disk failure when using RAIDZ2),
> or due to a software bug.  Sadly, those do happen, and they're
> devilishly tricky to track down.  The difference between ZFS and older
> filesystems is that older filesystems experience corruption during
> power loss _by_design_, not merely due to software bugs.  A perfectly
> functioning UFS implementation will experience corruption during power
> loss, and that's why it needs to be fscked.  It's not just
> theoretical, either.  I use UFS on my development VMs, and they
> frequently experience corruption after a panic (which happens all the
> time because I'm working on kernel code).

I know, which is why I have ZVOLs with UFS filesystems in them for the devel=
opment VMs...  in a perfect world the power would have been all good, the up=
ses would not be damaged and the generator would not run out of fuel because=
 of extended outage...  in fact if it was a perfect world I wouldn=E2=80=99t=
 have my own mini dc at home.

>=20
> 2) Backups are essential with any filesystem, not just ZFS.  After
> all, no amount of RAID will protect you from an accidental "rm -rf /".

You only do it once...  I did it back in 1995... haven=E2=80=99t ever done i=
t again.

>=20
> 3) ZFS hotspares can be swapped in automatically, though they don't be
> default.  It sounds like you already figured out how to assign a spare
> to the pool.  To use it automatically, you must set the "autoreplace"
> pool property and enable zfsd.  The latter can be done with "sysrc
> zfsd_enable=3D"YES"".

The system was originally built on 9.0, and got upgraded through out the yea=
rs... zfsd was not available back then.  So get your point, but maybe you di=
dn=E2=80=99t realize this blog was a history of 8+ years?

>=20
> 4) It sounds like you're having a lot of power trouble.  Have you
> tried sysutils/apcupsd from ports?

I did... Malta was notorious for it.  Hence 6kva upses in the bottom of each=
 rack (4 racks), cross connected with the rack next to it and a backup gener=
ator...  Australia on the otherhand is a lot more stable (at least where I a=
m)...  2 power issues in 2 years... both within 10 hours... one was a transf=
ormer, the other when some idiot took out a power pole (and I mean actually t=
ook it out, it was literally snapped in half... how they got out of the car a=
nd did a runner before the police or Ambos got there I=E2=80=99ll never know=
.)

>  It's fairly handy.  It can talk to
> a wide range of UPSes, and can be configured to do stuff like send you
> an email on power loss, and power down the server if the battery gets
> too low.
>=20

They could help this... all 4 upses are toast now.  One caught fire, one no l=
onger detects AC input, the other two I=E2=80=99m not even trying after the f=
irst catching fire... the lot are being replaced on insurance.

It=E2=80=99s a catalog of errors that most wouldn=E2=80=99t normally experie=
nce.  However it does show (to me) that ZFS on everything is a really bad id=
ea... particularly for home users where there is unknown hardware and you kn=
ow they will mistreat it... they certainly won=E2=80=99t have ECC RAM in lap=
tops etc... unknown caching facilities etc.. it=E2=80=99s a recipe for losin=
g the root drive...

Regards,

Michelle=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?56833732-2945-4BD3-95A6-7AF55AB87674>