Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Oct 2016 00:32:20 +0100
From:      Steven Hartland <killing@multiplay.co.uk>
To:        freebsd-stable@freebsd.org
Subject:   Re: Repeatable panic on ZFS filesystem (used for backups); 11.0-STABLE
Message-ID:  <4d4909b7-c44b-996e-90e1-ca446e8e4813@multiplay.co.uk>
In-Reply-To: <f9a4a12d-62df-482d-feeb-9d9f64de3e55@denninger.net>
References:  <3d4f25c9-a262-a373-ec7e-755325f8810b@denninger.net> <9adecd24-6659-0da5-5c05-d0d3957a2cb3@denninger.net> <CANCZdfq5QCDNhLY5GOpmBoh5ONYy2VPteuaMhQ2=3v%2B0vcoM0g@mail.gmail.com> <0f58b11f-0bca-bc08-6f90-4e6e530f9956@denninger.net> <43a67287-f4f8-5d3e-6c5e-b3599c6adb4d@multiplay.co.uk> <76551fd6-0565-ee6c-b0f2-7d472ad6a4b3@denninger.net> <25ff3a3e-77a9-063b-e491-8d10a06e6ae2@multiplay.co.uk> <26e092b2-17c6-8744-5035-d0853d733870@denninger.net> <d2afc0b0-0e7f-e7ac-fb21-fa4ffd1c1003@multiplay.co.uk> <f9a4a12d-62df-482d-feeb-9d9f64de3e55@denninger.net>

next in thread | previous in thread | raw e-mail | index | archive | help


On 17/10/2016 22:50, Karl Denninger wrote:
> I will make some effort on the sandbox machine to see if I can come up
> with a way to replicate this.  I do have plenty of spare larger drives
> laying around that used to be in service and were obsolesced due to
> capacity -- but what I don't know if whether the system will misbehave
> if the source is all spinning rust.
>
> In other words:
>
> 1. Root filesystem is mirrored spinning rust (production is mirrored SSDs)
>
> 2. Backup is mirrored spinning rust (of approx the same size)
>
> 3. Set up auto-snapshot exactly as the production system has now (which
> the sandbox is NOT since I don't care about incremental recovery on that
> machine; it's a sandbox!)
>
> 4. Run a bunch of build-somethings (e.g. buildworlds, cross-build for
> the Pi2s I have here, etc) to generate a LOT of filesystem entropy
> across lots of snapshots.
>
> 5. Back that up.
>
> 6. Export the backup pool.
>
> 7. Re-import it and "zfs destroy -r" the backup filesystem.
>
> That is what got me in a reboot loop after the *first* panic; I was
> simply going to destroy the backup filesystem and re-run the backup, but
> as soon as I issued that zfs destroy the machine panic'd and as soon as
> I re-attached it after a reboot it panic'd again.  Repeat until I set
> trim=0.
>
> But... if I CAN replicate it that still shouldn't be happening, and the
> system should *certainly* survive attempting to TRIM on a vdev that
> doesn't support TRIMs, even if the removal is for a large amount of
> space and/or files on the target, without blowing up.
>
> BTW I bet it isn't that rare -- if you're taking timed snapshots on an
> active filesystem (with lots of entropy) and then make the mistake of
> trying to remove those snapshots (as is the case with a zfs destroy -r
> or a zfs recv of an incremental copy that attempts to sync against a
> source) on a pool that has been imported before the system realizes that
> TRIM is unavailable on those vdevs.
>
> Noting this:
>
>      Yes need to find some time to have a look at it, but given how rare
>      this is and with TRIM being re-implemented upstream in a totally
>      different manor I'm reticent to spend any real time on it.
>
> What's in-process in this regard, if you happen to have a reference?
Looks like it may be still in review: https://reviews.csiden.org/r/263/




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4d4909b7-c44b-996e-90e1-ca446e8e4813>