From owner-freebsd-stable@FreeBSD.ORG Tue Jul 2 07:57:33 2013 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id CEA383F2; Tue, 2 Jul 2013 07:57:33 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from relay5-d.mail.gandi.net (relay5-d.mail.gandi.net [217.70.183.197]) by mx1.freebsd.org (Postfix) with ESMTP id EAEA01BDC; Tue, 2 Jul 2013 07:57:32 +0000 (UTC) Received: from mfilter26-d.gandi.net (mfilter26-d.gandi.net [217.70.178.154]) by relay5-d.mail.gandi.net (Postfix) with ESMTP id 47FF141C0A4; Tue, 2 Jul 2013 09:57:21 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at mfilter26-d.gandi.net Received: from relay5-d.mail.gandi.net ([217.70.183.197]) by mfilter26-d.gandi.net (mfilter26-d.gandi.net [10.0.15.180]) (amavisd-new, port 10024) with ESMTP id 7KXV2EVJlzIy; Tue, 2 Jul 2013 09:57:19 +0200 (CEST) X-Originating-IP: 76.102.14.35 Received: from jdc.koitsu.org (c-76-102-14-35.hsd1.ca.comcast.net [76.102.14.35]) (Authenticated sender: jdc@koitsu.org) by relay5-d.mail.gandi.net (Postfix) with ESMTPSA id D7D5D41C091; Tue, 2 Jul 2013 09:57:18 +0200 (CEST) Received: by icarus.home.lan (Postfix, from userid 1000) id EA97273A1C; Tue, 2 Jul 2013 00:57:16 -0700 (PDT) Date: Tue, 2 Jul 2013 00:57:16 -0700 From: Jeremy Chadwick To: Andriy Gapon Subject: Re: ZFS Panic after freebsd-update Message-ID: <20130702075716.GA79876@icarus.home.lan> References: <20130701154925.GA64899@icarus.home.lan> <20130701170422.GA65858@icarus.home.lan> <51D1C625.1030401@FreeBSD.org> <20130701185033.GB67450@icarus.home.lan> <51D26C5C.4000107@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <51D26C5C.4000107@FreeBSD.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable List , Scott Sipe , Paul Mather X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Jul 2013 07:57:33 -0000 On Tue, Jul 02, 2013 at 08:59:56AM +0300, Andriy Gapon wrote: > on 01/07/2013 21:50 Jeremy Chadwick said the following: > > The issue is that ZFS on FreeBSD is still young compared to other > > filesystems (specifically UFS). > > That's a fact. > > > Nothing is perfect, but FFS/UFS tends > > to have a significantly larger number of bugs worked out of it to the > > point where people can use it without losing sleep (barring the SUJ > > stuff, don't get me started). > > That's subjective. > > > I have the same concerns over other > > things, like ext2fs and fusefs for that matter -- but this thread is > > about a ZFS-related crash, and that's why I'm "over-focused" on it. > > I have an impression that you seem to state your (negative) opinion of ZFS in > every other thread about ZFS problems. The OP in question ended his post with the line "Thoughts?", and I have given those thoughts. My thoughts/opinions/experience may differ from that of others. Diversity of thoughts/opinions/experiences is good. I'm not some kind of "authoritative ZFS guru" -- far from it. If I misunderstood what "Thoughts?" meant/implied, then draw and quarter me for it; my actions/words = my responsibility. I do not feel I have a "negative opinion" of ZFS. I still use it today on FreeBSD, donated money to Pawel when the project was originally announced (because I wanted to see something new and useful thrive on FreeBSD), and try my best to assist with issues pertaining to it where applicable. These are not the actions of someone with a negative opinion, these are the actions of someone who is supportive while simultaneously very cautious. Is ZFS better today than it was when it was introduced? By a long shot. For example, on my stable/9 system here I don't tune /boot/loader.conf any longer. But that doesn't change my viewpoint when it comes to using ZFS exclusively on a FreeBSD box. > > A heterogeneous (UFS+ZFS) setup, rather than homogeneous (ZFS-only), > > results in a system where an admin can upgrade + boot into single-user > > and perform some tasks to test/troubleshoot; if the ZFS layer is > > broken, it doesn't mean an essentially useless box. That isn't FUD, > > that's just the stage we're at right now. I'm aware lots of people have > > working ZFS-exclusive setups; like I said, "works great until it > > doesn't". > > Yeah, a heterogeneous setup can have its benefits, but it can have its drawbacks > too. This is true for heterogeneous vs monoculture in general. > But the sword cuts both ways: what if something is broken in "UFS layer" or god > forbid in VFS layer and you have only UFS? > Besides, without mentioning specific classes of problems "ZFS layer is broken" > is too vague. The likelihood of something being broken in UFS is significantly lower given its established history. I have to go off of experience, both personal and professional -- in my years of dealing with FreeBSD (1997-present), I have only encountered issues with UFS a few times (I can count them on one, maybe two hands), and I'm choosing to exclude SU+J from the picture for what should be obvious reasons. With ZFS, well... just look at the mailing lists and PR count. I don't want to be a jerk about it, but you really have to look at the quantity. It doesn't mean ZFS is crap, it just means that for me, I don't think we're quite "there" yet. And I will gladly admit -- because you are the one who taught me this -- that every incident need be treated unique. But one can't deny that a substantial percentage (I would say majority) of -fs and -stable posts relate somehow to ZFS; I'm often thrilled when it turns out to be something else. Playing a strange devil's advocate, let me give you an interesting example: softupdates. When SU was introduced to FreeBSD back in the late 90s, there were issues and concerns -- lots. As such, SU was chosen to be disabled by default on root filesystems given the importance of that filesystem (re: "we do not want to risk losing as much data in the case of a crash" -- see the official FAQ, section 8.3). All other filesystems defaulted to SU enabled. It's been like that up until 9.x where it now defaults to enabled. So that's what, 15 years? You could say that my example could also apply to ZFS, i.e. the reports are a part of its growth and maturity, and I'd agree. But I don't feel it's reached the point where I'm willing to risk going ZFS-only. Down the road, sure, but not now. That's just my take on it. Please make sure to also consider, politely, that a lot of people who have issues with ZFS have not been subscribed to the lists for long periods of time. They sign up/post when they have a problem. Meaning: they do not necessarily know of the history. If they did, I (again politely) believe they're likely to use a UFS+ZFS mix, or maybe a gmirror+UFS+ZFS mix (though the GPT/gmirror thing is... never mind...). > > So, how do you kernel guys debug a problem in this environment: > > > > - ZFS-only > > - Running -RELEASE (i.e. no source, thus a kernel cannot be rebuilt > > with added debugging features, etc.) > > - No swap configured > > - No serial console > > I use boot environments and boot to a previous / known-good environment if I hit > a loader bug, a kernel bug or a major userland problem in a new environment. > I also use a mirrored setup and keep two copies of earlier boot chains. > I am also not shy of live media in the case everything else fails. > > Now I wonder how you deal with the same kind of UFS-only environment. The very few times I have had to deal with a system with "filesystem oddities" with UFS, the disk was removed from the system and put into a separate system (running the same kernel/world bits) which was then booted into single-user and things manually dealt with. The points were that the other system 1) was dedicated to this task, 2) had swap set up, and 3) had serial console set up. That system could be rebuilt (from source) to include kernel adjustments/etc. if further debugging data was needed (kernel compile-time features, mainly). All of these could apply to ZFS too, obviously. But in the OP's case, the situation sounds dire given the limitations -- limitations that someone (apparently not him) chose, which greatly hinder debugging/troubleshooting. Had a heterogeneous setup been chosen, the debugging/troubleshooting pains are less (IMO). When I see this, it makes me step back and ponder the decisions that lead to the ZFS-only setup. I work under the model that ZFS is young and therefore will break/cause chaos for me in some way. It's a safety net stemming from actual experiences, in addition to what I see on the lists. I operate under the same pretense when it comes to things like HAMMER on DragonflyBSD and Btrfs on Linux. I do not operate this way when it comes to UFS, just like I do not operate this way when it comes to ext2/ext3 on Linux. I choose to use UFS for root/var/tmp/usr and ZFS for "other stuff" because it allows me to debugging assistance without having to boot alternate media, play around with ISO/memstick images, set up a PXE boot environment, worry about bootloaders, or other whatnots. I just boot the system in single-user and go from there. What about the fact that you do work on ZFS and have familiarity with its code? Would you say your familiarity makes you more comfortable with a ZFS-only setup than others who do not have this familiarity? So with regards to "spreading FUD": - Fear: I'm not afraid of ZFS, I am simply not willing to accept the present-day risks given the alternatives that have been solid for me historically and given my skill set, - Uncertainty: true, I am always uncertain of youthful filesystems, - Doubt: I have no doubts regarding ZFS and its capabilities, potential, usefulness (see above, re: my experience), nor the fact it can (in the binary (yes/no) sense) be used for a root filesystem and/or other critical filesystems. "Spreading FUD" to me conjures the impact of someone running around trying to make people dislike or become afraid of something (I consider this a form of trolling) -- the polar and extreme opposite of advocacy. Such is not my intent, nor has it ever been. While I do have "problems" with FreeBSD (as a whole, the direction it's going, etc.), and would be silly to deny that doesn't influence the tone I use in my mails, it is something quite separate and would rather not go into that. My intent is to make people think about their setup decisions given what they've now experienced, and (hopefully) to get indirect answers as to why they chose the path they did (not quite relevant in this case, since the OP was not the one who deployed this setup). If you feel that's FUD, one might say *that's* subjective, and understandably so -- and I respect that. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Making life hard for others since 1977. PGP 4BD6C0CB |