Date: Sat, 19 Feb 2011 14:12:20 -0600 (CST) From: Robert Bonomi <bonomi@mail.r-bonomi.com> To: freebsd-questions@freebsd.org Subject: Re: ZFS-only booting on FreeBSD Message-ID: <201102192012.p1JKCKnP038248@mail.r-bonomi.com> In-Reply-To: <AF8BFB811828E5E7EFD857A5@mac-pro.magehandbook.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> Date: Sat, 19 Feb 2011 10:35:35 -0500 > From: Daniel Staal <DStaal@usa.net> > Subject: Re: ZFS-only booting on FreeBSD > [[.. sneck ..]] > > Basically, if a ZFS boot drive fails, you are likely to get the following > scenario: > 1) 'What do I need to do to replace a disk in the ZFS pool?' > 2) 'Oh, that's easy.' Replaces disk. > 3) System fails to boot at some later point. > 4) 'Oh, right, you need to do this *as well* on the *boot* pool...' > > Where if a UFS boot drive fails on an otherwise ZFS system, you'll get: > 1) 'What's this drive?' > 2) 'Oh, so how do I set that up again?' > 3) Set up replacement boot drive. > > The first situation hides that it's a special case, where the second one > doesn't. "For any foolproof system, there exists a _sufficiently-determined_ fool capable of breaking it" applies. > To avoid the first scenario you need to make sure your sysadmins are > following *local* (and probably out-of-band) docs, and aware of potential > problems. And awake. ;) The scenario in the second situation presents > it's problem as a unified package, and you can rely on normal levels of > alertness to be able to handle it correctly. (The sysadmin will realize > it needs to be set up as a boot device because it's the boot device. ;) > It may be complicated, but it's *obviously* complicated.) > > I'm still not clear on whether a ZFS-only system will boot with a failed > drive in the root ZFS pool. Once booted, of course a decent ZFS setup > should be able to recover from the failed drive. But the question is if > the FreeBSD boot process will handle the redundancy or not. At this > point I'm actually guessing it will, which of course only exasperates the > above surprise problem: 'The easy ZFS disk replacement procedure *did* > work in the past, why did it cause a problem now?' (And conceivably it > could cause *major* data problems at that point, as ZFS will *grow* a > pool quite easily, but *shrinking* one is a problem.) A non-ZFS boot drive results in immediate, _guaranteed_, down-time for replacement if/when it fails. A ZFS boot drive lets you replace the drive and *schedule* the down-time (for a 'test' re-boot, to make *sure* everything works) at a convenient time. Failure to schedule the required down time is a management failure, not a methodology issue. One has located the requisite "sufficiently- determined" fool, and the results thereof are to be expected.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201102192012.p1JKCKnP038248>