Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 20 Jul 2013 18:14:20 +0100
From:      Frank Leonhardt <frank2@fjl.co.uk>
To:        freebsd-questions@freebsd.org
Subject:   Re: to gmirror or to ZFS
Message-ID:  <51EAC56C.4030801@fjl.co.uk>
In-Reply-To: <DCC017BE-A293-4C1B-8B6F-D9AF6F50125B@mac.com>
References:  <4DFBC539-3CCC-4B9B-AB62-7BB846F18530@gmail.com> <alpine.BSF.2.00.1307152211180.74094@wonkity.com> <976836C5-F790-4D55-A80C-5944E8BC2575@gmail.com> <51E51558.50302@ShaneWare.Biz> <51E52190.7020008@fjl.co.uk> <CAOaKuAVULVuZxtExp=mNi-J7kMNbsxbLJVsv8nKTA0-Ru6M3%2Bw@mail.gmail.com> <6CE5718E-2646-4D8C-AF98-37384B8851C5@mac.com> <CAOaKuAU8nhaoq%2B6hCVkB%2Bb-ppiBvYPKANdWJRnYcmKaPdecwZA@mail.gmail.com> <DCC017BE-A293-4C1B-8B6F-D9AF6F50125B@mac.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On 16/07/2013 20:48, Charles Swiger wrote:
> Hi--
>
> On Jul 16, 2013, at 11:27 AM, Johan Hendriks <joh.hendriks@gmail.com> wrote:
>>> Well, "don't do that".  :-)
>> When the server reboots because of a powerfailure at night, then it boots.
>> Then it starts to rebuild the mirror on its own, and later the fsck kicks in.
>>
>> Not much i can do about it.
>>
>> Maybe i should have done it without the automatic attachment for a new device.
> It's normally the case that getting a hot spare automatically attached should be
> fine, but not if you also have the box go down entirely and need to fsck.
>
> I'm more used to needing to explicitly physically swap out a failed mirror component,
> in which case one can make sure the system is OK before the replacement drive goes in.
>
Agreed. Blaming gmirror for this kind of thing overlooks the overall 
design and operating procedures of the system, and assuming ZFS would 
have been any better may be wishful thinking. I've had plenty of gmirror 
crashes over the years, and they have all been recoverable. One thing I 
never allow it to do is to rebuild automatically. That's something for a 
human to initiate once the problem has been identified, and if it's 
flaky power in the data centre the job is postponed until I'm satisfied 
it's not going to drop during the rebuild. IME, one power failure is 
normally followed by several more.

It's worth noting, as a warning for anyone who hasn't been there, that 
the number of times a second drive in a RAID system fails during a 
rebuild is higher than would be expected. During a rebuild the remaining 
drives get thrashed, hot, and if they're on the edge, that's when 
they're going to go. And at the most inconvenient time. Okay - obvious 
when you think about it, but this tends to be too late.

Regards, Frank.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?51EAC56C.4030801>