Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 19 Feb 2012 08:28:41 -0500
From:      Michael Shuey <shuey@fmepnet.org>
To:        dg17@penx.com
Cc:        freebsd-fs@freebsd.org
Subject:   Re: ZFS size reduced, 100% full, on fbsd9 upgrade
Message-ID:  <CAELRr5k%2BvuN8G2BRigFT4%2BpmLergbcn_ybOV%2BSQj7KGDE-FEOw@mail.gmail.com>
In-Reply-To: <1329595563.42839.28.camel@btw.pki2.com>
References:  <CAELRr5kPXjqTooLbjPC1oPB3e2TfRC=eE%2Bzvsu-tW54Pz42xFg@mail.gmail.com> <1329595563.42839.28.camel@btw.pki2.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Okay, today's lesson: When you replace a disk with a bigger drive, and
it increases your raidz2's pool capacity, ALWAYS run a zpool scrub
<pool> before doing anything else.

I rebooted back to 8.2p6, ran a (somewhat longer than normal) scrub,
rebooted, then booted back to 9.0.  Seems fine now, and is finishing
its freebsd-update.  Weird....but at least it works.


On Sat, Feb 18, 2012 at 3:06 PM, Dennis Glatting <dg17@penx.com> wrote:
> I'm not a ZFS wiz but...
>
>
> On Sat, 2012-02-18 at 10:25 -0500, Michael Shuey wrote:
>> I'm upgrading a server from 8.2p6 to 9.0-RELEASE, and I've tried both
>> make in the source tree and freebsd-update and I get the same strange
>> result. =A0As soon as I boot to the fbsd9 kernel, even booting into
>> single-user mode, the pool's size is greatly reduced. =A0All filesystems
>> show 100% full (0 bytes free space), nothing can be written to the
>> pool (probably a side-effect of being 100% full), and dmesg shows
>> several of "Solaris: WARNING: metaslab_free_dva(): bad DVA
>> 0:5978620460544" warnings (with different numbers). =A0Switching kernels
>> back to the 8.2p6 kernel restores things to normal, but I'd really
>> like to finish my fbsd9 upgrade.
>>
>> The system is a 64-bit Intel box with 4 GB of memory, and 8 disks in a
>> raidz2 pool called "pool". =A0It's booted to the 8.2p6 kernel now, and
>> scrubbing the pool, but last time I did this (roughly a week ago) it
>> was fine. =A0/ is a gmirror, but /usr, /tmp, and /var all come from the
>> pool. =A0Normally, the pool has 1.2 TB of free space, and is version 15
>> (zfs version 4). =A0Some disks are WD drives, with 4k native sectors,
>> but some time ago I rebuilt the pool to use a native 4k sector size
>> (ashift=3D12).
>>
>
> I believe 4GB of memory is the minimum. More is better. When you use the
> minimum of anything, expect dodginess.
>
> You should upgrade your pool -- bug fixes and all that.
>
> Are all the disks 4k sectors? I found that a mix of 512 and 4k work but
> performance is best when they are all the same. I have also found 512
> emulation isn't a believable choice when looking at performance (i.e.,
> set for 4k).
>
> Different people have different opinions but I personally do not use ZFS
> for the OS, rather I RAID1 the OS. The question you have to ask is
> if /usr goes kablewie whether you have he skills to put it back
> together. I do not, so "simple" (i.e., hardware RAID1) for the OS is
> good for me -- it isn't the OS that's being worked in my setups, rather
> the data areas.
>
>
>> Over time, I've been slowly replacing disks (1 at a time) to increase
>> the free space in the pool. =A0Also, the system experienced severe
>> failure recently; the power supply blew, and took out the memory (and
>> presumably motherboard). =A0I replaced these last week with known-good
>> board/memory/processor/PS, and it's been running fine since.
>>
>
> Expect mixed results with mixed disks, at least from my experience,
> particularly when it comes to performance.
>
> Is the MB the same? I have had mixed results. I find the Gigabyte boards
> work well but ASUS dodgy when it comes to high interrupt handling.
> Server boards with ECC memory are the most reliable.
>
>
>> Any suggestions? =A0Is it possible I've got some nasty pool corruption
>> going on - and if so, how do I go about fixing it? =A0Any advice would
>> be appreciated. =A0This is a backup server, so I could rebuild its
>> contents from the primary, but I'd rather fix it if possible (since I
>> want to do a fbsd9 upgrade on the primary next).
>
> I screw around with my set ups. What I found is rebuilding the pool
> (when I screw it up) is the least troublesome approach.
>
> Recently I found a tray bad on one of my servers. Drove me nuts for two
> weeks. It could be a loose cable, or bad cable, or crimped cable, but I
> am not yet in the position to open the case. Most of my ZFS weirdnesses
> have been hardware related.
>
> It could be your blowout impacted your disks or wiring. Do you SMART? I
> found, generally, SMART is goodness but I presently have a question mark
> when it comes to the Hitachi 4TB disks (I misbehaved on that system so
> then issue could be my own; however on another system there wasn't any
> errors).
>
> I have found, when I have multiple, identical controllers, that the same
> firmware across the controllers is a good approach, otherwise weirdness
> and different MBs manifest this problem in different ways. Also, make
> sure your MB's BIOS is recent.
>
> YMMV
>
>
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAELRr5k%2BvuN8G2BRigFT4%2BpmLergbcn_ybOV%2BSQj7KGDE-FEOw>