Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 May 2015 13:00:36 +0200
From:      Jan Bramkamp <crest@rlwinm.de>
To:        freebsd-fs@freebsd.org
Subject:   Re: ZFS RAID 10 capacity expansion and uneven data distribution
Message-ID:  <55532ED4.5060401@rlwinm.de>
In-Reply-To: <5552071A.40709@free.de>
References:  <5552071A.40709@free.de>

next in thread | previous in thread | raw e-mail | index | archive | help


On 12/05/15 15:58, Kai Gallasch wrote:
> Hello list.
>
> What is the preferred way to expand a mirrored or RAID 10 zpool with
> additional mirror pairs?
>
> On one server I am currently using a four disk RAID 10 zpool:
>
> 	zpool              ONLINE       0     0     0
> 	  mirror-0         ONLINE       0     0     0
> 	    gpt/zpool-da2  ONLINE       0     0     0
> 	    gpt/zpool-da3  ONLINE       0     0     0
> 	  mirror-1         ONLINE       0     0     0
> 	    gpt/zpool-da4  ONLINE       0     0     0
> 	    gpt/zpool-da5  ONLINE       0     0     0
>
> Originally the pool consisted of only one mirror (zpool-da2 and zpool-da3)
>
> I then used "zpool add" to add mirror-1 to the pool
>
> Directly afer adding the new mirror I had all old data physically
> sitting on the old mirror and no data on the new disks.
>
> So there is much imbalance in the data distribution across the RAID 10.
> The effect is now, that the IOPS are not evently distributed about all
> devs of the pool and e.g. "gstat -p" when the server is very busy
> showed, that the old mirror pair can max out at 100% I/O usage while the
> other one is almost idle.
>
> Also: I also noted that the old mirror-pair had a FRAG about 50%, while
> the new one only has 3%.
>
> So is it generally not a good idea to expand a mirrored pool or RAID 10
> pool with new mirror pairs?
>
> Or by which procedure can the existing data in the pool be evenly
> distributed about all devices inside the pool?

ZFS will prefer the new VDEV for writes and they will slowly balance in 
due time. If you don't want to wait recreate the pool from a snapshot. 
Rebalancing between VDEVs requires moving a lot of data. ZFS as a COW 
filesystem requires the user to write data to move data (between VDEVS). 
Hiding this fact from the userspace would require the magical block 
pointer rewrites. Don't hold your breath. Nobody in OpenZFS the 
community wants to implement full block pointer rewriting because it 
would introduce a lot of new dependencies between ZFS internal layers. 
It would probably be the last major feature ever implemented in ZFS 
because it would complicate working with the code base a lot from what I 
gathered following the OpenZFS office hours.

You are stuck with either an unbalanced pool or the task of recreating 
your pool from a snapshot stream.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55532ED4.5060401>