Date: Tue, 9 Feb 2016 18:28:22 +0100 From: Jan Bramkamp <crest@rlwinm.de> To: freebsd-stable@freebsd.org Subject: Re: Best practices for ZFS setup for a strictly SSD based system? Message-ID: <56BA21B6.3070308@rlwinm.de> In-Reply-To: <2D296837-3B06-4E72-B8B0-A33AE6CE48AE@punkt.de> References: <2D296837-3B06-4E72-B8B0-A33AE6CE48AE@punkt.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On 09/02/16 16:54, Patrick M. Hausen wrote: > Hi, all, > > while there is quite a bit of documentation on how to improve ZFS performance > by using a combination of rotating disks and SSDs, I have not found much about > an SSD only setup. > > We are planning to try a hosting server with 8 SATA SSDs with ZFS. Things I am > not at all sure about: > > * Does the recommended limit of 6 disks for a RAIDZ2 still > hold? 2x 4 disks is quite a bit of overhead, could I use all 8 > in one vdev and get away with it? > (The maximum of 6 recommendation is in some old Sun doc) There are multiple reasons to limit number of disks per RAID-Z VDEV. * Resilver time: ZFS has to process all objects ordered by transaction id to resilver a RAID-Z. Resilvering is a torture test for the remaining disks of your degraded RAID-Z and with the ratio of bandwidth to capacity of current hard disks resilvering takes too long. This isn't an issue for SSDs. * For performance estimations think of the RAID-Z of one huge disk with larger blocks but the same IOPS as the slowest disk in the RAID-Z. Databases perform disk I/O in small blocks limiting your RAID-Z to the performance of about one of its member disks. * A ZFS pool can only grow by adding whole VDEVS or replacing all disks in a VDEV one at a time. Using mirror allows the pool to grow in smaller increments. > * Will e.g. MySQL still profit from residing on a mirror > instead of a RAIDZ2, even if all disks are SSDs? Yes OpenZFS schedules reads on mirrors to the disk with the shortest queue thus a mirror offers about sum of its member disks in read performance (IOPS and bandwidth) and the minimum of its member disks in write performance (IOPS and bandwidth). A pool with as many mirrored VDEVs as possible will offer the optimal performance for a given number of disks. For write heavy workloads the quality of the SSDs matters a lot as well. Cheap consumer SSDs can't sustain high write rates for any length of time. Even medium quality SSDs have a lot of jitter and suffer from throughput degradation under sustained write loads. Optimized server SSDs can sustain random write workloads with little jitter and bounded latency. A NVMe SSD can offer an additional order of magnitude performance increase over SATA SSDs but at a significant increase in price. With multiple NVMe SSDs you will run into the current scalability limits of ZFS and GEOM. > * Does a separate ZIL and/or ARC cache device still > make sense? Most likely not. An other optimization is splitting the log and table space and creating a dedicated ZFS dataset for each. Create the dataset containing the table space with the fixed record size of your MySQL backend. ZFS also offers a lot more consistency and atomicity quarantines than required by a minimal POSIX file system. This allows you to further reduce the syncing overhead by tuning MySQL to take advantage of ZFS quarantines.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?56BA21B6.3070308>