From owner-svn-src-all@FreeBSD.ORG Thu Apr 24 01:06:04 2014 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9CD56402; Thu, 24 Apr 2014 01:06:04 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 890831DE8; Thu, 24 Apr 2014 01:06:04 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.8/8.14.8) with ESMTP id s3O164jg039717; Thu, 24 Apr 2014 01:06:04 GMT (envelope-from smh@svn.freebsd.org) Received: (from smh@localhost) by svn.freebsd.org (8.14.8/8.14.8/Submit) id s3O164wo039714; Thu, 24 Apr 2014 01:06:04 GMT (envelope-from smh@svn.freebsd.org) Message-Id: <201404240106.s3O164wo039714@svn.freebsd.org> From: Steven Hartland Date: Thu, 24 Apr 2014 01:06:04 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r264850 - in head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Apr 2014 01:06:04 -0000 Author: smh Date: Thu Apr 24 01:06:03 2014 New Revision: 264850 URL: http://svnweb.freebsd.org/changeset/base/264850 Log: Add the ability to set a minimum ashift size for ZFS pool creation or root level vdev addition. Change max_auto_ashift sysctl to error when an invalid value is requested instead of silently limiting it. Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h ============================================================================== --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h Thu Apr 24 00:41:02 2014 (r264849) +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h Thu Apr 24 01:06:03 2014 (r264850) @@ -106,7 +106,7 @@ _NOTE(CONSTCOND) } while (0) #define SPA_BLOCKSIZES (SPA_MAXBLOCKSHIFT - SPA_MINBLOCKSHIFT + 1) /* - * Maximum supported logical ashift. + * Default maximum supported logical ashift. * * The current 8k allocation block size limit is due to the 8k * aligned/sized operations performed by vdev_probe() on @@ -117,6 +117,11 @@ _NOTE(CONSTCOND) } while (0) #define SPA_MAXASHIFT 13 /* + * Default minimum supported logical ashift. + */ +#define SPA_MINASHIFT SPA_MINBLOCKSHIFT + +/* * Size of block to hold the configuration data (a packed nvlist) */ #define SPA_CONFIG_BLOCKSIZE (1ULL << 14) Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c ============================================================================== --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Thu Apr 24 00:41:02 2014 (r264849) +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Thu Apr 24 01:06:03 2014 (r264850) @@ -53,7 +53,7 @@ SYSCTL_NODE(_vfs_zfs, OID_AUTO, vdev, CT * Virtual device management. */ -/** +/* * The limit for ZFS to automatically increase a top-level vdev's ashift * from logical ashift to physical ashift. * @@ -61,19 +61,34 @@ SYSCTL_NODE(_vfs_zfs, OID_AUTO, vdev, CT * child->vdev_ashift = 9 (512 bytes) * child->vdev_physical_ashift = 12 (4096 bytes) * zfs_max_auto_ashift = 11 (2048 bytes) + * zfs_min_auto_ashift = 9 (512 bytes) * - * On pool creation or the addition of a new top-leve vdev, ZFS will - * bump the ashift of the top-level vdev to 2048. + * On pool creation or the addition of a new top-level vdev, ZFS will + * increase the ashift of the top-level vdev to 2048 as limited by + * zfs_max_auto_ashift. * * Example: one or more 512B emulation child vdevs * child->vdev_ashift = 9 (512 bytes) * child->vdev_physical_ashift = 12 (4096 bytes) * zfs_max_auto_ashift = 13 (8192 bytes) + * zfs_min_auto_ashift = 9 (512 bytes) + * + * On pool creation or the addition of a new top-level vdev, ZFS will + * increase the ashift of the top-level vdev to 4096 to match the + * max vdev_physical_ashift. * - * On pool creation or the addition of a new top-leve vdev, ZFS will - * bump the ashift of the top-level vdev to 4096. + * Example: one or more 512B emulation child vdevs + * child->vdev_ashift = 9 (512 bytes) + * child->vdev_physical_ashift = 9 (512 bytes) + * zfs_max_auto_ashift = 13 (8192 bytes) + * zfs_min_auto_ashift = 12 (4096 bytes) + * + * On pool creation or the addition of a new top-level vdev, ZFS will + * increase the ashift of the top-level vdev to 4096 to match the + * zfs_min_auto_ashift. */ static uint64_t zfs_max_auto_ashift = SPA_MAXASHIFT; +static uint64_t zfs_min_auto_ashift = SPA_MINASHIFT; static int sysctl_vfs_zfs_max_auto_ashift(SYSCTL_HANDLER_ARGS) @@ -86,8 +101,8 @@ sysctl_vfs_zfs_max_auto_ashift(SYSCTL_HA if (err != 0 || req->newptr == NULL) return (err); - if (val > SPA_MAXASHIFT) - val = SPA_MAXASHIFT; + if (val > SPA_MAXASHIFT || val < zfs_min_auto_ashift) + return (EINVAL); zfs_max_auto_ashift = val; @@ -96,7 +111,31 @@ sysctl_vfs_zfs_max_auto_ashift(SYSCTL_HA SYSCTL_PROC(_vfs_zfs, OID_AUTO, max_auto_ashift, CTLTYPE_U64 | CTLFLAG_MPSAFE | CTLFLAG_RW, 0, sizeof(uint64_t), sysctl_vfs_zfs_max_auto_ashift, "QU", - "Cap on logical -> physical ashift adjustment on new top-level vdevs."); + "Max ashift used when optimising for logical -> physical sectors size on " + "new top-level vdevs."); + +static int +sysctl_vfs_zfs_min_auto_ashift(SYSCTL_HANDLER_ARGS) +{ + uint64_t val; + int err; + + val = zfs_min_auto_ashift; + err = sysctl_handle_64(oidp, &val, 0, req); + if (err != 0 || req->newptr == NULL) + return (err); + + if (val < SPA_MINASHIFT || val > zfs_max_auto_ashift) + return (EINVAL); + + zfs_min_auto_ashift = val; + + return (0); +} +SYSCTL_PROC(_vfs_zfs, OID_AUTO, min_auto_ashift, + CTLTYPE_U64 | CTLFLAG_MPSAFE | CTLFLAG_RW, 0, sizeof(uint64_t), + sysctl_vfs_zfs_min_auto_ashift, "QU", + "Min ashift used when creating new top-level vdevs."); static vdev_ops_t *vdev_ops_table[] = { &vdev_root_ops, @@ -1631,19 +1670,30 @@ vdev_metaslab_set_size(vdev_t *vd) } /* - * Maximize performance by inflating the configured ashift for - * top level vdevs to be as close to the physical ashift as - * possible without exceeding the administrator specified - * limit. + * Maximize performance by inflating the configured ashift for top level + * vdevs to be as close to the physical ashift as possible while maintaining + * administrator defined limits and ensuring it doesn't go below the + * logical ashift. */ void vdev_ashift_optimize(vdev_t *vd) { - if (vd == vd->vdev_top && - (vd->vdev_ashift < vd->vdev_physical_ashift) && - (vd->vdev_ashift < zfs_max_auto_ashift)) { - vd->vdev_ashift = MIN(zfs_max_auto_ashift, - vd->vdev_physical_ashift); + if (vd == vd->vdev_top) { + if (vd->vdev_ashift < vd->vdev_physical_ashift) { + vd->vdev_ashift = MIN( + MAX(zfs_max_auto_ashift, vd->vdev_ashift), + MAX(zfs_min_auto_ashift, vd->vdev_physical_ashift)); + } else { + /* + * Unusual case where logical ashift > physical ashift + * so we can't cap the calculated ashift based on max + * ashift as that would cause failures. + * We still check if we need to increase it to match + * the min ashift. + */ + vd->vdev_ashift = MAX(zfs_min_auto_ashift, + vd->vdev_ashift); + } } }