From owner-svn-src-vendor@freebsd.org Mon Jul 18 07:03:40 2016 Return-Path: Delivered-To: svn-src-vendor@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8D596B9C909; Mon, 18 Jul 2016 07:03:40 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 512A51F7C; Mon, 18 Jul 2016 07:03:40 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id u6I73dGX012171; Mon, 18 Jul 2016 07:03:39 GMT (envelope-from avg@FreeBSD.org) Received: (from avg@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id u6I73dIi012166; Mon, 18 Jul 2016 07:03:39 GMT (envelope-from avg@FreeBSD.org) Message-Id: <201607180703.u6I73dIi012166@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: avg set sender to avg@FreeBSD.org using -f From: Andriy Gapon Date: Mon, 18 Jul 2016 07:03:39 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-vendor@freebsd.org Subject: svn commit: r302993 - in vendor-sys/illumos/dist/uts/common: fs/zfs fs/zfs/sys sys/fs X-SVN-Group: vendor-sys MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-vendor@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: SVN commit messages for the vendor work area tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jul 2016 07:03:40 -0000 Author: avg Date: Mon Jul 18 07:03:39 2016 New Revision: 302993 URL: https://svnweb.freebsd.org/changeset/base/302993 Log: 7104 increase indirect block size illumos/illumos-gate@4b5c8e93cab28d3c65ba9d407fd8f46e3be1db1c https://github.com/illumos/illumos-gate/commit/4b5c8e93cab28d3c65ba9d407fd8f46e3be1db1c https://www.illumos.org/issues/7104 The current default indirect block size is 16KB. We can improve performance by increasing it to 128KB. This is especially helpful for any workload that needs to read most of the metadata, e.g. scrub/resilver, file deletion, filesystem deletion, and zfs send. We also need to fix a few space estimation errors to make the tests pass. Reviewed by: George Wilson Reviewed by: Paul Dagnelie Reviewed by: Dan McDonald Approved by: Robert Mustacchi Author: Matthew Ahrens Modified: vendor-sys/illumos/dist/uts/common/fs/zfs/dmu_objset.c vendor-sys/illumos/dist/uts/common/fs/zfs/spa_misc.c vendor-sys/illumos/dist/uts/common/fs/zfs/sys/dnode.h vendor-sys/illumos/dist/uts/common/sys/fs/zfs.h Modified: vendor-sys/illumos/dist/uts/common/fs/zfs/dmu_objset.c ============================================================================== --- vendor-sys/illumos/dist/uts/common/fs/zfs/dmu_objset.c Mon Jul 18 06:58:39 2016 (r302992) +++ vendor-sys/illumos/dist/uts/common/fs/zfs/dmu_objset.c Mon Jul 18 07:03:39 2016 (r302993) @@ -787,11 +787,17 @@ dmu_objset_create_impl(spa_t *spa, dsl_d /* * Determine the number of levels necessary for the meta-dnode - * to contain DN_MAX_OBJECT dnodes. + * to contain DN_MAX_OBJECT dnodes. Note that in order to + * ensure that we do not overflow 64 bits, there has to be + * a nlevels that gives us a number of blocks > DN_MAX_OBJECT + * but < 2^64. Therefore, + * (mdn->dn_indblkshift - SPA_BLKPTRSHIFT) (10) must be + * less than (64 - log2(DN_MAX_OBJECT)) (16). */ - while ((uint64_t)mdn->dn_nblkptr << (mdn->dn_datablkshift + + while ((uint64_t)mdn->dn_nblkptr << + (mdn->dn_datablkshift - DNODE_SHIFT + (levels - 1) * (mdn->dn_indblkshift - SPA_BLKPTRSHIFT)) < - DN_MAX_OBJECT * sizeof (dnode_phys_t)) + DN_MAX_OBJECT) levels++; mdn->dn_next_nlevels[tx->tx_txg & TXG_MASK] = Modified: vendor-sys/illumos/dist/uts/common/fs/zfs/spa_misc.c ============================================================================== --- vendor-sys/illumos/dist/uts/common/fs/zfs/spa_misc.c Mon Jul 18 06:58:39 2016 (r302992) +++ vendor-sys/illumos/dist/uts/common/fs/zfs/spa_misc.c Mon Jul 18 07:03:39 2016 (r302993) @@ -341,9 +341,14 @@ int spa_asize_inflation = 24; * it is possible to run the pool completely out of space, causing it to * be permanently read-only. * + * Note that on very small pools, the slop space will be larger than + * 3.2%, in an effort to have it be at least spa_min_slop (128MB), + * but we never allow it to be more than half the pool size. + * * See also the comments in zfs_space_check_t. */ int spa_slop_shift = 5; +uint64_t spa_min_slop = 128 * 1024 * 1024; /* * ========================================================================== @@ -1637,14 +1642,16 @@ spa_get_asize(spa_t *spa, uint64_t lsize /* * Return the amount of slop space in bytes. It is 1/32 of the pool (3.2%), - * or at least 32MB. + * or at least 128MB, unless that would cause it to be more than half the + * pool size. * * See the comment above spa_slop_shift for details. */ uint64_t -spa_get_slop_space(spa_t *spa) { +spa_get_slop_space(spa_t *spa) +{ uint64_t space = spa_get_dspace(spa); - return (MAX(space >> spa_slop_shift, SPA_MINDEVSIZE >> 1)); + return (MAX(space >> spa_slop_shift, MIN(space >> 1, spa_min_slop))); } uint64_t Modified: vendor-sys/illumos/dist/uts/common/fs/zfs/sys/dnode.h ============================================================================== --- vendor-sys/illumos/dist/uts/common/fs/zfs/sys/dnode.h Mon Jul 18 06:58:39 2016 (r302992) +++ vendor-sys/illumos/dist/uts/common/fs/zfs/sys/dnode.h Mon Jul 18 07:03:39 2016 (r302993) @@ -58,7 +58,7 @@ extern "C" { */ #define DNODE_SHIFT 9 /* 512 bytes */ #define DN_MIN_INDBLKSHIFT 12 /* 4k */ -#define DN_MAX_INDBLKSHIFT 14 /* 16k */ +#define DN_MAX_INDBLKSHIFT 17 /* 128k */ #define DNODE_BLOCK_SHIFT 14 /* 16k */ #define DNODE_CORE_SIZE 64 /* 64 bytes for dnode sans blkptrs */ #define DN_MAX_OBJECT_SHIFT 48 /* 256 trillion (zfs_fid_t limit) */ @@ -88,6 +88,11 @@ extern "C" { #define DNODES_PER_BLOCK_SHIFT (DNODE_BLOCK_SHIFT - DNODE_SHIFT) #define DNODES_PER_BLOCK (1ULL << DNODES_PER_BLOCK_SHIFT) + +/* + * This is inaccurate if the indblkshift of the particular object is not the + * max. But it's only used by userland to calculate the zvol reservation. + */ #define DNODES_PER_LEVEL_SHIFT (DN_MAX_INDBLKSHIFT - SPA_BLKPTRSHIFT) #define DNODES_PER_LEVEL (1ULL << DNODES_PER_LEVEL_SHIFT) Modified: vendor-sys/illumos/dist/uts/common/sys/fs/zfs.h ============================================================================== --- vendor-sys/illumos/dist/uts/common/sys/fs/zfs.h Mon Jul 18 06:58:39 2016 (r302992) +++ vendor-sys/illumos/dist/uts/common/sys/fs/zfs.h Mon Jul 18 07:03:39 2016 (r302993) @@ -600,6 +600,8 @@ typedef struct zpool_rewind_policy { /* * This is needed in userland to report the minimum necessary device size. + * + * Note that the zfs test suite uses 64MB vdevs. */ #define SPA_MINDEVSIZE (64ULL << 20)