Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 1 Oct 2019 20:09:25 +0000 (UTC)
From:      Alexander Motin <mav@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   svn commit: r352939 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs
Message-ID:  <201910012009.x91K9P94028362@repo.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: mav
Date: Tue Oct  1 20:09:25 2019
New Revision: 352939
URL: https://svnweb.freebsd.org/changeset/base/352939

Log:
  Improve latency of synchronous 128KB writes.
  
  Before my ZIL space optimization few years ago 128KB writes were logged
  as two 64KB+ records in two 128KB log blocks.  After that change it became
  ~124KB+/4KB+ in two 128KB log blocks to free space in the second block
  for another record.  Unfortunately in case of 128KB only writes, when space
  in the second block remained unused, that change increased write latency by
  imbalancing checksum computation time between parallel threads.
  
  This change introduces new 68KB log block size, used for both writes below
  67KB and 128KB-sharp writes.  Writes of 68-127KB are still using one 128KB
  block to not increase processing overhead.  Writes above 131KB are still
  using full 128KB blocks, since possible saving there is small.  Mixed loads
  will likely also fall back to previous 128KB, since code uses maximum of
  the last 10 requested block sizes.
  
  On a simple 128KB write test with queue depth of 1 this change demonstrates
  ~15-20% performance improvement.
  
  MFC after:	2 weeks
  Sponsored by:	iXsystems, Inc.

Modified:
  head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c

Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c
==============================================================================
--- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c	Tue Oct  1 19:39:00 2019	(r352938)
+++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c	Tue Oct  1 20:09:25 2019	(r352939)
@@ -1351,11 +1351,17 @@ zil_lwb_write_open(zilog_t *zilog, lwb_t *lwb)
  * aligned to 4KB) actually gets written. However, we can't always just
  * allocate SPA_OLD_MAXBLOCKSIZE as the slog space could be exhausted.
  */
-uint64_t zil_block_buckets[] = {
-    4096,		/* non TX_WRITE */
-    8192+4096,		/* data base */
-    32*1024 + 4096, 	/* NFS writes */
-    UINT64_MAX
+struct {
+	uint64_t	limit;
+	uint64_t	blksz;
+} zil_block_buckets[] = {
+    { 4096,		4096 },			/* non TX_WRITE */
+    { 8192 + 4096,	8192 + 4096 },		/* database */
+    { 32768 + 4096,	32768 + 4096 },		/* NFS writes */
+    { 65536 + 4096,	65536 + 4096 },		/* 64KB writes */
+    { 131072,		131072 },		/* < 128KB writes */
+    { 131072 + 4096,	65536 + 4096 },		/* 128KB writes */
+    { UINT64_MAX,	SPA_OLD_MAXBLOCKSIZE},	/* > 128KB writes */
 };
 
 /*
@@ -1432,11 +1438,9 @@ zil_lwb_write_issue(zilog_t *zilog, lwb_t *lwb)
 	 * pool log space.
 	 */
 	zil_blksz = zilog->zl_cur_used + sizeof (zil_chain_t);
-	for (i = 0; zil_blksz > zil_block_buckets[i]; i++)
+	for (i = 0; zil_blksz > zil_block_buckets[i].limit; i++)
 		continue;
-	zil_blksz = zil_block_buckets[i];
-	if (zil_blksz == UINT64_MAX)
-		zil_blksz = SPA_OLD_MAXBLOCKSIZE;
+	zil_blksz = zil_block_buckets[i].blksz;
 	zilog->zl_prev_blks[zilog->zl_prev_rotor] = zil_blksz;
 	for (i = 0; i < ZIL_PREV_BLKS; i++)
 		zil_blksz = MAX(zil_blksz, zilog->zl_prev_blks[i]);



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201910012009.x91K9P94028362>