Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 14 Apr 2017 18:41:37 +0000 (UTC)
From:      Andriy Gapon <avg@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-vendor@freebsd.org
Subject:   svn commit: r316929 - vendor-sys/illumos/dist/uts/common/fs/zfs
Message-ID:  <201704141841.v3EIfb7g078398@repo.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: avg
Date: Fri Apr 14 18:41:37 2017
New Revision: 316929
URL: https://svnweb.freebsd.org/changeset/base/316929

Log:
  6914 kernel virtual memory fragmentation leads to hang
  
  illumos/illumos-gate@af868f46a5b794687741d5424de9e3a2d684a84a
  https://github.com/illumos/illumos-gate/commit/af868f46a5b794687741d5424de9e3a2d684a84a
  
  https://www.illumos.org/issues/6914
    This change allows the kernel to use more virtual address space. This will
    allow us to devote 1.5x physmem for the zio arena, and an additional 1.5x
    physmem for the kernel heap.
    We saw a hang when unable to find any 128K contiguous memory segments. Looking
    at the core file we see many threads in stacks similar to this:
    > ffffff68c9c87c00::findstack -v
    stack pointer for thread ffffff68c9c87c00: ffffff02cd63d8b0
    [ ffffff02cd63d8b0 _resume_from_idle+0xf4() ]
      ffffff02cd63d8e0 swtch+0x141()
      ffffff02cd63d920 cv_wait+0x70(ffffff6009b1b01e, ffffff6009b1b020)
      ffffff02cd63da50 vmem_xalloc+0x640(ffffff6009b1b000, 20000, 1000, 0, 0, 0, 0, ffffff0200000004)
      ffffff02cd63dac0 vmem_alloc+0x135(ffffff6009b1b000, 20000, 4)
      ffffff02cd63db60 segkmem_xalloc+0x171(ffffff6009b1b000, 0, 20000, 4, 0, fffffffffb885fe0, fffffffffbcefa10)
      ffffff02cd63dbc0 segkmem_alloc_vn+0x4a(ffffff6009b1b000, 20000, 4, fffffffffbcefa10)
      ffffff02cd63dbf0 segkmem_zio_alloc+0x20(ffffff6009b1b000, 20000, 4)
      ffffff02cd63dd20 vmem_xalloc+0x5b1(ffffff6009b1c000, 20000, 1000, 0, 0, 0, 0, 4)
      ffffff02cd63dd90 vmem_alloc+0x135(ffffff6009b1c000, 20000, 4)
      ffffff02cd63de20 kmem_slab_create+0x8d(ffffff605fd37008, 4)
      ffffff02cd63de80 kmem_slab_alloc+0x11e(ffffff605fd37008, 4)
      ffffff02cd63dee0 kmem_cache_alloc+0x233(ffffff605fd37008, 4)
      ffffff02cd63df10 zio_data_buf_alloc+0x5b(20000)
      ffffff02cd63df70 arc_get_data_buf+0x92(ffffff6265a70588, 20000, ffffff901fd796f8)
      ffffff02cd63dfb0 arc_buf_alloc_impl+0x9c(ffffff6265a70588, ffffff6d233ab0b8)
  
  Reviewed by: George Wilson <george.wilson@delphix.com>
  Reviewed by: Adam Leventhal <ahl@delphix.com>
  Reviewed by: John Kennedy <john.kennedy@delphix.com>
  Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
  Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
  Approved by: Garrett D'Amore <garrett@damore.org>
  Author: Matthew Ahrens <mahrens@delphix.com>

Modified:
  vendor-sys/illumos/dist/uts/common/fs/zfs/arc.c

Modified: vendor-sys/illumos/dist/uts/common/fs/zfs/arc.c
==============================================================================
--- vendor-sys/illumos/dist/uts/common/fs/zfs/arc.c	Fri Apr 14 18:38:53 2017	(r316928)
+++ vendor-sys/illumos/dist/uts/common/fs/zfs/arc.c	Fri Apr 14 18:41:37 2017	(r316929)
@@ -5885,18 +5885,6 @@ arc_init(void)
 	/* Convert seconds to clock ticks */
 	arc_min_prefetch_lifespan = 1 * hz;
 
-	/* Start out with 1/8 of all memory */
-	arc_c = allmem / 8;
-
-#ifdef _KERNEL
-	/*
-	 * On architectures where the physical memory can be larger
-	 * than the addressable space (intel in 32-bit mode), we may
-	 * need to limit the cache to 1/8 of VM size.
-	 */
-	arc_c = MIN(arc_c, vmem_size(heap_arena, VMEM_ALLOC | VMEM_FREE) / 8);
-#endif
-
 	/* set min cache to 1/32 of all memory, or 64MB, whichever is more */
 	arc_c_min = MAX(allmem / 32, 64 << 20);
 	/* set max to 3/4 of all memory, or all but 1GB, whichever is more */
@@ -5934,6 +5922,15 @@ arc_init(void)
 	/* limit meta-data to 1/4 of the arc capacity */
 	arc_meta_limit = arc_c_max / 4;
 
+#ifdef _KERNEL
+	/*
+	 * Metadata is stored in the kernel's heap.  Don't let us
+	 * use more than half the heap for the ARC.
+	 */
+	arc_meta_limit = MIN(arc_meta_limit,
+	    vmem_size(heap_arena, VMEM_ALLOC | VMEM_FREE) / 2);
+#endif
+
 	/* Allow the tunable to override if it is reasonable */
 	if (zfs_arc_meta_limit > 0 && zfs_arc_meta_limit <= arc_c_max)
 		arc_meta_limit = zfs_arc_meta_limit;



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201704141841.v3EIfb7g078398>