Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 Mar 2015 16:52:33 -0500
From:      Dustin Wenz <dustinwenz@ebureau.com>
To:        "<freebsd-fs@freebsd.org>" <freebsd-fs@freebsd.org>
Subject:   Re: All available memory used when deleting files from ZFS
Message-ID:  <712A53CA-7A54-420F-9721-592A39D9A717@ebureau.com>
In-Reply-To: <923828D6-503B-4FC3-89E8-1DC6DF0C9B6B@ebureau.com>
References:  <FD30147A-C7F7-4138-9F96-10024A6FE061@ebureau.com> <5519C329.3090001@denninger.net> <923828D6-503B-4FC3-89E8-1DC6DF0C9B6B@ebureau.com>

next in thread | previous in thread | raw e-mail | index | archive | help
I was able to do a little regression testing on this, since I still had =
about 10 hosts remaining on FreeBSD 9.2. They are the same hardware and =
disk configuration, and had the same data files and zpool configurations =
(mirrors of 3TB mechanical disks) as the machines that blew up over the =
weekend. The only difference is that they ran FreeBSD 9.2.=20

Using the same rsync procedure as before, I was able to delete the 25 TB =
of data on all of the remaining hosts with no issues whatsoever. I never =
saw any free memory reduction (if anything, it increased since ARC was =
being freed up as well), no paging and no hangs. One other difference =
was that it took about twice as long to delete the files on 9.2 as on =
10.1 (20 minutes instead of 10 minutes).

So, it would appear that there is some different zfs behavior in FreeBSD =
10.1 that was not present in 9.2, and it's causing problems when freeing =
up space. If I knew why it takes twice as long to delete files in 9.2, =
that might shed some light on this. There is also the recent background =
destroy feature that might be suspect, but I'm not destroying =
filesystems here. What other recent zfs changes might apply to deleting =
files?

	- .Dustin


> On Mar 30, 2015, at 6:30 PM, Dustin Wenz <DustinWenz@ebureau.com> =
wrote:
>=20
> Unfortunately, I just spent the day recovering from this, so I have no =
way to easily get new memory stats now. I'm planning on doing a test =
with additional data in an effort to understand more about the issue, =
but it will take time to set something up.
>=20
> In the meantime, I'd advise anyone running ZFS on FreeBSD 10.x to be =
mindful when freeing up lots of space all at once.
>=20
> 	- .Dustin
>=20
>> On Mar 30, 2015, at 4:42 PM, Karl Denninger <karl@denninger.net> =
wrote:
>>=20
>> What's the UMA memory use look like on that machine when the remove =
is
>> initiated and progresses?  Look with vmstat -z and see what the used =
and
>> free counts look like for the zio allocations......
>>=20
>> On 3/30/2015 4:14 PM, Dustin Wenz wrote:
>>> I had several systems panic or hang over the weekend while deleting =
some data off of their local zfs filesystem. It looks like they ran out =
of physical memory (32GB), and hung when paging to swap-on-zfs (which is =
not surprising, given that ZFS was likely using the memory). They were =
running 10.1-STABLE r277139M, which I built in the middle of January. =
The pools were about 35TB in size, and are a concatenation of 3TB =
mirrors. They were maybe 95% full. I deleted just over 1000 files, =
totaling 25TB on each system.
>>>=20
>>> It took roughly 10 minutes to remove that 25TB of data per host =
using a remote rsync, and immediately after that everything seemed fine. =
However, after several more minutes, every machine that had data removed =
became unresponsive. Some had numerous "swap_pager: indefinite wait =
buffer" errors followed by a panic, and some just died with no console =
messages. The same thing would happen after a reboot, when FreeBSD =
attempted to mount the local filesystem again.
>>>=20
>>> I was able to boot these systems after exporting the affected pool, =
but the problem would recur several minutes after initiating a "zpool =
import". Watching zfs statistics didn't seem to reveal where the memory =
was going; ARC would only climb to about 4GB, but free memory would =
decline rapidly. Eventually, after enough export/reboot/import cycles, =
the pool would import successfully and everything would be fine from =
then on. Note that there is no L2ARC or compression being used.
>>>=20
>>> Has anyone else run into this when deleting files on ZFS? It seems =
to be a consistent problem under the versions of 10.1 I'm running.
>>>=20
>>> For reference, I've appended a zstat dump below that was taken 5 =
minutes after starting a zpool import, and was about three minutes =
before the machine became unresponsive. You can see that the ARC is only =
4GB, but free memory was down to 471MB (and continued to drop).
>>>=20
>>> 	- .Dustin
>>>=20
>>>=20
>>> =
------------------------------------------------------------------------
>>> ZFS Subsystem Report				Mon Mar 30 =
12:35:27 2015
>>> =
------------------------------------------------------------------------
>>>=20
>>> System Information:
>>>=20
>>> 	Kernel Version:				1001506 (osreldate)
>>> 	Hardware Platform:			amd64
>>> 	Processor Architecture:			amd64
>>>=20
>>> 	ZFS Storage pool Version:		5000
>>> 	ZFS Filesystem Version:			5
>>>=20
>>> FreeBSD 10.1-STABLE #11 r277139M: Tue Jan 13 14:59:55 CST 2015 root
>>> 12:35PM  up 8 mins, 3 users, load averages: 7.23, 8.96, 4.87
>>>=20
>>> =
------------------------------------------------------------------------
>>>=20
>>> System Memory:
>>>=20
>>> 	0.17%	55.40	MiB Active,	0.14%	46.11	MiB Inact
>>> 	98.34%	30.56	GiB Wired,	0.00%	0 Cache
>>> 	1.34%	425.46	MiB Free,	0.00%	4.00	KiB Gap
>>>=20
>>> 	Real Installed:				32.00	GiB
>>> 	Real Available:			99.82%	31.94	GiB
>>> 	Real Managed:			97.29%	31.08	GiB
>>>=20
>>> 	Logical Total:				32.00	GiB
>>> 	Logical Used:			98.56%	31.54	GiB
>>> 	Logical Free:			1.44%	471.57	MiB
>>>=20
>>> Kernel Memory:					3.17	GiB
>>> 	Data:				99.18%	3.14	GiB
>>> 	Text:				0.82%	26.68	MiB
>>>=20
>>> Kernel Memory Map:				31.08	GiB
>>> 	Size:				14.18%	4.41	GiB
>>> 	Free:				85.82%	26.67	GiB
>>>=20
>>> =
------------------------------------------------------------------------
>>>=20
>>> ARC Summary: (HEALTHY)
>>> 	Memory Throttle Count:			0
>>>=20
>>> ARC Misc:
>>> 	Deleted:				145
>>> 	Recycle Misses:				0
>>> 	Mutex Misses:				0
>>> 	Evict Skips:				0
>>>=20
>>> ARC Size:				14.17%	4.26	GiB
>>> 	Target Size: (Adaptive)		100.00%	30.08	GiB
>>> 	Min Size (Hard Limit):		12.50%	3.76	GiB
>>> 	Max Size (High Water):		8:1	30.08	GiB
>>>=20
>>> ARC Size Breakdown:
>>> 	Recently Used Cache Size:	50.00%	15.04	GiB
>>> 	Frequently Used Cache Size:	50.00%	15.04	GiB
>>>=20
>>> ARC Hash Breakdown:
>>> 	Elements Max:				270.56k
>>> 	Elements Current:		100.00%	270.56k
>>> 	Collisions:				23.66k
>>> 	Chain Max:				3
>>> 	Chains:					8.28k
>>>=20
>>> =
------------------------------------------------------------------------
>>>=20
>>> ARC Efficiency:					2.93m
>>> 	Cache Hit Ratio:		70.44%	2.06m
>>> 	Cache Miss Ratio:		29.56%	866.05k
>>> 	Actual Hit Ratio:		70.40%	2.06m
>>>=20
>>> 	Data Demand Efficiency:		97.47%	24.58k
>>> 	Data Prefetch Efficiency:	1.88%	479
>>>=20
>>> 	CACHE HITS BY CACHE LIST:
>>> 	  Anonymously Used:		0.05%	1.07k
>>> 	  Most Recently Used:		71.82%	1.48m
>>> 	  Most Frequently Used:		28.13%	580.49k
>>> 	  Most Recently Used Ghost:	0.00%	0
>>> 	  Most Frequently Used Ghost:	0.00%	0
>>>=20
>>> 	CACHE HITS BY DATA TYPE:
>>> 	  Demand Data:			1.16%	23.96k
>>> 	  Prefetch Data:		0.00%	9
>>> 	  Demand Metadata:		98.79%	2.04m
>>> 	  Prefetch Metadata:		0.05%	1.08k
>>>=20
>>> 	CACHE MISSES BY DATA TYPE:
>>> 	  Demand Data:			0.07%	621
>>> 	  Prefetch Data:		0.05%	470
>>> 	  Demand Metadata:		99.69%	863.35k
>>> 	  Prefetch Metadata:		0.19%	1.61k
>>>=20
>>> =
------------------------------------------------------------------------
>>>=20
>>> L2ARC is disabled
>>>=20
>>> =
------------------------------------------------------------------------
>>>=20
>>> File-Level Prefetch: (HEALTHY)
>>>=20
>>> DMU Efficiency:					72.95k
>>> 	Hit Ratio:			70.83%	51.66k
>>> 	Miss Ratio:			29.17%	21.28k
>>>=20
>>> 	Colinear:				21.28k
>>> 	  Hit Ratio:			0.01%	2
>>> 	  Miss Ratio:			99.99%	21.28k
>>>=20
>>> 	Stride:					50.45k
>>> 	  Hit Ratio:			99.98%	50.44k
>>> 	  Miss Ratio:			0.02%	9
>>>=20
>>> DMU Misc:
>>> 	Reclaim:				21.28k
>>> 	  Successes:			1.73%	368
>>> 	  Failures:			98.27%	20.91k
>>>=20
>>> 	Streams:				1.23k
>>> 	  +Resets:			0.16%	2
>>> 	  -Resets:			99.84%	1.23k
>>> 	  Bogus:				0
>>>=20
>>> =
------------------------------------------------------------------------
>>>=20
>>> VDEV cache is disabled
>>>=20
>>> =
------------------------------------------------------------------------
>>>=20
>>> ZFS Tunables (sysctl):
>>> 	kern.maxusers                           2380
>>> 	vm.kmem_size                            33367830528
>>> 	vm.kmem_size_scale                      1
>>> 	vm.kmem_size_min                        0
>>> 	vm.kmem_size_max                        1319413950874
>>> 	vfs.zfs.arc_max                         32294088704
>>> 	vfs.zfs.arc_min                         4036761088
>>> 	vfs.zfs.arc_average_blocksize           8192
>>> 	vfs.zfs.arc_shrink_shift                5
>>> 	vfs.zfs.arc_free_target                 56518
>>> 	vfs.zfs.arc_meta_used                   4534349216
>>> 	vfs.zfs.arc_meta_limit                  8073522176
>>> 	vfs.zfs.l2arc_write_max                 8388608
>>> 	vfs.zfs.l2arc_write_boost               8388608
>>> 	vfs.zfs.l2arc_headroom                  2
>>> 	vfs.zfs.l2arc_feed_secs                 1
>>> 	vfs.zfs.l2arc_feed_min_ms               200
>>> 	vfs.zfs.l2arc_noprefetch                1
>>> 	vfs.zfs.l2arc_feed_again                1
>>> 	vfs.zfs.l2arc_norw                      1
>>> 	vfs.zfs.anon_size                       1786368
>>> 	vfs.zfs.anon_metadata_lsize             0
>>> 	vfs.zfs.anon_data_lsize                 0
>>> 	vfs.zfs.mru_size                        504812032
>>> 	vfs.zfs.mru_metadata_lsize              415273472
>>> 	vfs.zfs.mru_data_lsize                  35227648
>>> 	vfs.zfs.mru_ghost_size                  0
>>> 	vfs.zfs.mru_ghost_metadata_lsize        0
>>> 	vfs.zfs.mru_ghost_data_lsize            0
>>> 	vfs.zfs.mfu_size                        3925990912
>>> 	vfs.zfs.mfu_metadata_lsize              3901947392
>>> 	vfs.zfs.mfu_data_lsize                  7000064
>>> 	vfs.zfs.mfu_ghost_size                  0
>>> 	vfs.zfs.mfu_ghost_metadata_lsize        0
>>> 	vfs.zfs.mfu_ghost_data_lsize            0
>>> 	vfs.zfs.l2c_only_size                   0
>>> 	vfs.zfs.dedup.prefetch                  1
>>> 	vfs.zfs.nopwrite_enabled                1
>>> 	vfs.zfs.mdcomp_disable                  0
>>> 	vfs.zfs.max_recordsize                  1048576
>>> 	vfs.zfs.dirty_data_max                  3429735628
>>> 	vfs.zfs.dirty_data_max_max              4294967296
>>> 	vfs.zfs.dirty_data_max_percent          10
>>> 	vfs.zfs.dirty_data_sync                 67108864
>>> 	vfs.zfs.delay_min_dirty_percent         60
>>> 	vfs.zfs.delay_scale                     500000
>>> 	vfs.zfs.prefetch_disable                0
>>> 	vfs.zfs.zfetch.max_streams              8
>>> 	vfs.zfs.zfetch.min_sec_reap             2
>>> 	vfs.zfs.zfetch.block_cap                256
>>> 	vfs.zfs.zfetch.array_rd_sz              1048576
>>> 	vfs.zfs.top_maxinflight                 32
>>> 	vfs.zfs.resilver_delay                  2
>>> 	vfs.zfs.scrub_delay                     4
>>> 	vfs.zfs.scan_idle                       50
>>> 	vfs.zfs.scan_min_time_ms                1000
>>> 	vfs.zfs.free_min_time_ms                1000
>>> 	vfs.zfs.resilver_min_time_ms            3000
>>> 	vfs.zfs.no_scrub_io                     0
>>> 	vfs.zfs.no_scrub_prefetch               0
>>> 	vfs.zfs.free_max_blocks                 -1
>>> 	vfs.zfs.metaslab.gang_bang              16777217
>>> 	vfs.zfs.metaslab.fragmentation_threshold70
>>> 	vfs.zfs.metaslab.debug_load             0
>>> 	vfs.zfs.metaslab.debug_unload           0
>>> 	vfs.zfs.metaslab.df_alloc_threshold     131072
>>> 	vfs.zfs.metaslab.df_free_pct            4
>>> 	vfs.zfs.metaslab.min_alloc_size         33554432
>>> 	vfs.zfs.metaslab.load_pct               50
>>> 	vfs.zfs.metaslab.unload_delay           8
>>> 	vfs.zfs.metaslab.preload_limit          3
>>> 	vfs.zfs.metaslab.preload_enabled        1
>>> 	vfs.zfs.metaslab.fragmentation_factor_enabled1
>>> 	vfs.zfs.metaslab.lba_weighting_enabled  1
>>> 	vfs.zfs.metaslab.bias_enabled           1
>>> 	vfs.zfs.condense_pct                    200
>>> 	vfs.zfs.mg_noalloc_threshold            0
>>> 	vfs.zfs.mg_fragmentation_threshold      85
>>> 	vfs.zfs.check_hostid                    1
>>> 	vfs.zfs.spa_load_verify_maxinflight     10000
>>> 	vfs.zfs.spa_load_verify_metadata        1
>>> 	vfs.zfs.spa_load_verify_data            1
>>> 	vfs.zfs.recover                         0
>>> 	vfs.zfs.deadman_synctime_ms             1000000
>>> 	vfs.zfs.deadman_checktime_ms            5000
>>> 	vfs.zfs.deadman_enabled                 1
>>> 	vfs.zfs.spa_asize_inflation             24
>>> 	vfs.zfs.spa_slop_shift                  5
>>> 	vfs.zfs.space_map_blksz                 4096
>>> 	vfs.zfs.txg.timeout                     5
>>> 	vfs.zfs.vdev.metaslabs_per_vdev         200
>>> 	vfs.zfs.vdev.cache.max                  16384
>>> 	vfs.zfs.vdev.cache.size                 0
>>> 	vfs.zfs.vdev.cache.bshift               16
>>> 	vfs.zfs.vdev.trim_on_init               1
>>> 	vfs.zfs.vdev.mirror.rotating_inc        0
>>> 	vfs.zfs.vdev.mirror.rotating_seek_inc   5
>>> 	vfs.zfs.vdev.mirror.rotating_seek_offset1048576
>>> 	vfs.zfs.vdev.mirror.non_rotating_inc    0
>>> 	vfs.zfs.vdev.mirror.non_rotating_seek_inc1
>>> 	vfs.zfs.vdev.async_write_active_min_dirty_percent30
>>> 	vfs.zfs.vdev.async_write_active_max_dirty_percent60
>>> 	vfs.zfs.vdev.max_active                 1000
>>> 	vfs.zfs.vdev.sync_read_min_active       10
>>> 	vfs.zfs.vdev.sync_read_max_active       10
>>> 	vfs.zfs.vdev.sync_write_min_active      10
>>> 	vfs.zfs.vdev.sync_write_max_active      10
>>> 	vfs.zfs.vdev.async_read_min_active      1
>>> 	vfs.zfs.vdev.async_read_max_active      3
>>> 	vfs.zfs.vdev.async_write_min_active     1
>>> 	vfs.zfs.vdev.async_write_max_active     10
>>> 	vfs.zfs.vdev.scrub_min_active           1
>>> 	vfs.zfs.vdev.scrub_max_active           2
>>> 	vfs.zfs.vdev.trim_min_active            1
>>> 	vfs.zfs.vdev.trim_max_active            64
>>> 	vfs.zfs.vdev.aggregation_limit          131072
>>> 	vfs.zfs.vdev.read_gap_limit             32768
>>> 	vfs.zfs.vdev.write_gap_limit            4096
>>> 	vfs.zfs.vdev.bio_flush_disable          0
>>> 	vfs.zfs.vdev.bio_delete_disable         0
>>> 	vfs.zfs.vdev.trim_max_bytes             2147483648
>>> 	vfs.zfs.vdev.trim_max_pending           64
>>> 	vfs.zfs.max_auto_ashift                 13
>>> 	vfs.zfs.min_auto_ashift                 9
>>> 	vfs.zfs.zil_replay_disable              0
>>> 	vfs.zfs.cache_flush_disable             0
>>> 	vfs.zfs.zio.use_uma                     1
>>> 	vfs.zfs.zio.exclude_metadata            0
>>> 	vfs.zfs.sync_pass_deferred_free         2
>>> 	vfs.zfs.sync_pass_dont_compress         5
>>> 	vfs.zfs.sync_pass_rewrite               2
>>> 	vfs.zfs.snapshot_list_prefetch          0
>>> 	vfs.zfs.super_owner                     0
>>> 	vfs.zfs.debug                           0
>>> 	vfs.zfs.version.ioctl                   4
>>> 	vfs.zfs.version.acl                     1
>>> 	vfs.zfs.version.spa                     5000
>>> 	vfs.zfs.version.zpl                     5
>>> 	vfs.zfs.vol.mode                        1
>>> 	vfs.zfs.vol.unmap_enabled               1
>>> 	vfs.zfs.trim.enabled                    1
>>> 	vfs.zfs.trim.txg_delay                  32
>>> 	vfs.zfs.trim.timeout                    30
>>> 	vfs.zfs.trim.max_interval               1
>>>=20
>>> =
------------------------------------------------------------------------
>>>=20
>>>=20
>>> _______________________________________________
>>> freebsd-fs@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>>> To unsubscribe, send any mail to =
"freebsd-fs-unsubscribe@freebsd.org"
>>>=20
>>>=20
>>> %SPAMBLOCK-SYS: Matched [@freebsd.org+], message ok
>>=20
>> --=20
>> Karl Denninger
>> karl@denninger.net
>> /The Market Ticker/
>>=20
>>=20
>=20
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?712A53CA-7A54-420F-9721-592A39D9A717>