From owner-freebsd-fs@FreeBSD.ORG Tue Mar 31 08:11:55 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9CF1452A for ; Tue, 31 Mar 2015 08:11:55 +0000 (UTC) Received: from mail-wg0-f45.google.com (mail-wg0-f45.google.com [74.125.82.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2D469BCC for ; Tue, 31 Mar 2015 08:11:54 +0000 (UTC) Received: by wgbdm7 with SMTP id dm7so10111321wgb.1 for ; Tue, 31 Mar 2015 01:11:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=DVQDwigpWWiX0kUqG99N6LL+0dc/zrhEpQCrMS2yhjk=; b=K+G3Ies+QHIc75Yaz3ungwGdgO+8hSaD427PhGTxkuTyLKw/4/eoNEkCAtTB83bBX0 J0PlfKcjHYDTzIT3MSuDBDLUdIvxk2RIEpnWWUAW89jsy4jHD+a/OcgVZtiSFUQMmfZn 2ORjN/3yWGpLvQnkeLcHATipi9Umd+UsD06cK/zMI3b4t74BkwoBDh3d4ia0XHP89oa2 24vX8KfSYcb71zqnN/e4f6uXsFZxxDgfQ5o2mL86fFOckwO0D5dgR2uEZN6pekaio7kg d0XK/U7eWZvZfT8QLzZgVdTUNY/zTwuYIunWEGYLIiZ82RspGrhQ6Ukyj9C4Ese1MoyZ e2Ug== X-Gm-Message-State: ALoCoQndH/bW1jAMxcDdASCIAqK7V8NlHReU2gsE43KbfrwW5YPeBEA+qFZTKtwqArl5wJULNM/C X-Received: by 10.180.74.170 with SMTP id u10mr3277816wiv.46.1427789512851; Tue, 31 Mar 2015 01:11:52 -0700 (PDT) Received: from [10.10.1.68] (82-69-141-170.dsl.in-addr.zen.co.uk. [82.69.141.170]) by mx.google.com with ESMTPSA id i5sm19172035wiz.0.2015.03.31.01.11.51 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 31 Mar 2015 01:11:51 -0700 (PDT) Message-ID: <551A56D2.3050006@multiplay.co.uk> Date: Tue, 31 Mar 2015 09:12:02 +0100 From: Steven Hartland User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Dustin Wenz , "" Subject: Re: All available memory used when deleting files from ZFS References: <5519E53C.4060203@multiplay.co.uk> <3FF6F6A0-A23F-4EDE-98F6-5B8E41EC34A1@ebureau.com> In-Reply-To: <3FF6F6A0-A23F-4EDE-98F6-5B8E41EC34A1@ebureau.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Mar 2015 08:11:55 -0000 Are your pools HDD or SSD based? If the latter then this still may be relevant, due to the TRIM support in FreeBSD and TBH with a large TXG of free's this still may help even if the DDT isn't the main cause, so may be worth testing. Regards Steve On 31/03/2015 04:52, Dustin Wenz wrote: > Thanks, Steven! However, I have not enabled dedup on any of the affected filesystems. Unless it became a default at some point, I'm not sure how that tunable would help. > > - .Dustin > >> On Mar 30, 2015, at 7:07 PM, Steven Hartland wrote: >> >> Later versions have vfs.zfs.free_max_blocks which is likely to be the fix your looking for. >> >> It was added to head by r271532 and stable/10 by: >> https://svnweb.freebsd.org/base?view=revision&revision=272665 >> >> Description being: >> >> Add a new tunable/sysctl, vfs.zfs.free_max_blocks, which can be used to >> limit how many blocks can be free'ed before a new transaction group is >> created. The default is no limit (infinite), but we should probably have >> a lower default, e.g. 100,000. >> >> With this limit, we can guard against the case where ZFS could run out of >> memory when destroying large numbers of blocks in a single transaction >> group, as the entire DDT needs to be brought into memory. >> >> Illumos issue: >> 5138 add tunable for maximum number of blocks freed in one txg >> >> >> >>> On 30/03/2015 22:14, Dustin Wenz wrote: >>> I had several systems panic or hang over the weekend while deleting some data off of their local zfs filesystem. It looks like they ran out of physical memory (32GB), and hung when paging to swap-on-zfs (which is not surprising, given that ZFS was likely using the memory). They were running 10.1-STABLE r277139M, which I built in the middle of January. The pools were about 35TB in size, and are a concatenation of 3TB mirrors. They were maybe 95% full. I deleted just over 1000 files, totaling 25TB on each system. >>> >>> It took roughly 10 minutes to remove that 25TB of data per host using a remote rsync, and immediately after that everything seemed fine. However, after several more minutes, every machine that had data removed became unresponsive. Some had numerous "swap_pager: indefinite wait buffer" errors followed by a panic, and some just died with no console messages. The same thing would happen after a reboot, when FreeBSD attempted to mount the local filesystem again. >>> >>> I was able to boot these systems after exporting the affected pool, but the problem would recur several minutes after initiating a "zpool import". Watching zfs statistics didn't seem to reveal where the memory was going; ARC would only climb to about 4GB, but free memory would decline rapidly. Eventually, after enough export/reboot/import cycles, the pool would import successfully and everything would be fine from then on. Note that there is no L2ARC or compression being used. >>> >>> Has anyone else run into this when deleting files on ZFS? It seems to be a consistent problem under the versions of 10.1 I'm running. >>> >>> For reference, I've appended a zstat dump below that was taken 5 minutes after starting a zpool import, and was about three minutes before the machine became unresponsive. You can see that the ARC is only 4GB, but free memory was down to 471MB (and continued to drop). >>> >>> - .Dustin >>> >>> >>> ------------------------------------------------------------------------ >>> ZFS Subsystem Report Mon Mar 30 12:35:27 2015 >>> ------------------------------------------------------------------------ >>> >>> System Information: >>> >>> Kernel Version: 1001506 (osreldate) >>> Hardware Platform: amd64 >>> Processor Architecture: amd64 >>> >>> ZFS Storage pool Version: 5000 >>> ZFS Filesystem Version: 5 >>> >>> FreeBSD 10.1-STABLE #11 r277139M: Tue Jan 13 14:59:55 CST 2015 root >>> 12:35PM up 8 mins, 3 users, load averages: 7.23, 8.96, 4.87 >>> >>> ------------------------------------------------------------------------ >>> >>> System Memory: >>> >>> 0.17% 55.40 MiB Active, 0.14% 46.11 MiB Inact >>> 98.34% 30.56 GiB Wired, 0.00% 0 Cache >>> 1.34% 425.46 MiB Free, 0.00% 4.00 KiB Gap >>> >>> Real Installed: 32.00 GiB >>> Real Available: 99.82% 31.94 GiB >>> Real Managed: 97.29% 31.08 GiB >>> >>> Logical Total: 32.00 GiB >>> Logical Used: 98.56% 31.54 GiB >>> Logical Free: 1.44% 471.57 MiB >>> >>> Kernel Memory: 3.17 GiB >>> Data: 99.18% 3.14 GiB >>> Text: 0.82% 26.68 MiB >>> >>> Kernel Memory Map: 31.08 GiB >>> Size: 14.18% 4.41 GiB >>> Free: 85.82% 26.67 GiB >>> >>> ------------------------------------------------------------------------ >>> >>> ARC Summary: (HEALTHY) >>> Memory Throttle Count: 0 >>> >>> ARC Misc: >>> Deleted: 145 >>> Recycle Misses: 0 >>> Mutex Misses: 0 >>> Evict Skips: 0 >>> >>> ARC Size: 14.17% 4.26 GiB >>> Target Size: (Adaptive) 100.00% 30.08 GiB >>> Min Size (Hard Limit): 12.50% 3.76 GiB >>> Max Size (High Water): 8:1 30.08 GiB >>> >>> ARC Size Breakdown: >>> Recently Used Cache Size: 50.00% 15.04 GiB >>> Frequently Used Cache Size: 50.00% 15.04 GiB >>> >>> ARC Hash Breakdown: >>> Elements Max: 270.56k >>> Elements Current: 100.00% 270.56k >>> Collisions: 23.66k >>> Chain Max: 3 >>> Chains: 8.28k >>> >>> ------------------------------------------------------------------------ >>> >>> ARC Efficiency: 2.93m >>> Cache Hit Ratio: 70.44% 2.06m >>> Cache Miss Ratio: 29.56% 866.05k >>> Actual Hit Ratio: 70.40% 2.06m >>> >>> Data Demand Efficiency: 97.47% 24.58k >>> Data Prefetch Efficiency: 1.88% 479 >>> >>> CACHE HITS BY CACHE LIST: >>> Anonymously Used: 0.05% 1.07k >>> Most Recently Used: 71.82% 1.48m >>> Most Frequently Used: 28.13% 580.49k >>> Most Recently Used Ghost: 0.00% 0 >>> Most Frequently Used Ghost: 0.00% 0 >>> >>> CACHE HITS BY DATA TYPE: >>> Demand Data: 1.16% 23.96k >>> Prefetch Data: 0.00% 9 >>> Demand Metadata: 98.79% 2.04m >>> Prefetch Metadata: 0.05% 1.08k >>> >>> CACHE MISSES BY DATA TYPE: >>> Demand Data: 0.07% 621 >>> Prefetch Data: 0.05% 470 >>> Demand Metadata: 99.69% 863.35k >>> Prefetch Metadata: 0.19% 1.61k >>> >>> ------------------------------------------------------------------------ >>> >>> L2ARC is disabled >>> >>> ------------------------------------------------------------------------ >>> >>> File-Level Prefetch: (HEALTHY) >>> >>> DMU Efficiency: 72.95k >>> Hit Ratio: 70.83% 51.66k >>> Miss Ratio: 29.17% 21.28k >>> >>> Colinear: 21.28k >>> Hit Ratio: 0.01% 2 >>> Miss Ratio: 99.99% 21.28k >>> >>> Stride: 50.45k >>> Hit Ratio: 99.98% 50.44k >>> Miss Ratio: 0.02% 9 >>> >>> DMU Misc: >>> Reclaim: 21.28k >>> Successes: 1.73% 368 >>> Failures: 98.27% 20.91k >>> >>> Streams: 1.23k >>> +Resets: 0.16% 2 >>> -Resets: 99.84% 1.23k >>> Bogus: 0 >>> >>> ------------------------------------------------------------------------ >>> >>> VDEV cache is disabled >>> >>> ------------------------------------------------------------------------ >>> >>> ZFS Tunables (sysctl): >>> kern.maxusers 2380 >>> vm.kmem_size 33367830528 >>> vm.kmem_size_scale 1 >>> vm.kmem_size_min 0 >>> vm.kmem_size_max 1319413950874 >>> vfs.zfs.arc_max 32294088704 >>> vfs.zfs.arc_min 4036761088 >>> vfs.zfs.arc_average_blocksize 8192 >>> vfs.zfs.arc_shrink_shift 5 >>> vfs.zfs.arc_free_target 56518 >>> vfs.zfs.arc_meta_used 4534349216 >>> vfs.zfs.arc_meta_limit 8073522176 >>> vfs.zfs.l2arc_write_max 8388608 >>> vfs.zfs.l2arc_write_boost 8388608 >>> vfs.zfs.l2arc_headroom 2 >>> vfs.zfs.l2arc_feed_secs 1 >>> vfs.zfs.l2arc_feed_min_ms 200 >>> vfs.zfs.l2arc_noprefetch 1 >>> vfs.zfs.l2arc_feed_again 1 >>> vfs.zfs.l2arc_norw 1 >>> vfs.zfs.anon_size 1786368 >>> vfs.zfs.anon_metadata_lsize 0 >>> vfs.zfs.anon_data_lsize 0 >>> vfs.zfs.mru_size 504812032 >>> vfs.zfs.mru_metadata_lsize 415273472 >>> vfs.zfs.mru_data_lsize 35227648 >>> vfs.zfs.mru_ghost_size 0 >>> vfs.zfs.mru_ghost_metadata_lsize 0 >>> vfs.zfs.mru_ghost_data_lsize 0 >>> vfs.zfs.mfu_size 3925990912 >>> vfs.zfs.mfu_metadata_lsize 3901947392 >>> vfs.zfs.mfu_data_lsize 7000064 >>> vfs.zfs.mfu_ghost_size 0 >>> vfs.zfs.mfu_ghost_metadata_lsize 0 >>> vfs.zfs.mfu_ghost_data_lsize 0 >>> vfs.zfs.l2c_only_size 0 >>> vfs.zfs.dedup.prefetch 1 >>> vfs.zfs.nopwrite_enabled 1 >>> vfs.zfs.mdcomp_disable 0 >>> vfs.zfs.max_recordsize 1048576 >>> vfs.zfs.dirty_data_max 3429735628 >>> vfs.zfs.dirty_data_max_max 4294967296 >>> vfs.zfs.dirty_data_max_percent 10 >>> vfs.zfs.dirty_data_sync 67108864 >>> vfs.zfs.delay_min_dirty_percent 60 >>> vfs.zfs.delay_scale 500000 >>> vfs.zfs.prefetch_disable 0 >>> vfs.zfs.zfetch.max_streams 8 >>> vfs.zfs.zfetch.min_sec_reap 2 >>> vfs.zfs.zfetch.block_cap 256 >>> vfs.zfs.zfetch.array_rd_sz 1048576 >>> vfs.zfs.top_maxinflight 32 >>> vfs.zfs.resilver_delay 2 >>> vfs.zfs.scrub_delay 4 >>> vfs.zfs.scan_idle 50 >>> vfs.zfs.scan_min_time_ms 1000 >>> vfs.zfs.free_min_time_ms 1000 >>> vfs.zfs.resilver_min_time_ms 3000 >>> vfs.zfs.no_scrub_io 0 >>> vfs.zfs.no_scrub_prefetch 0 >>> vfs.zfs.free_max_blocks -1 >>> vfs.zfs.metaslab.gang_bang 16777217 >>> vfs.zfs.metaslab.fragmentation_threshold70 >>> vfs.zfs.metaslab.debug_load 0 >>> vfs.zfs.metaslab.debug_unload 0 >>> vfs.zfs.metaslab.df_alloc_threshold 131072 >>> vfs.zfs.metaslab.df_free_pct 4 >>> vfs.zfs.metaslab.min_alloc_size 33554432 >>> vfs.zfs.metaslab.load_pct 50 >>> vfs.zfs.metaslab.unload_delay 8 >>> vfs.zfs.metaslab.preload_limit 3 >>> vfs.zfs.metaslab.preload_enabled 1 >>> vfs.zfs.metaslab.fragmentation_factor_enabled1 >>> vfs.zfs.metaslab.lba_weighting_enabled 1 >>> vfs.zfs.metaslab.bias_enabled 1 >>> vfs.zfs.condense_pct 200 >>> vfs.zfs.mg_noalloc_threshold 0 >>> vfs.zfs.mg_fragmentation_threshold 85 >>> vfs.zfs.check_hostid 1 >>> vfs.zfs.spa_load_verify_maxinflight 10000 >>> vfs.zfs.spa_load_verify_metadata 1 >>> vfs.zfs.spa_load_verify_data 1 >>> vfs.zfs.recover 0 >>> vfs.zfs.deadman_synctime_ms 1000000 >>> vfs.zfs.deadman_checktime_ms 5000 >>> vfs.zfs.deadman_enabled 1 >>> vfs.zfs.spa_asize_inflation 24 >>> vfs.zfs.spa_slop_shift 5 >>> vfs.zfs.space_map_blksz 4096 >>> vfs.zfs.txg.timeout 5 >>> vfs.zfs.vdev.metaslabs_per_vdev 200 >>> vfs.zfs.vdev.cache.max 16384 >>> vfs.zfs.vdev.cache.size 0 >>> vfs.zfs.vdev.cache.bshift 16 >>> vfs.zfs.vdev.trim_on_init 1 >>> vfs.zfs.vdev.mirror.rotating_inc 0 >>> vfs.zfs.vdev.mirror.rotating_seek_inc 5 >>> vfs.zfs.vdev.mirror.rotating_seek_offset1048576 >>> vfs.zfs.vdev.mirror.non_rotating_inc 0 >>> vfs.zfs.vdev.mirror.non_rotating_seek_inc1 >>> vfs.zfs.vdev.async_write_active_min_dirty_percent30 >>> vfs.zfs.vdev.async_write_active_max_dirty_percent60 >>> vfs.zfs.vdev.max_active 1000 >>> vfs.zfs.vdev.sync_read_min_active 10 >>> vfs.zfs.vdev.sync_read_max_active 10 >>> vfs.zfs.vdev.sync_write_min_active 10 >>> vfs.zfs.vdev.sync_write_max_active 10 >>> vfs.zfs.vdev.async_read_min_active 1 >>> vfs.zfs.vdev.async_read_max_active 3 >>> vfs.zfs.vdev.async_write_min_active 1 >>> vfs.zfs.vdev.async_write_max_active 10 >>> vfs.zfs.vdev.scrub_min_active 1 >>> vfs.zfs.vdev.scrub_max_active 2 >>> vfs.zfs.vdev.trim_min_active 1 >>> vfs.zfs.vdev.trim_max_active 64 >>> vfs.zfs.vdev.aggregation_limit 131072 >>> vfs.zfs.vdev.read_gap_limit 32768 >>> vfs.zfs.vdev.write_gap_limit 4096 >>> vfs.zfs.vdev.bio_flush_disable 0 >>> vfs.zfs.vdev.bio_delete_disable 0 >>> vfs.zfs.vdev.trim_max_bytes 2147483648 >>> vfs.zfs.vdev.trim_max_pending 64 >>> vfs.zfs.max_auto_ashift 13 >>> vfs.zfs.min_auto_ashift 9 >>> vfs.zfs.zil_replay_disable 0 >>> vfs.zfs.cache_flush_disable 0 >>> vfs.zfs.zio.use_uma 1 >>> vfs.zfs.zio.exclude_metadata 0 >>> vfs.zfs.sync_pass_deferred_free 2 >>> vfs.zfs.sync_pass_dont_compress 5 >>> vfs.zfs.sync_pass_rewrite 2 >>> vfs.zfs.snapshot_list_prefetch 0 >>> vfs.zfs.super_owner 0 >>> vfs.zfs.debug 0 >>> vfs.zfs.version.ioctl 4 >>> vfs.zfs.version.acl 1 >>> vfs.zfs.version.spa 5000 >>> vfs.zfs.version.zpl 5 >>> vfs.zfs.vol.mode 1 >>> vfs.zfs.vol.unmap_enabled 1 >>> vfs.zfs.trim.enabled 1 >>> vfs.zfs.trim.txg_delay 32 >>> vfs.zfs.trim.timeout 30 >>> vfs.zfs.trim.max_interval 1 >>> >>> ------------------------------------------------------------------------ >>> >>> >>> _______________________________________________ >>> freebsd-fs@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"