From owner-freebsd-fs@FreeBSD.ORG Wed Feb 15 11:36:46 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 145CC1065673 for ; Wed, 15 Feb 2012 11:36:46 +0000 (UTC) (envelope-from peter.maloney@brockmann-consult.de) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.186]) by mx1.freebsd.org (Postfix) with ESMTP id 82C6B8FC08 for ; Wed, 15 Feb 2012 11:36:43 +0000 (UTC) Received: from [10.3.0.26] ([141.4.215.32]) by mrelayeu.kundenserver.de (node=mreu0) with ESMTP (Nemesis) id 0MGV7M-1RkRtS1Lai-00DKkk; Wed, 15 Feb 2012 12:36:42 +0100 Message-ID: <4F3B98C9.1090400@brockmann-consult.de> Date: Wed, 15 Feb 2012 12:36:41 +0100 From: Peter Maloney User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110922 Thunderbird/3.1.15 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <15861.1329298812.1414986334451204096@ffe12.ukr.net> <92617.1329301696.6338962447434776576@ffe5.ukr.net> In-Reply-To: <92617.1329301696.6338962447434776576@ffe5.ukr.net> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: 7bit X-Provags-ID: V02:K0:RKj1iaSv8QH+2PDhsC1zvJbPO4pmJ8IUfZlfuoGsRTS mnkyFONdBwdsPLis0FQQMsu5sA3ebiPUOViRe+RDTg4yr5VhO6 a8H3/bQoxmrSeAeLS4ALyg8yGff1IR6ZmYIfAyOI3losBE4sei di2jEtCGPLeBIXrDFsVee3tlxNZmLlgIswcfEtRQUgykV2qLUv fY8BdIeaWFnKm1N/W3l6hrjRvKLYKBT0PEOjesj3GNwS0A16MZ 1kManC94/as762TmcC3vGqZNHc5C64opI2Nhz4FF35bIdac0+/ AWaVd+GpzAzjSMVOloOcBaBOSeVLy9vjytB+P8ibWLHGcdk4Ce y7mbt/aN+oRJdOIcy00FpqpaWqP198nhHWMb+I0Ni Subject: Re: ZFS and mem management X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Feb 2012 11:36:46 -0000 Can you also post: zpool get all And does your indexing scan through the .zfs/snapshot directory? If so, this is a known issue that totally eats your memory, resulting in swap space errors. On 02/15/2012 11:28 AM, Pavlo wrote: > > > Hey George, > > thanks for quick response. > > No, no dedup is used. > > zfs-stats -a : > > ------------------------------------------------------------------------ > ZFS Subsystem Report Wed Feb 15 12:26:18 2012 > ------------------------------------------------------------------------ > > System Information: > > Kernel Version: 802516 (osreldate) > Hardware Platform: amd64 > Processor Architecture: amd64 > > ZFS Storage pool Version: 28 > ZFS Filesystem Version: 5 > > FreeBSD 8.2-STABLE #12: Thu Feb 9 11:35:23 EET 2012 root > 12:26PM up 2:29, 7 users, load averages: 0.02, 0.16, 0.16 > > ------------------------------------------------------------------------ > > System Memory: > > 19.78% 1.53 GiB Active, 0.95% 75.21 MiB Inact > 36.64% 2.84 GiB Wired, 0.06% 4.83 MiB Cache > 42.56% 3.30 GiB Free, 0.01% 696.00 KiB Gap > > Real Installed: 8.00 GiB > Real Available: 99.84% 7.99 GiB > Real Managed: 96.96% 7.74 GiB > > Logical Total: 8.00 GiB > Logical Used: 57.82% 4.63 GiB > Logical Free: 42.18% 3.37 GiB > > Kernel Memory: 2.43 GiB > Data: 99.54% 2.42 GiB > Text: 0.46% 11.50 MiB > > Kernel Memory Map: 3.16 GiB > Size: 69.69% 2.20 GiB > Free: 30.31% 979.48 MiB > > ------------------------------------------------------------------------ > > ARC Summary: (THROTTLED) > Memory Throttle Count: 3.82k > > ARC Misc: > Deleted: 874.34k > Recycle Misses: 376.12k > Mutex Misses: 4.74k > Evict Skips: 4.74k > > ARC Size: 68.53% 2.34 GiB > Target Size: (Adaptive) 68.54% 2.34 GiB > Min Size (Hard Limit): 12.50% 437.50 MiB > Max Size (High Water): 8:1 3.42 GiB > > ARC Size Breakdown: > Recently Used Cache Size: 92.95% 2.18 GiB > Frequently Used Cache Size: 7.05% 169.01 MiB > > ARC Hash Breakdown: > Elements Max: 229.96k > Elements Current: 40.05% 92.10k > Collisions: 705.52k > Chain Max: 11 > Chains: 20.64k > > ------------------------------------------------------------------------ > > ARC Efficiency: 7.96m > Cache Hit Ratio: 84.92% 6.76m > Cache Miss Ratio: 15.08% 1.20m > Actual Hit Ratio: 76.29% 6.08m > > Data Demand Efficiency: 91.32% 4.99m > Data Prefetch Efficiency: 19.57% 134.19k > > CACHE HITS BY CACHE LIST: > Anonymously Used: 7.24% 489.41k > Most Recently Used: 25.29% 1.71m > Most Frequently Used: 64.54% 4.37m > Most Recently Used Ghost: 1.42% 95.77k > Most Frequently Used Ghost: 1.51% 102.33k > > CACHE HITS BY DATA TYPE: > Demand Data: 67.42% 4.56m > Prefetch Data: 0.39% 26.26k > Demand Metadata: 22.41% 1.52m > Prefetch Metadata: 9.78% 661.25k > > CACHE MISSES BY DATA TYPE: > Demand Data: 36.11% 433.60k > Prefetch Data: 8.99% 107.94k > Demand Metadata: 32.00% 384.29k > Prefetch Metadata: 22.91% 275.09k > > ------------------------------------------------------------------------ > > L2ARC is disabled > > ------------------------------------------------------------------------ > > File-Level Prefetch: (HEALTHY) > > DMU Efficiency: 26.49m > Hit Ratio: 71.64% 18.98m > Miss Ratio: 28.36% 7.51m > > Colinear: 7.51m > Hit Ratio: 0.02% 1.42k > Miss Ratio: 99.98% 7.51m > > Stride: 18.85m > Hit Ratio: 99.97% 18.85m > Miss Ratio: 0.03% 5.73k > > DMU Misc: > Reclaim: 7.51m > Successes: 0.29% 21.58k > Failures: 99.71% 7.49m > > Streams: 130.46k > +Resets: 0.35% 461 > -Resets: 99.65% 130.00k > Bogus: 0 > > ------------------------------------------------------------------------ > > VDEV cache is disabled > > ------------------------------------------------------------------------ > > ZFS Tunables (sysctl): > kern.maxusers 384 > vm.kmem_size 4718592000 > vm.kmem_size_scale 1 > vm.kmem_size_min 0 > vm.kmem_size_max 329853485875 > vfs.zfs.l2c_only_size 0 > vfs.zfs.mfu_ghost_data_lsize 2705408 > vfs.zfs.mfu_ghost_metadata_lsize 332861440 > vfs.zfs.mfu_ghost_size 335566848 > vfs.zfs.mfu_data_lsize 1641984 > vfs.zfs.mfu_metadata_lsize 3048448 > vfs.zfs.mfu_size 28561920 > vfs.zfs.mru_ghost_data_lsize 68477440 > vfs.zfs.mru_ghost_metadata_lsize 62875648 > vfs.zfs.mru_ghost_size 131353088 > vfs.zfs.mru_data_lsize 1651216384 > vfs.zfs.mru_metadata_lsize 278577152 > vfs.zfs.mru_size 2306510848 > vfs.zfs.anon_data_lsize 0 > vfs.zfs.anon_metadata_lsize 0 > vfs.zfs.anon_size 12968960 > vfs.zfs.l2arc_norw 1 > vfs.zfs.l2arc_feed_again 1 > vfs.zfs.l2arc_noprefetch 1 > vfs.zfs.l2arc_feed_min_ms 200 > vfs.zfs.l2arc_feed_secs 1 > vfs.zfs.l2arc_headroom 2 > vfs.zfs.l2arc_write_boost 8388608 > vfs.zfs.l2arc_write_max 8388608 > vfs.zfs.arc_meta_limit 917504000 > vfs.zfs.arc_meta_used 851157616 > vfs.zfs.arc_min 458752000 > vfs.zfs.arc_max 3670016000 > vfs.zfs.dedup.prefetch 1 > vfs.zfs.mdcomp_disable 0 > vfs.zfs.write_limit_override 1048576000 > vfs.zfs.write_limit_inflated 25728073728 > vfs.zfs.write_limit_max 1072003072 > vfs.zfs.write_limit_min 33554432 > vfs.zfs.write_limit_shift 3 > vfs.zfs.no_write_throttle 0 > vfs.zfs.zfetch.array_rd_sz 1048576 > vfs.zfs.zfetch.block_cap 256 > vfs.zfs.zfetch.min_sec_reap 2 > vfs.zfs.zfetch.max_streams 8 > vfs.zfs.prefetch_disable 0 > vfs.zfs.mg_alloc_failures 8 > vfs.zfs.check_hostid 1 > vfs.zfs.recover 0 > vfs.zfs.txg.synctime_ms 1000 > vfs.zfs.txg.timeout 10 > vfs.zfs.scrub_limit 10 > vfs.zfs.vdev.cache.bshift 16 > vfs.zfs.vdev.cache.size 0 > vfs.zfs.vdev.cache.max 16384 > vfs.zfs.vdev.write_gap_limit 4096 > vfs.zfs.vdev.read_gap_limit 32768 > vfs.zfs.vdev.aggregation_limit 131072 > vfs.zfs.vdev.ramp_rate 2 > vfs.zfs.vdev.time_shift 6 > vfs.zfs.vdev.min_pending 4 > vfs.zfs.vdev.max_pending 10 > vfs.zfs.vdev.bio_flush_disable 0 > vfs.zfs.cache_flush_disable 0 > vfs.zfs.zil_replay_disable 0 > vfs.zfs.zio.use_uma 0 > vfs.zfs.version.zpl 5 > vfs.zfs.version.spa 28 > vfs.zfs.version.acl 1 > vfs.zfs.debug 0 > vfs.zfs.super_owner 0 > > ------------------------------------------------------------------------ > > > > > > 2012/2/15 Pavlo : >> >> >> Hello. >> >> We have an issue with memory management on FreeBSD and i suspect it is >> related to FS. >> We are using ZFS, here quick stats: >> >> >> zpool status >> pool: disk1 >> state: ONLINE >> scan: resilvered 657G in 8h30m with 0 errors on Tue Feb 14 21:17:37 2012 >> config: >> >> NAME STATE READ WRITE CKSUM >> disk1 ONLINE 0 0 0 >> mirror-0 ONLINE 0 0 0 >> gpt/disk0 ONLINE 0 0 0 >> gpt/disk1 ONLINE 0 0 0 >> gpt/disk2 ONLINE 0 0 0 >> gpt/disk4 ONLINE 0 0 0 >> gpt/disk6 ONLINE 0 0 0 >> gpt/disk8 ONLINE 0 0 0 >> gpt/disk10 ONLINE 0 0 0 >> gpt/disk12 ONLINE 0 0 0 >> mirror-7 ONLINE 0 0 0 >> gpt/disk14 ONLINE 0 0 0 >> gpt/disk15 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: zroot >> state: ONLINE >> scan: resilvered 34.9G in 0h11m with 0 errors on Tue Feb 14 12:57:52 2012 >> config: >> >> NAME STATE READ WRITE CKSUM >> zroot ONLINE 0 0 0 >> mirror-0 ONLINE 0 0 0 >> gpt/sys0 ONLINE 0 0 0 >> gpt/sys1 ONLINE 0 0 0 >> >> errors: No known data errors >> >> ------------------------------------------------------------------------ >> >> System Memory: >> >> 0.95% 75.61 MiB Active, 0.24% 19.02 MiB Inact >> 18.25% 1.41 GiB Wired, 0.01% 480.00 KiB Cache >> 80.54% 6.24 GiB Free, 0.01% 604.00 KiB Gap >> >> Real Installed: 8.00 GiB >> Real Available: 99.84% 7.99 GiB >> Real Managed: 96.96% 7.74 GiB >> >> Logical Total: 8.00 GiB >> Logical Used: 21.79% 1.74 GiB >> Logical Free: 78.21% 6.26 GiB >> >> Kernel Memory: 1.18 GiB >> Data: 99.05% 1.17 GiB >> Text: 0.95% 11.50 MiB >> >> Kernel Memory Map: 4.39 GiB >> Size: 23.32% 1.02 GiB >> Free: 76.68% 3.37 GiB >> >> ------------------------------------------------------------------------ >> >> ------------------------------------------------------------------------ >> ZFS Subsystem Report Wed Feb 15 10:53:03 2012 >> ------------------------------------------------------------------------ >> >> System Information: >> >> Kernel Version: 802516 (osreldate) >> Hardware Platform: amd64 >> Processor Architecture: amd64 >> >> ZFS Storage pool Version: 28 >> ZFS Filesystem Version: 5 >> >> FreeBSD 8.2-STABLE #12: Thu Feb 9 11:35:23 EET 2012 root >> 10:53AM up 56 mins, 6 users, load averages: 0.00, 0.00, 0.00 >> >> ------------------------------------------------------------------------ >> >> >> >> >> Background: >> we are using some tool that does indexing of some data and then pushes it >> into database (currently bdb-5.2). Instances of indexer are running >> continuously one after another. Time of indexing for one instance of >> indexer may vary between 2 seconds and 30 minutes. But mostly it is >> below one minute. There is nothing else running on the machine except >> system stuff and daemons. After several hours of indexing i can see a lot >> of active memory, it's ok. Then i check the number of vnodes. and it's >> really huge: 300k+ even tho nobody has so many opened files. Reading docs >> and googling i figured that's because of cached pages that reside in >> memory (unmounting of disk causes whole memory to be freed). Also I >> figured that happens only when I am accessing files via mmap(). >> >> Looks like pretty legit behaviour but the issue is: >> This spectacle continues (approximately for 12 hours) unlit indexers >> began to be killed out of swap. As I wrote above I observe a lot of used >> vnodes and like 7GB of active memory. I made a tool that allocates memory >> using malloc() to check what's the limit of available memory that can be >> allocated. It is several megabytes, sometimes more. Unless that tool gets >> killed out of swap as well. So how i can see the issue: for some reason >> after some process had exited normally all mapped pages don't get freed. >> I red about and I agree that this is reasonable behaviour if we have >> spare memory. But following this logic these pages can be flushed back to >> file at any time when system is under stress conditions. So when I ask >> for a piece of RAM, OS should do that trick and give me what I ask. But >> that's never happens. Those pages are like frozen. Until I unmount disk. >> Even after there is not a single instance of indexer running. >> >> I believe all this is caused by mmap() for sure : BDB uses mmap() for >> accessing databases and we tested indexing with out pushing data to DB. >> Worked shiny. You may suggest that that's something wrong with BDB. But >> we have some more tools of ours that using mmap() as well and the >> behaviour is exact. >> >> Thank you. Paul, Ukraine. >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > Hi Paul, > > Are you using dedup anywhere on that pool? > > Also, could you please post the full zfs-stats -a > > -- -------------------------------------------- Peter Maloney Brockmann Consult Max-Planck-Str. 2 21502 Geesthacht Germany Tel: +49 4152 889 300 Fax: +49 4152 889 333 E-mail: peter.maloney@brockmann-consult.de Internet: http://www.brockmann-consult.de --------------------------------------------