Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 22 Apr 2009 00:34:00 +0200
From:      Kai Gallasch <gallasch@free.de>
To:        freebsd-fs@freebsd.org
Subject:   FreeBSD 7.2-RC1 - ZFS related kernel panic "kmem_map too small"
Message-ID:  <49EE49D8.7000902@free.de>

next in thread | raw e-mail | index | archive | help
Hi.

Today I had a kernel panic on my server running FreeBSD 7.2-RC1 (amd64),
Opteron, 4 Cores, 16GB RAM, when benchmarking a raidz1 pool with
bonnie++ benchmark.

# bonnie++ -d /zpool1/test/tmp -s 32408 -u kai

The server hosts about ten jails with webservers, mail, etc. - very low
load.

I used bonnie++ to somehow provoke a panic, after the server in the past
week had several zfs related panics, that ended up with processes stuck
in state "zfs". The pattern was always that after booting the server
kept running for about a day and then crashed or became unusable.

Some sysctl values that I saved during such a "process stuck in zfs" state:

kern.maxvnodes: 120000
kern.minvnodes: 25000
vm.stats.vm.v_vnodepgsout: 48
vm.stats.vm.v_vnodepgsin: 33500
vm.stats.vm.v_vnodeout: 48
vm.stats.vm.v_vnodein: 27299
vfs.freevnodes: 25000
vfs.wantfreevnodes: 25000
vfs.numvnodes: 93765
debug.sizeof.vnode: 504

vfs.zfs.arc_min: 37545216
vfs.zfs.arc_max: 901085184
vfs.zfs.mdcomp_disable: 0
vfs.zfs.prefetch_disable: 0
vfs.zfs.zio.taskq_threads: 0
vfs.zfs.recover: 0
vfs.zfs.vdev.cache.size: 10485760
vfs.zfs.vdev.cache.max: 16384
vfs.zfs.cache_flush_disable: 0
vfs.zfs.zil_disable: 0
vfs.zfs.debug: 1
kstat.zfs.misc.arcstats.hits: 22067589
kstat.zfs.misc.arcstats.misses: 4824470
kstat.zfs.misc.arcstats.demand_data_hits: 5661546
kstat.zfs.misc.arcstats.demand_data_misses: 2512832
kstat.zfs.misc.arcstats.demand_metadata_hits: 13533858
kstat.zfs.misc.arcstats.demand_metadata_misses: 1606419
kstat.zfs.misc.arcstats.prefetch_data_hits: 157869
kstat.zfs.misc.arcstats.prefetch_data_misses: 252444
kstat.zfs.misc.arcstats.prefetch_metadata_hits: 2714316
kstat.zfs.misc.arcstats.prefetch_metadata_misses: 452775
kstat.zfs.misc.arcstats.mru_hits: 10229954
kstat.zfs.misc.arcstats.mru_ghost_hits: 19863
kstat.zfs.misc.arcstats.mfu_hits: 9008171
kstat.zfs.misc.arcstats.mfu_ghost_hits: 159664
kstat.zfs.misc.arcstats.deleted: 4570138
kstat.zfs.misc.arcstats.recycle_miss: 579604
kstat.zfs.misc.arcstats.mutex_miss: 37379
kstat.zfs.misc.arcstats.evict_skip: 90360
kstat.zfs.misc.arcstats.hash_elements: 87460
kstat.zfs.misc.arcstats.hash_elements_max: 248398
kstat.zfs.misc.arcstats.hash_collisions: 2006655
kstat.zfs.misc.arcstats.hash_chains: 11410
kstat.zfs.misc.arcstats.hash_chain_max: 7
kstat.zfs.misc.arcstats.p: 617419234
kstat.zfs.misc.arcstats.c: 746412403
kstat.zfs.misc.arcstats.c_min: 37545216
kstat.zfs.misc.arcstats.c_max: 901085184
kstat.zfs.misc.arcstats.size: 615520768


My sysctl.conf:

# 12328 (default) -> 18000
kern.maxfiles=18000

# 5547 (default) -> 2000
kern.maxprocperuid=2000

# 11095 (default) -> 5000
kern.maxfilesperproc=5000

# postgresql
kern.ipc.shmall=32768
kern.ipc.shmmax=134217728
kern.ipc.semmap=256
security.jail.sysvipc_allowed=1
kern.ipc.shm_use_phys=1

vfs.zfs.debug=1
# default 100000
kern.maxvnodes=120000


The crash today (while running bonnie++) gave me some new data:

vfs.freevnodes: 24973
vfs.numvnodes: 35789
kstat.zfs.misc.arcstats.hits: 7086527
kstat.zfs.misc.arcstats.misses: 193683
kstat.zfs.misc.arcstats.demand_data_hits: 5599886
kstat.zfs.misc.arcstats.demand_data_misses: 82250
kstat.zfs.misc.arcstats.demand_metadata_hits: 1159851
kstat.zfs.misc.arcstats.demand_metadata_misses: 29224
kstat.zfs.misc.arcstats.prefetch_data_hits: 156004
kstat.zfs.misc.arcstats.prefetch_data_misses: 39321
kstat.zfs.misc.arcstats.prefetch_metadata_hits: 170786
kstat.zfs.misc.arcstats.prefetch_metadata_misses: 42888
kstat.zfs.misc.arcstats.mru_hits: 717887
kstat.zfs.misc.arcstats.mru_ghost_hits: 16917
kstat.zfs.misc.arcstats.mfu_hits: 6089477
kstat.zfs.misc.arcstats.mfu_ghost_hits: 14084
kstat.zfs.misc.arcstats.deleted: 269579
kstat.zfs.misc.arcstats.recycle_miss: 32480
kstat.zfs.misc.arcstats.mutex_miss: 814
kstat.zfs.misc.arcstats.evict_skip: 1687376
kstat.zfs.misc.arcstats.hash_elements: 2263
kstat.zfs.misc.arcstats.hash_elements_max: 65758
kstat.zfs.misc.arcstats.hash_collisions: 51235
kstat.zfs.misc.arcstats.hash_chains: 9
kstat.zfs.misc.arcstats.hash_chain_max: 4
kstat.zfs.misc.arcstats.p: 29036496
kstat.zfs.misc.arcstats.c: 37545216
kstat.zfs.misc.arcstats.c_min: 37545216
kstat.zfs.misc.arcstats.c_max: 901085184
kstat.zfs.misc.arcstats.size: 401183744


On the console I found:

panic: kmem_malloc(131072): kmem_map too small: 1152401408 total allocated
cpuid = 1

In /usr/src/UPDATING I read:

[..]

20090207:
        ZFS users on amd64 machines with 4GB or more of RAM should
        reevaluate their need for setting vm.kmem_size_max and
        vm.kmem_size manually.  In fact, after recent changes to the
        kernel, the default value of vm.kmem_size is larger than the
        suggested manual setting in most ZFS/FreeBSD tuning guides.

So I understood this as "vm.kmem_size is set unnecessary large by
default. You should think about decreasing it to save some RAM"

On my amd64 server the default values of kmem_size are

vm.kmem_size_scale: 3
vm.kmem_size_max: 3865468109
vm.kmem_size_min: 0
vm.kmem_size: 1201446912

Can someone give me a hint how to debug this problem further, or how to
find some reasonable values for setting vm.kmem_size_max and
vm.kmem_size with 16G of RAM?

Thanks!

Kai.









Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?49EE49D8.7000902>