Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 07 Oct 2013 21:09:06 +0300
From:      Dmitriy Makarov <supportme@ukr.net>
To:        freebsd-current@freebsd.org
Subject:   ZFS L2ARC - incorrect size and abnormal system load on r255173
Message-ID:  <1381166916.122992963.5h9ygiri@frv45.ukr.net>

next in thread | raw e-mail | index | archive | help
Hi all,

On our production system on r255173 we have problem with abnormal high system load caused (not sure) with L2ARC placed on a few SSD, 490 GB total size.

After a fresh boot everything seems to be fine, Load Average less then 5.00.
But after some time (nearly day-two) Load Average jump to 10.. 20+ and system goes really slow, IO opearations with zfs pool grows from ms to even seconds. 

And L2ARC sysctls are pretty disturbing:

[frv:~]$ sysctl -a | grep l2

vfs.zfs.l2arc_write_max: 25000000
vfs.zfs.l2arc_write_boost: 50000000
vfs.zfs.l2arc_headroom: 8
vfs.zfs.l2arc_feed_secs: 1
vfs.zfs.l2arc_feed_min_ms: 30
vfs.zfs.l2arc_noprefetch: 0
vfs.zfs.l2arc_feed_again: 1
vfs.zfs.l2arc_norw: 1
vfs.zfs.l2c_only_size: 1525206040064
vfs.cache.numfullpathfail2: 4
kstat.zfs.misc.arcstats.evict_l2_cached: 6592742547456
kstat.zfs.misc.arcstats.evict_l2_eligible: 734016778752
kstat.zfs.misc.arcstats.evict_l2_ineligible: 29462561417216
kstat.zfs.misc.arcstats.l2_hits: 576550808
kstat.zfs.misc.arcstats.l2_misses: 128158998
kstat.zfs.misc.arcstats.l2_feeds: 1524059
kstat.zfs.misc.arcstats.l2_rw_clash: 1429740
kstat.zfs.misc.arcstats.l2_read_bytes: 2896069043200
kstat.zfs.misc.arcstats.l2_write_bytes: 2405022640128
kstat.zfs.misc.arcstats.l2_writes_sent: 826642
kstat.zfs.misc.arcstats.l2_writes_done: 826642
kstat.zfs.misc.arcstats.l2_writes_error: 0
kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 1059415
kstat.zfs.misc.arcstats.l2_evict_lock_retry: 1640
kstat.zfs.misc.arcstats.l2_evict_reading: 0
kstat.zfs.misc.arcstats.l2_free_on_write: 8580680
kstat.zfs.misc.arcstats.l2_abort_lowmem: 2096
kstat.zfs.misc.arcstats.l2_cksum_bad: 212832715
kstat.zfs.misc.arcstats.l2_io_error: 5501886
kstat.zfs.misc.arcstats.l2_size: 1587962307584
kstat.zfs.misc.arcstats.l2_asize: 1425666718720
kstat.zfs.misc.arcstats.l2_hdr_size: 82346948208
kstat.zfs.misc.arcstats.l2_compress_successes: 41707766
kstat.zfs.misc.arcstats.l2_compress_zeros: 0
kstat.zfs.misc.arcstats.l2_compress_failures: 0
kstat.zfs.misc.arcstats.l2_write_trylock_fail: 8847701930
kstat.zfs.misc.arcstats.l2_write_passed_headroom: 21220076
kstat.zfs.misc.arcstats.l2_write_spa_mismatch: 27619372107
kstat.zfs.misc.arcstats.l2_write_in_l2: 418007172085
kstat.zfs.misc.arcstats.l2_write_io_in_progress: 29279
kstat.zfs.misc.arcstats.l2_write_not_cacheable: 131001473113
kstat.zfs.misc.arcstats.l2_write_full: 63699
kstat.zfs.misc.arcstats.l2_write_buffer_iter: 1524059
kstat.zfs.misc.arcstats.l2_write_pios: 826642
kstat.zfs.misc.arcstats.l2_write_buffer_bytes_scanned: 8433038008130560
kstat.zfs.misc.arcstats.l2_write_buffer_list_iter: 96529899
kstat.zfs.misc.arcstats.l2_write_buffer_list_null_iter: 9228464


Here is output from zfs-stats about L2:

[frv:~]$ zfs-stats -L

------------------------------------------------------------------------
ZFS Subsystem Report                            Mon Oct  7 20:50:19 2013
------------------------------------------------------------------------

L2 ARC Summary: (DEGRADED)
        Passed Headroom:                        21.22m
        Tried Lock Failures:                    8.85b
        IO In Progress:                         29.32k
        Low Memory Aborts:                      2.10k
        Free on Write:                          8.59m
        Writes While Full:                      63.71k
        R/W Clashes:                            1.43m
        Bad Checksums:                          213.07m
        IO Errors:                              5.51m
        SPA Mismatch:                           27.62b

L2 ARC Size: (Adaptive)                         1.44    TiB
        Header Size:                    5.19%   76.70   GiB

L2 ARC Evicts:
        Lock Retries:                           1.64k
        Upon Reading:                           0

L2 ARC Breakdown:                               705.25m
        Hit Ratio:                      81.82%  577.01m
        Miss Ratio:                     18.18%  128.24m
        Feeds:                                  1.52m

L2 ARC Buffer:
        Bytes Scanned:                          7.49    PiB
        Buffer Iterations:                      1.52m
        List Iterations:                        96.55m
        NULL List Iterations:                   9.23m

L2 ARC Writes:
        Writes Sent:                    100.00% 826.96k

----------------------------------------------------------------------


In /boot/loader.conf (128GB RAM in system):

vm.kmem_size="110G"
vfs.zfs.arc_max="100G"
vfs.zfs.arc_min="80G"
vfs.zfs.vdev.cache.size=16M
vfs.zfs.vdev.cache.max="16384"

vfs.zfs.txg.timeout="10"
vfs.zfs.write_limit_min="134217728"

vfs.zfs.vdev.cache.bshift="14"
vfs.zfs.arc_meta_limit=53687091200

vfs.zfs.l2arc_write_max=25165824
vfs.zfs.l2arc_write_boost=50331648
vfs.zfs.l2arc_noprefetch=0

In /etc/sysctl.conf:

vfs.zfs.l2arc_write_max=25000000
vfs.zfs.l2arc_write_boost=50000000
vfs.zfs.l2arc_noprefetch=0
vfs.zfs.l2arc_headroom=8
vfs.zfs.l2arc_feed_min_ms=30
vfs.zfs.arc_meta_limit=53687091200




How can L2 ARC Size: (Adaptive) be 1.44 TiB (up) with total physical size of L2ARC devices 490GB?

Why this values can grow and become so high? 
kstat.zfs.misc.arcstats.l2_cksum_bad: 212832715
kstat.zfs.misc.arcstats.l2_io_error: 5501886


Thanks for any help and ideas!




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1381166916.122992963.5h9ygiri>