Date: Fri, 19 Feb 2016 12:07:03 +0100 From: =?UTF-8?Q?Niccol=C3=B2_Corvini?= <n.corvini@gmail.com> To: freebsd-fs@freebsd.org Subject: Zfs heavy io writing | zfskern txg_thread_enter Message-ID: <CAM1TVW-yOvU6VM19PadD5ygsv2-Vb-_8T7SKjcsP7Ov0Q5A5SQ@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
Hi, first time here! We are having a problem with a server running FreeBsd 9.1 with ZFS on a single sata drive. Since a few days ago, in the morning the system becomes really slow due of a really heavy io writing. We investigated and we think it might start at night, maybe correlated to to crondaily (standard) but we are not sure. After a few hours the situation returns to normal. Any help is much appreciated The machine is a Intel Xeon E5-2620 with 36GB of RAM, the HDD is a 2TB an is half full. gstat output: L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name 13 135 21 641 256.7 108 6410 41.4 128.8| ada0 0 0 0 0 0.0 0 0 0.0 0.0| ada0p1 13 135 21 641 256.7 108 6410 41.7 128.8| ada0p2 0 0 0 0 0.0 0 0 0.0 0.0| cd0 0 0 0 0 0.0 0 0 0.0 0.0| gptid/3c0de011-4f37-11e5-8217-3085a91c3292 0 0 0 0 0.0 0 0 0.0 0.0| zvol/zroot/swap 13 135 21 641 256.7 108 6410 41.7 128.9| gpt/disk1 Using top -m io shows that the responsible is [zfskern{txg_thread_enter}] top -m io output: PID JID USERNAME VCSW IVCSW READ WRITE FAULT TOTAL PERCENT COMMAND 3 0 root 14 1 0 37 0 37 30.33% [zfskern{txg_thread_enter}] 49866 215 7070 26 2 0 5 0 5 4.10% postgres: stats collector process (postgres) 99901 5 70 42 0 0 4 0 4 3.28% postgres: promeditec promeditec.osr.test 192.168.0.246(278 24820 199 www 10 0 7 0 0 7 5.74% [jsvc{jsvc}] 33869 212 88 19 2 0 2 0 2 1.64% [mysqld{mysqld}] 93400 0 root 13 0 10 0 0 10 8.20% [find] 89407 215 7070 10 0 0 1 0 1 0.82% postgres: alfresco alfconservazione.dotcom.ts.it 192.168.0 15776 5 70 11 0 0 4 0 4 3.28% postgres: stats collector process (postgres) 33869 212 88 10 0 0 3 0 3 2.46% [mysqld{mysqld}] 33869 212 88 2 0 0 11 0 11 9.02% [mysqld{mysqld}] 18685 198 root 5 0 0 2 0 2 1.64% /usr/sbin/syslogd -s 15852 214 70 4 1 0 1 0 1 0.82% postgres: alfresco alfcomunets.dotcom.ts.it 192.168.0.212( 98335 120 root 11 0 29 0 0 29 23.77% find /var/log -name messages.* -mtime -2 16128 214 70 8 0 0 1 0 1 0.82% postgres: alfresco alfaxErre8 192.168.0.208(50558) (postg 1116 198 root 10 0 0 1 0 1 0.82% sendmail: ./u1J9k90d001112 local: client DATA status (send 1120 198 root 7 0 0 4 0 4 3.28% mail.local -l Using procstat -kk on the zfskern pid shows: PID TID COMM TDNAME KSTACK 3 100129 zfskern arc_reclaim_thre mi_switch sleepq_timedwait _cv_timedwait arc_reclaim_thread fork_exit fork_trampoline 3 100130 zfskern l2arc_feed_threa mi_switch sleepq_timedwait _cv_timedwait l2arc_feed_thread fork_exit fork_trampoline 3 100504 zfskern txg_thread_enter mi_switch sleepq_wait _cv_wait txg_thread_wait txg_quiesce_thread fork_exit fork_trampoline 3 100505 zfskern txg_thread_enter mi_switch sleepq_wait _cv_wait zio_wait dsl_pool_sync spa_sync txg_sync_thread fork_exit fork_trampoline 3 100506 zfskern zvol zroot/swap mi_switch sleepq_wait _sleep zvol_geom_worker fork_exit fork_trampoline systat -vmstat 7 users Load 0.50 0.62 1.46 Feb 19 11:46 Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER Tot Share Tot Share Free in out in out Act 41318k 199492 1251183k 588716 2254748 count All 46844k 229048 -1968M 892088 pages Proc: Interrupts r p d s w Csw Trp Sys Int Sof Flt cow 2836 total 4k 8516 393 12k 236 1445 11k 11281 zfod atkbd0 1 ozfod acpi0 9 4.4%Sys 0.0%Intr 0.3%User 0.0%Nice 95.4%Idle %ozfod ehci0 17 | | | | | | | | | | | daefr ehci1 23 == 11171 prcfr 79 cpu0:timer dtbuf 11265 totfr isci0 264 Namei Name-cache Dir-cache 1095774 desvn react 24 em0:rx 0 Calls hits % hits % 409282 numvn pdwak 16 em0:tx 0 78 41 53 273943 frevn pdpgs em0:link intrn 196 ahci0 278 Disks ada0 cd0 pass0 pass1 20445132 wire 267 cpu21:time KB/t 2.55 0.00 0.00 0.00 37317552 act 55 cpu13:time tps 223 0 0 0 4948708 inact 86 cpu5:timer MB/s 0.56 0.00 0.00 0.00 884804 cache 24 cpu12:time %busy 94 0 0 0 1370012 free 63 cpu10:time buf 306 cpu19:time 63 cpu11:time 55 cpu14:time 86 cpu9:timer 71 cpu18:time 86 cpu3:timer 47 cpu23:time 55 cpu6:timer 55 cpu22:time 71 cpu2:timer 39 cpu20:time zpool status: pool: zroot state: ONLINE scan: scrub repaired 0 in 3h46m with 0 errors on Wed Nov 4 21:54:44 2015 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 gpt/disk1 ONLINE 0 0 0 errors: No known data errors
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM1TVW-yOvU6VM19PadD5ygsv2-Vb-_8T7SKjcsP7Ov0Q5A5SQ>