Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 3 Jan 2019 11:34:25 +0100
From:      Borja Marcos <borjam@sarenet.es>
To:        freebsd-fs@freebsd.org
Subject:   Interesting: ZFS scrub prefetch hurting sequential scrub performance?
Message-ID:  <8ECF7513-9DFB-46EF-86BA-DB717D713792@sarenet.es>

next in thread | raw e-mail | index | archive | help

Hi,

I have noticed that my scrubs have become painfully slow. I am wondering =
wether I=E2=80=99ve just hit some worst case or maybe
there is some interaction between the ZFS sequential scrub and scrub =
prefetch. I don=E2=80=99t recall seeing this behavior
before the sequential scrub code was committed.=20

Did I hit some worst case or should scrub prefetch be disabled with the =
new sequential scrub code?


# zpool status
  pool: pool
 state: ONLINE
  scan: scrub in progress since Sat Dec 29 03:56:02 2018
	133G scanned at 309K/s, 129G issued at 300K/s, 619G total
	0 repaired, 20.80% done, no estimated completion time

When this happened last month I tried rebooting the server and =
restarting the scrub and everything went better.=20

The first graph shows the disk I/O bandwith history for the last week. =
When the scrub started disk I/O =E2=80=9Cbusy percent=E2=80=9D
reached almost 100 %. And curiously the transfer rates looked rather =
healthy at around 10 MBps of read activity.

At first I suspected a misbehaving disk slowing down the whole process =
with retries but all the disks show a similar
service time pattern. One attached for reference.

Looking at the rest of the stats for some misbehavior hints I saw =
arctats_prefetch_metadata misses raising to
about 2000 per second and arcstats_l2_misses following the same pattern.=20=


Could it be prefetch spending a lot of time writing on the l2arc only to =
have the data evicted due to misses?

I have tried disabling scrub prefetch (vfs.zfs.no_scrub_prefetch=3D1) =
and, voila! everything picked up speed. Now
with a zpool iostat I see bursts of 100+ MBps reading activity and a =
proper scrub activity.

Disk busy percent has gone down to around 50% and cache stats have =
become much better. Turns out that
most of the I/O activity was just pointless writes to the L2ARC.

Now, the hardware configuration.

The server has only 8 GB of memory with a maximum configured ARC size of =
4 GB.=20

It has a LSI2008 card with IR firmware. I didn=C2=B4t bother to cross =
flash but anyway I am not using the RAID facilities,
it=C2=B4s just configured like a plain HBA.

mps0: <Avago Technologies (LSI) SAS2008> port 0x9000-0x90ff mem =
0xdfff0000-0xdfffffff,0xdff80000-0xdffbffff irq 17 at device 0.0 =
numa-domain 0 on pci4
mps0: Firmware: 20.00.07.00, Driver: 21.02.00.00-fbsd
mps0: IOCCapabilities: =
185c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,IR>

zpool status
  pool: pool
 state: ONLINE
  scan: scrub in progress since Sat Dec 29 03:56:02 2018
	323G scanned at 742K/s, 274G issued at 632K/s, 619G total
	0 repaired, 44.32% done, no estimated completion time
config:

	NAME        STATE     READ WRITE CKSUM
	pool        ONLINE       0     0     0
	  raidz1-0  ONLINE       0     0     0
	    da12    ONLINE       0     0     0
	    da13    ONLINE       0     0     0
	    da14    ONLINE       0     0     0
	    da9     ONLINE       0     0     0
	    da15    ONLINE       0     0     0
	    da3     ONLINE       0     0     0
	  raidz1-1  ONLINE       0     0     0
	    da10    ONLINE       0     0     0
	    da4     ONLINE       0     0     0
	    da5     ONLINE       0     0     0
	    da6     ONLINE       0     0     0
	    da7     ONLINE       0     0     0
	    da8     ONLINE       0     0     0
	logs
	  da11p2    ONLINE       0     0     0
	cache
	  da11p3    ONLINE       0     0     0

errors: No known data errors


Yes, both ZIL and L2ARC on the same disk (a SSD). I know it=E2=80=99s =
not optimal but I guess it=E2=80=99s better
than the high latency of conventional disks,

# camcontrol devlist
<SEAGATE ST914603SSUN146G 0868>    at scbus6 target 11 lun 0 (pass0,da0)
<SEAGATE ST914603SSUN146G 0868>    at scbus6 target 15 lun 0 (pass1,da1)
<SEAGATE ST9146803SS FS03>         at scbus6 target 17 lun 0 (pass2,da2)
<SEAGATE ST914603SSUN146G 0868>    at scbus6 target 18 lun 0 (pass3,da3)
<SEAGATE ST9146803SS FS03>         at scbus6 target 20 lun 0 (pass4,da4)
<SEAGATE ST914603SSUN146G 0868>    at scbus6 target 21 lun 0 (pass5,da5)
<SEAGATE ST9146803SS FS03>         at scbus6 target 22 lun 0 (pass6,da6)
<SEAGATE ST914603SSUN146G 0868>    at scbus6 target 23 lun 0 (pass7,da7)
<SEAGATE ST914603SSUN146G 0868>    at scbus6 target 24 lun 0 (pass8,da8)
<SEAGATE ST9146803SS FS03>         at scbus6 target 25 lun 0 (pass9,da9)
<SEAGATE ST9146803SS FS03>         at scbus6 target 26 lun 0 =
(pass10,da10)
<LSILOGIC SASX28 A.0 5021>         at scbus6 target 27 lun 0 =
(ses0,pass11)
<ATA Samsung SSD 850 2B6Q>         at scbus6 target 28 lun 0 =
(pass12,da11)
<SEAGATE ST9146803SS FS03>         at scbus6 target 29 lun 0 =
(pass13,da12)
<SEAGATE ST9146802SS S229>         at scbus6 target 30 lun 0 =
(pass14,da13)
<SEAGATE ST9146803SS FS03>         at scbus6 target 32 lun 0 =
(pass15,da14)
<SEAGATE ST9146802SS S22B>         at scbus6 target 33 lun 0 =
(pass16,da15)
<TSSTcorp CD/DVDW TS-T632A SR03>   at scbus13 target 0 lun 0 =
(pass17,cd0)


Hope the attachments reach the list, otherwise I will mail them to =
anyone interested.


Cheers,





Borja.











Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8ECF7513-9DFB-46EF-86BA-DB717D713792>