Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 27 Mar 2014 06:52:38 -0500
From:      Karl Denninger <karl@denninger.net>
To:        freebsd-fs@freebsd.org, freebsd-stable@freebsd.org
Subject:   Re: kern/187594: [zfs] [patch] ZFS ARC behavior problem and fix
Message-ID:  <53341106.4060101@denninger.net>
In-Reply-To: <8659e58b9fabd9f553c8be5da5dc61fd@mail.mikej.com>
References:  <201403261230.s2QCU3vI095105@freefall.freebsd.org> <8659e58b9fabd9f553c8be5da5dc61fd@mail.mikej.com>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a cryptographically signed message in MIME format.

--------------ms060209040607030606090406
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable

On 3/27/2014 4:11 AM, mikej wrote:
> I've been running the latest patch now on r263711 and want to give it=20
> a +1
>
> No ZFS knobs set and I must go out of my way to have my system swap.
>
> I hope this patch gets a much wider review and can be put into the
> tree permanently.
>
> Karl, thanks for the working on this.
>
> Regards,
>
> Michael Jung
No problem; I was being driven insane by the stalls and related bad=20
behavior... and there's that old saw about complaining about something=20
without proposing a fix for it (I've done it!) being "less than optimum" =

so.... :-)

Hopefully wider review (and, if the general consensus is similar to what =

I've seen here and what you're reporting as well, inclusion in the=20
codebase) will come.

On my sandbox system I have to get truly abusive before I can get the=20
system to swap now, but that load is synthetic and we all know what=20
sometimes happens when you try to extrapolate from synthetic loads to=20
real production ones.

What really has my attention is the impact on systems running live=20
production loads.

It has entirely changed the character of those machines, working=20
equally-well for both pure ZFS machines and mixed UFS/ZFS systems. One=20
of these systems that gets pounded on pretty good and has a=20
moderately-large configuration (~10TB of storage, 2 Xeon quad-core=20
processors and 24GB of RAM serving a combination of Samba users=20
internally, a decently-large Postgres installation supporting an=20
externally-facing web forum and blog application, email and similar=20
things) has been completely transformed from being "frequently=20
challenged" by its workload to literally loafing 90%+ of the day. DBMS=20
response times have seen their standard deviation drop by an order of=20
magnitude with best-response times down for one of the most-common query =

sequences (~30 separate ops) from ~180ms to ~140.

This particular machine has a separate pool for the system itself (root, =

usr and var) which was formerly UFS because it had to be in order to=20
avoid the worst of the "stall" bad behavior.  It also has two other=20
pools on it, one for read-nearly-only data sets that are comprised of=20
very large files that are almost archival in character and a second that =

has the system's "working set" on it.  The latter has a separate intent=20
log; I had a cache SSD drive on it as well but have recently dropped=20
that as with these changes it no longer produces a material improvement=20
in performance.  I'm frankly not sure the intent log is helping any more =

either but I've yet to drop it and instrument the results -- it used to=20
be *necessary* to avoid nasty problems during busy periods.

I now have that machine set up booting from ZFS with the system on a=20
mirrored pool dedicated to system images, with lz4 *and* dedup on (for=20
that filesystem's root), which allows me to clone it almost instantly,=20
start a jail on the clone and then do a "buildworld buildkernel -j8"=20
while only allocating storage to actual changes. Dedup ratio on that=20
mirror set is 1.4x and lz4 is showing a net compression ratio of 2.01x.=20
Even better I cannot provoke misbehavior by doing this sort of thing=20
during the middle of the day where formerly that was just begging for=20
trouble; the impact on user perceptible performance during it is zero=20
although I can see the degradation in performance (a modest increase in=20
system latency) in the stats.

Oh, did I mention that everything except the boot/root/usr/var=20
filesystems (including swap) are geli-encrypted on this machine as well=20
and that the nightly PC backup jobs bury the GIG-E interface on which=20
they're attached -- and sustain that performance against the ZFS disks=20
for the duration?  (The machine does have AESNI loaded....)

Finally swap allocation remains at zero throughout all of this.

At present, coming off the overnight that has an activity spike for=20
routine in-house backup activity from connected PCs but is otherwise the =

"low point" of activity shows 1GB of free memory, an "auto-tuned" amount =

of 12.9GB of ARC cache (with a maximum size of 22.3) and inactive pages=20
have remained stable.  Wired memory is almost 19GB with Postgres using a =

sizable chunk of it.  Cache efficiency is claimed to be 98.9% (!) =20
That'll go down somewhat over the day but during the busiest part of the =

day it remains well into the 90s which I'm sure has a heck of a lot to=20
do with the performance improvements....

Cross-posted over to -STABLE in the hope of expanding review and testing =

by others.

--=20
-- Karl
karl@denninger.net



--------------ms060209040607030606090406
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIFTzCC
BUswggQzoAMCAQICAQgwDQYJKoZIhvcNAQEFBQAwgZ0xCzAJBgNVBAYTAlVTMRAwDgYDVQQI
EwdGbG9yaWRhMRIwEAYDVQQHEwlOaWNldmlsbGUxGTAXBgNVBAoTEEN1ZGEgU3lzdGVtcyBM
TEMxHDAaBgNVBAMTE0N1ZGEgU3lzdGVtcyBMTEMgQ0ExLzAtBgkqhkiG9w0BCQEWIGN1c3Rv
bWVyLXNlcnZpY2VAY3VkYXN5c3RlbXMubmV0MB4XDTEzMDgyNDE5MDM0NFoXDTE4MDgyMzE5
MDM0NFowWzELMAkGA1UEBhMCVVMxEDAOBgNVBAgTB0Zsb3JpZGExFzAVBgNVBAMTDkthcmwg
RGVubmluZ2VyMSEwHwYJKoZIhvcNAQkBFhJrYXJsQGRlbm5pbmdlci5uZXQwggIiMA0GCSqG
SIb3DQEBAQUAA4ICDwAwggIKAoICAQC5n2KBrBmG22nVntVdvgKCB9UcnapNThrW1L+dq6th
d9l4mj+qYMUpJ+8I0rTbY1dn21IXQBoBQmy8t1doKwmTdQ59F0FwZEPt/fGbRgBKVt3Quf6W
6n7kRk9MG6gdD7V9vPpFV41e+5MWYtqGWY3ScDP8SyYLjL/Xgr+5KFKkDfuubK8DeNqdLniV
jHo/vqmIgO+6NgzPGPgmbutzFQXlxUqjiNAAKzF2+Tkddi+WKABrcc/EqnBb0X8GdqcIamO5
SyVmuM+7Zdns7D9pcV16zMMQ8LfNFQCDvbCuuQKMDg2F22x5ekYXpwjqTyfjcHBkWC8vFNoY
5aFMdyiN/Kkz0/kduP2ekYOgkRqcShfLEcG9SQ4LQZgqjMpTjSOGzBr3tOvVn5LkSJSHW2Z8
Q0dxSkvFG2/lsOWFbwQeeZSaBi5vRZCYCOf5tRd1+E93FyQfpt4vsrXshIAk7IK7f0qXvxP4
GDli5PKIEubD2Bn+gp3vB/DkfKySh5NBHVB+OPCoXRUWBkQxme65wBO02OZZt0k8Iq0i4Rci
WV6z+lQHqDKtaVGgMsHn6PoeYhjf5Al5SP+U3imTjF2aCca1iDB5JOccX04MNljvifXgcbJN
nkMgrzmm1ZgJ1PLur/ADWPlnz45quOhHg1TfUCLfI/DzgG7Z6u+oy4siQuFr9QT0MQIDAQAB
o4HWMIHTMAkGA1UdEwQCMAAwEQYJYIZIAYb4QgEBBAQDAgWgMAsGA1UdDwQEAwIF4DAsBglg
hkgBhvhCAQ0EHxYdT3BlblNTTCBHZW5lcmF0ZWQgQ2VydGlmaWNhdGUwHQYDVR0OBBYEFHw4
+LnuALyLA5Cgy7T5ZAX1WzKPMB8GA1UdIwQYMBaAFF3U3hpBZq40HB5VM7B44/gmXiI0MDgG
CWCGSAGG+EIBAwQrFilodHRwczovL2N1ZGFzeXN0ZW1zLm5ldDoxMTQ0My9yZXZva2VkLmNy
bDANBgkqhkiG9w0BAQUFAAOCAQEAZ0L4tQbBd0hd4wuw/YVqEBDDXJ54q2AoqQAmsOlnoxLO
31ehM/LvrTIP4yK2u1VmXtUumQ4Ao15JFM+xmwqtEGsh70RRrfVBAGd7KOZ3GB39FP2TgN/c
L5fJKVxOqvEnW6cL9QtvUlcM3hXg8kDv60OB+LIcSE/P3/s+0tEpWPjxm3LHVE7JmPbZIcJ1
YMoZvHh0NSjY5D0HZlwtbDO7pDz9sZf1QEOgjH828fhtborkaHaUI46pmrMjiBnY6ujXMcWD
pxtikki0zY22nrxfTs5xDWGxyrc/cmucjxClJF6+OYVUSaZhiiHfa9Pr+41okLgsRB0AmNwE
f6ItY3TI8DGCBQowggUGAgEBMIGjMIGdMQswCQYDVQQGEwJVUzEQMA4GA1UECBMHRmxvcmlk
YTESMBAGA1UEBxMJTmljZXZpbGxlMRkwFwYDVQQKExBDdWRhIFN5c3RlbXMgTExDMRwwGgYD
VQQDExNDdWRhIFN5c3RlbXMgTExDIENBMS8wLQYJKoZIhvcNAQkBFiBjdXN0b21lci1zZXJ2
aWNlQGN1ZGFzeXN0ZW1zLm5ldAIBCDAJBgUrDgMCGgUAoIICOzAYBgkqhkiG9w0BCQMxCwYJ
KoZIhvcNAQcBMBwGCSqGSIb3DQEJBTEPFw0xNDAzMjcxMTUyMzhaMCMGCSqGSIb3DQEJBDEW
BBST4cX/La1k81iZ920b7xLGzU6ExzBsBgkqhkiG9w0BCQ8xXzBdMAsGCWCGSAFlAwQBKjAL
BglghkgBZQMEAQIwCgYIKoZIhvcNAwcwDgYIKoZIhvcNAwICAgCAMA0GCCqGSIb3DQMCAgFA
MAcGBSsOAwIHMA0GCCqGSIb3DQMCAgEoMIG0BgkrBgEEAYI3EAQxgaYwgaMwgZ0xCzAJBgNV
BAYTAlVTMRAwDgYDVQQIEwdGbG9yaWRhMRIwEAYDVQQHEwlOaWNldmlsbGUxGTAXBgNVBAoT
EEN1ZGEgU3lzdGVtcyBMTEMxHDAaBgNVBAMTE0N1ZGEgU3lzdGVtcyBMTEMgQ0ExLzAtBgkq
hkiG9w0BCQEWIGN1c3RvbWVyLXNlcnZpY2VAY3VkYXN5c3RlbXMubmV0AgEIMIG2BgsqhkiG
9w0BCRACCzGBpqCBozCBnTELMAkGA1UEBhMCVVMxEDAOBgNVBAgTB0Zsb3JpZGExEjAQBgNV
BAcTCU5pY2V2aWxsZTEZMBcGA1UEChMQQ3VkYSBTeXN0ZW1zIExMQzEcMBoGA1UEAxMTQ3Vk
YSBTeXN0ZW1zIExMQyBDQTEvMC0GCSqGSIb3DQEJARYgY3VzdG9tZXItc2VydmljZUBjdWRh
c3lzdGVtcy5uZXQCAQgwDQYJKoZIhvcNAQEBBQAEggIAg58Yb4ERaNOoW98HiWSJ9hmNZVot
ULYq1OHZwR4jRaSIpWM9bKiMj8VQ+2XJvfB4VfqNRZhJKhm96Ssx7k7gAM8MX/U1U4OReih8
fDRsI+YsAeDiog6gAG1CsTZiXF1K0yMXTa/o2WaODssbS9sDH7utMaeH/u/XwawPRl+NAEN6
e+0cceRNTPg3k/iWkErg0CC6XmlXrFEXfH29ytyMF+dtIKnqXyxbkIeo+Hd5JFUSn+2cAa9D
cfHHcNwF1sEas0Y+4X63yBrZAf68nCyYngQZaqob1Ox2LfL+GQ0S63WpiBRfvPZldUOfQIzE
hh00FoL2lwInI1geMnB1k9qRFxvI2SPVxBA3ic/seBb0wbyXb+dnyK7dhq9XwX/Wl7FpT/N6
jf6EwwPFAkSqGsC6Xa5D1/tgWjnrX7rIqIBCSkWFjXakUTFvxpwF7jAJrX2ucG4uZM5+Z9qP
1V1/hA/NvqU6fjr2HOS6O0bKiKWL7iYRHFjxRExq0vkTwEQTwOb4fTmGTHVj+ojYUlGIUsQq
xcj/O3w7zdzD3RncCjqGs6+sutTODkIQa0medmBNWPOEOdAgPHaYa+GmG4Kp7nvnwHRECnr3
KQoZu0TyRnoqPvq+cYSPSTerkzp6GXsIMjLoazsiW+m4dAP1N7aTxmWlQ47M1n5mc+E34o5K
KSdVZcQAAAAAAAA=
--------------ms060209040607030606090406--





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?53341106.4060101>