Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 20 Feb 2013 19:28:28 +1100
From:      Peter Jeremy <peter@rulingia.com>
To:        Kevin Day <toasty@dragondata.com>
Cc:        FreeBSD Filesystems <freebsd-fs@freebsd.org>
Subject:   Re: Improving ZFS performance for large directories
Message-ID:  <20130220082828.GA44920@server.rulingia.com>
In-Reply-To: <19E0C908-79F1-43F8-899C-6B60F998D4A5@dragondata.com>
References:  <19DB8F4A-6788-44F6-9A2C-E01DEA01BED9@dragondata.com> <CAJjvXiE%2B8OMu_yvdRAsWugH7W=fhFW7bicOLLyjEn8YrgvCwiw@mail.gmail.com> <F4420A8C-FB92-4771-B261-6C47A736CF7F@dragondata.com> <20130201192416.GA76461@server.rulingia.com> <19E0C908-79F1-43F8-899C-6B60F998D4A5@dragondata.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--1yeeQ81UyVL57Vl7
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2013-Feb-19 14:10:47 -0600, Kevin Day <toasty@dragondata.com> wrote:
>Timing doing an "ls" in large directories 20 times, the first is the
>slowest, then all subsequent listings are roughly the same.

OK.  My testing was on large files rather than large amounts of metadata.

>Thinking I'd make the primary cache metadata only, and the secondary
>cache "all" would improve things,

This won't work as expected.  L2ARC only caches data coming out of ARC
so by setting ARC to cache metadata only, there's never any "data" in
ARC and hence never any evicted from ARC to L2ARC.

> I wiped the device (SATA secure erase to make sure)

That's not necessary.  L2ARC doesn't survive reboots because all teh
L2ARC "metadata" is in ARC only.  This does mean that it takes quite
a while for L2ARC to warm up following a reboot.

>Before adding the SSD, an "ls" in a directory with 65k files would
>take 10-30 seconds, it's now down to about 0.2 seconds.

That sounds quite good.

> There are roughly 29M files, growing at about 50k files/day. We
>recently upgraded, and are now at 96 3TB drives in the pool.=20

That number of files isn't really excessive but it sounds like your
workload has very low locality.  At this stage, my suggestions are:
1) Disable atime if you don't need it & haven't already.
   Otherwise file accesses are triggering metadata updates.
2) Increase vfs.zfs.arc_meta_limit
   You're still getting more metadata misses than data misses
3) Increase your ARC size (more RAM)
   Your pool is quite large compared to your RAM.

>It's a 250G drive, and only 22G is being used, and there's still a
>~66% miss rate.

That's 66% of the requests that missed in ARC.

> Is there any way to tell why more metadata isn't
>being pushed to the L2ARC?

ZFS treats writing to L2ARC very much as an afterthought.  L2ARC writes
are rate limited by vfs.zfs.l2arc_write_{boost,max} and will be aborted
if they might interfere with a read.  I'm not sure how to improve it.

Since this is all generic ZFS, you might like to try asking on
zfs@lists.illumos.org as well.  Some of the experts there might have
some ideas.

--=20
Peter Jeremy

--1yeeQ81UyVL57Vl7
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iEYEARECAAYFAlEkiSwACgkQ/opHv/APuIdsKQCgq90SUs/wm9rYE5moVPpIXBHu
PCcAn38hMTi+YFknk64N3ro4mR/dSKsk
=Sl9j
-----END PGP SIGNATURE-----

--1yeeQ81UyVL57Vl7--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130220082828.GA44920>