Date: Wed, 29 Aug 2018 15:03:27 -0700 From: Richard Elling <richard.elling@richardelling.com> To: openzfs-developer <developer@lists.open-zfs.org> Cc: Paul <devgs@ukr.net>, freebsd-fs@freebsd.org, developer@open-zfs.org Subject: Re: [developer] Re: Potential bug recently introduced in arc_adjust() that leads to unintended pressure on MFU eventually leading to dramatic reduction in its size Message-ID: <6569BAF8-1AC4-4CB4-A384-E5C8EFF129D8@richardelling.com> In-Reply-To: <20180829212207.GF2709@raichu> References: <1535534257.46692673.obgqw5dr@frv33.fwdcdn.com> <20180829212207.GF2709@raichu>
next in thread | previous in thread | raw e-mail | index | archive | help
Thanks for passing this along, Mark. Comments embedded > On Aug 29, 2018, at 2:22 PM, Mark Johnston <markj@freebsd.org> wrote: >=20 > On Wed, Aug 29, 2018 at 12:42:33PM +0300, Paul wrote: >> Hello team, >>=20 >>=20 >> It seems like a commit on Mar 23 introduced a bug: if during = execution of arc_adjust() >> target is reached after MRU is evicted current code continues = evicting MFU. Before said >> commit, on the step prior to MFU eviction, target value was = recalculated as: arc_size is hot, so it was broken up into per-cpu counters and asize is = now a snapshot of the sum of the counters... >>=20 >> target =3D arc_size - arc_c; >>=20 >> arc_size here is a global variable that was being updated = accordingly, during MRU eviction, >> hence this expression, resulted in zero or negative target if MRU = eviction was enough >> to reach the original goal. >>=20 >> Modern version uses cached value of arc_size, called asize: >>=20 >> target =3D asize - arc_c; >>=20 >> Because asize stays constant during execution of whole body of = arc_adjust() it means that >> above expression will always be evaluated to value > 0, causing MFU = to be evicted every=20 >> time, even if MRU eviction has reached the goal already. Because of = the difference in=20 >> nature of MFU and MRU, globally it leads to eventual reduction of = amount of MFU in ARC=20 >> to dramatic numbers. >=20 > Hi Paul, >=20 > Your analysis does seem right to me. I cc'ed the openzfs mailing list > so that an actual ZFS expert can chime in; it looks like this = behaviour > is consistent between FreeBSD, illumos and ZoL. Agree. In the pre-aggsum code, arc_size would have changed after the MRU = adjustment. Now it does not. I have at least one correlation to this occuring in a = repeatable test that I can run on my ZoL test machine (when it is finished punishing some = other code). >=20 > Have you already tried the obvious "fix" of subtracting total_evicted > from the MFU target? ameta also needs to be re-aggsummed after the MRU adjustments. -- richard >=20 >> Servers that run the version of FreeBSD prior to the issue have this = picture of ARC: >>=20 >> ARC: 369G Total, 245G MFU, 97G MRU, 36M Anon, 3599M Header, 24G = Other >>=20 >> As you can see, MFU dominates. This is a nature of our workload: we = have a considerably=20 >> small dataset that we use constantly and repeatedly; and a large = dataset that we use >> rarely. >>=20 >> But on the modern version of FreeBSD picture is dramatically = different:=20 >>=20 >> ARC: 360G Total, 50G MFU, 272G MRU, 211M Anon, 7108M Header, 30G = Other >>=20 >> This leads to a much heavier burden on the disk sub-system. >>=20 >>=20 >> Commit that introduced a bug:=20 >> = https://github.com/freebsd/freebsd/commit/555f9563c9dc217341d4bb5129f5d233= cf1f92b8 >=20 > ------------------------------------------ > openzfs: openzfs-developer > Permalink: = https://openzfs.topicbox.com/groups/developer/T10a105c53bcce15c-M8152dc243= 0a5ea4e625ad564 > Delivery options: = https://openzfs.topicbox.com/groups/developer/subscription
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6569BAF8-1AC4-4CB4-A384-E5C8EFF129D8>