Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 29 Aug 2018 15:03:27 -0700
From:      Richard Elling <richard.elling@richardelling.com>
To:        openzfs-developer <developer@lists.open-zfs.org>
Cc:        Paul <devgs@ukr.net>, freebsd-fs@freebsd.org, developer@open-zfs.org
Subject:   Re: [developer] Re: Potential bug recently introduced in arc_adjust() that leads to unintended pressure on MFU eventually leading to dramatic reduction in its size
Message-ID:  <6569BAF8-1AC4-4CB4-A384-E5C8EFF129D8@richardelling.com>
In-Reply-To: <20180829212207.GF2709@raichu>
References:  <1535534257.46692673.obgqw5dr@frv33.fwdcdn.com> <20180829212207.GF2709@raichu>

next in thread | previous in thread | raw e-mail | index | archive | help
Thanks for passing this along, Mark.
Comments embedded

> On Aug 29, 2018, at 2:22 PM, Mark Johnston <markj@freebsd.org> wrote:
>=20
> On Wed, Aug 29, 2018 at 12:42:33PM +0300, Paul wrote:
>> Hello team,
>>=20
>>=20
>> It seems like a commit on Mar 23 introduced a bug: if during =
execution of arc_adjust()
>> target is reached after MRU is evicted current code continues =
evicting MFU. Before said
>> commit, on the step prior to MFU eviction, target value was =
recalculated as:

arc_size is hot, so it was broken up into per-cpu counters and asize is =
now a snapshot
of the sum of the counters...

>>=20
>>  target =3D arc_size - arc_c;
>>=20
>> arc_size here is a global variable that was being updated =
accordingly, during MRU eviction,
>> hence this expression, resulted in zero or negative target if MRU =
eviction was enough
>> to reach the original goal.
>>=20
>> Modern version uses cached value of arc_size, called asize:
>>=20
>>  target =3D asize - arc_c;
>>=20
>> Because asize stays constant during execution of whole body of =
arc_adjust() it means that
>> above expression will always be evaluated to value > 0, causing MFU =
to be evicted every=20
>> time, even if MRU eviction has reached the goal already. Because of =
the difference in=20
>> nature of MFU and MRU, globally it leads to eventual reduction of =
amount of MFU in ARC=20
>> to dramatic numbers.
>=20
> Hi Paul,
>=20
> Your analysis does seem right to me.  I cc'ed the openzfs mailing list
> so that an actual ZFS expert can chime in; it looks like this =
behaviour
> is consistent between FreeBSD, illumos and ZoL.

Agree. In the pre-aggsum code, arc_size would have changed after the MRU =
adjustment.
Now it does not. I have at least one correlation to this occuring in a =
repeatable test that
I can run on my ZoL test machine (when it is finished punishing some =
other code).

>=20
> Have you already tried the obvious "fix" of subtracting total_evicted
> from the MFU target?

ameta also needs to be re-aggsummed after the MRU adjustments.
 -- richard

>=20
>> Servers that run the version of FreeBSD prior to the issue have this =
picture of ARC:
>>=20
>>   ARC: 369G Total, 245G MFU, 97G MRU, 36M Anon, 3599M Header, 24G =
Other
>>=20
>> As you can see, MFU dominates. This is a nature of our workload: we =
have a considerably=20
>> small dataset that we use constantly and repeatedly; and a large =
dataset that we use
>> rarely.
>>=20
>> But on the modern version of FreeBSD picture is dramatically =
different:=20
>>=20
>>   ARC: 360G Total, 50G MFU, 272G MRU, 211M Anon, 7108M Header, 30G =
Other
>>=20
>> This leads to a much heavier burden on the disk sub-system.
>>=20
>>=20
>> Commit that introduced a bug:=20
>> =
https://github.com/freebsd/freebsd/commit/555f9563c9dc217341d4bb5129f5d233=
cf1f92b8
>=20
> ------------------------------------------
> openzfs: openzfs-developer
> Permalink: =
https://openzfs.topicbox.com/groups/developer/T10a105c53bcce15c-M8152dc243=
0a5ea4e625ad564
> Delivery options: =
https://openzfs.topicbox.com/groups/developer/subscription




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6569BAF8-1AC4-4CB4-A384-E5C8EFF129D8>