Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 30 Aug 2018 10:40:04 -0400
From:      Mark Johnston <markj@freebsd.org>
To:        Paul <devgs@ukr.net>
Cc:        freebsd-fs@freebsd.org, developer@open-zfs.org
Subject:   Re: Potential bug recently introduced in arc_adjust() that leads to unintended pressure on MFU eventually leading to dramatic reduction in its size
Message-ID:  <20180830144004.GD15740@raichu>
In-Reply-To: <1535611885.624706680.h132ex07@frv33.fwdcdn.com>
References:  <1535534257.46692673.obgqw5dr@frv33.fwdcdn.com> <20180829212207.GF2709@raichu> <1535611885.624706680.h132ex07@frv33.fwdcdn.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Aug 30, 2018 at 09:55:27AM +0300, Paul wrote:
> 30 August 2018, 00:22:14, by "Mark Johnston" <markj@freebsd.org>:
> 
> > On Wed, Aug 29, 2018 at 12:42:33PM +0300, Paul wrote:
> > > Hello team,
> > > 
> > > 
> > > It seems like a commit on Mar 23 introduced a bug: if during execution of arc_adjust()
> > > target is reached after MRU is evicted current code continues evicting MFU. Before said
> > > commit, on the step prior to MFU eviction, target value was recalculated as:
> > > 
> > >   target = arc_size - arc_c;
> > > 
> > > arc_size here is a global variable that was being updated accordingly, during MRU eviction,
> > > hence this expression, resulted in zero or negative target if MRU eviction was enough
> > > to reach the original goal.
> > > 
> > > Modern version uses cached value of arc_size, called asize:
> > > 
> > >   target = asize - arc_c;
> > > 
> > > Because asize stays constant during execution of whole body of arc_adjust() it means that
> > > above expression will always be evaluated to value > 0, causing MFU to be evicted every 
> > > time, even if MRU eviction has reached the goal already. Because of the difference in 
> > > nature of MFU and MRU, globally it leads to eventual reduction of amount of MFU in ARC 
> > > to dramatic numbers.
> > 
> > Hi Paul,
> > 
> > Your analysis does seem right to me.  I cc'ed the openzfs mailing list
> > so that an actual ZFS expert can chime in; it looks like this behaviour
> > is consistent between FreeBSD, illumos and ZoL.
> > 
> > Have you already tried the obvious "fix" of subtracting total_evicted
> > from the MFU target?
> 
> We are going to apply the asize patch (plus the ameta, as suggested by Richard) and reboot 
> one of our production servers this night or the following.

Just to be explicit, are you testing something equivalent to the patch
at the end of this email?

> Then we have to wait a few days and observer the ARC behaviour.

Thanks!  Please let us know how it goes: we're preparing to release
FreeBSD 12.0 shortly and I'd like to get this fixed in head/ as soon as
possible.

diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
index 1387925c4607..882c04dba50a 100644
--- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
+++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
@@ -4446,6 +4446,12 @@ arc_adjust(void)
 		    arc_adjust_impl(arc_mru, 0, target, ARC_BUFC_METADATA);
 	}
 
+	/*
+	 * Re-sum ARC stats after the first round of evictions.
+	 */
+	asize = aggsum_value(&arc_size);
+	ameta = aggsum_value(&arc_meta_used);
+
 	/*
 	 * Adjust MFU size
 	 *



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180830144004.GD15740>