Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 26 Mar 2021 13:29:45 +0100
From:      Michael Gmelin <freebsd@grem.de>
To:        Mathieu Chouquet-Stringer <me+freebsd@mathieu.digital>
Cc:        Matt Churchyard <matt.churchyard@userve.net>, "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>, current@freebsd.org
Subject:   Re: Scrub incredibly slow with 13.0-RC3 (as well as RC1 & 2)
Message-ID:  <20210326132945.3274687e@bsd64.grem.de>
In-Reply-To: <YF2raxOUeN8Y23eT@weirdfishes>
References:  <YFhuxr0qRzchA7x8@weirdfishes> <202103221515.12MFFHRK015188@higson.cam.lispworks.com> <YFi6Lwh3ISn8UMvS@weirdfishes> <YFk11A/j7URClN/l@weirdfishes> <YFm3BTK/J9XY/mCN@weirdfishes> <202103241230.12OCUqur030001@higson.cam.lispworks.com> <YFs3jFT7sEaGeQCe@weirdfishes> <33eb78e2de404a77b271880dbee4c22e@SERVER.ad.usd-group.com> <YF2raxOUeN8Y23eT@weirdfishes>

next in thread | previous in thread | raw e-mail | index | archive | help


On Fri, 26 Mar 2021 10:37:47 +0100
Mathieu Chouquet-Stringer <me+freebsd@mathieu.digital> wrote:

> On Thu, Mar 25, 2021 at 08:55:12AM +0000, Matt Churchyard wrote:
> > Just an a aside, I did post a message a few weeks ago with a similar
> > problem on 13 (as well as snapshot issues). Scrub seemed ok for a
> > short while, but then ground to a halt. It would take 10+ minutes to
> > go 0.01%, with everything appearing fairly idle. I finally gave up
> > and stopped it after about 20 hours. Moving to 12.2 and rebuilding
> > the pool, the system scrubbed the same data in an hour, and I've
> > just scrubbed the same system after a month of use with about 4
> > times the data in 3 hours 20. As far as I'm aware, both should be
> > using effectively the same "new" scrub code.
> >
> > Will be interesting if you find a cause as I didn't get any response
> > to what for me was a complete showstopper for moving to 13.  
> 
> Bear with me, I'm slowly resilvering now... But same thing, it's not
> even maxing out my slow drives... Looks like it'll take 2 days...
> 
> I did some flame graphs using dtrace. The first one is just the output
> of that:
> dtrace -x stackframes=100 -n 'profile-99 /arg0/ { @[stack()] =
> count(); } tick-60s { exit(0); }'
> 
> Clearly my machine is not busy at all.
> And the second is the output of pretty much the same thing except I'm
> only capturing pid 31 which is the one busy.
> dtrace -x stackframes=100 -n 'profile-99 /arg0 && pid == 31/ {
> @[stack()] = count(); } tick-60s { exit(0); }'
> 
> One striking thing is how many times hpet_get_timecount is present...

Does tuning of

- vfs.zfs.scrub_delay
- vfs.zfs.resilver_min_time_ms
- vfs.zfs.resilver_delay

make a difference?

Best,
Michael

-- 
Michael Gmelin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20210326132945.3274687e>