Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 19 Feb 2016 14:54:58 +0800 (AWST)
From:      David Adam <zanchey@ucc.gu.uwa.edu.au>
To:        Tom Curry <thomasrcurry@gmail.com>
Cc:        FreeBSD Filesystems <freebsd-fs@freebsd.org>
Subject:   Re: Poor ZFS+NFSv3 read/write performance and panic
Message-ID:  <alpine.DEB.2.11.1602191448430.1862@motsugo.ucc.gu.uwa.edu.au>
In-Reply-To: <CAGtEZUCBapbAEUQawVnFS%2BUuUYGSrhyk=i3VEkQaKV4zRQuhJA@mail.gmail.com>
References:  <alpine.DEB.2.11.1601292153420.26396@motsugo.ucc.gu.uwa.edu.au> <alpine.DEB.2.11.1602080056390.17583@motsugo.ucc.gu.uwa.edu.au> <CAGtEZUDCAENGcUjpZDjUBg93F_MWQO40Q4WScm1BogAOUjgEaA@mail.gmail.com> <alpine.DEB.2.11.1602141625300.1862@motsugo.ucc.gu.uwa.edu.au> <CAGtEZUCBapbAEUQawVnFS%2BUuUYGSrhyk=i3VEkQaKV4zRQuhJA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 14 Feb 2016, Tom Curry wrote:
> On Sun, Feb 14, 2016 at 3:40 AM, David Adam <zanchey@ucc.gu.uwa.edu.au>
> wrote:
> 
> > On Mon, 8 Feb 2016, Tom Curry wrote:
> > > On Sun, Feb 7, 2016 at 11:58 AM, David Adam <zanchey@ucc.gu.uwa.edu.au>
> > > wrote:
> > >
> > > > Just wondering if anyone has any idea how to identify which devices are
> > > > implicated in ZFS' vdev_deadman(). I have updated the firmware on the
> > > > mps(4) card that has our disks attached but that hasn't helped.
> > >
> > > I too ran into this problem and spent quite some time troubleshooting
> > > hardware. For me it turns out it was not hardware at all, but software.
> > > Specifically the ZFS ARC. Looking at your stack I see some arc reclaim up
> > > top, it's possible you're running into the same issue. There is a monster
> > > of a PR that details this here
> > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187594
> > >
> > > If you would like to test this theory out, the fastest way is to limit
> > the
> > > ARC by adding the following to /boot/loader.conf and rebooting
> > > vfs.zfs.arc_max="24G"
> > >
> > Thanks Tom - this certainly did sound promising, but setting the ARC to
> > 11G of our 16G of RAM didn't help. `zfs-stats` confirmed that the ARC was
> > the expected size and that there was still 461 MB of RAM free.
> 
> Did the system still panic or did it merely degrade in performance? When
> performance heads south are you swapping?

I had booted back into a GENERIC kernel, so it slowed down and then 
deadlocked - no network traffic and no response on the console. I've never 
actually managed to capture the panic with a GENERIC kernel, only with one 
built with DDB/WITNESS/DIAGNOSTIC. My colleagues tended to try and reboot 
the server before it got to that stage (and then ask who was going to 
install Linux).

It seems to be fixed now but I have committed a mortal sin and changed two 
things at once - upgraded to 10.3-BETA1 (as suggested by jwd@ off-list) 
but also dropped the ARC size further to 10G.

If I can make it happen again, I'll certainly be asking for more help and 
will see what the swap state is.

Thanks to everyone who replied on and off list.

David Adam
Wheel Group
University Computer Club, The University of Western Australia
zanchey@ucc.gu.uwa.edu.au



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.DEB.2.11.1602191448430.1862>