Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 Jul 2014 06:32:59 -0400 (EDT)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Harald Schmalzbauer <h.schmalzbauer@omnilan.de>
Cc:        freebsd-stable <freebsd-stable@freebsd.org>
Subject:   Re: nfsd server cache flooded, try to increase nfsrc_floodlevel
Message-ID:  <1672084785.3195858.1406284379181.JavaMail.root@uoguelph.ca>
In-Reply-To: <53D20A49.5020803@omnilan.de>

next in thread | previous in thread | raw e-mail | index | archive | help
Harald Schmaltzbauer wrote:
> Bez=C3=BCglich Rick Macklem's Nachricht vom 25.07.2014 02:14 (localtime):
> > Harald Schmalzbauer wrote:
> >> Bez=C3=BCglich Rick Macklem's Nachricht vom 08.08.2013 14:20
> >> (localtime):
> >>> Lars Eggert wrote:
> >>>> Hi,
> >>>>
> >>>> every few days or so, my -STABLE NFS server (v3 and v4) gets
> >>>> wedged
> >>>> with a ton of messages about "nfsd server cache flooded, try to
> >>>> increase nfsrc_floodlevel" in the log, and nfsstat shows TCPPeak
> >>>> at
> >>>> 16385. It requires a reboot to unwedge, restarting the server
> >>>> does
> >>>> not help.
> >>>>
> >>>> The clients are (mostly) six -CURRENT nfsv4 boxes that netboot
> >>>> from
> >>>> the server and mount all drives from there.
> >>>>
> > Have you tried increasing vfs.nfsd.tcphighwater?
> > This needs to be increased to increase the flood level above 16384.
> >
> > Garrett Wollman sets:
> > vfs.nfsd.tcphighwater=3D100000
> > vfs.nfsd.tcpcachetimeo=3D300
> >
> > or something like that, if I recall correctly.
>=20
> Thanks you for your help!
>=20
> I read about tuning these sysctls, but I object individually altering
> these, because I don't have hundreds of clients torturing a poor
> server
> or any other not well balanced setup.
> I run into this problem with one client, connected via 1GbE (not 10
> or
> 40GbE) link, talking to modern server with 10G RAM - and this
> environment forces me to reboot the storage server every 2nd day.
> IMHO such a setup shouldn't require manual tuning and I consider this
> as
> a really urgent problem!
Well, the default was what worked for the hardware I have to test on:
- single core i386 server with 256Mbytes of memory on 100Mbps network

Since I have nothing close to 10Gbps networking (100Mbps only), I can't
test even a situation like your server/single client, so all I can do
is set a default that works for me.

I don't think you can expect a "one size fits all" setting when you
have servers ranging from what I have (i386 with 256Mbytes of memory)
to 64 cores, Gbytes of RAM etc. Eventually, others (like iXsystems maybe)
who can test on a range of servers may be able to come up with a way to
tune it dynamically based on server size, but that requires a range of
hardware to test on.

Basically, although NFSv4 is now 10years old, people are just starting
to use it, so experience is still pretty limited w.r.t. it.

rick

> Whatever causes the server to lock up is strongly required to be
> fixed
> for next release,
> otherwise the shipped implementation of NFS is not really suitable
> for
> production environment and needs a warning message when enabled.
> The impact of this failure forces admins to change the operation
> system
> in order to get a core service back into operation.
> The importance is, that I don't suffer from weaker performance or
> lags/delays, but my server stops NFS completely and only a reboot
> solves
> this situation.
>=20
> Are there later modifcations or other findings which are known to
> obsolete
> your noopen.patch (http://people.freebsd.org/~rmacklem/noopen.patch)?
>=20
> I'm testing this atm, but having other panics on the same machine
> related to vfs locking, so results of the test won't be available too
> soon.
>=20
> Thank you,
>=20
> -Harry
>=20
>=20
>=20



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1672084785.3195858.1406284379181.JavaMail.root>