Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 24 Nov 2016 11:08:11 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Alan Somers <asomers@freebsd.org>
Cc:        FreeBSD CURRENT <freebsd-current@freebsd.org>
Subject:   Re: NFSv4 performance degradation with 12.0-CURRENT client
Message-ID:  <20161124090811.GO54029@kib.kiev.ua>
In-Reply-To: <CAOtMX2jJ2XoQyVG1c04QL7NTJn1pg38s=XEgecE38ea0QoFAOw@mail.gmail.com>
References:  <CAOtMX2jJ2XoQyVG1c04QL7NTJn1pg38s=XEgecE38ea0QoFAOw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Nov 23, 2016 at 10:17:25PM -0700, Alan Somers wrote:
> I have a FreeBSD 10.3-RELEASE-p12 server exporting its home
> directories over both NFSv3 and NFSv4.  I have a TrueOS client (based
> on 12.0-CURRENT on the drm-next-4.7 branch, built on 28-October)
> mounting the home directories over NFSv4.  At first, everything is
> fine and performance is good.  But if the client does a buildworld
> using sources on NFS and locally stored objects, performance slowly
> degrades.  The degradation is most noticeable with metadata-heavy
> operations.  For example, "ls -l" in a directory with 153 files takes
> less than 0.1 seconds right after booting.  But the longer the
> buildworld goes on, the slower it gets.  Eventually that same "ls -l"
> takes 19 seconds.  When the home directories are mounted over NFSv3
> instead, I see no degradation.
> 
> top shows negligible CPU consumption on the server, and very high
> consumption on the client when using NFSv4 (nearly 100%).  The
> NFS-using process is spending almost all of its time in system mode,
> and dtrace shows that almost all of its time is spent in
> ncl_getpages().
> 
> I have delegations disabled on the server.  On the client, the home
> directories are nullfs mounted to two additional locations, and the
> buildworld was actually using one of those nullfs mounts, not the NFS
> mount directly.
> 
> Any ideas?

Try stock FreeBSD first.

If reproducable on the stock HEAD, can you point to the lines of
ncl_getpages() where the time is spent ?  Does reading of the problematic
files, as opposed to mmaping it, also cause the behaviour ?  E.g. try dd.

There is really no time-unbounded loops in the ncl_getpages() itself.
I could understand the situation if e.g. time is spent in getpbuf() or
ncl_readrpc(), but not in ncl_getpages() directly.

Also, as an experiment, you could see if HEAD after r308980 demonstrates
any difference.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20161124090811.GO54029>