Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 31 Jan 2014 12:58:15 -0500
From:      J David <j.david.lists@gmail.com>
To:        Garrett Wollman <wollman@freebsd.org>
Cc:        freebsd-net@freebsd.org
Subject:   Re: Terrible NFS performance under 9.2-RELEASE?
Message-ID:  <CABXB=RQ6LdeoNi4vNZGCaM2C_up_JCf2SpWPzm2S_M_%2BpTnzsQ@mail.gmail.com>
In-Reply-To: <201401310618.s0V6IVJv027167@hergotha.csail.mit.edu>
References:  <CABXB=RR1eDvdUAaZd73Vv99EJR=DFzwRvMTw3WFER3aQ%2B2%2B2zQ@mail.gmail.com> <87942875.478893.1391121843834.JavaMail.root@uoguelph.ca> <CABXB=RTx9_gE=0G9UAzwJ3LuYv8fy=sAOZp1e2D7cJ6_=kgd9A@mail.gmail.com> <201401310618.s0V6IVJv027167@hergotha.csail.mit.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jan 31, 2014 at 1:18 AM,  <wollman@freebsd.org> wrote:
> This is almost entirely wrong in its description of the non-offload
> case.

Yes, you're quite right; I confused myself.  GSO works a little
differently, but FreeBSD doesn't use that.

> The whole mess is then passed on to the hardware for
> offload, if it fits.

That's the point, NFS is creating a situation where it never fits.  It
can't shove 65k into 64k, so it ends up looping back through the whole
output routine again for a tiny tail of data, and then the same for
the input routine on the other side.  Arguably that makes rsize/wsize
65536 negligibly different than rsize/wsize 32768 in the long run
because the average data output per pass is about the same (64k + 1k
vs 33k + 33k).  Except, of course, in the case where almost all files
are between 32k and 60k.

Please don't get me wrong, I'm not suggesting there's anything more
than a small CPU reduction to be obtained by changing this.  Which is
not nothing if the client is CPU-limited due to the other work it's
doing, but it's not much.  To get real speedups from NFS would require
a change to the punishing read-before-write behavior, which is pretty
clearly not going to happen.

> RPC responses will only get smushed together if
> tcp_output() wasn't able to schedule the transmit immediately, and if
> the network is working properly, that will only happen if there's more
> than one client-side-receive-window's-worth of data to be transmitted.

This is something I have seen live in tcpdump, but then I have had so
many problems with NFS and congestion control that the "network is
working properly" condition probably isn't satisfied.  Hopefully the
jumbo cluster changes will resolve that once and for all.

Thanks!



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CABXB=RQ6LdeoNi4vNZGCaM2C_up_JCf2SpWPzm2S_M_%2BpTnzsQ>