Date: Fri, 31 Jan 2014 18:20:56 -0500 (EST) From: Rick Macklem <rmacklem@uoguelph.ca> To: J David <j.david.lists@gmail.com> Cc: freebsd-net@freebsd.org, Garrett Wollman <wollman@freebsd.org> Subject: Re: Terrible NFS performance under 9.2-RELEASE? Message-ID: <1609454808.1083115.1391210456671.JavaMail.root@uoguelph.ca> In-Reply-To: <CABXB=RQ6LdeoNi4vNZGCaM2C_up_JCf2SpWPzm2S_M_%2BpTnzsQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
J David wrote: > On Fri, Jan 31, 2014 at 1:18 AM, <wollman@freebsd.org> wrote: > > This is almost entirely wrong in its description of the non-offload > > case. > > Yes, you're quite right; I confused myself. GSO works a little > differently, but FreeBSD doesn't use that. > > > The whole mess is then passed on to the hardware for > > offload, if it fits. > > That's the point, NFS is creating a situation where it never fits. > It > can't shove 65k into 64k, so it ends up looping back through the > whole > output routine again for a tiny tail of data, and then the same for > the input routine on the other side. Arguably that makes rsize/wsize > 65536 negligibly different than rsize/wsize 32768 in the long run > because the average data output per pass is about the same (64k + 1k > vs 33k + 33k). Except, of course, in the case where almost all files > are between 32k and 60k. > Oh, and remember to try setting readahead=8 in your mounts, too. NFS will do a read + N readaheads (where N == 1 by default) and then wait for replies to those before continuing on. If the product of rsize * readahead isn't enough bits to fill the pipe (bandwidth * transit delay), then you won't be using the bandwidth your network interface provides. rick ps: Any you probably want your nfsd threads to be at least 16 instead of the default of 4. > Please don't get me wrong, I'm not suggesting there's anything more > than a small CPU reduction to be obtained by changing this. Which is > not nothing if the client is CPU-limited due to the other work it's > doing, but it's not much. To get real speedups from NFS would > require > a change to the punishing read-before-write behavior, which is pretty > clearly not going to happen. > > > RPC responses will only get smushed together if > > tcp_output() wasn't able to schedule the transmit immediately, and > > if > > the network is working properly, that will only happen if there's > > more > > than one client-side-receive-window's-worth of data to be > > transmitted. > > This is something I have seen live in tcpdump, but then I have had so > many problems with NFS and congestion control that the "network is > working properly" condition probably isn't satisfied. Hopefully the > jumbo cluster changes will resolve that once and for all. > > Thanks! > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to > "freebsd-net-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1609454808.1083115.1391210456671.JavaMail.root>