Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 24 Jan 2014 18:37:38 -0500
From:      J David <j.david.lists@gmail.com>
To:        Rick Macklem <rmacklem@uoguelph.ca>
Cc:        freebsd-net@freebsd.org
Subject:   Re: Terrible NFS performance under 9.2-RELEASE?
Message-ID:  <CABXB=RTTCfxP_Ebp3aa4k9qr5QrGDVQQMr1R1w0wBTUBD1OtwA@mail.gmail.com>
In-Reply-To: <659117348.16015750.1390604069888.JavaMail.root@uoguelph.ca>
References:  <CABXB=RSebaWTD1LjQz__ZZ3EJwTpOMpxq0Q=bt4280dx%2B0auCw@mail.gmail.com> <659117348.16015750.1390604069888.JavaMail.root@uoguelph.ca>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jan 24, 2014 at 5:54 PM, Rick Macklem <rmacklem@uoguelph.ca> wrote:
> But disabling it will identify if that is causing the problem. And it
> is a workaround that often helps people get things to work. (With real
> hardware, there may be no way to "fix" such things, depending on the
> chipset, etc.)

There are two problems that are crippling NFS performance with large
block sizes.

One is the extraneous NFS read-on-write issue I documented earlier
today that has nothing to do with network topology or packet size.
You might have more interest in that one.

This other thing is a five-way negative interaction between 64k NFS,
TSO, LRO, delayed ack, and congestion control.  Disabling *any* one of
them is sufficient to see significant improvement, but does not serve
to identify that it is causing the problem since it is not a unique
characterstic.  (Even if it was, that would not determine whether a
problem was with component X or with component Y's ability to interact
with component X.)  Figuring out what's really happening has proven
very difficult for me, largely due to my limited knowledge of these
areas.  And the learning curve on the TCP code is pretty steep.

The "simple" explanation appears to be that NFS generates two packets,
one just under 64k and one containing "the rest" and the alternating
sizes prevent the delayed ack code from ever seeing two full-size
segments in a row, so traffic gets pinned down to one packet per
net.inet.tcp.delacktime (100ms default), for 10pps, as observed
earlier.  But unfortunately, like a lot of simple explanations, this
one appears to have the disadvantage of being more or less completely
wrong.

> ps: If you had looked at the link I had in the email, you would have
>     seen that he gets very good performance once he disables TSO. As
>     they say, your mileage may vary.

Pretty much every word written on this subject has come across my
screens at this point.  "Very good performance" is relative.  Yes, you
can get about 10-20x better performance by disabling TSO, at the
expense of using vastly more CPU.  Which is definitely a big
improvement, and may be sufficient for many applications.  But in
absolute terms, the overall performance and particularly the
efficiency remains unsatisfactory.

Thanks!



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CABXB=RTTCfxP_Ebp3aa4k9qr5QrGDVQQMr1R1w0wBTUBD1OtwA>