Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 24 Jul 2002 11:09:11 -0500
From:      "Jaime Bozza" <jbozza@thinkburst.com>
To:        <stable@FreeBSD.ORG>
Cc:        "'Matthew Dillon'" <dillon@apollo.backplane.com>, "'Olaf R'" <olaf@keghouse.net>
Subject:   RE: Abominable NFSv3 read performance / FreeBSD server / Solaris client
Message-ID:  <022101c2332c$729a5610$6401010a@bozza>
In-Reply-To: <200207240457.g6O4vNd1025796@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Just to add my input, I've had the same problem with NFSv3 between
Solaris and FreeBSD.  Disabling rfc1323 doesn't seem to change anything
on my end.  Here's a scenario:

FreeBSD 4.6-STABLE server, default nfsd settings, tcp.sendspace=32768
Solaris 8 client (SparcStation 20), default settings, default options
for the mount.
bigfile is 64953143 bytes

(Solaris 8 with NFSv3 defaults to a 32768 read buffer size, rsize=32768)

(on Solaris system)
mount server:/export /mnt/server 
time dd if=/mnt/server/bigfile of=/dev/null bs=64k

returns:
991+1 records in
991+1 records out
0.05u 1.68s 2:29.76 1.1%


mount -o rsize=16384 server:/export /mnt/server
time dd if=/mnt/server/bigfile of=/dev/null bs=64k

returns:
991+1 records in
991+1 records out
0.03u 3.11s 0:19.61 16.0%


rsize=8192 (0.00u 2.50s 0:33.37 7.4%)
rsize=12288 (0.07u 3.00s 0:28.95 10.6%)
rsize=20480 (0.05u 3.58s 0:17.72 20.4%)
rsize=24576 (0.06u 3.24s 0:26.11 12.6%)
rsize=22528 (0.04u 3.56s 0:21.03 17.1%)

So, it seems that, in my particular situation, 20480 (20K) seems to be
the best buffer size.

Increasing tcp.sendspace on the server did not seem to change the times
all that much. (the default 32768 read buffer would have the same 780
bytes in the Send-Q you talked about below)

FreeBSD to FreeBSD transfers the file fine, Solaris to Solaris transfers
the file fine.

I've been using a forced rsize of 16384 for quite some time and it's
helped a lot.  (Both with FreeBSD client to a Solaris server and Solaris
client to a FreeBSD server.)


If you want me to run additional tests or outputs, I'd be happy to.



Jaime Bozza



-----Original Message-----
From: owner-freebsd-stable@FreeBSD.ORG
[mailto:owner-freebsd-stable@FreeBSD.ORG] On Behalf Of Matthew Dillon
Sent: Tuesday, July 23, 2002 11:57 PM
To: Olaf R
Cc: stable@FreeBSD.ORG
Subject: Re: Abominable NFSv3 read performance / FreeBSD server /
Solaris client



:I'm experiencing terrible NFSv3 read performance between Solaris 8
:clients and a FreeBSD 4.6-STABLE server, which ran CVSup/make world
:a day ago and thus includes the 'em' driver update (though that made
:no difference).
:
:Other FreeBSD clients and Mac G3 boxes running MacOS X 10.1.x can do
:reads at fairly respectable rate - several MB/s over 100 Mbps ethernet.

    High Olaf. 

:And now a (partial) tcpdump trace of the 'dd if=foo of=/dev/null
bs=64k'
:from the Sun's perspective.
:...
:11:50:14.388682 freebsd.nfs > solaris.0: reply ok 1460 (DF)
:11:50:14.388769 freebsd.nfs > solaris.0: reply ok 1460 (DF)
:11:50:14.388786 solaris.1022 > freebsd.nfsd: . ack 30660 win 24820 (DF)
(HERE)
:11:50:14.480613 solaris.1022 > freebsd.nfsd: . ack 32120 win 24820 (DF)
:11:50:14.480892 freebsd.nfs > solaris.0: reply ok 780 (DF)
...
:11:50:14.482482 solaris.3454009260 > freebsd.nfs: 180 read fh
979,451513/13278828 32768 bytes @ 0x000010000 (DF)
:11:50:14.482501 solaris.3454009261 > freebsd.nfs: 180 read fh
979,451513/13278828 32768 bytes @ 0x000018000 (DF)
:11:50:14.482520 solaris.3454009262 > freebsd.nfs: 180 read fh
979,451513/13278828 32768 bytes @ 0x000020000 (DF)
:11:50:14.482539 solaris.3454009263 > freebsd.nfs: 180 read fh
979,451513/13278828 32768 bytes @ 0x000028000 (DF)

    This isn't right.  A whole 1/10 second delay.

    I'm going to point out something here.  Note that the
    last packet FreeBSD sends is 780 bytes.  This corresponds
    approximately to the last fragment of an 8K NFS read
    operation.   Note that FreeBSD appears to be waiting
    for an ACK from the solaris box before pushing the
    last packet out.

    This tells me that there is a serious window sizing
    issue here.  Unfortunately the TCP trace is not detailed
    enough to tell for sure, I really need to see the sequence
    numbers for FreeBSD's transmissions as well as Solaris's
    acks's, but there are several possibilities:

    (a) The FreeBSD box does not have a large enough send
	buffer.

    (b) The Solaris box does not have a large enough receive
	buffer.

	or

    (c) The buffers are large enough but the solaris box is
	not properly handling RFC1323 (window scaling).

    Please try the following:

    (1) Disable rfc1323 (net.inet.tcp.rfc1323) and change
	TCP's send buffer size to 65535 bytes (NOT 65536),
	aka net.inet.tcp.sendspace=65535.  

	Remember to killall -9 nfsd and restart nfsd on
	the FreeBSD box after making these changes.

    (2) Try specifying larger net.inet.tcp.sendspace's
	with window scaling disabled.  Again remember
	to killall -9 nfsd and restart it.

    (3) If any of the above fixed the problem, try reenabling
	window scaling (and restart nfsd's again).  If the
	problem now occurs again the issue is that Solaris
	is likely not doing window scaling properly.

    Also during the dd, on the FreeBSD side please do a
    'netstat -tn | fgrep tcp | fgrep <the_proper_port>' to
    double check that FreeBSD is filling the TCP connection's
    buffer as you expect for what you've set the buffer size
    too.

    The problem are these 1/10 second delays.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>
       
...
:11:50:14.485255 freebsd.nfs > solaris.0: reply ok 1460 (DF)
:11:50:14.485378 freebsd.nfs > solaris.0: reply ok 1460 (DF)
:11:50:14.485510 freebsd.nfs > solaris.0: reply ok 1460 (DF)
(HERE)
:11:50:14.580651 solaris.1022 > freebsd.nfsd: . ack 65020 win 24820 (DF)
:11:50:14.580910 freebsd.nfs > solaris.0: reply ok 780 (DF)
(HERE)
:11:50:14.680615 solaris.1022 > freebsd.nfsd: . ack 65800 win 24820 (DF)
...
:11:50:14.680974 freebsd.nfs > solaris.3454009260: reply ok 1460 read
(DF)
:11:50:14.681089 freebsd.nfs > solaris.0: reply ok 1460 (DF)

     Another 1/10 second delay.  In fact two sets of 1/10 second
     delays.

....
:11:50:14.682917 freebsd.nfs > solaris.0: reply ok 1460 (DF)
:11:50:14.683038 freebsd.nfs > solaris.0: reply ok 1460 (DF)
(HERE)
:11:50:14.780637 solaris.1022 > freebsd.nfsd: . ack 97920 win 24820 (DF)
:11:50:14.780918 freebsd.nfs > solaris.0: reply ok 780 (DF)
(HERE)
:11:50:14.880647 solaris.1022 > freebsd.nfsd: . ack 98700 win 24820 (DF)
:11:50:14.880994 freebsd.nfs > solaris.3454009261: reply ok 1460 read
(DF)
:11:50:14.881110 freebsd.nfs > solaris.0: reply ok 1460 (DF)
:11:50:14.881134 solaris.1022 > freebsd.nfsd: . ack 101620 win 24820
(DF)
...

    And again.

...
:11:50:14.883416 freebsd.nfs > solaris.0: reply ok 1460 (DF)
:11:50:14.883543 freebsd.nfs > solaris.0: reply ok 1460 (DF)
(HERE)
:11:50:14.980685 solaris.1022 > freebsd.nfsd: . ack 130820 win 24820
(DF)
:11:50:14.980965 freebsd.nfs > solaris.0: reply ok 780 (DF)
(HERE)
:11:50:15.080663 solaris.1022 > freebsd.nfsd: . ack 131600 win 24820
(DF)
:11:50:15.081021 freebsd.nfs > solaris.3454009262: reply ok 1460 read
(DF)
...

    And again.

:11:50:15.083251 solaris.1022 > freebsd.nfsd: . ack 159340 win 24820
(DF)
:11:50:15.083340 freebsd.nfs > solaris.0: reply ok 1460 (DF)
(HERE)
:11:50:15.180662 solaris.1022 > freebsd.nfsd: . ack 163720 win 24820
(DF)
:11:50:15.180945 freebsd.nfs > solaris.0: reply ok 780 (DF)
:11:50:15.182401 solaris.3454009264 > freebsd.nfs: 180 read fh
979,451513/13278828 32768 bytes @ 0x000030000 (DF)
:11:50:15.182422 solaris.3454009265 > freebsd.nfs: 180 read fh
979,451513/13278828 32768 bytes @ 0x000038000 (DF)
:11:50:15.182436 solaris.3454009266 > freebsd.nfs: 180 read fh
979,451513/13278828 32768 bytes @ 0x000040000 (DF)
:11:50:15.182456 solaris.3454009267 > freebsd.nfs: 180 read fh
979,451513/13278828 32768 bytes @ 0x000048000 (DF)

    And again.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?022101c2332c$729a5610$6401010a>