Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 23 Jul 2002 21:57:23 -0700 (PDT)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Olaf R <olaf@keghouse.net>
Cc:        stable@FreeBSD.ORG
Subject:   Re: Abominable NFSv3 read performance / FreeBSD server / Solaris client
Message-ID:  <200207240457.g6O4vNd1025796@apollo.backplane.com>
References:   <200207240358.g6O3w90G006348@bronkowitz.keghouse.net>

next in thread | previous in thread | raw e-mail | index | archive | help

:I'm experiencing terrible NFSv3 read performance between Solaris 8
:clients and a FreeBSD 4.6-STABLE server, which ran CVSup/make world
:a day ago and thus includes the 'em' driver update (though that made
:no difference).
:
:Other FreeBSD clients and Mac G3 boxes running MacOS X 10.1.x can do
:reads at fairly respectable rate - several MB/s over 100 Mbps ethernet.

    High Olaf. 

:And now a (partial) tcpdump trace of the 'dd if=foo of=/dev/null bs=64k'
:from the Sun's perspective.
:...
:11:50:14.388682 freebsd.nfs > solaris.0: reply ok 1460 (DF)
:11:50:14.388769 freebsd.nfs > solaris.0: reply ok 1460 (DF)
:11:50:14.388786 solaris.1022 > freebsd.nfsd: . ack 30660 win 24820 (DF)
(HERE)
:11:50:14.480613 solaris.1022 > freebsd.nfsd: . ack 32120 win 24820 (DF)
:11:50:14.480892 freebsd.nfs > solaris.0: reply ok 780 (DF)
...
:11:50:14.482482 solaris.3454009260 > freebsd.nfs: 180 read fh 979,451513/13278828 32768 bytes @ 0x000010000 (DF)
:11:50:14.482501 solaris.3454009261 > freebsd.nfs: 180 read fh 979,451513/13278828 32768 bytes @ 0x000018000 (DF)
:11:50:14.482520 solaris.3454009262 > freebsd.nfs: 180 read fh 979,451513/13278828 32768 bytes @ 0x000020000 (DF)
:11:50:14.482539 solaris.3454009263 > freebsd.nfs: 180 read fh 979,451513/13278828 32768 bytes @ 0x000028000 (DF)

    This isn't right.  A whole 1/10 second delay.

    I'm going to point out something here.  Note that the
    last packet FreeBSD sends is 780 bytes.  This corresponds
    approximately to the last fragment of an 8K NFS read
    operation.   Note that FreeBSD appears to be waiting
    for an ACK from the solaris box before pushing the
    last packet out.

    This tells me that there is a serious window sizing
    issue here.  Unfortunately the TCP trace is not detailed
    enough to tell for sure, I really need to see the sequence
    numbers for FreeBSD's transmissions as well as Solaris's
    acks's, but there are several possibilities:

    (a) The FreeBSD box does not have a large enough send
	buffer.

    (b) The Solaris box does not have a large enough receive
	buffer.

	or

    (c) The buffers are large enough but the solaris box is
	not properly handling RFC1323 (window scaling).

    Please try the following:

    (1) Disable rfc1323 (net.inet.tcp.rfc1323) and change
	TCP's send buffer size to 65535 bytes (NOT 65536),
	aka net.inet.tcp.sendspace=65535.  

	Remember to killall -9 nfsd and restart nfsd on
	the FreeBSD box after making these changes.

    (2) Try specifying larger net.inet.tcp.sendspace's
	with window scaling disabled.  Again remember
	to killall -9 nfsd and restart it.

    (3) If any of the above fixed the problem, try reenabling
	window scaling (and restart nfsd's again).  If the
	problem now occurs again the issue is that Solaris
	is likely not doing window scaling properly.

    Also during the dd, on the FreeBSD side please do a
    'netstat -tn | fgrep tcp | fgrep <the_proper_port>' to
    double check that FreeBSD is filling the TCP connection's
    buffer as you expect for what you've set the buffer size
    too.

    The problem are these 1/10 second delays.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>
       
...
:11:50:14.485255 freebsd.nfs > solaris.0: reply ok 1460 (DF)
:11:50:14.485378 freebsd.nfs > solaris.0: reply ok 1460 (DF)
:11:50:14.485510 freebsd.nfs > solaris.0: reply ok 1460 (DF)
(HERE)
:11:50:14.580651 solaris.1022 > freebsd.nfsd: . ack 65020 win 24820 (DF)
:11:50:14.580910 freebsd.nfs > solaris.0: reply ok 780 (DF)
(HERE)
:11:50:14.680615 solaris.1022 > freebsd.nfsd: . ack 65800 win 24820 (DF)
...
:11:50:14.680974 freebsd.nfs > solaris.3454009260: reply ok 1460 read (DF)
:11:50:14.681089 freebsd.nfs > solaris.0: reply ok 1460 (DF)

     Another 1/10 second delay.  In fact two sets of 1/10 second
     delays.

....
:11:50:14.682917 freebsd.nfs > solaris.0: reply ok 1460 (DF)
:11:50:14.683038 freebsd.nfs > solaris.0: reply ok 1460 (DF)
(HERE)
:11:50:14.780637 solaris.1022 > freebsd.nfsd: . ack 97920 win 24820 (DF)
:11:50:14.780918 freebsd.nfs > solaris.0: reply ok 780 (DF)
(HERE)
:11:50:14.880647 solaris.1022 > freebsd.nfsd: . ack 98700 win 24820 (DF)
:11:50:14.880994 freebsd.nfs > solaris.3454009261: reply ok 1460 read (DF)
:11:50:14.881110 freebsd.nfs > solaris.0: reply ok 1460 (DF)
:11:50:14.881134 solaris.1022 > freebsd.nfsd: . ack 101620 win 24820 (DF)
...

    And again.

...
:11:50:14.883416 freebsd.nfs > solaris.0: reply ok 1460 (DF)
:11:50:14.883543 freebsd.nfs > solaris.0: reply ok 1460 (DF)
(HERE)
:11:50:14.980685 solaris.1022 > freebsd.nfsd: . ack 130820 win 24820 (DF)
:11:50:14.980965 freebsd.nfs > solaris.0: reply ok 780 (DF)
(HERE)
:11:50:15.080663 solaris.1022 > freebsd.nfsd: . ack 131600 win 24820 (DF)
:11:50:15.081021 freebsd.nfs > solaris.3454009262: reply ok 1460 read (DF)
...

    And again.

:11:50:15.083251 solaris.1022 > freebsd.nfsd: . ack 159340 win 24820 (DF)
:11:50:15.083340 freebsd.nfs > solaris.0: reply ok 1460 (DF)
(HERE)
:11:50:15.180662 solaris.1022 > freebsd.nfsd: . ack 163720 win 24820 (DF)
:11:50:15.180945 freebsd.nfs > solaris.0: reply ok 780 (DF)
:11:50:15.182401 solaris.3454009264 > freebsd.nfs: 180 read fh 979,451513/13278828 32768 bytes @ 0x000030000 (DF)
:11:50:15.182422 solaris.3454009265 > freebsd.nfs: 180 read fh 979,451513/13278828 32768 bytes @ 0x000038000 (DF)
:11:50:15.182436 solaris.3454009266 > freebsd.nfs: 180 read fh 979,451513/13278828 32768 bytes @ 0x000040000 (DF)
:11:50:15.182456 solaris.3454009267 > freebsd.nfs: 180 read fh 979,451513/13278828 32768 bytes @ 0x000048000 (DF)

    And again.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200207240457.g6O4vNd1025796>