From owner-freebsd-stable Wed Jul 24 9:10:29 2002 Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9397C37B400 for ; Wed, 24 Jul 2002 09:10:17 -0700 (PDT) Received: from mail.thinkburst.com (juno.geocomm.com [204.214.64.110]) by mx1.FreeBSD.org (Postfix) with ESMTP id 991B343E31 for ; Wed, 24 Jul 2002 09:10:16 -0700 (PDT) (envelope-from jbozza@thinkburst.com) Received: from mailgate.thinkburstmedia.com (gateway.thinkburstmedia.com [204.214.64.100]) by mail.thinkburst.com (Postfix) with ESMTP id 6FD3AAE9C for ; Wed, 24 Jul 2002 11:10:16 -0500 (CDT) Received: (qmail 9936 invoked from network); 24 Jul 2002 16:10:14 -0000 From: "Jaime Bozza" To: Cc: "'Matthew Dillon'" , "'Olaf R'" Subject: RE: Abominable NFSv3 read performance / FreeBSD server / Solaris client Date: Wed, 24 Jul 2002 11:09:11 -0500 Message-ID: <022101c2332c$729a5610$6401010a@bozza> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.3416 In-Reply-To: <200207240457.g6O4vNd1025796@apollo.backplane.com> X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4910.0300 Importance: Normal Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Just to add my input, I've had the same problem with NFSv3 between Solaris and FreeBSD. Disabling rfc1323 doesn't seem to change anything on my end. Here's a scenario: FreeBSD 4.6-STABLE server, default nfsd settings, tcp.sendspace=32768 Solaris 8 client (SparcStation 20), default settings, default options for the mount. bigfile is 64953143 bytes (Solaris 8 with NFSv3 defaults to a 32768 read buffer size, rsize=32768) (on Solaris system) mount server:/export /mnt/server time dd if=/mnt/server/bigfile of=/dev/null bs=64k returns: 991+1 records in 991+1 records out 0.05u 1.68s 2:29.76 1.1% mount -o rsize=16384 server:/export /mnt/server time dd if=/mnt/server/bigfile of=/dev/null bs=64k returns: 991+1 records in 991+1 records out 0.03u 3.11s 0:19.61 16.0% rsize=8192 (0.00u 2.50s 0:33.37 7.4%) rsize=12288 (0.07u 3.00s 0:28.95 10.6%) rsize=20480 (0.05u 3.58s 0:17.72 20.4%) rsize=24576 (0.06u 3.24s 0:26.11 12.6%) rsize=22528 (0.04u 3.56s 0:21.03 17.1%) So, it seems that, in my particular situation, 20480 (20K) seems to be the best buffer size. Increasing tcp.sendspace on the server did not seem to change the times all that much. (the default 32768 read buffer would have the same 780 bytes in the Send-Q you talked about below) FreeBSD to FreeBSD transfers the file fine, Solaris to Solaris transfers the file fine. I've been using a forced rsize of 16384 for quite some time and it's helped a lot. (Both with FreeBSD client to a Solaris server and Solaris client to a FreeBSD server.) If you want me to run additional tests or outputs, I'd be happy to. Jaime Bozza -----Original Message----- From: owner-freebsd-stable@FreeBSD.ORG [mailto:owner-freebsd-stable@FreeBSD.ORG] On Behalf Of Matthew Dillon Sent: Tuesday, July 23, 2002 11:57 PM To: Olaf R Cc: stable@FreeBSD.ORG Subject: Re: Abominable NFSv3 read performance / FreeBSD server / Solaris client :I'm experiencing terrible NFSv3 read performance between Solaris 8 :clients and a FreeBSD 4.6-STABLE server, which ran CVSup/make world :a day ago and thus includes the 'em' driver update (though that made :no difference). : :Other FreeBSD clients and Mac G3 boxes running MacOS X 10.1.x can do :reads at fairly respectable rate - several MB/s over 100 Mbps ethernet. High Olaf. :And now a (partial) tcpdump trace of the 'dd if=foo of=/dev/null bs=64k' :from the Sun's perspective. :... :11:50:14.388682 freebsd.nfs > solaris.0: reply ok 1460 (DF) :11:50:14.388769 freebsd.nfs > solaris.0: reply ok 1460 (DF) :11:50:14.388786 solaris.1022 > freebsd.nfsd: . ack 30660 win 24820 (DF) (HERE) :11:50:14.480613 solaris.1022 > freebsd.nfsd: . ack 32120 win 24820 (DF) :11:50:14.480892 freebsd.nfs > solaris.0: reply ok 780 (DF) ... :11:50:14.482482 solaris.3454009260 > freebsd.nfs: 180 read fh 979,451513/13278828 32768 bytes @ 0x000010000 (DF) :11:50:14.482501 solaris.3454009261 > freebsd.nfs: 180 read fh 979,451513/13278828 32768 bytes @ 0x000018000 (DF) :11:50:14.482520 solaris.3454009262 > freebsd.nfs: 180 read fh 979,451513/13278828 32768 bytes @ 0x000020000 (DF) :11:50:14.482539 solaris.3454009263 > freebsd.nfs: 180 read fh 979,451513/13278828 32768 bytes @ 0x000028000 (DF) This isn't right. A whole 1/10 second delay. I'm going to point out something here. Note that the last packet FreeBSD sends is 780 bytes. This corresponds approximately to the last fragment of an 8K NFS read operation. Note that FreeBSD appears to be waiting for an ACK from the solaris box before pushing the last packet out. This tells me that there is a serious window sizing issue here. Unfortunately the TCP trace is not detailed enough to tell for sure, I really need to see the sequence numbers for FreeBSD's transmissions as well as Solaris's acks's, but there are several possibilities: (a) The FreeBSD box does not have a large enough send buffer. (b) The Solaris box does not have a large enough receive buffer. or (c) The buffers are large enough but the solaris box is not properly handling RFC1323 (window scaling). Please try the following: (1) Disable rfc1323 (net.inet.tcp.rfc1323) and change TCP's send buffer size to 65535 bytes (NOT 65536), aka net.inet.tcp.sendspace=65535. Remember to killall -9 nfsd and restart nfsd on the FreeBSD box after making these changes. (2) Try specifying larger net.inet.tcp.sendspace's with window scaling disabled. Again remember to killall -9 nfsd and restart it. (3) If any of the above fixed the problem, try reenabling window scaling (and restart nfsd's again). If the problem now occurs again the issue is that Solaris is likely not doing window scaling properly. Also during the dd, on the FreeBSD side please do a 'netstat -tn | fgrep tcp | fgrep ' to double check that FreeBSD is filling the TCP connection's buffer as you expect for what you've set the buffer size too. The problem are these 1/10 second delays. -Matt Matthew Dillon ... :11:50:14.485255 freebsd.nfs > solaris.0: reply ok 1460 (DF) :11:50:14.485378 freebsd.nfs > solaris.0: reply ok 1460 (DF) :11:50:14.485510 freebsd.nfs > solaris.0: reply ok 1460 (DF) (HERE) :11:50:14.580651 solaris.1022 > freebsd.nfsd: . ack 65020 win 24820 (DF) :11:50:14.580910 freebsd.nfs > solaris.0: reply ok 780 (DF) (HERE) :11:50:14.680615 solaris.1022 > freebsd.nfsd: . ack 65800 win 24820 (DF) ... :11:50:14.680974 freebsd.nfs > solaris.3454009260: reply ok 1460 read (DF) :11:50:14.681089 freebsd.nfs > solaris.0: reply ok 1460 (DF) Another 1/10 second delay. In fact two sets of 1/10 second delays. .... :11:50:14.682917 freebsd.nfs > solaris.0: reply ok 1460 (DF) :11:50:14.683038 freebsd.nfs > solaris.0: reply ok 1460 (DF) (HERE) :11:50:14.780637 solaris.1022 > freebsd.nfsd: . ack 97920 win 24820 (DF) :11:50:14.780918 freebsd.nfs > solaris.0: reply ok 780 (DF) (HERE) :11:50:14.880647 solaris.1022 > freebsd.nfsd: . ack 98700 win 24820 (DF) :11:50:14.880994 freebsd.nfs > solaris.3454009261: reply ok 1460 read (DF) :11:50:14.881110 freebsd.nfs > solaris.0: reply ok 1460 (DF) :11:50:14.881134 solaris.1022 > freebsd.nfsd: . ack 101620 win 24820 (DF) ... And again. ... :11:50:14.883416 freebsd.nfs > solaris.0: reply ok 1460 (DF) :11:50:14.883543 freebsd.nfs > solaris.0: reply ok 1460 (DF) (HERE) :11:50:14.980685 solaris.1022 > freebsd.nfsd: . ack 130820 win 24820 (DF) :11:50:14.980965 freebsd.nfs > solaris.0: reply ok 780 (DF) (HERE) :11:50:15.080663 solaris.1022 > freebsd.nfsd: . ack 131600 win 24820 (DF) :11:50:15.081021 freebsd.nfs > solaris.3454009262: reply ok 1460 read (DF) ... And again. :11:50:15.083251 solaris.1022 > freebsd.nfsd: . ack 159340 win 24820 (DF) :11:50:15.083340 freebsd.nfs > solaris.0: reply ok 1460 (DF) (HERE) :11:50:15.180662 solaris.1022 > freebsd.nfsd: . ack 163720 win 24820 (DF) :11:50:15.180945 freebsd.nfs > solaris.0: reply ok 780 (DF) :11:50:15.182401 solaris.3454009264 > freebsd.nfs: 180 read fh 979,451513/13278828 32768 bytes @ 0x000030000 (DF) :11:50:15.182422 solaris.3454009265 > freebsd.nfs: 180 read fh 979,451513/13278828 32768 bytes @ 0x000038000 (DF) :11:50:15.182436 solaris.3454009266 > freebsd.nfs: 180 read fh 979,451513/13278828 32768 bytes @ 0x000040000 (DF) :11:50:15.182456 solaris.3454009267 > freebsd.nfs: 180 read fh 979,451513/13278828 32768 bytes @ 0x000048000 (DF) And again. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message