From owner-freebsd-stable@FreeBSD.ORG Sun Aug 29 15:44:08 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 207DC1065670 for ; Sun, 29 Aug 2010 15:44:08 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id D4E358FC21 for ; Sun, 29 Aug 2010 15:44:07 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ah4FAKccekyDaFvO/2dsb2JhbACDFpAOjhenfJB5gSKBU4FPcwSKCQ X-IronPort-AV: E=Sophos;i="4.56,287,1280721600"; d="scan'208";a="90068270" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 29 Aug 2010 11:44:03 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 743FEB3E95; Sun, 29 Aug 2010 11:44:06 -0400 (EDT) Date: Sun, 29 Aug 2010 11:44:06 -0400 (EDT) From: Rick Macklem To: rick-freebsd2009@kiwi-computer.com Message-ID: <2002105637.244211.1283096646412.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20100829032252.GA81736@rix.kiwi-computer.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [24.65.230.102] X-Mailer: Zimbra 6.0.7_GA_2476.RHEL4 (ZimbraWebClient - SAF3 (Mac)/6.0.7_GA_2473.RHEL4_64) Cc: freebsd-stable@freebsd.org Subject: Re: Why is NFSv4 so slow? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Aug 2010 15:44:08 -0000 > Hi. I'm still having problems with NFSv4 being very laggy on one > client. > When the NFSv4 server is at 50% idle CPU and the disks are < 1% busy, > I am > getting horrible throughput on an idle client. Using dd(1) with 1 MB > block > size, when I try to read a > 100 MB file from the client, I'm getting > around 300-500 KiB/s. On another client, I see upwards of 20 MiB/s > with > the same test (on a different file). On the broken client: > Since other client(s) are working well, that seems to suggest that it is a network related problem and not a bug in the NFS code. First off, the obvious question: How does this client differ from the one that performs much better? Do they both use the same "re" network interface for the NFS traffic? (If the answer is "no", I'd be suspicious that the "re" hardware or device driver is the culprit.) Things that I might try in an effort to isolate the problem: - switch the NFS traffic to use the nfe0 net interface. - put a net interface identical to the one on the client that works well in the machine and use that for the NFS traffic. - turn off TXCSUM and RXCSUM on re0 - reduce the read/write data size, using rsize=N,wsize=N on the mount. (It will default to MAXBSIZE and some net interfaces don't handle large bursts of received data well. If you drop it to rsize=8192,wszie=8192 and things improve, then increase N until it screws up.) - check the port configuration on the switch end, to make sure it is also 1000bps-full duplex. - move the client to a different net port on the switch or even a different switch (and change the cable, while you're at it). - Look at "netstat -s" and see if there are a lot of retransmits going on in TCP. If none of the above seems to help, you could look at a packet trace and see what is going on. Look for TCP reconnects (SYN, SYN-ACK...) or places where there is a large time delay/retransmit of a TCP segment. Hopefully others who are more familiar with the networking side can suggest other things to try, rick