From owner-freebsd-stable@FreeBSD.ORG  Sun Aug 29 15:44:08 2010
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 207DC1065670
	for <freebsd-stable@freebsd.org>; Sun, 29 Aug 2010 15:44:08 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca
	[131.104.91.36])
	by mx1.freebsd.org (Postfix) with ESMTP id D4E358FC21
	for <freebsd-stable@freebsd.org>; Sun, 29 Aug 2010 15:44:07 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ah4FAKccekyDaFvO/2dsb2JhbACDFpAOjhenfJB5gSKBU4FPcwSKCQ
X-IronPort-AV: E=Sophos;i="4.56,287,1280721600"; d="scan'208";a="90068270"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
	([131.104.91.206])
	by esa-annu-pri.mail.uoguelph.ca with ESMTP; 29 Aug 2010 11:44:03 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
	by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 743FEB3E95;
	Sun, 29 Aug 2010 11:44:06 -0400 (EDT)
Date: Sun, 29 Aug 2010 11:44:06 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: rick-freebsd2009@kiwi-computer.com
Message-ID: <2002105637.244211.1283096646412.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20100829032252.GA81736@rix.kiwi-computer.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [24.65.230.102]
X-Mailer: Zimbra 6.0.7_GA_2476.RHEL4 (ZimbraWebClient - SAF3
	(Mac)/6.0.7_GA_2473.RHEL4_64)
Cc: freebsd-stable@freebsd.org
Subject: Re: Why is NFSv4 so slow?
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 29 Aug 2010 15:44:08 -0000

> Hi. I'm still having problems with NFSv4 being very laggy on one
> client.
> When the NFSv4 server is at 50% idle CPU and the disks are < 1% busy,
> I am
> getting horrible throughput on an idle client. Using dd(1) with 1 MB
> block
> size, when I try to read a > 100 MB file from the client, I'm getting
> around 300-500 KiB/s. On another client, I see upwards of 20 MiB/s
> with
> the same test (on a different file). On the broken client:
> 

Since other client(s) are working well, that seems to suggest that it
is a network related problem and not a bug in the NFS code.

First off, the obvious question: How does this client differ from the
one that performs much better?
Do they both use the same "re" network interface for the NFS traffic?
(If the answer is "no", I'd be suspicious that the "re" hardware or
device driver is the culprit.)

Things that I might try in an effort to isolate the problem:
- switch the NFS traffic to use the nfe0 net interface.
- put a net interface identical to the one on the client that
  works well in the machine and use that for the NFS traffic.
- turn off TXCSUM and RXCSUM on re0
- reduce the read/write data size, using rsize=N,wsize=N on the
  mount. (It will default to MAXBSIZE and some net interfaces don't
  handle large bursts of received data well. If you drop it to
  rsize=8192,wszie=8192 and things improve, then increase N until it
  screws up.)
- check the port configuration on the switch end, to make sure it
  is also 1000bps-full duplex.
- move the client to a different net port on the switch or even a
  different switch (and change the cable, while you're at it).
- Look at "netstat -s" and see if there are a lot of retransmits
  going on in TCP.

If none of the above seems to help, you could look at a packet trace
and see what is going on. Look for TCP reconnects (SYN, SYN-ACK...)
or places where there is a large time delay/retransmit of a TCP
segment.

Hopefully others who are more familiar with the networking side
can suggest other things to try, rick