From owner-freebsd-stable@FreeBSD.ORG  Mon Jun 28 03:14:02 2010
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id BE71A106566B
	for <freebsd-stable@freebsd.org>; Mon, 28 Jun 2010 03:14:02 +0000 (UTC)
	(envelope-from rick@svn.kiwi-computer.com)
Received: from svn.kiwi-computer.com (174-20-59-6.mpls.qwest.net [174.20.59.6])
	by mx1.freebsd.org (Postfix) with SMTP id 4E0048FC17
	for <freebsd-stable@freebsd.org>; Mon, 28 Jun 2010 03:14:01 +0000 (UTC)
Received: (qmail 45576 invoked by uid 2000); 28 Jun 2010 03:14:01 -0000
Date: Sun, 27 Jun 2010 22:14:01 -0500
From: "Rick C. Petty" <rick-freebsd2009@kiwi-computer.com>
To: Rick Macklem <rmacklem@uoguelph.ca>
Message-ID: <20100628031401.GA45282@kay.kiwi-computer.com>
References: <20100627221607.GA31646@kay.kiwi-computer.com>
	<Pine.GSO.4.63.1006271949220.3233@muncher.cs.uoguelph.ca>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <Pine.GSO.4.63.1006271949220.3233@muncher.cs.uoguelph.ca>
User-Agent: Mutt/1.4.2.3i
Cc: freebsd-stable@freebsd.org
Subject: Re: Why is NFSv4 so slow?
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: rick-freebsd2009@kiwi-computer.com
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 28 Jun 2010 03:14:02 -0000

On Sun, Jun 27, 2010 at 08:04:28PM -0400, Rick Macklem wrote:
> 
> Weird, I don't see that here. The only thing I can think of is that the
> experimental client/server will try to do I/O at the size of MAXBSIZE
> by default, which might be causing a burst of traffic your net interface
> can't keep up with. (This can be turned down to 32K via the
> rsize=32768,wsize=32768 mount options. I found this necessary to avoid
> abissmal performance on some Macs for the Mac OS X port.)

Hmm.  When I mounted the same filesystem with nfs3 from a different client,
everything started working at almost normal speed (still a little slower
though).

Now on that same host I saw a file get corrupted.  On the server, I see
the following:

% hd testfile | tail -4
00677fd0  2a 24 cc 43 03 90 ad e2  9a 4a 01 d9 c4 6a f7 14  |*$.C.....J...j..|
00677fe0  3f ba 01 77 28 4f 0f 58  1a 21 67 c5 73 1e 4f 54  |?..w(O.X.!g.s.OT|
00677ff0  bf 75 59 05 52 54 07 6f  db 62 d6 4a 78 e8 3e 2b  |.uY.RT.o.b.Jx.>+|
00678000

But on the client I see this:

% hd testfile | tail -4
00011ff0  1e af dc 8e d6 73 67 a2  cd 93 fe cb 7e a4 dd 83  |.....sg.....~...|
00012000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00678000

The only thing I could do to fix it was to copy the file on the server,
delete the original file on the client, and move the copied file back.

Not only is it affecting random file reads, but started breaking src
and ports builds in random places.  In one situation, portmaster failed
because of a port checksum.  It then tried to refetch and failed with the
same checksum problem.  I manually deleted the file, tried again and it
built just fine.  The ports tree and distfiles are nfs4 mounted.

> The other thing that can really slow it down is if the uid<->login-name
> (and/or gid<->group-name) is messed up, but this would normally only
> show up for things like "ls -l". (Beware having multiple password database
> entries for the same uid, such as "root" and "toor".)

I use the same UIDs/GIDs on all my boxes, so that can't be it.  But thanks
for the idea.

> I don't recommend the use of "intr or soft" for NFSv4 mounts, but they
> wouldn't affect performance for trivial tests. You might want to try:
> "nfsv4,rsize=32768,wsize=32768" and see how that works.

I'm trying that right now (with rdirplus also) on one host.  If I start to
the delays again, I'll compare between hosts.

> When you did the nfs3 mount did you specify "newnfs" or "nfs" for the
> file system type? (I'm wondering if you still saw the problem with the
> regular "nfs" client against the server? Others have had good luck using
> the server for NFSv3 mounts.)

I used "nfs" for FStype.  So I should be using "newnfs"?  This wasn't very
clear in the man pages.  In fact "newnfs" wasn't mentioned in
"man mount_newnfs".

> When I see abissmal NFS perf. it is usually an issue with the underlying
> transport. Looking at things like "netstat -i" or "netstat -s" might
> give you a hint?

I suspected it might be transport-related.  I didn't see anything out of
the ordinary from netstat, but then again I don't know what's "ordinary"
with NFS.  =)

~~

One other thing I noticed but I'm not sure if it's a bug or expected
behavior (unrelated to the delays or corruption), is I have the following
filesystems on the server:

/vol/a
/vol/a/b
/vol/a/c

I export all three volumes and set my NFS V4 root to "/".  On the client,
I'll "mount ... server:vol /vol" and the "b" and "c" directories show up
but when I try "ls /vol/a/b /vol/a/c", they show up empty.  In dmesg I see:

	kernel: nfsv4 client/server protocol prob err=10020

After unmounting /vol, I discovered that my client already had /vol/a/b and
/vol/a/c directories (because pre-NFSv4, I had to mount each filesystem
separately).  Once I removed those empty dirs and remounted, the problem
went away.  But it did drive me crazy for a few hours.

-- Rick C. Petty