Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 01 Apr 1999 23:06:55 -0500
From:      "David E. Cross" <crossd@cs.rpi.edu>
To:        "Kenneth D. Merry" <ken@plutotech.com>
Cc:        crossd@cs.rpi.edu (David E. Cross), dillon@apollo.backplane.com, freebsd-hackers@FreeBSD.ORG, schimken@cs.rpi.edu, crossd@cs.rpi.edu
Subject:   Re: More death to nfsiod (workarround) 
Message-ID:  <199904020406.XAA13920@cs.rpi.edu>
In-Reply-To: Message from "Kenneth D. Merry" <ken@plutotech.com>  of "Thu, 01 Apr 1999 15:17:41 MST." <199904012217.PAA04791@panzer.plutotech.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
> > >      AMD is a rather complex piece of software.  It's creating a situation
> > >      that the kernel isn't happy with but I really don't have time to delve
> > >      into it ( anyone else care to take a shot at it? ) on top of everything
> > >      else I'm doing.  If there is any way you can avoid using AMD, I would
> > >      avoid using AMD.
> > Late yesterday I was able to determine how amd was mounting the partitions, 
> > and I was able to replicate it with a hand-mounted filesystem.  I was in the
> > process of digging through NFS packets between 2 hosts when I made the
> > observation "Hey, this isn't UDP".  I then hand mounted a filesytstem with
> > "mount_nfs -2T -r 8192 -w 8192 server:/path /mnt" ran my test, and it failed :)
> > 
> > Since then I have updated AMD to use vers3/UDP for mounts, and guess what, so
> > far the problem has not come back.  I have run the test 4 times now, not a
> > single failure, and it is *FAST*.  To tickle this you need to have a relatively
> > fast connect between the NFS client and server (switched half duplex 10M
> > segment was enough to do it, although the primary machine that was having the
> > problem was dedicated 100M full-duplex).  I am able to reproduce this with
> > relative ease here.
> > 
> > (mount_nfs -3T -r 8192 -w 8192 ... also seemed to work)
> 
> Yeah, I upgraded a number of machines (30+) to -stable about 10 days ago,
> and was irritated to discover that AMD defaulted to TCP mounts.
> 
> I put proto=udp in the amd configuration files, and things got back to
> normal.  I had some hang problems with TCP mounts under amd, but that may
> have also been because of a routing problem I fixed at the same time that
> I changed from TCP mounts to UDP.
> 
> And, I'll have to say that NFS is much better than it has been in the past.

We had hang problems as well.  I can reproduce those too (they are not really
hangs, they are more pauses, the machine pauses for ~45 minutes in a deadlock
condition(I am guessing), then seems to come out of it. A wee bit annoying.
I am fairly all of these problems are not with AMD per-se but the kernel
NFS/TCP implementation... mind you I have *never* seen (Solaris/SGI/Linux)
a working NFS/TCP implementation.   Irritated is not the correct word to 
describe me finding out that AMD had switched from using UDP to TCP as the 
default (I am also on the amd-dev list, and there was discussion of amd 
being modified to not use TCP on machines that didn't handle it well, so
I had assumed that it wasn't using TCP.. *bad*).  I have a packet trace of
221 packets that cause a spurious NFS problem... not sure which one anymore.
If people think it would be of benefit I can continue to analyze it, or
hand it off to somone to analyze.  I feel the problem is entirely client
side though, so it won't show up in what I have.

I am also going to email amd-dev asking Erez to have freeBSD be NFSv3/UDP
by default.

--
David Cross 


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199904020406.XAA13920>