Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 4 May 1999 14:06:10 -0500 (CDT)
From:      Kevin Day <toasty@home.dragondata.com>
To:        dillon@apollo.backplane.com (Matthew Dillon)
Cc:        dfr@nlsystems.com, jso@research.att.com, hackers@freebsd.org
Subject:   Re: kern/11470: V3 NFS problem (fwd)
Message-ID:  <199905041906.OAA07822@home.dragondata.com>
In-Reply-To: <199905041842.LAA18532@apollo.backplane.com> from Matthew Dillon at "May 4, 1999 11:42:56 am"

next in thread | previous in thread | raw e-mail | index | archive | help
> :From: Kevin Day <toasty@home.dragondata.com>
> :
> :Ok, I've been playing with your last patches (just before they were
> :committed).
> :
> :1) if I 'sysctl -w vfs.nfs.async=1' on the server, the client will
> :eventually get deadlocked, with most processes stuck in 'nfsrcvlk' or
> :'nfsinval'(i think)
> 
>     Yes, I'm sure there are still a couple of lockup situations that
>     we need to fix in this area.  I need to know whether this is via
>     NFSV2 or NFSV3 and whether this is a UDP or TCP mount.  And, if it is
>     a TCP mount, whether the problem occurs with a UDP mount.  A similar
>     situation occured with TCP when I was doing makes that turned out to be
>     a data corruption bug related to multiple RPC's winding up in the same
>     mbuf.
> 
>     Note:  If your *SERVER* is not running the latest -current, you have to
>     upgrade it.  If your server is running FreeBSD-stable, the TCP fix (which
>     is a server-side bug) has NOT yet been committed to FreeBSD-stable.

We're mounting with NFS V3, and UDP. The nfs server in most of my testing was a
2.2.5 server, but this also occurred with a 4.0 nfs server.


> 
> :2) If I set a cpu time limit for a process, and the executable file is being
> :ran over NFS, if it exceeds the CPU limit, i get flooded with "vm_fault: pager
> :error"'s
> 
>     This is definitely a bug.  I'll bet you are using an 'intr' or 'soft'
>     mount, yes?  There are still some serious bugs with 'intr' mounts 
>     interacting badly with the VM system, but they should be relatively easy
>     to fix.

No, I ended up taking out soft and intr a while ago, since they created
instability when the nfs server became unreachable.

> 
> :3) See PR 7728. NFS server is also a web server, dumping logs into user's
> :home directories. Our FTP server is an NFS client. When clients try to
> :download their log files, the ftpd process gets stuck (kill -9 won't kill
> :it). This also happens when they try to upload over top of a file they just
> :viewed on the web server.
> :
> :Processes seem to get stuck in 'sbwait' (which really doesn't seem like it's
> :stuck), or 'nfsrcv'
> 
>     What is occuring is that existing VM cache pages are being ripped out from
>     under the client and the client is getting confused.  I'll need to work
>     up a reliable way to reproduce the problem between a client and server
>     in order to squash it.  If someone else can come up with a simple script
>     to run on the client & the server that reproduces the problem, we will
>     be able to squash it more quickly.
> 

If my time permits, I'll try to find a easily reproducable instance, but I
think my weekend is going to go to upgrading the NFS server from 2.2.5 to
something more current.

Kevin


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199905041906.OAA07822>