Date: Tue, 4 May 1999 14:06:10 -0500 (CDT) From: Kevin Day <toasty@home.dragondata.com> To: dillon@apollo.backplane.com (Matthew Dillon) Cc: dfr@nlsystems.com, jso@research.att.com, hackers@freebsd.org Subject: Re: kern/11470: V3 NFS problem (fwd) Message-ID: <199905041906.OAA07822@home.dragondata.com> In-Reply-To: <199905041842.LAA18532@apollo.backplane.com> from Matthew Dillon at "May 4, 1999 11:42:56 am"
next in thread | previous in thread | raw e-mail | index | archive | help
> :From: Kevin Day <toasty@home.dragondata.com> > : > :Ok, I've been playing with your last patches (just before they were > :committed). > : > :1) if I 'sysctl -w vfs.nfs.async=1' on the server, the client will > :eventually get deadlocked, with most processes stuck in 'nfsrcvlk' or > :'nfsinval'(i think) > > Yes, I'm sure there are still a couple of lockup situations that > we need to fix in this area. I need to know whether this is via > NFSV2 or NFSV3 and whether this is a UDP or TCP mount. And, if it is > a TCP mount, whether the problem occurs with a UDP mount. A similar > situation occured with TCP when I was doing makes that turned out to be > a data corruption bug related to multiple RPC's winding up in the same > mbuf. > > Note: If your *SERVER* is not running the latest -current, you have to > upgrade it. If your server is running FreeBSD-stable, the TCP fix (which > is a server-side bug) has NOT yet been committed to FreeBSD-stable. We're mounting with NFS V3, and UDP. The nfs server in most of my testing was a 2.2.5 server, but this also occurred with a 4.0 nfs server. > > :2) If I set a cpu time limit for a process, and the executable file is being > :ran over NFS, if it exceeds the CPU limit, i get flooded with "vm_fault: pager > :error"'s > > This is definitely a bug. I'll bet you are using an 'intr' or 'soft' > mount, yes? There are still some serious bugs with 'intr' mounts > interacting badly with the VM system, but they should be relatively easy > to fix. No, I ended up taking out soft and intr a while ago, since they created instability when the nfs server became unreachable. > > :3) See PR 7728. NFS server is also a web server, dumping logs into user's > :home directories. Our FTP server is an NFS client. When clients try to > :download their log files, the ftpd process gets stuck (kill -9 won't kill > :it). This also happens when they try to upload over top of a file they just > :viewed on the web server. > : > :Processes seem to get stuck in 'sbwait' (which really doesn't seem like it's > :stuck), or 'nfsrcv' > > What is occuring is that existing VM cache pages are being ripped out from > under the client and the client is getting confused. I'll need to work > up a reliable way to reproduce the problem between a client and server > in order to squash it. If someone else can come up with a simple script > to run on the client & the server that reproduces the problem, we will > be able to squash it more quickly. > If my time permits, I'll try to find a easily reproducable instance, but I think my weekend is going to go to upgrading the NFS server from 2.2.5 to something more current. Kevin To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199905041906.OAA07822>