Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 23 Dec 2014 10:25:27 +1100
From:      Richard Perini <rpp@ci.com.au>
To:        John Baldwin <jhb@freebsd.org>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: NFS negative name caching and amd
Message-ID:  <20141222232527.GA52306@odi.ci.com.au>
In-Reply-To: <201412221004.48504.jhb@freebsd.org>
References:  <20141221102746.GA11278@odi.ci.com.au> <201412221004.48504.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Dec 22, 2014 at 10:04:48AM -0500, John Baldwin wrote:
> On Sunday, December 21, 2014 5:27:46 am Richard Perini wrote:
> > 
> > We're struggling with an NFS negative name caching issue that results in
> > a file created by an NFS client 'A' being invisible on client 'B' for up
> > to client A's negnametimeo value.  In our scenario, a process on client
> > A creates a file, and passes a message to another process which may 
> > run on client B.  The second process expects the file created by A to
> > be available.
> 
> Which NFS server are you using?  If it is a FreeBSD NFS server, try changing
> vfs.timestamp_precision to 2 (or 3) and seeing if that reduces the amount of
> time you have to wait until the directory's ac timeout.

	Yes, we are running FreeBSD on the server machines. Unfortunately, our 
	process really can't tolerate a delay of any length - either the file 
	is present or its not. 

> Another possible the fix is to be careful to not open the file until you know
> it exists if you still want to keep the reduced LOOKUP RPC load from caching
> negative lookups.

	We have coded around the most common failure points with retry logic,
	but this is a hack, and there are some third party libraries involved
	that are not practical to fix in this manner.

> > We're running a mix of 9-stable and 10-stable machines, and the problem is 
> > common to both.
> > 
> > The obvious fix is to set the nfs mount option 'negnametimeo' to 0, but 
> > unfortunately we also have 'amd' in the picture (which we also need in our 
> > environment). Amd doesn't understand negnametimeo and ignores it, leaving
> > it set to the system default of 60 seconds (as shown by nfsstat -m).
> 
> Have you tried autofs for 10-stable?  Is it able to pass this option to NFS
> if you use it?  If that works, I would prefer that to be the long term
> solution for this.  I'm not a huge fan of adding kernel options to override
> each NFS default mount option if we can help it.

	I just ran up autofs and automountd on 10-stable, set the negnametimeo
	option in auto_master and it works a treat.  However it will be quite 
	some time before we're able to shift off 9 which leaves us with the 
	kernel option as the easiest path.  

	I'd point out that the nfs client code in
	/usr/src/sys/fs/nfsclient/nfsmount.h is already coded to allow override:

ifndef NFS_DEFAULT_NEGNAMETIMEO
#define NFS_DEFAULT_NEGNAMETIMEO        60
#endif

	so all that is required is the entry in the "options" file.  Naturally
	we can add that ourselves (the beauty of open source :-) but it would
	be the only change to the native FreeBSD code for us, so of course
	we'd prefer to see it in the tree.

Regards, and compliments of the season.

--R



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20141222232527.GA52306>