Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 18 Sep 2005 18:57:52 -0400
From:      Erez Zadok <ezk@cs.sunysb.edu>
To:        freebsd-fs@freebsd.org
Cc:        christos@zoulas.com
Subject:   turning off the NFS attribute cache
Message-ID:  <200509182257.j8IMvqgb008168@agora.fsl.cs.sunysb.edu>

next in thread | raw e-mail | index | archive | help
Summary:

Freebsd doesn't seem to have a way to turn off the NFS attribute cache,
which breaks the Amd automounter so badly that I do not recommend using Amd
on FreeBSD for heavy use, not until this is fixed.


Details:

I'm the lead maintainer of the am-utils package (www.am-utils.org), the
so-called Amd Berkeley Automounter.  Amd is a user-level NFSv2 server that
manages automounts of all other file systems.  The kernel contacts Amd via
RPCs, and Amd in turn performs the actual mounts, and then responds back to
the kernel's RPCs.  Every kernel caches attributes of files, in a Directory
Name Lookup Cache (DNLC).

Amd manages its namespace in the user level, but the kernel caches names
itself.  So the two must coordinate to ensure that both namespaces are in
sync.  If the kernel uses a cached entry from the DNLC, without consulting
Amd, users may see corruption of the automounter namespace (symlinks
pointing to the wrong places, ESTALE errors, and more).  For example,
suppose Amd timed out an entry and removed the entry from Amd's namespace.
Amd has to tell the kernel to purge its corresponding DNLC entry too.  The
way Amd often does that is by incrementing the mtime of the parent
directory.  This is the most common method for kernels to check if their
DNLC entries are stale: if the parent dir mtime is newer, the kernel will
discard all cached entries for that directory, and will re-issue lookup
methods.  Those lookups will result in NFS_GETATTR/NFS_LOOKUP calls sent to
Amd, and Amd can then properly inform the kernel of the new state of
automounted entries.

In order to ensure that Amd is "in charge" of its namespace w/o interference
from the kernel, Amd will try to turn off the NFS attribute cache.  It does
so by using the NFSMNT_NOAC flag, if it exists, or setting various "cache
timeout" fields in struct nfs_args to 0.

We have released a major new version of am-utils, version 6.1, a few months
ago.  Since then, a lot of people have experimented with Amd, in
anticipation of migrating from the very old am-utils 6.0 to the new 6.1.
For a couple of months now, we have received reports of problems with Amd,
especially under heavy use.  Users reported getting ESTALE errors from time
to time, or seeing automounted entries whose symlinks don't point to where
it should be.  After much debugging, we traced it to a few places in Amd
where it wasn't updating the parent dir mtime as it should have.  So we
fixed it and verified that it was working (on Solaris and Linux, where the
actual user bug reports came from).

After fixing this in Amd, I went on to verify that things work for other
OSs.  When I got to FreeBSD 4.6, I found that it always caches directory
entries, and there is no way to turn it off completely.  Specifically, if I
set the ac{reg,dir}{min,max} fields in struct nfs_args all to zero, the
kernel seems to cache the entries for a default number of seconds (I counted
something like 5 seconds).  On some OSs, setting these four fields to 0
turns off the attribute cache, but not on FreeBSD 4.6.  I was able to verify
this using Amd and a script that exercises the interaction of the kernel's
attrcache and Amd.  I didn't look at the kernel sources (yet) but I'm pretty
certain of this behavior.

I then experimented by setting the ac{reg,dir}{min,max} fields in struct
nfs_args all to 1, the smallest non-zero value I could.  When I ran my Amd
exercising script, I found that the value of 1 reduced the race between the
DNLC and Amd, and the script took a little longer to run before it detected
an incoherency.  That makes sense: the smaller the DNLC cache interval is,
the shorter the window of vulnerability is.  (BTW, the mount_nfs man page
says that the ac{reg,dir}{min,max} fields use a 1 second resolution, but my
experimentation indicated it was in 0.1 second units -- is that right?)

Clearly, setting the ac{reg,dir}{min,max} fields to 0 is worse than setting
it to 1 on FreeBSD.  So the current (ugly) workaround I've used in am-utils
is to set the global parameter auto_attrcache=1 in the /etc/amd.conf file.

The near term solution is for FreeBSD to support a true 'noac' flag, which
can be added fairly easily.  This'd make Amd work reliably.

The long term solution is to implement Autofs support for all OSs and to
support it in Amd.  Luckily, there is autofs support for freebsd now, but
Amd doesn't support autofs yet on freebsd (we support the solaris and linux
autofs as of now).  Still, we found that even with autofs support, many
sysadmins still prefer to use the good 'ol non-autofs mode.

For what it's worth, I've confirmed that OpenBSD 3.7 and NetBSD 2.0.2 also
do not have a way to turn off the attribute cache and both suffer from the
same problem.  I've reported it to Christos Zoulas (NetBSD) who said that
they will fix this.  It is my hope that FreeBSD will implement a "noac" flag
ASAP; it's crucial if you want people to use FreeBSD+Amd in heavy-use
production-level environments.

I also plan to test this attrcache behavior on all OSs I have access to and
report the same to each.

Finally, I'll be happy to work with anyone to provide more details, scripts
to exercise these bugs, submit official bug reports, and even implement a
"noac" flag for any BSD kernel.

Sincerely,
Erez.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200509182257.j8IMvqgb008168>