Date: Sun, 18 Sep 2005 18:57:52 -0400 From: Erez Zadok <ezk@cs.sunysb.edu> To: freebsd-fs@freebsd.org Cc: christos@zoulas.com Subject: turning off the NFS attribute cache Message-ID: <200509182257.j8IMvqgb008168@agora.fsl.cs.sunysb.edu>
next in thread | raw e-mail | index | archive | help
Summary: Freebsd doesn't seem to have a way to turn off the NFS attribute cache, which breaks the Amd automounter so badly that I do not recommend using Amd on FreeBSD for heavy use, not until this is fixed. Details: I'm the lead maintainer of the am-utils package (www.am-utils.org), the so-called Amd Berkeley Automounter. Amd is a user-level NFSv2 server that manages automounts of all other file systems. The kernel contacts Amd via RPCs, and Amd in turn performs the actual mounts, and then responds back to the kernel's RPCs. Every kernel caches attributes of files, in a Directory Name Lookup Cache (DNLC). Amd manages its namespace in the user level, but the kernel caches names itself. So the two must coordinate to ensure that both namespaces are in sync. If the kernel uses a cached entry from the DNLC, without consulting Amd, users may see corruption of the automounter namespace (symlinks pointing to the wrong places, ESTALE errors, and more). For example, suppose Amd timed out an entry and removed the entry from Amd's namespace. Amd has to tell the kernel to purge its corresponding DNLC entry too. The way Amd often does that is by incrementing the mtime of the parent directory. This is the most common method for kernels to check if their DNLC entries are stale: if the parent dir mtime is newer, the kernel will discard all cached entries for that directory, and will re-issue lookup methods. Those lookups will result in NFS_GETATTR/NFS_LOOKUP calls sent to Amd, and Amd can then properly inform the kernel of the new state of automounted entries. In order to ensure that Amd is "in charge" of its namespace w/o interference from the kernel, Amd will try to turn off the NFS attribute cache. It does so by using the NFSMNT_NOAC flag, if it exists, or setting various "cache timeout" fields in struct nfs_args to 0. We have released a major new version of am-utils, version 6.1, a few months ago. Since then, a lot of people have experimented with Amd, in anticipation of migrating from the very old am-utils 6.0 to the new 6.1. For a couple of months now, we have received reports of problems with Amd, especially under heavy use. Users reported getting ESTALE errors from time to time, or seeing automounted entries whose symlinks don't point to where it should be. After much debugging, we traced it to a few places in Amd where it wasn't updating the parent dir mtime as it should have. So we fixed it and verified that it was working (on Solaris and Linux, where the actual user bug reports came from). After fixing this in Amd, I went on to verify that things work for other OSs. When I got to FreeBSD 4.6, I found that it always caches directory entries, and there is no way to turn it off completely. Specifically, if I set the ac{reg,dir}{min,max} fields in struct nfs_args all to zero, the kernel seems to cache the entries for a default number of seconds (I counted something like 5 seconds). On some OSs, setting these four fields to 0 turns off the attribute cache, but not on FreeBSD 4.6. I was able to verify this using Amd and a script that exercises the interaction of the kernel's attrcache and Amd. I didn't look at the kernel sources (yet) but I'm pretty certain of this behavior. I then experimented by setting the ac{reg,dir}{min,max} fields in struct nfs_args all to 1, the smallest non-zero value I could. When I ran my Amd exercising script, I found that the value of 1 reduced the race between the DNLC and Amd, and the script took a little longer to run before it detected an incoherency. That makes sense: the smaller the DNLC cache interval is, the shorter the window of vulnerability is. (BTW, the mount_nfs man page says that the ac{reg,dir}{min,max} fields use a 1 second resolution, but my experimentation indicated it was in 0.1 second units -- is that right?) Clearly, setting the ac{reg,dir}{min,max} fields to 0 is worse than setting it to 1 on FreeBSD. So the current (ugly) workaround I've used in am-utils is to set the global parameter auto_attrcache=1 in the /etc/amd.conf file. The near term solution is for FreeBSD to support a true 'noac' flag, which can be added fairly easily. This'd make Amd work reliably. The long term solution is to implement Autofs support for all OSs and to support it in Amd. Luckily, there is autofs support for freebsd now, but Amd doesn't support autofs yet on freebsd (we support the solaris and linux autofs as of now). Still, we found that even with autofs support, many sysadmins still prefer to use the good 'ol non-autofs mode. For what it's worth, I've confirmed that OpenBSD 3.7 and NetBSD 2.0.2 also do not have a way to turn off the attribute cache and both suffer from the same problem. I've reported it to Christos Zoulas (NetBSD) who said that they will fix this. It is my hope that FreeBSD will implement a "noac" flag ASAP; it's crucial if you want people to use FreeBSD+Amd in heavy-use production-level environments. I also plan to test this attrcache behavior on all OSs I have access to and report the same to each. Finally, I'll be happy to work with anyone to provide more details, scripts to exercise these bugs, submit official bug reports, and even implement a "noac" flag for any BSD kernel. Sincerely, Erez.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200509182257.j8IMvqgb008168>