From owner-freebsd-hackers@FreeBSD.ORG Wed Dec 3 00:26:41 2008 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E77F01065672 for ; Wed, 3 Dec 2008 00:26:41 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from bunrab.catwhisker.org (adsl-63-193-123-122.dsl.snfc21.pacbell.net [63.193.123.122]) by mx1.freebsd.org (Postfix) with ESMTP id 9C3928FC14 for ; Wed, 3 Dec 2008 00:26:41 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from bunrab.catwhisker.org (localhost [127.0.0.1]) by bunrab.catwhisker.org (8.13.3/8.13.3) with ESMTP id mB30Fdhv099330 for ; Tue, 2 Dec 2008 16:15:39 -0800 (PST) (envelope-from david@bunrab.catwhisker.org) Received: (from david@localhost) by bunrab.catwhisker.org (8.13.3/8.13.1/Submit) id mB30FcOM099329 for hackers@freebsd.org; Tue, 2 Dec 2008 16:15:38 -0800 (PST) (envelope-from david) Date: Tue, 2 Dec 2008 16:15:38 -0800 From: David Wolfskill To: hackers@freebsd.org Message-ID: <20081203001538.GC96383@bunrab.catwhisker.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="hYooF8G/hrfVAmum" Content-Disposition: inline User-Agent: Mutt/1.4.2.1i Cc: Subject: NFS (& amd?) dysfunction descending a hierarchy X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Dec 2008 00:26:42 -0000 --hYooF8G/hrfVAmum Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable I seem to have a fairly- (though not deterministly so) reproducible mode of failure with an NFS-mounted directory hierarchy: An attempt to traverse a "sufficiently large" hierarchy (e.g., via "tar zcpf" or "rm -fr") will fail to "visit" some subdirectories, typically apparently acting as if the subdirectories in question do not actually exist (despite the names having been returned in the output of a previous readdir()). The file system is mounted read-write, courtesy of amd(8); none of the files has any non-default flags; there are no ACLs involved; and I owned the lot (that is, as "owning user" of the files). An example of "sufficiently large" has been demonstrated to be a recent copy of a FreeBSD ports tree. (The problem was discovered using a hierarchy that had some proprietary content; I tried a copy of the ports tree to see if I could replicate the issue with something a FreeBSD hacker would more likely have handy. And avoid NDA issues. :-}) Now, before I go further: I'm not pointing the finger at FreeBSD, here (yet). At minimum, there could be fault with FreeBSD (as the NFS client); with amd(8); with the NetApp Filer (as the NFS server); or the network -- or the configuration(s) of any of them. But I just tried this, using the same NFS server, but a machine running Solaris 8 as an NFS client, and was unable to re-create the problem. And I found a way to avoid having the problem occur using a FreeBSD NFS client: whack amd(8)'s config so that the dismount_interval is 12 hours instead of the default 2 minutes, thus effectivly preventing amd(8) from its normal attempts to unmount file systems. Please note that I don't consider this a fix -- or even an acceptable circumvention, in the long term. Rather, it's a diagnostic change, in an attempt to better understand the nature of the problem. Here are step-by-step instructions to recreate the problem; unfortunately, I believe I don't have the resources to test this anywhere but at work, though I will try it at home, to the extent that I can: * Set up the environment. * The failing environment uses NetApp filers as NFS servers. I don't know what kind or how recent the software is on them, but can find out. (I exepct they're fairly well-maintained.) * Ensure that the NFS space available is at least 10 GB or more. I will refer to this as "~/NFS/", as I tend to create such symlinks to keep track of things. * I used a dual, quad-core machine running FreeBSD RELENG_7_1 as of yesterday morning as an NFS client. It also had a recently-updated /usr/ports tree, which was a CVS working directory (so each "real" subdirectory also had a CVS subdirectory within it). * Set up amd(8) so that ~/NFS is mounted on demand when it's referenced, and only via amd(8). Ensure that the dismount_interval has the default value of 120 seconds. * Create a reference tarball. * cd /usr && tar zcpf ~/NFS/ports.tgz ports/ * Create the test directory hierarchy. * cd ~/NFS && tar zxpf ports.tgz * Clear any cache. * Unmount ~/NFS, then re-mount it. Or just reboot the NFS client machine. Or arrange to have done all of the above set-up stuff from a differnet NFS client. * Set up for information capture (optional). * Use ps(1) or your favorite alternative tool to determine the PID for amd(8). Note that `cat /var/run/amd.pid` won't do the trick. :-{ * Run ktrace(1) to capture activity from amd(8) and its descendants, e.g.: sudo ktrace -dip ${amd_pid} -f ktrace_amd.out * Start a packet-capture for NFS traffic, e.g.: sudo tcpdump -s 0 -n -w nfs.bpf host ${nfs_server} * Start the test. * Do this under ktrace(1), if you did the above optional step: rm -fr ~/NFS/ports; echo $? As soon as rm(1) issues a whine, you might as well interrupt it (^C). * Stop the information capture, if you started it. * ^C for the tcpdump(1) process. * sudo ktrace -C If the packet capture file is too big for the analysis program you prefer to digest as a unit, see the net/tcpslice port for a bit of relief. (Wireshark seems to want to read an entire packet capture file into main memory.) I have performed the above, with the "information-gathering" step; I can *probably* make that information available, but I'll need to check -- some organizations get paranoid about things like host names. I don't expect that my current employer is, but I don't know yet, so I won't promise. In the mean time, I should be able to extract somewhat-relevant information from what I've collected, if that would be useful. While I wouldn't mind sharing the results, I strongly suspect that blow-by-blow analysis wouldn't be ideal for this (or any other) mailing list; I would be very happy to work with others to figure out what's gone wrong (or is misconfigured) and get things working properly. If someone(s) would be willing to help, I'd appreciate it very much. If (enough) folks would actually prefer that the details stay in the list (or some other list), I'm willing to do that, too. Thanks! Peace, david --=20 David H. Wolfskill david@catwhisker.org Depriving a girl or boy of an opportunity for education is evil. See http://www.catwhisker.org/~david/publickey.gpg for my public key. --hYooF8G/hrfVAmum Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (FreeBSD) iEYEARECAAYFAkk1z6oACgkQmprOCmdXAD2mrwCfTEVXI1WgKKGBlhx9mKSzAcbb UucAniRFPjrIOXonJk9Id6v1lFhXsAvF =9NVu -----END PGP SIGNATURE----- --hYooF8G/hrfVAmum--