From owner-freebsd-current Thu Jul 29 17: 5:16 1999 Delivered-To: freebsd-current@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2]) by hub.freebsd.org (Postfix) with ESMTP id ACDD415742 for ; Thu, 29 Jul 1999 17:05:11 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id RAA80452; Thu, 29 Jul 1999 17:05:02 -0700 (PDT) (envelope-from dillon) Date: Thu, 29 Jul 1999 17:05:02 -0700 (PDT) From: Matthew Dillon Message-Id: <199907300005.RAA80452@apollo.backplane.com> To: Bill Paul Cc: peter@netplex.com.au, crossd@cs.rpi.edu, current@FreeBSD.ORG Subject: readdirplus client side fix (was Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm) References: <199907292322.TAA17429@skynet.ctr.columbia.edu> Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG :And here is something even scarier: readdirplus from the client side :doesn't appear to work correctly either. This time, you don't need an :IRIX machine to trigger the problem (though it helps :). Do the following : :client# mount -o nvsv3,tcp,rdirplus server:/somefs /mnt :client# ls /mnt; du /mnt; etc... : :Seems okay so far, right? Ah, but now try to unmount the filesystem: : :# umount /mnt : :... :-Bill But, on the bright side, readdirplus is somewhat experimental in that it is not used by default, so very little testing of it has been done to date. Thus the bug is not unexepcted :-). At least the bugs we are getting now tend to be in the 'outlying areas' of NFS and not so much with the core code. Another area that is probably full of bugs: nqleasing. -- Ok, I was able to reproduce the above bug and fix it. The problem on the FreeBSD client is in nfs_readdirplusrpc() in nfs/nfs_vnops.c. It can obtain the vnode being used to populate the additional directory info in one of two ways. When it gets the vnode via nfs_nget(), the returned vnode is locked. When it gets it via a hit against NFS_CMPFH() (which I presume is for '.'), it simply VREF()'s the vnode. In the one case the vnode is returned locked, in the other it is not. However, the internal loop vrele()'s the vnode rather then vput()'s it, so the vnodes in the directory scan are never unlocked. This leads to the lockup. If you could test and then commit this patch (w/ me as the submitter), I would appreciate it! It seems to fix the problem for me. This patch is relative to CURRENT. The fix ought to be MFCable to STABLE. The funny thing is that the error termination code actually got it right and the loop got it wrong. Usually it's the other way around. -- Presumably this will not fix the SGI client. I've no idea what the problem there is. There may be a bug in the SGI client or there may be a bug in the client & server implementation of the protocol in FreeBSD. -Matt Matthew Dillon Index: nfs_vnops.c =================================================================== RCS file: /home/ncvs/src/sys/nfs/nfs_vnops.c,v retrieving revision 1.135 diff -u -r1.135 nfs_vnops.c --- nfs_vnops.c 1999/07/01 13:32:54 1.135 +++ nfs_vnops.c 1999/07/29 23:57:06 @@ -2367,7 +2367,10 @@ nfsm_adv(nfsm_rndup(i)); } if (newvp != NULLVP) { - vrele(newvp); + if (newvp == vp) + vrele(newvp); + else + vput(newvp); newvp = NULLVP; } nfsm_dissect(tl, u_int32_t *, NFSX_UNSIGNED); To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message