From owner-freebsd-questions@FreeBSD.ORG Mon Mar 22 17:41:03 2010 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E17D51065679; Mon, 22 Mar 2010 17:41:03 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id AF52A8FC14; Mon, 22 Mar 2010 17:41:03 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 5D6A646B2D; Mon, 22 Mar 2010 13:41:03 -0400 (EDT) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPA id 477F58A027; Mon, 22 Mar 2010 13:41:02 -0400 (EDT) From: John Baldwin To: Steve Polyack Date: Mon, 22 Mar 2010 13:39:37 -0400 User-Agent: KMail/1.12.1 (FreeBSD/7.3-CBSD-20100217; KDE/4.3.1; amd64; ; ) References: <4BA3613F.4070606@comcast.net> <201003221200.41607.jhb@freebsd.org> <4BA79E54.5030504@comcast.net> In-Reply-To: <4BA79E54.5030504@comcast.net> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201003221339.37169.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Mon, 22 Mar 2010 13:41:02 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.7 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: freebsd-fs@freebsd.org, Rick Macklem , User Questions , bseklecki@noc.cfi.pgh.pa.us Subject: Re: FreeBSD NFS client goes into infinite retry loop X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Mar 2010 17:41:04 -0000 On Monday 22 March 2010 12:44:04 pm Steve Polyack wrote: > On 03/22/10 12:00, John Baldwin wrote: > > On Monday 22 March 2010 11:47:43 am Steve Polyack wrote: > > > >> On 03/22/10 10:52, Steve Polyack wrote: > >> > >>> On 3/19/2010 11:27 PM, Rick Macklem wrote: > >>> > >>>> On Fri, 19 Mar 2010, Steve Polyack wrote: > >>>> > >>>> [good stuff snipped] > >>>> > >>>>> This makes sense. According to wireshark, the server is indeed > >>>>> transmitting "Status: NFS3ERR_IO (5)". Perhaps this should be STALE > >>>>> instead; it sounds more correct than marking it a general IO error. > >>>>> Also, the NFS server is serving its share off of a ZFS filesystem, > >>>>> if it makes any difference. I suppose ZFS could be talking to the > >>>>> NFS server threads with some mismatched language, but I doubt it. > >>>>> > >>>>> > >>>> Ok, now I think we're making progress. If VFS_FHTOVP() doesn't return > >>>> ESTALE when the file no longer exists, the NFS server returns whatever > >>>> error it has returned. > >>>> > >>>> So, either VFS_FHTOVP() succeeds after the file has been deleted, which > >>>> would be a problem that needs to be fixed within ZFS > >>>> OR > >>>> ZFS returns an error other than ESTALE when it doesn't exist. > >>>> > >>>> Try the following patch on the server (which just makes any error > >>>> returned by VFS_FHTOVP() into ESTALE) and see if that helps. > >>>> > >>>> --- nfsserver/nfs_srvsubs.c.sav 2010-03-19 22:06:43.000000000 -0400 > >>>> +++ nfsserver/nfs_srvsubs.c 2010-03-19 22:07:22.000000000 -0400 > >>>> @@ -1127,6 +1127,8 @@ > >>>> } > >>>> } > >>>> error = VFS_FHTOVP(mp,&fhp->fh_fid, vpp); > >>>> + if (error != 0) > >>>> + error = ESTALE; > >>>> vfs_unbusy(mp); > >>>> if (error) > >>>> goto out; > >>>> > >>>> Please let me know if the patch helps, rick > >>>> > >>>> > >>>> > >>> The patch seems to fix the bad behavior. Running with the patch, I > >>> see the following output from my patch (return code of nfs_doio from > >>> within nfsiod): > >>> nfssvc_iod: iod 0 nfs_doio returned errno: 70 > >>> > >>> Furthermore, when inspecting the transaction with Wireshark, after > >>> deleting the file on the NFS server it looks like there is only a > >>> single error. This time there it is a reply to a V3 Lookup call that > >>> contains a status of "NFS3ERR_NOENT (2)" coming from the NFS server. > >>> The client also does not repeatedly try to complete the failed request. > >>> > >>> Any suggestions on the next step here? Based on what you said it > >>> looks like ZFS is falsely reporting an IO error to VFS instead of > >>> ESTALE / NOENT. I tried looking around zfs_fhtovp() and only saw > >>> returns of EINVAL, but I'm not even sure I'm looking in the right place. > >>> > >> Further on down the rabbit hole... here's the piece in zfs_fhtovp() > >> where it's kicking out EINVAL instead of ESTALE - the following patch > >> corrects the behavior, but of course also suggests further digging > >> within the zfs_zget() function to ensure that _it_ is returning the > >> correct thing and whether or not it needs to be handled there or within > >> zfs_fhtovp(). > >> > >> --- > >> src-orig/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c > >> 2010-03-22 11:41:21.000000000 -0400 > >> +++ src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c > >> 2010-03-22 16:25:21.000000000 -0400 > >> @@ -1246,7 +1246,7 @@ > >> dprintf("getting %llu [%u mask %llx]\n", object, fid_gen, gen_mask); > >> if (err = zfs_zget(zfsvfs, object,&zp)) { > >> ZFS_EXIT(zfsvfs); > >> - return (err); > >> + return (ESTALE); > >> } > >> zp_gen = zp->z_phys->zp_gen& gen_mask; > >> if (zp_gen == 0) > >> > > So the odd thing here is that ffs_fhtovp() doesn't return ESTALE if VFS_VGET() > > (which calls ffs_vget()) fails, it only returns ESTALE if the generation count > > doesn't matter. > > > > > It looks like it also returns ESTALE when the inode is invalid (< > ROOTINO || > max inodes?) - would an unlinked file in FFS referenced at > a later time report an invalid inode? > > But back to your point, zfs_zget() seems to be failing and returning the > EINVAL before zfs_fhtovp() even has a chance to set and check zp_gen. > I'm trying to get some more details through the use of gratuitous > dprintf()'s, but they don't seem to be making it to any logs or the > console even with vfs.zfs.debug=1 set. Any pointers on how to get these > dprintf() calls working? That I have no idea on. Maybe Rick can chime in? I'm actually not sure why we would want to treat a FHTOVP failure as anything but an ESTALE error in the NFS server to be honest. -- John Baldwin