From owner-freebsd-hackers@FreeBSD.ORG Fri Dec 12 15:34:32 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1A33F374 for ; Fri, 12 Dec 2014 15:34:32 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E5688988 for ; Fri, 12 Dec 2014 15:34:31 +0000 (UTC) Received: from ralph.baldwin.cx (pool-173-70-85-31.nwrknj.fios.verizon.net [173.70.85.31]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 4CDF4B98C; Fri, 12 Dec 2014 10:34:30 -0500 (EST) From: John Baldwin To: Mike Gelfand Subject: Re: [BUG] Getting path to program binary sometimes fails Date: Fri, 12 Dec 2014 10:33:59 -0500 Message-ID: <3715296.8JkIjC2VMR@ralph.baldwin.cx> User-Agent: KMail/4.14.2 (FreeBSD/10.1-STABLE; KDE/4.14.2; amd64; ; ) In-Reply-To: <27C465FC-E8C7-44CB-A812-65213BB8AC9F@logicnow.com> References: <91809230-5E81-4A6E-BFD6-BE8815A06BB2@logicnow.com> <2066750.N3TZpYSHCy@ralph.baldwin.cx> <27C465FC-E8C7-44CB-A812-65213BB8AC9F@logicnow.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Fri, 12 Dec 2014 10:34:30 -0500 (EST) Cc: Konstantin Belousov , "freebsd-hackers@freebsd.org" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Dec 2014 15:34:32 -0000 On Friday, December 05, 2014 03:52:41 PM Mike Gelfand wrote: > On Dec 5, 2014, at 6:19 PM, John Baldwin wrote: > >> No, not NFS but ZFS. Could that be an issue? The FreeBSD 8 machine I > >> mentioned before has UFS. > >> > >> Also, as you can see from the video I recorded (and from the code I > >> provided), path resolution succeeds and fails within fractions of a > >> second > >> after process startup. > > > > Are you seeing vnodes being actively recycled? In particular, do you see > > vfs.numvnodes close to kern.maxvnodes? You can try raising > > kern.maxvnodes. > > If vfs.numvnodes grows up to the limit then as long as you can stomach the > > RAM of having more vnodes around that would increase the changes of your > > paths remaining valid. > > When the call works, sysctl returns: > vfs.numvnodes: 59638 > kern.maxvnodes: 204723 > The times it doesn't, the output is: > vfs.numvnodes: 60017 > kern.maxvnodes: 204723 > I've selected maximum numbers. Monitoring was made with > while sysctl vfs.numvnodes kern.maxvnodes; do sleep 0.1; done > > So it seems that's not related, correct? 60K is much less than 200K. Yes. Unfortunately, we don't expose a raw counter for vnode recycling (I should really add one). I think we might also purge unused vnodes if we have too many "free" vnodes. However, directories that aren't currently open by a process (meaning via opendir()) count as "unused", so are subject to purging. You can try increasing "vfs.wantfreevnodes". Also, please try this patch. It just adds a counter for recycled vnodes. If this value increases during your test then it does show that recycling is occurring. If it doesn't, then that rules it out. Index: vfs_subr.c =================================================================== --- vfs_subr.c (revision 275512) +++ vfs_subr.c (working copy) @@ -156,6 +156,10 @@ static int vlru_allow_cache_src; SYSCTL_INT(_vfs, OID_AUTO, vlru_allow_cache_src, CTLFLAG_RW, &vlru_allow_cache_src, 0, "Allow vlru to reclaim source vnode"); +static u_long recycles_count; +SYSCTL_ULONG(_vfs, OID_AUTO, recycles, CTLFLAG_RW, &recycles_count, 0, + "Number of vnodes recycled"); + /* * Various variables used for debugging the new implementation of * reassignbuf(). @@ -988,6 +992,7 @@ vtryrecycle(struct vnode *vp) __func__, vp); return (EBUSY); } + atomic_add_long(&recycles_count, 1); if ((vp->v_iflag & VI_DOOMED) == 0) vgonel(vp); VOP_UNLOCK(vp, LK_INTERLOCK); -- John Baldwin