From owner-freebsd-hackers@FreeBSD.ORG Fri Dec 5 15:33:04 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4BD1B22A; Fri, 5 Dec 2014 15:33:04 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 22F07BFC; Fri, 5 Dec 2014 15:33:04 +0000 (UTC) Received: from ralph.baldwin.cx (pool-173-70-85-31.nwrknj.fios.verizon.net [173.70.85.31]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id D9753B9A8; Fri, 5 Dec 2014 10:33:02 -0500 (EST) From: John Baldwin To: Mike Gelfand Subject: Re: [BUG] Getting path to program binary sometimes fails Date: Fri, 05 Dec 2014 10:19:01 -0500 Message-ID: <2066750.N3TZpYSHCy@ralph.baldwin.cx> User-Agent: KMail/4.14.2 (FreeBSD/10.1-STABLE; KDE/4.14.2; amd64; ; ) In-Reply-To: References: <91809230-5E81-4A6E-BFD6-BE8815A06BB2@logicnow.com> <201411201125.30087.jhb@freebsd.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Fri, 05 Dec 2014 10:33:02 -0500 (EST) Cc: Konstantin Belousov , "freebsd-hackers@freebsd.org" , "hackers@freebsd.org" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Dec 2014 15:33:04 -0000 On Friday, December 05, 2014 12:01:15 PM Mike Gelfand wrote: > John, >=20 > Sorry for late reply. >=20 > On Nov 20, 2014, at 7:25 PM, John Baldwin wrote: > >> Since you=E2=80=99re saying that current behavior is not a defect,= maybe > >> documentation is wrong (incomplete, misleading) then? I will readi= ly > >> accept > >> the =E2=80=9Cnot a defect=E2=80=9D explanation, but only if one wo= uldn=E2=80=99t have to ask you > >> every time this oddity is met. If this is the expected error condi= tion, > >> what should I do to get the path reliably? Should I retry (and how= many > >> times)? You=E2=80=99re saying cache is being purged; does it mean = that when I > >> ask for path then cache is populated again? Does it guarantee then= that > >> I=E2=80=99ll be able to get the path on next call? Could you guara= ntee that I=E2=80=99ll > >> be able to get the path at all if I fail two or more times? Should= I > >> rely on ENOENT specifically when retrying?>=20 > > Is this over NFS? NFS is more aggressive than local filesystems in= > > purging > > name cache entries because there are inherent races in NFS with cer= tain > > fileservers (ones that don't use sub-second timestamps), so by defa= ult > > entries always expire after about a minute. You can change that vi= a the > > 'nametimeo' mount option (takes a count in seconds). >=20 > No, not NFS but ZFS. Could that be an issue? The FreeBSD 8 machine I > mentioned before has UFS. >=20 > Also, as you can see from the video I recorded (and from the code I > provided), path resolution succeeds and fails within fractions of a s= econd > after process startup. Are you seeing vnodes being actively recycled? In particular, do you s= ee=20 vfs.numvnodes close to kern.maxvnodes? You can try raising kern.maxvno= des. =20 If vfs.numvnodes grows up to the limit then as long as you can stomach = the RAM=20 of having more vnodes around that would increase the changes of your pa= ths=20 remaining valid. --=20 John Baldwin