From owner-freebsd-hackers@FreeBSD.ORG  Thu Nov 20 17:57:11 2014
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 5040D331;
 Thu, 20 Nov 2014 17:57:11 +0000 (UTC)
Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1])
 (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 276A2F15;
 Thu, 20 Nov 2014 17:57:11 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id 22604B968;
 Thu, 20 Nov 2014 12:57:10 -0500 (EST)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-hackers@freebsd.org
Subject: Re: [BUG] Getting path to program binary sometimes fails
Date: Thu, 20 Nov 2014 11:25:29 -0500
User-Agent: KMail/1.13.5 (FreeBSD/8.4-CBSD-20140415; KDE/4.5.5; amd64; ; )
References: <91809230-5E81-4A6E-BFD6-BE8815A06BB2@logicnow.com>
 <20141113170758.GY17068@kib.kiev.ua>
 <B655709E-0D6F-4DE1-A746-9A20B897BEA8@logicnow.com>
In-Reply-To: <B655709E-0D6F-4DE1-A746-9A20B897BEA8@logicnow.com>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="windows-1252"
Content-Transfer-Encoding: quoted-printable
Message-Id: <201411201125.30087.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Thu, 20 Nov 2014 12:57:10 -0500 (EST)
Cc: Konstantin Belousov <kostikbel@gmail.com>,
 Mike Gelfand <Mike.Gelfand@logicnow.com>,
 "hackers@freebsd.org" <hackers@freebsd.org>
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.18-1
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 20 Nov 2014 17:57:11 -0000

On Friday, November 14, 2014 4:54:18 am Mike Gelfand wrote:
> On Nov 13, 2014, at 8:07 PM, Konstantin Belousov <kostikbel@gmail.com>=20
wrote:
>=20
> > This is not a defect.  The vnode->path translation uses namecache, which
> > could be purged at any time.  The behaviour is typical for most unix
> > implementations.  Linux and new Solaris have 'rigid' namecache, where
> > name entry lifetime is the same as the vnode lifetime it is attached to.
> > I am not aware of any useful consequences of such design, except
> > vn_fullpath() working more reliable, but at the cost of increased
> > memory usage.
>=20
> The man page for sysctl(3) states that =93Unless explicitly noted below,=
=20
sysctl() returns a consistent snapshot of the data requested=94 (surely we =
don=92t=20
expect half the path being returned; I=92m just trying to read thoroughly).=
=20
Later on there are no special notes on {CTL_KERN, KERN_PROC,=20
KERN_PROC_PATHNAME}; at least no notes on the unstable behavior being=20
observed, and no funny details of internal implementation you describe. ERR=
ORS=20
section only describes ENOENT condition as =93The name array specifies a va=
lue=20
that is unknown,=94 which certainly is not the case here.

Note that sysctl(3) is describing a generic interface that mostly returns
integers.  The language is trying to state that when you read the values you
get a consistent snapshot of whatever logical values a node provides.  (e.g=
=2E=20
for a 64-bit int on a 32-bit system it will try to return a consistent value
rather than one which mixes 32-bit halves from different values of the=20
associated varaible, or things like the kern.cp_times sysctl (for the=20
cp_times[] array) will return a consistent snapshot of the entire array of=
=20
ints).  It is not saying that a node is not permitted to say "I have no val=
id
data at this time."  If anything, I think that a node is obligated to return
that instead of a partial data (as you somewhat noted).

> Since you=92re saying that current behavior is not a defect, maybe=20
documentation is wrong (incomplete, misleading) then? I will readily accept=
=20
the =93not a defect=94 explanation, but only if one wouldn=92t have to ask =
you every=20
time this oddity is met. If this is the expected error condition, what shou=
ld=20
I do to get the path reliably? Should I retry (and how many times)? You=92r=
e=20
saying cache is being purged; does it mean that when I ask for path then ca=
che=20
is populated again? Does it guarantee then that I=92ll be able to get the p=
ath=20
on next call? Could you guarantee that I=92ll be able to get the path at al=
l if=20
I fail two or more times? Should I rely on ENOENT specifically when retryin=
g?

Is this over NFS?  NFS is more aggressive than local filesystems in purging
name cache entries because there are inherent races in NFS with certain=20
fileservers (ones that don't use sub-second timestamps), so by default entr=
ies=20
always expire after about a minute.  You can change that via the 'nametimeo=
'=20
mount option (takes a count in seconds).

=2D-=20
John Baldwin