Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 6 Apr 2011 05:03:29 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Kostik Belousov <kostikbel@gmail.com>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: Knob to turn off _POSIX_NO_TRUNC
Message-ID:  <20110406034131.L2689@besplex.bde.org>
In-Reply-To: <20110405141631.GA78089@deviant.kiev.zoral.com.ua>
References:  <20110405141631.GA78089@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 5 Apr 2011, Kostik Belousov wrote:

> From very old and gloomy SysV times I remembered filesystem behaviour
> that silently truncated the file name components to the NAME_MAX limit,
> that was, AFAIR, 14. To much of my dismay, I met some usermode software
> recently that blindly tried to create the file from externally provided
> name, and sometimes failed with ENAMETOOLONG in similar situation.
> The authors are not cooperative.

Apply rmrf.

> I ended up with the following hack, which almost turns off the
> _POSIX_NO_TRUNC behaviour, globally on the system. Patch allowed me
> to proceed. The cost in the default case is a single check, which is
> performed only on ENAMETOOLONG path.

Please don't commit this.  This is already handled correctly for gloomy
file systems by returning 0 from pathconf(path, _PC_NO_TRUNC) for ones
that have short names, although the global definitions of _POSIX_NO_TRUNC
and NAME_MAX have always been broken in FreeBSD.  msdosfs does this
even when it has long names.  The main other file systems that do it
are hpfs and ntfs, which I think shouldn't do it since they always have
long names.  Perhaps they do this for compatibility with WinDOS.

FreeBSD defines _POSIX_NO_TRUNC as 1, which means that file names are
_never_ truncated and is inconsistent what actual file systems like
msdsofs do.  FreeBSD defines NAME_MAX as a certain value, which means
that the name length limit is _always_ this value and is inconsistent
with the limit that actual file systems like msdosfs have.

> diff --git a/sys/kern/vfs_lookup.c b/sys/kern/vfs_lookup.c
> index 50a2570..e9e7697 100644
> --- a/sys/kern/vfs_lookup.c
> +++ b/sys/kern/vfs_lookup.c
> @@ -99,6 +99,11 @@ SYSCTL_INT(_vfs, OID_AUTO, lookup_shared, CTLFLAG_RW, &lookup_shared, 0,
>     "Enables/Disables shared locks for path name translation");
> TUNABLE_INT("vfs.lookup_shared", &lookup_shared);
>
> +static int lookup_trim;
> +SYSCTL_INT(_vfs, OID_AUTO, lookup_trim, CTLFLAG_RW, &lookup_trim, 0,
> +    "Enables/Disables trim of the long path component instead of ENAMETOOLONG");

Enabling this is another way of breaking the guarantee that the compile-time
constant _POSIX_NO_TRUNC gives the run-time behaviour.

> +TUNABLE_INT("vfs.lookup_trim", &lookup_trim);

It is a style bug (bloat) to have a tunable for something that doesn't
need to be set for booting.

> +
> /*
>  * Convert a pathname into a pointer to a locked vnode.
>  *
> @@ -514,8 +519,14 @@ dirloop:
> 		continue;
> 	cnp->cn_namelen = cp - cnp->cn_nameptr;
> 	if (cnp->cn_namelen > NAME_MAX) {

This hard coding of NAME_MAX gives various bugs which are now larger
than before.  Before, the only bug was that file systems that could
support a {NAME_MAX} (= pathconf(path, _PC_NAME_MAX) longer than the
compile-time constant NAME_MAX can't do it, since the vfs level rejects
such names here.  Some check is needed here to avoid overrunning the
buffer, but the limit on the buffer size shouldn't the one for a
broken implementation of the user API.  With a non-broken implementation,
NAME_MAX cannot be defined (except possibly visibly only in the kernel)
and the limit here should be the maximum needed by all file systems.
This can then be increased if a new file system needs it without breaking
the user API, or it can be made larger than necessary to allow for
expansion with incomplete coordination with new file systems.

> -		error = ENAMETOOLONG;
> -		goto bad;

Even if the name length is not > NAME_MAX, the ENAMETOOLONG error or
truncation may still occur in inidividualy file systems.  I doubt that
all file systems handle this correctly.  For example, msdosfs never
returns ENAMETOOLONG, and this is correct since msdosfs advertises
{_PC_NO_TRUNC} = 0.  But {_PC_NO_TRUNC} = 0 is wrong since the above
implements _PC_NO_TRUNC = 1 unconditionally if the the name length
exceeds NAME_MAX.

> +		if (!lookup_trim) {
> +			error = ENAMETOOLONG;
> +			goto bad;
> +		}
> +		ndp->ni_pathlen -= cnp->cn_namelen - NAME_MAX;
> +		cnp->cn_namelen = NAME_MAX;
> +		strcpy(cnp->cn_nameptr + cnp->cn_namelen, cp);
> +		cp = cnp->cn_nameptr + cnp->cn_namelen;

This makes {_PC_NO_TRUNC} as perfectly broken as _POSIX_NO_TRUNC for
file systems that implement _POSIX_NO_TRUNC, since the patch doesn't
touch individual file systems so all the ones that return 1 for
pathconf(path, _PC_NO_TRUNC) become inconsistent with their actual
behavior.  If you fix this, then the patch becomes correct but uglier.

> 	}
> #ifdef NAMEI_DIAGNOSTIC
> 	{ char c = *cp;
>

The code for the pathconf() vop in individual file systems has
considerable ugliness related to this.  Most file systems want the
defaults for everything related to pathnames, but the implementation
provides no way to get the defaults from another layer, so most vops
duplicate a not-so-big case statement.  OTOH vop_stdpathconf() provides
a "default" for the whole vop which is very nonstandard and unsuitable
as a default for almost everything in it that is not related to
pathnames.  It also has large style bugs (indentation errors) which
were not present in the place it was copied from.  It is essentially
devfs's patconf vop, and it now seems to be used only by coda, devfs,
fdescfs, portalfs and zfs.  The use of it in coda, devfs and zfs shows
how it can be be used as a default after all: start with an fs-specific
vop and only use vop_stdpathconf() for cases that you don't want to
override.  But vop_stdpathconf() is unsuitable for this, since it
provides many cases that are correct for at most devfs or only for a
few file systems.  These are:

     _PC_LINK_MAX = LINK_MAX:  wrong because most file systems don't support
 			      hard links
     _PC_MAX_CANON = MAX_CANON
     _PC_MAX_INPUT = MAX_INPUT
     _PC_VDISABLE = POSIX_VDISABLE:
 			     wrong because only devfs supports device files
 			     wrong for most devfs files too, since most
 			     device files are not ttys
 			     MAX_CANON and MAX_INPUT are also wrong
     _PC_PIPE_BUF = PIPE_BUF: wrong because most file systems don't support
 			     named pipes or sockets
 			     PIPE_BUF may also be wrong for sockets (I
 			     think it is a variable depending on watermarks
 			     which can be set by ioctls)
After removing these, only the pathname-related cases _PC_NAME_MAX and
_PC_NO_TRUNC, and _PC_CHOWN_RESTRICTED are left.  Perhaps it's not worth
having a default for just these.  ufs_pathconf() is now quite large and
ugly, with many ufs-specific cases that can't be defaulted, and almost
total disorder for the cases.  It wouldn't benefit from defaulting just
3 cases.  It has complications related to fifos.  This seems to result
in the pathconf selectors for pathnames failing for pathnames that refer
to a fifo.  Whether they work is implementation-defined (they are only
required to work for pathnames that refer to directories), but FreeBSD
normally makes them work for all types of files.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110406034131.L2689>