Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 16 Nov 2013 20:31:29 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Doug Ambrisko <ambrisko@ambrisko.com>
Cc:        freebsd-hackers@freebsd.org, Dirk Engling <erdgeist@erdgeist.org>, Jase Thew <jase@freebsd.org>, mdf@freebsd.org
Subject:   Re: Re: Fix MNAMELEN or reimplement struct statfs
Message-ID:  <20131116183129.GD59496@kib.kiev.ua>
In-Reply-To: <20131115010854.GA76106@ambrisko.com>
References:  <51B3B59B.8050903@erdgeist.org> <CAMBSHm8GMWffuuEcSpuNu26Mv4N2yAa2iEdw5koiXx0w30zPRQ@mail.gmail.com> <201306101152.17966.jhb@freebsd.org> <52854161.6080104@FreeBSD.org> <20131115010854.GA76106@ambrisko.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--0vFzRikYu/73UBJt
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Nov 14, 2013 at 05:08:54PM -0800, Doug Ambrisko wrote:
> On Thu, Nov 14, 2013 at 09:32:17PM +0000, Jase Thew wrote:
> | On 10/06/2013 16:52, John Baldwin wrote:
> | > On Saturday, June 08, 2013 9:36:27 pm mdf@freebsd.org wrote:
> | >> On Sat, Jun 8, 2013 at 3:52 PM, Dirk Engling <erdgeist@erdgeist.org>=
 wrote:
> | >>
> | >>> The arbitrary value
> | >>>
> | >>> #define MNAMELEN        88              /* size of on/from name buf=
s */
> | >>>
> | >>> struct statfs {
> | >>> [...]
> | >>>         char    f_mntfromname[MNAMELEN];/* mounted filesystem */
> | >>>         char    f_mntonname[MNAMELEN];  /* directory on which mount=
ed */
> | >>> };
> | >>>
> | >>> currently bites us when trying to use poudriere with errors like
> | >>>
> | >>> 'mount: tmpfs: File name too long'
> | >>>
> | >>>
> | >>> /poudriere/data/build/91_RELEASE_amd64-REALLY-REALLY-LONG-
> | > JAILNAME/ref/wrkdirs
> | >>>
> | >>> The topic has been discussed several times since 2004 and has been
> | >>> postponed each time, the last time when it hit zfs users:
> | >>>
> | >>> http://lists.freebsd.org/pipermail/freebsd-fs/2010-March/007974.html
> | >>>
> | >>> So I'd like to point to the calendar, it's 2013 already and there's
> | >>> still a static arbitrary (and way too low) limit in one of the core
> | >>> areas of the vfs code.
> | >>>
> | >>> So I'd like to bump the issue and propose either making f_mntfromna=
me a
> | >>> dynamic allocation or just increase MNAMELEN, using 10.0 as water s=
hed.
> | >>>
> | >>
> | >> Gleb Kurtsou did this along with the ino64 GSoC project.  Unfortunat=
ely,
> | >>  both he and I hit ENOTIME due to the job that pays the bills and it=
's
> | >> never made it back to the main repository.
> | >>
> | >> IIRC, though, the only reason for doing it with 64-bit ino_t is that=
 he'd
> | >> already finished changing the stat/dirent ABI so what was one more. =
 I
> | >> think he went with 1024 bytes, which also necessitated not allocating
> | >> statfs on the stack for the kernel.
> | >=20
> | > He also fixed a few other things since changing this ABI is so invasi=
ve
> | > IIRC.  This really is the right fix for this.  Is it in an svn branch=
=20
> | > that can be updated and a new patch generated?
> | >=20
> |=20
> | Hi folks,
> |=20
> | Has there been any progress on addressing this issue? With the advent of
> | pkgng / poudriere, this limitation is really becoming a frustrating pro=
blem.
>=20
> I looked at NetBSD and what they did with statvfs.  The mount paths
> lengths are bigger in NetBSD defines so that helps.  However, when
> testing it out via a script that keep on doing a nullfs mount in=20
> every increasing directory depth I found that NetBSD would allow the
> mount to exceed the value in statvfs.  When NetBSD populates the path
> in statvfs they truncate it to what fits in statvfs.  So I looked at
> what that might be like in FreeBSD.  So I came up with this simple patch:
>=20
> --- /sys/kern/vfs_mount.c	2013-10-01 14:27:35.000000000 -0700
> +++ vfs_mount.c	2013-10-21 14:20:19.000000000 -0700
> @@ -656,7 +656,7 @@ vfs_donmount(struct thread *td, uint64_t
>  	 * variables will fit in our mp buffers, including the
>  	 * terminating NUL.
>  	 */
> -	if (fstypelen >=3D MFSNAMELEN - 1 || fspathlen >=3D MNAMELEN - 1) {
> +	if (fstypelen >=3D MFSNAMELEN - 1 || fspathlen >=3D MAXPATHLEN - 1) {
>  		error =3D ENAMETOOLONG;
>  		goto bail;
>  	}
> @@ -748,8 +748,8 @@ sys_mount(td, uap)
>  		return (EOPNOTSUPP);
>  	}
> =20
> -	ma =3D mount_argsu(ma, "fstype", uap->type, MNAMELEN);
> -	ma =3D mount_argsu(ma, "fspath", uap->path, MNAMELEN);
> +	ma =3D mount_argsu(ma, "fstype", uap->type, MFSNAMELEN);
> +	ma =3D mount_argsu(ma, "fspath", uap->path, MAXPATHLEN);
>  	ma =3D mount_argb(ma, flags & MNT_RDONLY, "noro");
>  	ma =3D mount_argb(ma, !(flags & MNT_NOSUID), "nosuid");
>  	ma =3D mount_argb(ma, !(flags & MNT_NOEXEC), "noexec");
> @@ -1039,7 +1039,7 @@ vfs_domount(
>  	 * variables will fit in our mp buffers, including the
>  	 * terminating NUL.
>  	 */
> -	if (strlen(fstype) >=3D MFSNAMELEN || strlen(fspath) >=3D MNAMELEN)
> +	if (strlen(fstype) >=3D MFSNAMELEN || strlen(fspath) >=3D MAXPATHLEN)
>  		return (ENAMETOOLONG);
> =20
>  	if (jailed(td->td_ucred) || usermount =3D=3D 0) {
> @@ -1095,9 +1095,9 @@ vfs_domount(
>  	NDFREE(&nd, NDF_ONLY_PNBUF);
>  	vp =3D nd.ni_vp;
>  	if ((fsflags & MNT_UPDATE) =3D=3D 0) {
> -		pathbuf =3D malloc(MNAMELEN, M_TEMP, M_WAITOK);
> +		pathbuf =3D malloc(MAXPATHLEN, M_TEMP, M_WAITOK);
>  		strcpy(pathbuf, fspath);
> -		error =3D vn_path_to_global_path(td, vp, pathbuf, MNAMELEN);
> +		error =3D vn_path_to_global_path(td, vp, pathbuf, MAXPATHLEN);
>  		/* debug.disablefullpath =3D=3D 1 results in ENODEV */
>  		if (error =3D=3D 0 || error =3D=3D ENODEV) {
>  			error =3D vfs_domount_first(td, vfsp, pathbuf, vp,
> @@ -1147,8 +1147,8 @@ sys_unmount(td, uap)
>  			return (error);
>  	}
> =20
> -	pathbuf =3D malloc(MNAMELEN, M_TEMP, M_WAITOK);
> -	error =3D copyinstr(uap->path, pathbuf, MNAMELEN, NULL);
> +	pathbuf =3D malloc(MAXPATHLEN, M_TEMP, M_WAITOK);
> +	error =3D copyinstr(uap->path, pathbuf, MAXPATHLEN, NULL);
>  	if (error) {
>  		free(pathbuf, M_TEMP);
>  		return (error);
> @@ -1181,7 +1181,7 @@ sys_unmount(td, uap)
>  			vfslocked =3D NDHASGIANT(&nd);
>  			NDFREE(&nd, NDF_ONLY_PNBUF);
>  			error =3D vn_path_to_global_path(td, nd.ni_vp, pathbuf,
> -			    MNAMELEN);
> +			    MAXPATHLEN);
>  			if (error =3D=3D 0 || error =3D=3D ENODEV)
>  				vput(nd.ni_vp);
>  			VFS_UNLOCK_GIANT(vfslocked);
>=20
> I seemed to have found a typo bug in an instance in which MFSNAMELEN
> wasn't used in the fstype when I did this change.
>=20
> With this patch things in general seem to work.  You can do a
> mount and umount of a long path.  The umount of the long path works
> by failing on the exact match but then passing when via the FSID.
> df/mount looks a little strange since it shows a truncated path=20
> but has valid contents (FS type, space etc.).  umount via the truncated
> path works if there is only one truncated path that matches.  If there
> are multiple then it fails.
>=20
> This doesn't change and kernel ABI's so then it is safe to apply to the
> kernel without rebuilding user-land.
>=20
> Future work could be to implement statvfs to return a longer path but
> only do it for df/umount etc.  The rest of the system could continue
> with the existing statfs.  mount works because it passed a string into
> the kernel so it can be long.
>=20
> I'd propose this as a current solution to this problem.  It appears to
> work well and shouldn't drastically break things.  Doing df via the
> full path, stat etc. work since the associated path access the vnode.
> So things that do a mount, df of the mount point etc. should continue
> to work.  Scripts that try to figure out the mount points vi df and mount
> displaying all mount points would fail.  That is probably good enough for
> now.
>=20
> Comments welcomed.

Generally, I agree with the approach, but what is done seems to be too
simple to be usable.

One obvious and important thing which is broken with the patch is
the unmounts from jails. In other words, now it is possible to mount
something from jail with appropriate privileges set up, and after that
corresponding mount cannot be unmounted, since vfs_mount_alloc() copies
trimmed path into f_mntonname, and sys_unmount() matches full path with
pathbuf.  Hmm, this should be broken in the same way for non-jailed
mounts with pathes which do not fit into f_mntonname.

I think that struct mount should have a const char * field where the
non-trimmed path is stored and used for match at unmount. f_mntonname
truncation would be only unfortunate user interface glitch.

--0vFzRikYu/73UBJt
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (FreeBSD)

iQIcBAEBAgAGBQJSh7oAAAoJEJDCuSvBvK1B/8QP/izzxCeGUxj94MJEfRZhG08X
xNy2X5ZxLNO9m2iCa72648bzJxpnJvKiLijj4ZsVTC/vX5O3vOCtP1B6Kve86Xpe
6QqXLLQfSRQFVeAVBA2caH+eGveBbr/71ED1AnVgqY2a69FNhSVPj5YgHOjVcplr
o17tv2/f4pUaKVTCH5Qlkypbcrz6CmlTjwj7Z/qRFddf9tObl75JGGfptaHg+Uhy
ur/R+6vFVF+iKpjMTZmtMQT4pA8FqwfbCicgz+baIO19ie0h0eWZOLqe9rL4nK4/
3B9Ux2PBd0y+CxbHg+TbEJfuWKish6FWJrT47Nzk4Wufs8y7R58Rm5EvGGRKokzZ
P+HUUTWox1HayOpE4glQYhy1B2J/ABVmVD2iscFmyZHxFFzKNem1Q2PpZkpj4EaH
yuj0B4RNqm/p9N0AKnyIzzd92Ouhf8Kx+k9HIbIpjBgQ1fGwk+ANnuBtIA65e/Af
l1R4DtbglqNipcXxT49ahaI71uAQZCwgkoDSyyOXYWJHpqhRxblQsRdD4rYVreIb
VQxATOJDAWvxZ8+ariz8vc9Tf+AFhsjGexvtC6bgVoKZqrh9WgbEjIo3zf1UuM6Q
ynQGThMhDyTTXdxpXq1C6rvDSRzvflUMHQJn3ByjpWZlndki8nFX95R9CzzP21Vt
/jfbVyPcUIWrkrNZ/aeA
=FE3x
-----END PGP SIGNATURE-----

--0vFzRikYu/73UBJt--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20131116183129.GD59496>