Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 1 Jul 2014 15:30:58 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Mateusz Guzik <mjguzik@gmail.com>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Mateusz Guzik <mjg@FreeBSD.org>
Subject:   Re: svn commit: r268087 - head/sys/kern
Message-ID:  <20140701123058.GP93733@kib.kiev.ua>
In-Reply-To: <20140701115612.GA26696@dft-labs.eu>
References:  <201407010921.s619LXHL063077@svn.freebsd.org> <20140701114245.GO93733@kib.kiev.ua> <20140701115612.GA26696@dft-labs.eu>

next in thread | previous in thread | raw e-mail | index | archive | help

--rK9cUJk4ANtwQkTL
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Jul 01, 2014 at 01:56:12PM +0200, Mateusz Guzik wrote:
> On Tue, Jul 01, 2014 at 02:42:45PM +0300, Konstantin Belousov wrote:
> > Old code did the malloc(M_WAITOK) call in crget() before the text vnode
> > was locked.  After your change, the crdup() is called with the vnode lo=
cked.
> > Witness would not tell you that anything is wrong there, but the new
> > code is worse than the previous structure, even if malloc() was sometim=
es
> > done when not needed.
> >=20
> > To satisfy the memory request from malloc(), pagedaemon or laundry may
> > need to lock the vnode, which creates a circular dependency.  Pagedaemon
> > locks vnodes with timeout, which just means that it would not be able
> > to clean pages while execve() is stuck in malloc(M_WAITOK), while
> > laundry takes the vnode lock without timeout, hanging until the malloc
> > request is satisfied.
> >=20
> > The rule is, do not allocate memory while vnodes are locked.  It is not
> > always followed, but it makes no sense to change existing correct code
> > to broke the pattern.
>=20
> Right, my bad. This was intended to be a minor cleanup, I'm happy to
> revert if you want.
>=20
> Note that current code relocks the vnode already, so there should be no
> harm doing the same in 'else' case. (Although LK_RETRY looks somewhat
> fishy in here.)
>=20
> That said I propose the following:
> diff --git a/sys/kern/kern_exec.c b/sys/kern/kern_exec.c
> index cce687b..9b3a99d 100644
> --- a/sys/kern/kern_exec.c
> +++ b/sys/kern/kern_exec.c
> @@ -716,11 +716,11 @@ interpret:
>  		VOP_UNLOCK(imgp->vp, 0);
>  		setugidsafety(td);
>  		error =3D fdcheckstd(td);
> -		vn_lock(imgp->vp, LK_SHARED | LK_RETRY);
>  		if (error !=3D 0)
>  			goto done1;
>  		newcred =3D crdup(oldcred);
>  		euip =3D uifind(attr.va_uid);
> +		vn_lock(imgp->vp, LK_SHARED | LK_RETRY);
>  		PROC_LOCK(p);
>  		/*
>  		 * Set the new credentials.
This is definitely fine.

> @@ -764,7 +764,9 @@ interpret:
>  		if (oldcred->cr_svuid !=3D oldcred->cr_uid ||
>  		    oldcred->cr_svgid !=3D oldcred->cr_gid) {
>  			PROC_UNLOCK(p);
> +			VOP_UNLOCK(imgp->vp, 0);
>  			newcred =3D crdup(oldcred);
> +			vn_lock(imgp->vp, LK_SHARED | LK_RETRY);
>  			PROC_LOCK(p);
>  			change_svuid(newcred, newcred->cr_uid);
>  			change_svgid(newcred, newcred->cr_gid);
Use of LK_RETRY is fine as far errors from  VOPs which actually perform
accesses to the vnode are checked.  It means that reclaimed vnode would
be detected later.

In fact, could the vnode unlock moved much earlier, in particular,
to avoid the same unlock/lock in the pmc hook call ?  The only use
for the vnode after the VREF() is done, as I see, is to check
for MNT_NOSUID.  Can we test this earlier, and cache the result ?
I do not think that the possible race with flag changing under us
matter.

> @@ -841,6 +843,7 @@ interpret:
> =20
>  	SDT_PROBE(proc, kernel, , exec__success, args->fname, 0, 0, 0, 0);
> =20
> +	VOP_UNLOCK(imgp->vp, 0);
>  done1:
>  	/*
>  	 * Free any resources malloc'd earlier that we didn't use.
This change is fine but unrelated.  There is no harm of calling free()
while holding vnode lock.

> @@ -849,7 +852,6 @@ done1:
>  		uifree(euip);
>  	if (newcred !=3D NULL)
>  		crfree(oldcred);
> -	VOP_UNLOCK(imgp->vp, 0);
> =20
>  	/*
>  	 * Handle deferred decrement of ref counts.
> --=20
> Mateusz Guzik <mjguzik gmail.com>

--rK9cUJk4ANtwQkTL
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBAgAGBQJTsqoCAAoJEJDCuSvBvK1B+xAP/0qnA6avxqIff/MDGGfJwUZz
ycHLv1oTdrz74llaIkqYri6coP+a7uywANU5+pfZBW0Vr+1qVrMAsAnOH8eUriQJ
3EhpQT0Flg70ctWQE8i/L6onrDSyamLUqPkTHLLITC9gSkR5APp3Qhbp5r11bbI7
nemko+8T83modpQPrdmQE3beBlhsRg+C/Is4iQMfct3xYuHmzBDpPYqegn3/HGFH
IPq8q5je+8vyyskGErRBh5E2R3BEKfllvQm9LRjcc8PeW4LKkfnDLjPD77JZThi0
Ag98Nzy6ME+cpdHHiFLiZ8dh2ckYjuGbj4nIt+A/7F07ayEuhzLHZLKD2s/WIOnD
EEJMPSxeXLIid7mv2JSsNXjnQWisoq0Myj8N8D0wEb4bs/WfuDyUKxn9zDfXXQ71
4l9xbGjszKxdg156SJiB1zGIuAgT+q85nr79ZcvfPo/o6xEondu/IzIqdtKJVcqo
9qQ5qkJcIHdVz2KGdTZKFnfl4dm+9v6WR+PnMiidKS31e/f8ksOK/x6mInZFk6cp
GGvzJVRuhFgzpftEYfRFauSqklLl7URIMRHZy/1ZdPmiSheYihVhKRZzUyzyZDQz
4My5/GTjfUOGZ9gonuBr7z2IW+wS54hFg8cp64j3psVMzhlWPX4m5ZlmdTpWkEro
OX8rSQQjc4lDrzVqfqQH
=8+zG
-----END PGP SIGNATURE-----

--rK9cUJk4ANtwQkTL--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140701123058.GP93733>