Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 09 Feb 2007 14:03:37 +0800
From:      LI Xin <delphij@delphij.net>
To:        freebsd-arch@freebsd.org
Subject:   Patch for review: resolve a race condition in [sg]etpriority()
Message-ID:  <45CC0EB9.7030400@delphij.net>

next in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enig1C02D17C056105FC2139BFB9
Content-Type: multipart/mixed; boundary="------------070209090508050504040207"

This is a multi-part message in MIME format.
--------------070209090508050504040207
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hi,

Here is a patch which resolves a race condition in [sg]etpriority(), but
I think there might be some better idea to resolve the problem, so I
would appreciate if someone would shed me some light :-)

Description of the problem:

In PRIO_USER of both [sg]etpriority(), p_ucred is expected to be a valid
reference to a valid user credential.  However, for PRS_NEW processes,
there is chances where it is not properly initialized, causing a fatal
trap 12 [1].

On the other hand, just determining whether a process is in PRS_NEW
state and skip those who are newly born is not enough for semantical
correctness.  A process could either have p_{start,end}copy section
copied or not, which needs to be handled with care.

The attached solution:

What I did in the attached patch is basically:

 - Before allproc_lock is sx_xunlock'ed in fork1(), lock both the parent
and the child.
 - After releasing allproc_lock, do process p_{start,end}copy,
p_{start,end}zero and crhold() immediately, and release parent and
child's p_mtx as soon as possible.
 - In getpriority(), simply skip PRS_NEW processes because they does not
affect PRIO_USER path's result.  This part is not really necessary for
correctness, though.

The pros for this is that it does not require an extensive API change,
and in theory it does not require that the allproc lock to be held for a
extended period; the cons is that this adds some overhead because it
adds two pairs of mutex lock/unlock.

In a private discussion, there was also some other ideas.  One is to
just move the unlock to where the process is completely initialized,
another is to introduce an event handler and msleep() p_state when
necessary, and do a wakeup once we bumped the state, but I think these
solution have their own limitations just like mine.

[1] Reproducing script can be obtained from kern/108071.

Cheers,
--=20
Xin LI <delphij@delphij.net>	http://www.delphij.net/
FreeBSD - The Power to Serve!

--------------070209090508050504040207
Content-Type: text/plain; x-mac-type="0"; x-mac-creator="0";
	name="patch-priority.20070124"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline;
 filename="patch-priority.20070124"

Index: kern_fork.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /home/ncvs/src/sys/kern/kern_fork.c,v
retrieving revision 1.266
diff -u -p -r1.266 kern_fork.c
--- kern_fork.c	23 Jan 2007 08:46:50 -0000	1.266
+++ kern_fork.c	24 Jan 2007 08:35:31 -0000
@@ -423,8 +423,22 @@ again:
 	AUDIT_ARG(pid, p2->p_pid);
 	LIST_INSERT_HEAD(&allproc, p2, p_list);
 	LIST_INSERT_HEAD(PIDHASH(p2->p_pid), p2, p_hash);
+
+	PROC_LOCK(p2);
+	PROC_LOCK(p1);
+
 	sx_xunlock(&allproc_lock);
=20
+	bcopy(&p1->p_startcopy, &p2->p_startcopy,
+	    __rangeof(struct proc, p_startcopy, p_endcopy));
+	PROC_UNLOCK(p1);
+
+	bzero(&p2->p_startzero,
+	    __rangeof(struct proc, p_startzero, p_endzero));
+
+	p2->p_ucred =3D crhold(td->td_ucred);
+	PROC_UNLOCK(p2);
+
 	/*
 	 * Malloc things while we don't hold any locks.
 	 */
@@ -482,13 +496,9 @@ again:
 	PROC_LOCK(p2);
 	PROC_LOCK(p1);
=20
-	bzero(&p2->p_startzero,
-	    __rangeof(struct proc, p_startzero, p_endzero));
 	bzero(&td2->td_startzero,
 	    __rangeof(struct thread, td_startzero, td_endzero));
=20
-	bcopy(&p1->p_startcopy, &p2->p_startcopy,
-	    __rangeof(struct proc, p_startcopy, p_endcopy));
 	bcopy(&td->td_startcopy, &td2->td_startcopy,
 	    __rangeof(struct thread, td_startcopy, td_endcopy));
=20
@@ -511,7 +521,6 @@ again:
 	sched_fork(td, td2);
=20
 	mtx_unlock_spin(&sched_lock);
-	p2->p_ucred =3D crhold(td->td_ucred);
 	td2->td_ucred =3D crhold(p2->p_ucred);
 #ifdef AUDIT
 	audit_proc_fork(p1, p2);
Index: kern_resource.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /home/ncvs/src/sys/kern/kern_resource.c,v
retrieving revision 1.165
diff -u -p -r1.165 kern_resource.c
--- kern_resource.c	17 Jan 2007 14:58:53 -0000	1.165
+++ kern_resource.c	24 Jan 2007 02:06:13 -0000
@@ -143,6 +143,9 @@ getpriority(td, uap)
 			uap->who =3D td->td_ucred->cr_uid;
 		sx_slock(&allproc_lock);
 		FOREACH_PROC_IN_SYSTEM(p) {
+			/* Do not bother to check PRS_NEW processes */
+			if (p->p_state =3D=3D PRS_NEW)
+				continue;
 			PROC_LOCK(p);
 			if (!p_cansee(td, p) &&
 			    p->p_ucred->cr_uid =3D=3D uap->who) {

--------------070209090508050504040207--

--------------enig1C02D17C056105FC2139BFB9
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFzA65OfuToMruuMARA7cHAJsGWVZlMcoepLM7h1+zT014Cj6u7gCgge6e
RL6f6wX1VMZHkO8l84269xQ=
=fkK6
-----END PGP SIGNATURE-----

--------------enig1C02D17C056105FC2139BFB9--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?45CC0EB9.7030400>