Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 18 Mar 2009 18:32:22 +0200
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Daniel Eischen <deischen@freebsd.org>
Cc:        David Schultz <das@freebsd.org>, hackers@freebsd.org, davidxu@freebsd.org, Jason Evans <jasone@freebsd.org>, Julian Elischer <julian@elischer.org>
Subject:   Re: threaded, forked, rethreaded processes will deadlock
Message-ID:  <20090318163222.GE7716@deviant.kiev.zoral.com.ua>
In-Reply-To: <Pine.GSO.4.64.0901220021320.4150@sea.ntplx.net>
References:  <4966F81C.3070406@elischer.org> <20090109163426.GC2825@green.homeunix.org> <49678BBC.8050306@elischer.org> <20090116211959.GA12007@green.homeunix.org> <49710BD6.7040705@FreeBSD.org> <20090120004135.GB12007@green.homeunix.org> <20090121230033.GC12007@green.homeunix.org> <Pine.GSO.4.64.0901211831210.4150@sea.ntplx.net> <20090122045637.GA61058@zim.MIT.EDU> <Pine.GSO.4.64.0901220021320.4150@sea.ntplx.net>

next in thread | previous in thread | raw e-mail | index | archive | help

--N1GIdlSm9i+YlY4t
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Jan 22, 2009 at 12:42:56AM -0500, Daniel Eischen wrote:
> On Wed, 21 Jan 2009, David Schultz wrote:
>=20
> >I think there *is* a real bug here, but there's two distinct ways
> >to fix it. When a threaded process forks, malloc acquires all its
> >locks so that its state is consistent after a fork. However, the
> >post-fork hook that's supposed to release these locks fails to do
> >so in the child because the child process isn't threaded, and
> >malloc_mutex_unlock() is optimized to be a no-op in
> >single-threaded processes. If the child *stays* single-threaded,
> >malloc() works by accident even with all the locks held because
> >malloc_mutex_lock() is also a no-op in single-threaded processes.
> >But if the child goes multi-threaded, then things break.
> >
> >Solution 1 is to actually unlock the locks in the child process,
> >which is what Brian is proposing.
> >
> >Solution 2 is to take the position that all of this pre- and
> >post-fork bloat in the fork() path is gratuitous and should be
> >removed. The rationale here is that if you fork with multiple
> >running threads, there's scads of ways in which the child's heap
> >could be inconsistent; fork hooks would be needed not just in
> >malloc(), but in stdio, third party libraries, etc. Why should
> >malloc() be special? It's the programmer's job to quiesce all the
> >threads before calling fork(), and if the programmer doesn't do
> >this, then POSIX only guarantees that async-signal-safe functions
> >will work.
> >
> >Note that Solution 2 also fixes Brian's problem if he quiesces all
> >of his worker threads before forking (as he should!) With the
> >pre-fork hook removed, all the locks will start out free in the
> >child.  So that's what I vote for...
>=20
> The problem is that our own libraries (libthr included)
> need to malloc() for themselves, even after a fork() in
> the child.  After a fork(), the malloc locks should be
> reinitialized in the child if it was threaded, so that
> our implementation actually works for all the async
> signal calls, fork(), exec(), etc.  I forget the exact
> failure modes for very common cases, but if you remove
> the re-initialization of the malloc locks, I'm sure
> you will have problems.
>=20
> Perhaps much of this malloc() stuff goes away when we
> move to pthread locks that are not pointers to allocated
> objects, but instead are actual objects/structures.
> This needs to be done in order for mutexes/CVs/etc
> to be PTHREAD_PROCESS_SHARED (placed in shared memory
> and used by multiple processes).  In other words,
> pthread_mutex_t goes from this:
>=20
> 	typedef struct pthread_mutex *pthread_mutex_t;
>=20
> to something like this:
>=20
> 	struct __pthread_mutex {
> 		uint32_t	lock;
> 		...
> 	}
> 	typedef struct __pthread_mutex pthread_mutex_t;
>=20
> Same thing for CVs, and we probably should convert any other
> locks used internally by libc/libpthread (spinlocks).
>=20
> So after a fork(), there is no need to reallocate anything,
> it can just be reinitialized if necessary.
>=20

I looked at the issue once more recently, and I propose the following
much less intrusive patch. It is somewhat hackish, but I think that
it would be good to have this working. Most other Unixes do have
working thread library after the fork. Any objections ?

diff --git a/lib/libthr/thread/thr_fork.c b/lib/libthr/thread/thr_fork.c
index bc410d1..ae6b9ad 100644
--- a/lib/libthr/thread/thr_fork.c
+++ b/lib/libthr/thread/thr_fork.c
@@ -173,14 +173,19 @@ _fork(void)
 		/* Ready to continue, unblock signals. */=20
 		_thr_signal_unblock(curthread);
=20
-		if (unlock_malloc)
+		if (unlock_malloc) {
+			__isthreaded =3D 1;
 			_malloc_postfork();
+			__isthreaded =3D 0;
+		}
=20
 		/* Run down atfork child handlers. */
 		TAILQ_FOREACH(af, &_thr_atfork_list, qe) {
 			if (af->child !=3D NULL)
 				af->child();
 		}
+
+		THR_UMUTEX_UNLOCK(curthread, &_thr_atfork_lock);
 	} else {
 		/* Parent process */
 		errsave =3D errno;

--N1GIdlSm9i+YlY4t
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (FreeBSD)

iEYEARECAAYFAknBIhUACgkQC3+MBN1Mb4gSTwCeIIAdoAw9tSKhJ1ttiGe8LNwo
5zoAoOx5my0Upyo9shFZ1P/irQ60mREW
=y+gs
-----END PGP SIGNATURE-----

--N1GIdlSm9i+YlY4t--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090318163222.GE7716>