Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 22 Nov 2013 14:35:53 +0100
From:      Jilles Tjoelker <jilles@stack.nl>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        freebsd-hackers@freebsd.org, threads@freebsd.org, Vitaly Magerya <vmagerya@gmail.com>, davidxu@freebsd.org
Subject:   Re: Problem with signal 0 being delivered to SIGUSR1 handler
Message-ID:  <20131122133553.GA28457@stack.nl>
In-Reply-To: <20131121211546.GQ59496@kib.kiev.ua>
References:  <528DFEE6.6020504@gmail.com> <20131121211546.GQ59496@kib.kiev.ua>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
On Thu, Nov 21, 2013 at 11:15:46PM +0200, Konstantin Belousov wrote:
> On Thu, Nov 21, 2013 at 02:39:02PM +0200, Vitaly Magerya wrote:
> > Hi, folks. I'm investigating a test case failure that devel/boehm-gc
> > has on recent FreeBSD releases. The problem is that a signal
> > handler registered for SIGUSR1 is sometimes called with signum=0,
> > which should not be possible under any conditions.

> > Here's a simple test case that demonstrates this behavior:

> > /* Compile with 'c99 -o example example.c -pthread'
> >  */
> > #include <pthread.h>
> > #include <signal.h>
> > #include <stdio.h>
> > #include <stdlib.h>
> > 
> > void signal_handler(int signum, siginfo_t *si, void *context) {
> >     if (signum != SIGUSR1) {
> >         printf("bad signal, signum=%d\n", signum);
> >         exit(1);
> >     }
> > }
> > 
> > void *thread_func(void *arg) {
> >     return arg;
> > }
> > 
> > int main(void) {
> >     struct sigaction sa = { 0 };
> >     sa.sa_flags = SA_SIGINFO;
> >     sa.sa_sigaction = signal_handler;
> >     if (sigfillset(&sa.sa_mask) != 0) abort();
> >     if (sigaction(SIGUSR1, &sa, NULL) != 0) abort();
> >     for (int i = 0; i < 10000; i++) {
> >         pthread_t t;
> >         pthread_create(&t, NULL, thread_func, NULL);
> >         pthread_kill(t, SIGUSR1);
> Side note.  pthread_kill(3) call behaviour is undefined if pthread_create(3)
> in the line before failed.
> 
> >     }
> >     return 0;
> > }

> > Under FreeBSD 9.2-RELEASE amd64 I pretty consistently get
> > "signum=0" from this program, but you may need to run it a few
> > times or increase the number of iterations to see the same.

> > Interestingly enough, I don't see this behavior under 9.0-RELEASE.

This is because the bug was introduced with AVX support. (It also occurs
on systems without AVX.)

> > So, any ideas what the problem here is?

> It happens when libthr deferred signal handling path is taken for signal
> delivery and for some reason the code inside the deferred path called
> into rtld for symbol binding. Than, rtld lock is locked, some code in
> rtld is executed, and rtld lock is unlocked. Unlock causes _thr_ast()
> run, which results in the nested check_deferred_signal() execution.
> The check_deferred_signal() clearks si_signo, so on return the same
> signal is delivered one more time, but is advertized as signo zero.

> The _thr_rtld_init() approach of doing dummy calls does not really work,
> since it is not practically possible to enumerate the symbols needed
> during signal delivery.

> My first attempt to fix this was to increment curthread->critical_count
> around the calls to check_* functions in the _thr_ast(), but it causes
> reverse problem of losing _thr_ast() runs on unlock.

> I ended up with the flag to indicate that deferred delivery is running,
> so check_deferred_signal() should avoid doing anything. A delicate
> moment is that user signal handler is allowed to modify the passed
> machine context to result the return from the signal handler to cause
> arbitrary jump, or just do longjmp(). For this case, I also clear the
> flag in thr_sighandler(), since kernel signal delivery means that nested
> delivery code should not run right now.

This analysis suggests an easier approach: just move the check for
deferred_siginfo.si_signo == 0 downward. If __fillcontextx2 or sysarch
need to be looked up by rtld, the resulting _thr_ast() will invoke the
signal handler and the original call to check_deferred_signal() will do
nothing.

This patch fixes the problem for me on stable/9 and head.

Index: lib/libthr/thread/thr_sig.c
===================================================================
--- lib/libthr/thread/thr_sig.c	(revision 258178)
+++ lib/libthr/thread/thr_sig.c	(working copy)
@@ -326,12 +326,12 @@ check_deferred_signal(struct pthread *curthread)
 	uc_len = __getcontextx_size();
 	uc = alloca(uc_len);
 	getcontext(uc);
-	if (curthread->deferred_siginfo.si_signo == 0)
-		return;
 	__fillcontextx2((char *)uc);
 	act = curthread->deferred_sigact;
 	uc->uc_sigmask = curthread->deferred_sigmask;
 	memcpy(&info, &curthread->deferred_siginfo, sizeof(siginfo_t));
+	if (curthread->deferred_siginfo.si_signo == 0)
+		return;
 	/* remove signal */
 	curthread->deferred_siginfo.si_signo = 0;
 	handle_signal(&act, info.si_signo, &info, uc);

-- 
Jilles Tjoelker



Want to link to this message? Use this URL: <http://docs.FreeBSD.org/cgi/mid.cgi?20131122133553.GA28457>