From owner-freebsd-threads@FreeBSD.ORG Fri Nov 22 03:55:55 2013 Return-Path: Delivered-To: threads@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 90A6B92A; Fri, 22 Nov 2013 03:55:55 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 7085324B5; Fri, 22 Nov 2013 03:55:55 +0000 (UTC) Received: from xyf.my.dom (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id rAM3tskd068692; Fri, 22 Nov 2013 03:55:54 GMT (envelope-from davidxu@freebsd.org) Message-ID: <528ED5D3.1030906@freebsd.org> Date: Fri, 22 Nov 2013 11:56:03 +0800 From: David Xu User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:17.0) Gecko/20130416 Thunderbird/17.0.5 MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: Problem with signal 0 being delivered to SIGUSR1 handler References: <528DFEE6.6020504@gmail.com> <20131121211546.GQ59496@kib.kiev.ua> In-Reply-To: <20131121211546.GQ59496@kib.kiev.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@freebsd.org, Vitaly Magerya , threads@freebsd.org X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Nov 2013 03:55:55 -0000 On 2013/11/22 05:15, Konstantin Belousov wrote: > On Thu, Nov 21, 2013 at 02:39:02PM +0200, Vitaly Magerya wrote: >> Hi, folks. I'm investigating a test case failure that devel/boehm-gc >> has on recent FreeBSD releases. The problem is that a signal >> handler registered for SIGUSR1 is sometimes called with signum=0, >> which should not be possible under any conditions. >> >> Here's a simple test case that demonstrates this behavior: >> >> /* Compile with 'c99 -o example example.c -pthread' >> */ >> #include >> #include >> #include >> #include >> >> void signal_handler(int signum, siginfo_t *si, void *context) { >> if (signum != SIGUSR1) { >> printf("bad signal, signum=%d\n", signum); >> exit(1); >> } >> } >> >> void *thread_func(void *arg) { >> return arg; >> } >> >> int main(void) { >> struct sigaction sa = { 0 }; >> sa.sa_flags = SA_SIGINFO; >> sa.sa_sigaction = signal_handler; >> if (sigfillset(&sa.sa_mask) != 0) abort(); >> if (sigaction(SIGUSR1, &sa, NULL) != 0) abort(); >> for (int i = 0; i < 10000; i++) { >> pthread_t t; >> pthread_create(&t, NULL, thread_func, NULL); >> pthread_kill(t, SIGUSR1); > Side note. pthread_kill(3) call behaviour is undefined if pthread_create(3) > in the line before failed. > >> } >> return 0; >> } >> >> Under FreeBSD 9.2-RELEASE amd64 I pretty consistently get >> "signum=0" from this program, but you may need to run it a few >> times or increase the number of iterations to see the same. >> >> Interestingly enough, I don't see this behavior under 9.0-RELEASE. >> >> So, any ideas what the problem here is? > > It happens when libthr deferred signal handling path is taken for signal > delivery and for some reason the code inside the deferred path called > into rtld for symbol binding. Than, rtld lock is locked, some code in > rtld is executed, and rtld lock is unlocked. Unlock causes _thr_ast() > run, which results in the nested check_deferred_signal() execution. > The check_deferred_signal() clearks si_signo, so on return the same > signal is delivered one more time, but is advertized as signo zero. > > The _thr_rtld_init() approach of doing dummy calls does not really work, > since it is not practically possible to enumerate the symbols needed > during signal delivery. > > My first attempt to fix this was to increment curthread->critical_count > around the calls to check_* functions in the _thr_ast(), but it causes > reverse problem of losing _thr_ast() runs on unlock. > > I ended up with the flag to indicate that deferred delivery is running, > so check_deferred_signal() should avoid doing anything. A delicate > moment is that user signal handler is allowed to modify the passed > machine context to result the return from the signal handler to cause > arbitrary jump, or just do longjmp(). For this case, I also clear the > flag in thr_sighandler(), since kernel signal delivery means that nested > delivery code should not run right now. > > Please try this. > > diff --git a/lib/libthr/thread/thr_private.h b/lib/libthr/thread/thr_private.h > index 83a02b5..c6651cd 100644 > --- a/lib/libthr/thread/thr_private.h > +++ b/lib/libthr/thread/thr_private.h > @@ -433,6 +433,9 @@ struct pthread { > /* the sigaction should be used for deferred signal. */ > struct sigaction deferred_sigact; > > + /* deferred signal delivery is performed, do not reenter. */ > + int deferred_run; > + > /* Force new thread to exit. */ > int force_exit; > > diff --git a/lib/libthr/thread/thr_sig.c b/lib/libthr/thread/thr_sig.c > index 415ddb0..57c9406 100644 > --- a/lib/libthr/thread/thr_sig.c > +++ b/lib/libthr/thread/thr_sig.c > @@ -162,6 +162,7 @@ thr_sighandler(int sig, siginfo_t *info, void *_ucp) > act = _thr_sigact[sig-1].sigact; > _thr_rwl_unlock(&_thr_sigact[sig-1].lock); > errno = err; > + curthread->deferred_run = 0; > > /* > * if a thread is in critical region, for example it holds low level locks, > @@ -320,14 +321,18 @@ check_deferred_signal(struct pthread *curthread) > siginfo_t info; > int uc_len; > > - if (__predict_true(curthread->deferred_siginfo.si_signo == 0)) > + if (__predict_true(curthread->deferred_siginfo.si_signo == 0 || > + curthread->deferred_run)) > return; > > + curthread->deferred_run = 1; > uc_len = __getcontextx_size(); > uc = alloca(uc_len); > getcontext(uc); > - if (curthread->deferred_siginfo.si_signo == 0) > + if (curthread->deferred_siginfo.si_signo == 0) { > + curthread->deferred_run = 0; > return; > + } > __fillcontextx2((char *)uc); > act = curthread->deferred_sigact; > uc->uc_sigmask = curthread->deferred_sigmask; > The patch looks fine to me.