From owner-freebsd-current@FreeBSD.ORG Mon Oct 5 19:21:51 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2165A1065670 for ; Mon, 5 Oct 2009 19:21:51 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (skuns.zoral.com.ua [91.193.166.194]) by mx1.freebsd.org (Postfix) with ESMTP id 140E58FC20 for ; Mon, 5 Oct 2009 19:21:49 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id n95JLiYM047061 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 5 Oct 2009 22:21:44 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3) with ESMTP id n95JLiwf094776; Mon, 5 Oct 2009 22:21:44 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3/Submit) id n95JLi4P094775; Mon, 5 Oct 2009 22:21:44 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 5 Oct 2009 22:21:44 +0300 From: Kostik Belousov To: Jilles Tjoelker Message-ID: <20091005192144.GA2259@deviant.kiev.zoral.com.ua> References: <20091001120730.GR3130@deviant.kiev.zoral.com.ua> <20091002201213.GA16633@stack.nl> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="FCPLy5NpE1Kdjj9y" Content-Disposition: inline In-Reply-To: <20091002201213.GA16633@stack.nl> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-current@freebsd.org, Justin Teller Subject: Re: Signals and an exiting thread X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Oct 2009 19:21:51 -0000 --FCPLy5NpE1Kdjj9y Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Oct 02, 2009 at 10:12:13PM +0200, Jilles Tjoelker wrote: > On Thu, Oct 01, 2009 at 03:07:30PM +0300, Kostik Belousov wrote: > > On Wed, Sep 30, 2009 at 11:02:19AM -0700, Justin Teller wrote: > > > We're trying to control one process from another process through > > > signals (I know, revolutionary ;-), and we've found that a signal > > > occasionally gets lost. =9AThe process we're signaling is > > > multi-threaded. =9AIt looks like the signal is lost when the kernel > > > decides to post the signal to a thread that is in the process of dying > > > (calling pthread_exit, etc). >=20 > > > Is this expected behavior that we should just handle, or is it a race > > > in the kernel that should be/will be/already is fixed? >=20 > > > It may be that a fix is already in current, and I just haven't found > > > it in my searches through the source code (I'm working off of source > > > code for an older 8.0 image). If it is fixed, I'd appreciate a > > > pointer to the code that fixes it. >=20 > > When thread enters the kernel last time to be killed, it is very much > > bad idea to allow it to return to usermode to handle directed signal. > > And, there would always be window between entering the kernel and > > marking the thread as exiting. >=20 > > Moving the thread-directed signals back to the process queue is hard > > and there is no way to distinguish which signals were sent to process > > or to the thread. >=20 > > Possibly, we could clear the thread signal mask while still in user mod= e. > > I think it would still leave a very narrow window when a process > > signal could be directed to the dying thread and not be delivered to > > usermode; this happens when signal is generated while sigsetmask already > > entered the kernel, but did not changed the mask yet. This is worked > > around by rechecking the pending signals after setting the block mask > > and releasing it if needed. >=20 > SIGKILL cannot be masked. Is it possible that a kill(SIGKILL) is > assigned to a dying thread and lost? >=20 > (SIGSTOP cannot be masked either, but its processing is done by the > sending thread, so it should be safe.) >=20 > I suppose that race can also occur in other uses of pthread_sigmask(). > If a thread masks a signal for a while, and that signal is assigned to > that thread just as it is executing pthread_sigmask(), it will only be > processed when that thread unblocks or accepts it, even though other > threads may have the signal unmasked or be in a sigwait() for it. > Signals sent after pthread_sigmask() has changed the signal mask are > likely processed sooner because they will be assigned to a different > thread or left in the process queue. >=20 > POSIX seems to say that signals generated for the process should remain > queued for the process and should only be assigned to a thread at time > of delivery. >=20 > This could be implemented by leaving signals in the process queue or by > tracking for each signal in the thread queue whether it was directed at > the thread and moving the process signals back at sigmask/thr_exit. > Either way I am not sure of all the consequences at this time. I agree that postponing assignment of the thread for signal delivery till the actual delivery occurs is the proper fix. I tried to cheat in my previous patch. Below is an experimental change that did very minimal testing. diff --git a/sys/kern/kern_sig.c b/sys/kern/kern_sig.c index 0386fc4..abd9714 100644 --- a/sys/kern/kern_sig.c +++ b/sys/kern/kern_sig.c @@ -581,20 +581,11 @@ void signotify(struct thread *td) { struct proc *p; - sigset_t set; =20 p =3D td->td_proc; =20 PROC_LOCK_ASSERT(p, MA_OWNED); =20 - /* - * If our mask changed we may have to move signal that were - * previously masked by all threads to our sigqueue. - */ - set =3D p->p_sigqueue.sq_signals; - SIGSETNAND(set, td->td_sigmask); - if (! SIGISEMPTY(set)) - sigqueue_move_set(&p->p_sigqueue, &td->td_sigqueue, &set); if (SIGPENDING(td)) { thread_lock(td); td->td_flags |=3D TDF_NEEDSIGCHK | TDF_ASTPENDING; @@ -1985,17 +1976,9 @@ tdsignal(struct proc *p, struct thread *td, int sig,= ksiginfo_t *ksi) KNOTE_LOCKED(&p->p_klist, NOTE_SIGNAL | sig); prop =3D sigprop(sig); =20 - /* - * If the signal is blocked and not destined for this thread, then - * assign it to the process so that we can find it later in the first - * thread that unblocks it. Otherwise, assign it to this thread now. - */ if (td =3D=3D NULL) { td =3D sigtd(p, sig, prop); - if (SIGISMEMBER(td->td_sigmask, sig)) - sigqueue =3D &p->p_sigqueue; - else - sigqueue =3D &td->td_sigqueue; + sigqueue =3D &p->p_sigqueue; } else { KASSERT(td->td_proc =3D=3D p, ("invalid thread")); sigqueue =3D &td->td_sigqueue; @@ -2409,8 +2392,9 @@ issignal(struct thread *td, int stop_allowed) { struct proc *p; struct sigacts *ps; + struct sigqueue *queue; sigset_t sigpending; - int sig, prop, newsig; + int sig, prop, newsig, signo; =20 p =3D td->td_proc; ps =3D p->p_sigacts; @@ -2420,6 +2404,7 @@ issignal(struct thread *td, int stop_allowed) int traced =3D (p->p_flag & P_TRACED) || (p->p_stops & S_SIG); =20 sigpending =3D td->td_sigqueue.sq_signals; + SIGSETOR(sigpending, p->p_sigqueue.sq_signals); SIGSETNAND(sigpending, td->td_sigmask); =20 if (p->p_flag & P_PPWAIT) @@ -2440,6 +2425,7 @@ issignal(struct thread *td, int stop_allowed) */ if (SIGISMEMBER(ps->ps_sigignore, sig) && (traced =3D=3D 0)) { sigqueue_delete(&td->td_sigqueue, sig); + sigqueue_delete(&p->p_sigqueue, sig); continue; } if (p->p_flag & P_TRACED && (p->p_flag & P_PPWAIT) =3D=3D 0) { @@ -2452,12 +2438,18 @@ issignal(struct thread *td, int stop_allowed) =20 if (sig !=3D newsig) { ksiginfo_t ksi; + + queue =3D &td->td_sigqueue; /* * clear old signal. * XXX shrug off debugger, it causes siginfo to * be thrown away. */ - sigqueue_get(&td->td_sigqueue, sig, &ksi); + if (sigqueue_get(queue, sig, &ksi) =3D=3D 0) { + queue =3D &p->p_sigqueue; + signo =3D sigqueue_get(queue, sig, &ksi); + KASSERT(signo =3D=3D sig, ("signo !=3D sig")); + } =20 /* * If parent wants us to take the signal, @@ -2472,7 +2464,7 @@ issignal(struct thread *td, int stop_allowed) * Put the new signal into td_sigqueue. If the * signal is being masked, look for other signals. */ - SIGADDSET(td->td_sigqueue.sq_signals, sig); + SIGADDSET(queue->sq_signals, sig); if (SIGISMEMBER(td->td_sigmask, sig)) continue; signotify(td); @@ -2567,6 +2559,7 @@ issignal(struct thread *td, int stop_allowed) return (sig); } sigqueue_delete(&td->td_sigqueue, sig); /* take the signal! */ + sigqueue_delete(&p->p_sigqueue, sig); } /* NOTREACHED */ } @@ -2613,7 +2606,9 @@ postsig(sig) ps =3D p->p_sigacts; mtx_assert(&ps->ps_mtx, MA_OWNED); ksiginfo_init(&ksi); - sigqueue_get(&td->td_sigqueue, sig, &ksi); + if (sigqueue_get(&td->td_sigqueue, sig, &ksi) =3D=3D 0 && + sigqueue_get(&p->p_sigqueue, sig, &ksi) =3D=3D 0) + return; ksi.ksi_signo =3D sig; if (ksi.ksi_code =3D=3D SI_TIMER) itimer_accept(p, ksi.ksi_timerid, &ksi); diff --git a/sys/kern/subr_trap.c b/sys/kern/subr_trap.c index 6d60ddb..ccf6479 100644 --- a/sys/kern/subr_trap.c +++ b/sys/kern/subr_trap.c @@ -90,6 +90,7 @@ userret(struct thread *td, struct trapframe *frame) =20 CTR3(KTR_SYSC, "userret: thread %p (pid %d, %s)", td, p->p_pid, td->td_name); +#if 0 #ifdef DIAGNOSTIC /* Check that we called signotify() enough. */ PROC_LOCK(p); @@ -100,6 +101,7 @@ userret(struct thread *td, struct trapframe *frame) thread_unlock(td); PROC_UNLOCK(p); #endif +#endif #ifdef KTRACE KTRUSERRET(td); #endif @@ -218,7 +220,14 @@ ast(struct trapframe *framep) ktrcsw(0, 1); #endif } - if (flags & TDF_NEEDSIGCHK) { + + /* + * Check for signals. Unlocked reads of p_pendingcnt or + * p_siglist might cause process-directed signal to be handled + * later. + */ + if (flags & TDF_NEEDSIGCHK || p->p_pendingcnt > 0 || + !SIGISEMPTY(p->p_siglist)) { PROC_LOCK(p); mtx_lock(&p->p_sigacts->ps_mtx); while ((sig =3D cursig(td, SIG_STOP_ALLOWED)) !=3D 0) diff --git a/sys/sys/signalvar.h b/sys/sys/signalvar.h index 89b40f0..680da40 100644 --- a/sys/sys/signalvar.h +++ b/sys/sys/signalvar.h @@ -252,9 +252,10 @@ typedef struct sigqueue { =20 /* Return nonzero if process p has an unmasked pending signal. */ #define SIGPENDING(td) \ - (!SIGISEMPTY((td)->td_siglist) && \ - !sigsetmasked(&(td)->td_siglist, &(td)->td_sigmask)) - + ((!SIGISEMPTY((td)->td_siglist) && \ + !sigsetmasked(&(td)->td_siglist, &(td)->td_sigmask)) || \ + (!SIGISEMPTY((td)->td_proc->p_siglist) && \ + !sigsetmasked(&(td)->td_proc->p_siglist, &(td)->td_sigmask))) /* * Return the value of the pseudo-expression ((*set & ~*mask) !=3D 0). Th= is * is an optimized version of SIGISEMPTY() on a temporary variable diff --git a/tools/regression/sigqueue/sigqtest1/sigqtest1.c b/tools/regres= sion/sigqueue/sigqtest1/sigqtest1.c index 0f40021..f0201c3 100644 --- a/tools/regression/sigqueue/sigqtest1/sigqtest1.c +++ b/tools/regression/sigqueue/sigqtest1/sigqtest1.c @@ -1,12 +1,14 @@ /* $FreeBSD$ */ -#include -#include #include #include +#include +#include +#include =20 int received; =20 -void handler(int sig, siginfo_t *si, void *ctx) +void +handler(int sig, siginfo_t *si, void *ctx) { if (si->si_code !=3D SI_QUEUE) errx(1, "si_code !=3D SI_QUEUE"); @@ -15,7 +17,8 @@ void handler(int sig, siginfo_t *si, void *ctx) received++; } =20 -int main() +int +main() { struct sigaction sa; union sigval val; diff --git a/tools/regression/sigqueue/sigqtest2/sigqtest2.c b/tools/regres= sion/sigqueue/sigqtest2/sigqtest2.c index 078ea81..50b579d 100644 --- a/tools/regression/sigqueue/sigqtest2/sigqtest2.c +++ b/tools/regression/sigqueue/sigqtest2/sigqtest2.c @@ -1,24 +1,29 @@ /* $FreeBSD$ */ -#include -#include -#include -#include #include #include +#include +#include +#include +#include +#include +#include =20 int stop_received; int exit_received; int cont_received; =20 -void job_handler(int sig, siginfo_t *si, void *ctx) +void +job_handler(int sig, siginfo_t *si, void *ctx) { int status; int ret; =20 if (si->si_code =3D=3D CLD_STOPPED) { + printf("%d: stop received\n", si->si_pid); stop_received =3D 1; kill(si->si_pid, SIGCONT); } else if (si->si_code =3D=3D CLD_EXITED) { + printf("%d: exit received\n", si->si_pid); ret =3D waitpid(si->si_pid, &status, 0); if (ret =3D=3D -1) errx(1, "waitpid"); @@ -26,11 +31,13 @@ void job_handler(int sig, siginfo_t *si, void *ctx) errx(1, "!WIFEXITED(status)"); exit_received =3D 1; } else if (si->si_code =3D=3D CLD_CONTINUED) { + printf("%d: cont received\n", si->si_pid); cont_received =3D 1; } } =20 -void job_control_test() +void +job_control_test(void) { struct sigaction sa; pid_t pid; @@ -43,9 +50,12 @@ void job_control_test() stop_received =3D 0; cont_received =3D 0; exit_received =3D 0; + fflush(stdout); pid =3D fork(); if (pid =3D=3D 0) { + printf("child %d\n", getpid()); kill(getpid(), SIGSTOP); + sleep(2); exit(1); } =20 @@ -60,11 +70,13 @@ void job_control_test() printf("job control test OK.\n"); } =20 -void rtsig_handler(int sig, siginfo_t *si, void *ctx) +void +rtsig_handler(int sig, siginfo_t *si, void *ctx) { } =20 -int main() +int +main() { struct sigaction sa; sigset_t set; --FCPLy5NpE1Kdjj9y Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (FreeBSD) iEYEARECAAYFAkrKR0gACgkQC3+MBN1Mb4gzYgCg4iGI9vSCMedPF3wn73LzKV56 Y2sAoJjnIer2nkgQLMpa2dwQSskC3whJ =J0CC -----END PGP SIGNATURE----- --FCPLy5NpE1Kdjj9y--