From owner-freebsd-arch@FreeBSD.ORG Fri Sep 7 16:24:32 2012 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D6CEE106566B for ; Fri, 7 Sep 2012 16:24:32 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 9830B8FC12 for ; Fri, 7 Sep 2012 16:24:32 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 0D03FB93B for ; Fri, 7 Sep 2012 12:24:32 -0400 (EDT) From: John Baldwin To: arch@freebsd.org Date: Fri, 7 Sep 2012 12:17:47 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p17; KDE/4.5.5; amd64; ; ) MIME-Version: 1.0 Content-Type: Text/Plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Message-Id: <201209071217.47439.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Fri, 07 Sep 2012 12:24:32 -0400 (EDT) Cc: Subject: [PATCH] Close a race between exit1() and SIGSTOP X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Sep 2012 16:24:33 -0000 We ran into a bug at work recently where processes were reporting an exit status from wait(2) of WIFSIGNALED() with WTERMSIG() of SIGSTOP. I narrowed this down to a race in exit1(). Specifically, exit1() sets p->p_xstat to the passed in exit status and sets P_WEXIT. It then drops the PROC_LOCK() to do other work such as freeing file descriptors, etc. Later it reacquires the PROC_LOCK(), marks the process as a zombie, and terminates. During the window where the PROC_LOCK() is not held, if a stop signal arrives (SIGSTOP, SIGTSTP, etc.), then the signal code will overwrite p->p_xstat with the signal that initiated the stop. The end result is that setting from SIGSTOP stays in p->p_xstat and is reported via wait(2). My first attempt at a fix was to simply ignore all signals once P_WEXIT is set by adding a check for P_WEXIT to the existing check for PRS_ZOMBIE. However, I think this is not quite right. For example, if an exiting process gets hung on an interruptible NFS mount while doing fdfree() I think we want a user to be able to kill the process to let it break out of NFS and finish exiting. The second fix I have is to explicitly clear P_STOPPED_SIGNAL to discard any pending SIGSTOP when setting P_WEXIT and to explicitly disallow any stop signals for a process that is currently exiting (that is, P_WEXIT is set). This fixes the race I ran into while still allowing other signals during process exit. The patch to do that is below. Below that is a test program that reproduces the issue. I would really like to get some feedback on which approach is best and if my concerns about allowing other signals during exit1() is a legitimate concern. (For some reason I feel like I've seen an argument for allowing that before.) Index: kern/kern_sig.c =================================================================== --- kern/kern_sig.c (revision 240144) +++ kern/kern_sig.c (working copy) @@ -2134,6 +2134,8 @@ * We try do the per-process part here. */ if (P_SHOULDSTOP(p)) { + KASSERT(!(p->p_flag & P_WEXIT), + ("signal to stopped but exiting process")); if (sig == SIGKILL) { /* * If traced process is already stopped, @@ -2248,7 +2250,7 @@ MPASS(action == SIG_DFL); if (prop & SA_STOP) { - if (p->p_flag & P_PPWAIT) + if (p->p_flag & (P_PPWAIT|P_WEXIT)) goto out; p->p_flag |= P_STOPPED_SIG; p->p_xstat = sig; @@ -2410,6 +2412,7 @@ struct proc *p = td->td_proc; PROC_LOCK_ASSERT(p, MA_OWNED); + KASSERT(!(p->p_flag & P_WEXIT), ("Stopping exiting process")); WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, &p->p_mtx.lock_object, "Stopping for traced signal"); @@ -2648,7 +2651,7 @@ * process group, ignore tty stop signals. */ if (prop & SA_STOP) { - if (p->p_flag & P_TRACED || + if (p->p_flag & (P_TRACED|P_WEXIT) || (p->p_pgrp->pg_jobc == 0 && prop & SA_TTYSTOP)) break; /* == ignore */ Index: kern/kern_exit.c =================================================================== --- kern/kern_exit.c (revision 240204) +++ kern/kern_exit.c (working copy) @@ -200,6 +200,16 @@ _STOPEVENT(p, S_EXIT, rv); /* + * Ignore any pending request to stop due to a stop signal. + * Once P_WEXIT is set, future requests will be ignored as + * well. + * + * XXX: Is this correct? + */ + p->p_flag &= ~P_STOPPED_SIG; + KASSERT(!P_SHOULDSTOP(p), ("exiting process is stopped")); + + /* * Note that we are exiting and do another wakeup of anyone in * PIOCWAIT in case they aren't listening for S_EXIT stops or * decided to wait again after we told them we are exiting. /*- * Test program to expose a race where SIGSTOP can be delivered to an * exiting process after it has set p_xstat to the exit status in * exit1() resulting in the p_xstat being overwritten. */ #include #include #include #include #include #include #include #include #include #include static volatile sig_atomic_t info; static void handler(int sig) { info = 1; } static void print_status(int status) { if (WIFCONTINUED(status)) printf("CONTINUED"); else if (WIFEXITED(status)) printf("EXITED: %d", WEXITSTATUS(status)); else if (WIFSIGNALED(status)) printf("SIGNALED: %d%s", WTERMSIG(status), WCOREDUMP(status) ? " (core dumped" : ""); else if (WIFSTOPPED(status)) printf("STOPPED: %d", WSTOPSIG(status)); else printf("UNKNOWN: %#x\n", status); } static void * kill_thread(void *arg) { pid_t pid; pid = (intptr_t)arg; for (;;) { if (kill(pid, SIGSTOP) < 0) { if (errno == ESRCH) break; err(1, "kill(SIGSTOP)"); } if (kill(pid, SIGCONT) < 0) { if (errno == ESRCH) break; err(1, "kill(SIGCONT)"); } } return (NULL); } int main(int ac, char **av) { pthread_t thread; pid_t pid, wpid; int count, error, status; if (signal(SIGINFO, handler) == SIG_ERR) errx(1, "signal(SIGINFO)"); for (count = 0;; count++) { if (info) { printf("count %d\n", count); info = 0; } pid = fork(); switch (pid) { case -1: err(1, "fork"); case 0: usleep(1000); exit(0); } error = pthread_create(&thread, NULL, kill_thread, (void *)(intptr_t)pid); if (error) errc(1, error, "pthread_create"); wpid = waitpid(pid, &status, 0); if (wpid == -1) err(1, "waitpid"); if (wpid != pid) errx(1, "waitpid mismatch"); if (!WIFEXITED(status) || WEXITSTATUS(status) != 0) { printf("abnormal status: "); print_status(status); printf(" after %d loops\n", count); } error = pthread_join(thread, NULL); if (error) errc(1, error, "pthread_join"); } return (0); } -- John Baldwin