From owner-freebsd-threads@FreeBSD.ORG Fri Jun 11 07:45:51 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BA16416A4CE; Fri, 11 Jun 2004 07:45:51 +0000 (GMT) Received: from mail.mcneil.com (rrcs-west-24-199-45-54.biz.rr.com [24.199.45.54]) by mx1.FreeBSD.org (Postfix) with ESMTP id 84A7243D1F; Fri, 11 Jun 2004 07:45:49 +0000 (GMT) (envelope-from sean@mcneil.com) Received: from localhost (localhost.mcneil.com [127.0.0.1]) by mail.mcneil.com (Postfix) with ESMTP id 156C8FD087; Fri, 11 Jun 2004 00:45:29 -0700 (PDT) Received: from mail.mcneil.com ([127.0.0.1]) by localhost (server.mcneil.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 28084-02; Fri, 11 Jun 2004 00:45:28 -0700 (PDT) Received: from [24.199.45.54] (mcneil.com [24.199.45.54]) by mail.mcneil.com (Postfix) with ESMTP id 61062FD075; Fri, 11 Jun 2004 00:45:28 -0700 (PDT) From: Sean McNeil To: Daniel Eischen In-Reply-To: References: Content-Type: text/plain Message-Id: <1086939928.10026.26.camel@server.mcneil.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Fri, 11 Jun 2004 00:45:28 -0700 Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new at mcneil.com cc: David Xu cc: freebsd-threads@freebsd.org Subject: Re: signal handler priority issue X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Jun 2004 07:45:51 -0000 OK, I think I have it figured out.... The problem is, that when the first signal handler, SIGUSR1, is run the SIGUSR2 signal is blocked. I think this is what Daniel was trying to say, or I didn't provide enough information for him to catch it. So, I've fixed the sigaction call to unblock SIGUSR2 while SIGUSR1 is going and rearranged a few things: me->stop_info.signal = 0; me->stop_info.last_stop_count = my_stop_count; /* Tell the thread that wants to stop the world that this */ /* thread has been stopped. Note that sem_post() is */ /* the only async-signal-safe primitive in LinuxThreads. */ sem_post(&GC_suspend_ack_sem); #if DEBUG_THREADS GC_printf2("Waiting for restart #%d of 0x%lx\n", my_stop_count, my_thread); #endif /* Wait until that thread tells us to restart by sending */ /* this thread a SIG_THR_RESTART signal. */ if (sigfillset(&mask) != 0) ABORT("sigfillset() failed"); if (sigdelset(&mask, SIG_THR_RESTART) != 0) ABORT("sigdelset() failed"); # ifdef NO_SIGNALS if (sigdelset(&mask, SIGINT) != 0) ABORT("sigdelset() failed"); if (sigdelset(&mask, SIGQUIT) != 0) ABORT("sigdelset() failed"); if (sigdelset(&mask, SIGTERM) != 0) ABORT("sigdelset() failed"); if (sigdelset(&mask, SIGABRT) != 0) ABORT("sigdelset() failed"); # endif while (me->stop_info.signal != SIG_THR_RESTART) { sigsuspend(&mask); /* Wait for signal */ } There might still be a bug that I'm just hiding. I think that the problem is while in the handler for SIGUSR1 and someone calls pthread_kill with SIGUSR2, the signal isn't marked as pending because it is masked off. It appears to rely on the following behavior: thread 1 calls pthread_kill for SIGUSR1. thread 2 enters SIGUSR1 handler and SIGUSR2 is masked. thread 1 calls pthread_kill for SIGUSR2. signal is set pending as it is currently masked off. thread 2 calls sigsuspend thus unblocking SIGUSR2. Signal handler for SIGUSR2 is called and then sigsuspend in SIGUSR1 handler returns. Is this correct behavior? Should a pthread_kill of a blocked signal pend or should it be dropped? Right now, I've worked around this.