From owner-freebsd-arch@FreeBSD.ORG Fri Jan 4 01:19:23 2008 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5FCF816A417; Fri, 4 Jan 2008 01:19:23 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 2641213C465; Fri, 4 Jan 2008 01:19:23 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id E320B47954; Thu, 3 Jan 2008 20:19:21 -0500 (EST) Date: Fri, 4 Jan 2008 01:19:21 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Poul-Henning Kamp In-Reply-To: <2296.1199319966@critter.freebsd.dk> Message-ID: <20080104011149.M42109@fledge.watson.org> References: <2296.1199319966@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Attilio Rao , arch@freebsd.org, Andre Oppermann , freebsd-arch@freebsd.org Subject: Re: New "timeout" api, to replace callout X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Jan 2008 01:19:23 -0000 > In message <477C2A8A.8090604@freebsd.org>, Andre Oppermann writes: > >>>>>> I fear we have to go for the latter. Getting a non-sleeping callout >>>>>> drain seems to be non-trivial. >>>>> There is a crucial difference between "non-sleeping" and "not sleeping >>>>> on my lock" that you should be very careful about in this context. >>>>> >>>>> Which is your requirement ? >>>> Calling timeout_drain() must not sleep and not drop the lock in this >>>> context (while making any pending timeout go away forever). >>> >>> Ok, so if the timeouts callback function is running you propose to do what >>> ? spin until it returns ? >> >> As long as the spinning is bounded [...] I don't have a perfect solution >> handy. That's why I try to state the requirement and hope the timeout >> gurus can work out how to do it. ;-) > > It's all Alan Turings fault :-) > > You can't have that, it's that simple. > > What I'm proposing is that your thread will sleep on a plain, but unrelated > mutex (internal to the timeout code) until the function comes back. > > Based on your description above, you won't be able to tell the any > difference between this and what you wish for. In a world of bounded mutex-length sleeps and unbounded I/O-length sleeps, waiting a mutex-length sleep for callout_stop() to cancel and drain the callout is fine. However, callout_stop() needs to be done while holding locks that the callout may acquire (global tcbinfo and local inpcb locks), which in the current world order would lead rapidly to deadlock. In the model you have in mind, is it OK to hold locks that the callout being canceled using the call to callout_stop() might acquire? The following pseudo-code snippets captures what I'm probably putting into words poorly: TCP input path: mtx_lock(&tcbinfo); ... mtx_lock(inp->inp_mtx); ... /* * NB: We can't drop the locks here or there will be races, and we * also can't msleep() or related sorts of things as we're running in * an ithread. */ callout_stop(inp->inp_callout); mtx_destroy(inp->inp_mtx); free(inp); TCP timer path: mtx_lock(&tcbinfo); ... mtx_lock(inp->inp_mtx); ... mtx_unlock(inp->inp_mtx); ... mtx_unlock(&tcbinfo); Right now, we callout_stop() but not callout_drain() in the input path on the basis that we can't msleep(), and for better and mostly worse, we accept the resulting race. We'd like to eliminate that race. One way to do that is to hand the inpcb off to another kthread that *can* msleep() doing a callout_drain(), and have it GC the data structure when done. That's sort of ugly, and means that we're freeing the memory asynchronously which is somewhat undesirable. If it were possible to have the above callout_stop() also guarantee that once it returns, the timer is not currently running and will never run again, then we wouldn't need that. We may also have cases where we want to free the inpcb from within the timer... Robert N M Watson Computer Laboratory University of Cambridge