Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 11 Jul 2003 11:10:28 +0200 (CEST)
From:      Harti Brandt <brandt@fokus.fraunhofer.de>
To:        John Baldwin <jhb@FreeBSD.org>
Cc:        hackers@FreeBSD.org
Subject:   RE: Race in kevent
Message-ID:  <20030711110052.D38638@beagle.fokus.fraunhofer.de>
In-Reply-To: <XFMail.20030710152212.jhb@FreeBSD.org>
References:  <XFMail.20030710152212.jhb@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 10 Jul 2003, John Baldwin wrote:

JB>
JB>On 10-Jul-2003 Harti Brandt wrote:
JB>> On Wed, 9 Jul 2003, John Baldwin wrote:
JB>>
JB>> JB>On 09-Jul-2003 Harti Brandt wrote:
JB>> JB>>
JB>> JB>> Hi,
JB>> JB>>
JB>> JB>> I just had a crash while typing ^C to a program that has a kevent timer
JB>> JB>> running. The crash was:
JB>> JB>>
JB>> JB>> callout_stop
JB>> JB>> callout_reset
JB>> JB>> filt_timerexpire
JB>> JB>> softclock
JB>> JB>>
JB>> JB>> and callout_stop was accessing freed memory (0xdeadc0e2). After looking
JB>> JB>> some time at the filt_timerdetach, callout_stop and softclock I think the
JB>> JB>> following happened:
JB>>
JB>> JB>This is becoming a common race unfortunately. :(  See the hacks in
JB>> JB>msleep() that use TDF_TIMEOUT in coooperationg with endtsleep() and
JB>> JB>the recent commit to the realtimer callout code for ways to work around
JB>> JB>this race.
JB>>
JB>> In both places the thread just sleeps until the timeout has fired (when I
JB>> understand this correctly). While this is a possible workaround also for
JB>> kevent() (which only holds Giant as far as I can see) this is by no means
JB>> a solution for other callers. While looking through the tree I have found
JB>> several issues with timeouts which probably should be resolved or they
JB>> will hit us with SMP:
JB>
JB>Yes, they sleep until the callout has finished executing.  Note that the
JB>callout has _already_ fired.  The common case is that it is blocked on
JB>the lock that the code trying to stop the callout is holding.  Thus, you
JB>are going to have to have special case code in your callout handler
JB>_anyway_ to handle these edge cases, so there really isn't a super-duper
JB>easy-clean solution.

This is with MPSAFE callouts. !MPSAFE callouts will block before calling
the handler when the try to acquire Giant. I'm assuming here, that the
thread that calls callout_stop() must already have Giant (this is a valid
assumption, isn't it?). In this case we can clear the cached copy (c_func)
of the callout in callout_stop() and softclock will notice this when it
has aquired Giant. We just need to make this cached copy global. For the
MPSAFE case there is nothing we can do I suppose.

JB>> - the CALLOUT_ACTIVE flag is not maintained correctly. softclock() fails
JB>> to clear this flag after the timeout has fired. callout_stop() clears
JB>> CALLOUT_ACTIVE if it finds the callout not PENDING. This is wrong if
JB>> the callout is just about to be called (in this case it is !PENDING
JB>> but ACTIVE). This makes callout_active() useless.
JB>
JB>The problem is in the API.  One of the design goals is that a callout can
JB>re-fire itself.  Thus, softclock can't touch the callout once it has fired
JB>it.  This design goal is the reason for much of the confusion.
JB>
JB>> - using callout_active() on a callout_handle. Callouts for
JB>> callout_handles (timeout(9)) are allocated from a common pool. So you may
JB>> just check the wrong callout if the callout has already fired and has been
JB>> reallocated to another user. Handles allocated with timeout(9) can only
JB>> be passed to untimeout(9)
JB>
JB>The idea is that timeout(9) and untimeout(9) are a deprecated interface and
JB>code should be using the callout(9) API instead.  Note that timeout(9)'s can
JB>never be marked MPSAFE.
JB>
JB>> I think we should try to make the callout interface usable without races
JB>> for the !MPSAFE case (see mail from Eric Jacobs). For the MPSAFE case the
JB>> caller should be responsible for this. And we should probably better
JB>> document the interface.
JB>>
JB>> Going to think about this...
JB>
JB>Well, you need to consider the design goal above as it throws several
JB>wrenches into the works.  One possibility is that we could ditch the
JB>design goal.  Another possibility is that we could expand the callout
JB>API to allow for periodict callouts and not just one-shot callouts.

That would probably be a good idea. Scheduling the new callout before
calling the callout routine would be a good idea.

harti
-- 
harti brandt,
http://www.fokus.fraunhofer.de/research/cc/cats/employees/hartmut.brandt/private
brandt@fokus.fraunhofer.de, harti@freebsd.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030711110052.D38638>