Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 9 Jul 2003 15:28:38 +0200 (CEST)
From:      Harti Brandt <brandt@fokus.fraunhofer.de>
To:        hackers@freebsd.org
Subject:   Race in kevent
Message-ID:  <20030709150708.O30571@beagle.fokus.fraunhofer.de>

next in thread | raw e-mail | index | archive | help

Hi,

I just had a crash while typing ^C to a program that has a kevent timer
running. The crash was:

callout_stop
callout_reset
filt_timerexpire
softclock

and callout_stop was accessing freed memory (0xdeadc0e2). After looking
some time at the filt_timerdetach, callout_stop and softclock I think the
following happened:


Proc 1                                  Proc 2
------                                  ------
filt_timerdetach			softclock called
call with Giant locked

					lock_spin(callout_lock)
					...
call callout_stop which hangs on
lock_spin(callout_lock)

					sofclock finds the callout,
					removes it from its queue and
					clears PENDING

					unlock_spin(callout_lock)
					lock(&Giant) blocks

callout_stop finds the callout to
be not pending and returns

filt_timerdetach frees the callout

...

unlock(&Giant)
					softclock continues and calls
					the (stopped) callout

					KABOOM because the pointer used
					by filt_timerexpire is gone


The problem seems to be that there is a small window where the callout is
already taken off from the callout queue, but not yet called and where all
locks are unlocked. callout_stop may just slip into this window and
invalidate the callout softclock() is about to call as soon as it gets
Giant (event with an non-MPSAFE callout the same problem exists although
the window is much smaller).

What to do?

callout_stop already detects this situation and returns 0. As far as I
understand the way to handle this is not to free the callout memory in
filt_timerdetach() when callout_stop() returns 0, but let the callout be
called. filt_timerexpire() should detect this situation and simply free
the memory and return. Is this a possible solution? (Actually this
requires some work, because the knote pointer that the filt_timerexpire()
gets is probably also gone).

harti
-- 
harti brandt,
http://www.fokus.fraunhofer.de/research/cc/cats/employees/hartmut.brandt/private
brandt@fokus.fraunhofer.de, harti@freebsd.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030709150708.O30571>