FreeBSD Mail Archives

Date:      Thu, 4 Jul 2002 10:30:58 -0500 (CDT)
From:      Jonathan Lemon <jlemon@flugsvamp.com>
To:        bsddiy@yahoo.com, Bruce Evans <bde@zeta.org.au>, current@freebsd.org
Subject:   Re: Timeout and SMP race
Message-ID:  <200207041530.g64FUwE48481@prism.flugsvamp.com>
In-Reply-To: <local.mail.freebsd-current/20020704132044.7338.qmail@web20909.mail.yahoo.com>

In article <local.mail.freebsd-current/20020704132044.7338.qmail@web20909.mail.yahoo.com> you write:
>in RELENG_4, when one calls callout_stop() (not nested in softclock execute
>path
>, I am not talking about this case), after it returns, he can believe that the
>callout is truely stopped, however in CURRENT, this assumption is false, now we
>
>must care if callout_stop() truely stopped the callout when it returned, this 
>is all difference I see, we bring in this race which not exists in RELENG_4, 
>see what hacking code put in kern_condvar.c and kern_synch.c in CURRENT source,

I don't believe this is true.  There is a corresponding race in -stable,
where the spl() level is dropped before performing the callout, thus 
allowing either a call to callout_stop() or callout_reset() to come in
immediately before the callout is actually made.

The callout function is responsible for checking to see if it has lost
the race, and perform the appropriate action.  This is done with the 
CALLOUT_PENDING and CALLOUT_ACTIVE flags:

        s = splnet();
        if (callout_pending(tp->tt_delack) || !callout_active(tp->tt_delack)) {
                splx(s);
                return;
        }
        callout_deactivate(tp->tt_delack);

If 'CALLOUT_PENDING' is set, then we lost a race with callout_reset(),
and should not perform the callout.  If 'CALLOUT_ACTIVE' is clear, then
we lost a race with callout_stop().

Either way, on both -current and -stable, you cannot assume that the
timer callback is completely gone immediately after calling callout_stop().
--
Jonathan


>----- Original Message ----- 
>From: "Bruce Evans" <bde@zeta.org.au>
>To: "David Xu" <davidx@viasoft.com.cn>
>Cc: "Julian Elischer" <julian@elischer.org>; <freebsd-current@FreeBSD.ORG>
>Sent: Thursday, July 04, 2002 7:02 PM
>Subject: Re: Timeout and SMP race
>
>
>> On Thu, 4 Jul 2002, David Xu wrote:
>> 
>> > ----- Original Message -----
>> > From: "Julian Elischer" <julian@elischer.org>
>> > To: "David Xu" <davidx@viasoft.com.cn>
>> > Cc: <freebsd-current@FreeBSD.ORG>
>> > Sent: Thursday, July 04, 2002 4:36 PM
>> > Subject: Re: Timeout and SMP race
>> > >
>
>> >
>> > if another thread other than softclock itself is calling callout_stop(),
>> > and callout_stop() detected that softclock is currently running the
>> > callout,  it should wait until softclock finishes the work, then return.
>> 
>> softclock() intentionally releases callout_lock() to allow other processes
>> to manipulate callouts.  What is the race exactly?  Concurrent calls to
>> softclock() seem to be possible but seem to be handled correctly (internal
>> locking prevents problems).  Well, I can see one race in softclock():
>> 
>> % c_func = c->c_func;
>> % c_arg = c->c_arg;
>> % c_flags = c->c_flags;
>> 
>> This caches some values, as is needed since the 'c' pointer may become
>> invalid after we release the lock ... but the things pointed to may become
>> invalid too.
>> 
>> % c->c_func = NULL;
>> % if (c->c_flags & CALLOUT_LOCAL_ALLOC) {
>> % c->c_flags = CALLOUT_LOCAL_ALLOC;
>> % SLIST_INSERT_HEAD(&callfree, c,
>> %     c_links.sle);
>> % } else
>> % c->c_flags &= ~CALLOUT_PENDING;
>> % mtx_unlock_spin(&callout_lock);
>> 
>> callout_stop() may stop 'c' here.  It won't do much, since we have already
>> set c->c_func to NULL, but its caller may want the callout actually stopped
>> so that it can do things like unloading the old c->c_func.
>> 
>> % if ((c_flags & CALLOUT_MPSAFE) == 0)
>> % mtx_lock(&Giant);
>> % c_func(c_arg);
>> 
>> This calls through a possibly-invalid function pointer.
>> 
>> % if ((c_flags & CALLOUT_MPSAFE) == 0)
>> % mtx_unlock(&Giant);
>> % mtx_lock_spin(&callout_lock);
>> 
>> This seems to be an old bug.  In RELENG_4, splsoftclock() gives a more
>> global lock, but there is nothing to prevent callout_stop() being run
>> at splsoftclock().  In fact, it must be able to run when called nested
>> from inside softclock(), since it might be called from the handler.
>> Waiting in callout_stop() for softclock() to finish would deadlock in
>> this case.  It's interesting that this case is (always?) avoided in
>> untimeout() by not calling callout_stop() when c->c_func == NULL.
>> 
>> softclock() can't do anything about c->c_func going away after it is
>> called.  Clients must somehow avoid killing it.
>> 
>> I think c->c_func rarely goes away, and the race that you noticed is
>> lost more often.
>> 
>> Bruce
>> 
>> 
>> To Unsubscribe: send mail to majordomo@FreeBSD.org
>> with "unsubscribe freebsd-current" in the body of the message
>
>__________________________________________________
>Do You Yahoo!?
>Sign up for SBC Yahoo! Dial - First Month Free
>http://sbc.yahoo.com
>
>To Unsubscribe: send mail to majordomo@FreeBSD.org
>with "unsubscribe freebsd-current" in the body of the message
>



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200207041530.g64FUwE48481>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation