Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 14 Nov 2000 22:58:25 -0800
From:      Mike Smith <msmith@freebsd.org>
To:        Jake Burkholder <jburkhol@home.com>
Cc:        John Baldwin <jhb@FreeBSD.org>, cvs-all@FreeBSD.org, cvs-committers@FreeBSD.org, jake@io.yi.org
Subject:   Re: cvs commit: src/sys/sys eventhandler.h 
Message-ID:  <200011150658.eAF6wPF01843@mass.osd.bsdi.com>
In-Reply-To: Your message of "Tue, 14 Nov 2000 22:37:08 PST." <20001115063708.614A0BA7A@io.yi.org> 

next in thread | previous in thread | raw e-mail | index | archive | help
> > > This is wrong and should be backed out; the tsleep() in the handler is 
> > > the bug.  You can't release the lock on the list while you're traversing 
> > > it, since the traversal is holding private state which has to remain 
> > > consistent with the list.
> > > 
> > > If you want to release the lock on the list, you would have to detect 
> > > when the list is changed and rescan it to find the 'new' current 
> > > location.  Only that doesn't work either, since the list traversal allows 
> > > you to remove the current handler from the list.
> > > 
> > > Please revert this change and fix eventhandler clients so that they don't 
> > > sleep.
> > 
> > Hrm, I guess I could go hard-code all the shutdown events to not use tsleep but
> > to use delay()'s.  This means hacking up or duplicating things like
> > suspend_kproc(). :(.   You shouldn't be holding a lock while you are calling
> > functions that aren't related to the resource you are holding.  Grrr.  Would
> > it be feasible to say that an eventhandler may not modify a list it is on? 
> > Perhaps extending the eventhandler interface to allow for once-only events
> > would be sufficient.
> 
> The timeout code deals with a similar situation.  It saves
> the next element in the list in a global variable, nextsoftcheck,
> spls down, calls the timeout handler, spls back up and uses nextsoftcheck
> as the next element.  If untimeout (callout_stop) finds that nextsoftcheck
> points to the timeout that's about to be deleted, it advances nextsoftcheck
> to point to the next element.
> 
> Wouldn't something similar work?

No.  Consider an example.  The list contains two event handlers, A and B. 
The eventhandler is invoked twice, 1) and 2)

 1) Eventhandler invoked
 1) Mutex acquired
 1) List traversed, A found, pointer to B saved
 1) Mutex released
 1) A called
 1) A sleeps
 2) Eventhandler invoked
 2) Mutex acquired
 2) List traversed, A found, pointer to B saved
 2) Mutex released
 2) A called
 2) Mutex acquired
 2) List traversed, end found
 2) Mutex released
 2) B called
 2) B unlinks self from list
 2) Eventhandler ends
 1) A wakes up
 1) Mutex acquired
 1) Dereferences invalid saved pointer to B

> Holding the mutex over the whole traversal also means you rely on mutex
> recursion in order to be able to deregister from an event handler.

True.  I've been thinking about Garrett's point about flagging deleted 
list elements too.

I'm actually trying to work out what the issue is with witness; I can 
only assume that it doesn't like the fact that an eventhandler mutex may 
be acquired either before or after giant, and this exposes a weakness in 
witness - that it can't deal with the situation where lock ordering 
*doesn't* matter.

-- 
... every activity meets with opposition, everyone who acts has his
rivals and unfortunately opponents also.  But not because people want
to be opponents, rather because the tasks and relationships force
people to take different points of view.  [Dr. Fritz Todt]
           V I C T O R Y   N O T   V E N G E A N C E




To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200011150658.eAF6wPF01843>