Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 5 Jul 2001 09:32:47 -0700 (PDT)
From:      Matthew Jacob <mjacob@feral.com>
To:        Doug Rabson <dfr@nlsystems.com>
Cc:        John Baldwin <jhb@FreeBSD.org>, Matt Dillon <dillon@earth.backplane.com>, <cvs-all@FreeBSD.org>, <cvs-committers@FreeBSD.org>, Jake Burkholder <jake@FreeBSD.org>
Subject:   Re: cvs commit: src/sys/sys systm.h condvar.h src/sys/kern kern_
Message-ID:  <20010705091047.C37950-100000@wonky.feral.com>
In-Reply-To: <Pine.BSF.4.33.0107051143050.30393-100000@herring.nlsystems.com>

next in thread | previous in thread | raw e-mail | index | archive | help


On Thu, 5 Jul 2001, Doug Rabson wrote:

> On Wed, 4 Jul 2001, Matthew Jacob wrote:
>
> > > Imagine an interrupt between the wakeup and lock release.  That would give
> > > plenty enough time for another CPU to grab the woken process and then block on
> > > the lock.
> >
> > John- show some numbers with even a 32 processor SGI that this happens
> > all that much. Yer reachin'....
>
> I think that all John is saying that the possibility of taking an
> interrupt can make the size of the window large enough for another cpu (or
> for preemption on this cpu) to grab the woken process.
>
> I know from the original alpha port that in a buildworld, even 1-2
> instruction windows can cause races.

Of course.

There are two issues we've been bruiting about.

1. "Safe is Safe. Unsafe isn't."

If there's a window in wakeup, solving it by requiring the caller to
release a lock it holds in order for it to wake things up safely is
wrong, in my ill-informed opinion.

One of the arguments so far for the new functions is to avoid some kind
of (vaguely stated) breakage window. I haven't been massively convinced
by the explanations so far.

*Either the caller code is making incorrect assumptions or there's
something broken in the atomicity of the scheduler*

2. "Premature optimization is the root of all evil" - Donald Knuth

The other issue put forward has been that these functions will avoid
lock contention. I've been asking all along for somebody to put some
numbers on the table that show, in practice, that this is a problem
in reality.



The way I keep imagining the the issues brought up by this discussion
is thusly (#B vague):


Scenario A:

	cpuX		cpuY
	signals cv
			reschedules thread
			slams against lock
			held on cpuX- i.e.,
			cpuY is idle, nothing
			to do, and we're
			alwasy *so* quick
			that cpuY will get
			there before....

	releases lock

Problem: more contention than there should be.

Scenario B:

John's clearest comment:

> but I can see the state of the subsytem being altered by another CPU

"The Subsystem"... *Which* subsystem?

> before the wakeup is delivered (since the lock is unlocked and we
> may preempt on lock release and thus alter the state of the locked
> subsytem before coming back to the original thread and doing the
> wakeup) resulting in possibly bogus wakeups being sent.  Yuck.

	cpuX		cpuY
	signals cv
			goes to run thread

	releases lock

	tries to run	tries to run
	thread,		thread,
	collides	collides

It seems to me that this should be a "cannot happen", or if it does
happen (as it would with something like broadcast interrupts), the code
the lock covers has to cope.

Look- I'm not trying to argue for the sake of arguing. I was asking what
I think are very simple questions. I'll conclude from this exercise that
I really don't understand what the SMP implementors here are trying to
accomplish and drop it. I don't agree with creeping featurism, and,
*wince*, I kind of have to agree with the other Matt about this not
addressing the real problem set, but it's also quite possible that I
have no clue as to what's going as well. Sorry for raising a ruckus.


-matt




To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010705091047.C37950-100000>