From owner-freebsd-arch  Tue Sep 12 11:34:58 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from magnesium.net (toxic.magnesium.net [207.154.84.15])
	by hub.freebsd.org (Postfix) with SMTP id C438237B42C
	for <arch@FreeBSD.ORG>; Tue, 12 Sep 2000 11:34:53 -0700 (PDT)
Received: (qmail 2178 invoked by uid 1142); 12 Sep 2000 18:34:52 -0000
Date: 12 Sep 2000 11:34:52 -0700
Date: Tue, 12 Sep 2000 11:34:45 -0700
From: Jason Evans <jasone@canonware.com>
To: Matthew Jacob <mjacob@feral.com>
Cc: arch@FreeBSD.ORG
Subject: Necessary synchronization primitives (was Re: cvs commit: [...] yarrow.c [...])
Message-ID: <20000912113445.F31089@blitz.canonware.com>
References: <20000912161928.C23948@wantadilla.lemis.com> <Pine.BSF.4.21.0009120012350.70592-100000@beppo.feral.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0.1i
In-Reply-To: <Pine.BSF.4.21.0009120012350.70592-100000@beppo.feral.com>; from mjacob@feral.com on Tue, Sep 12, 2000 at 12:28:24AM -0700
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

On Tue, Sep 12, 2000 at 12:28:24AM -0700, Matthew Jacob wrote:
> I believe that, *so far*, FreeBSD is taking a very good middle ground approach
> between these two extremes. In particular, the witness code approach from BSDi
> is avoiding the absolute chaotic nightmare that the three way clustercoitus
> that streams mux modules, non-recursive Mutexes (which is what Solaris locks
> are) and the 'unsafe_driver' global mutex to allow for non-SMP ready modules
> to exist with SMP-ready modules made of Solaris.
> 
> Please don't allow 'creeping featurism/comp sci 101' in. Rather than add
> in reader/writer locks- make the people asking for them *really* justify why
> they need the expense of a locking mechanism that will be around for years.
> What problem does it solve? Can the problem be solved by redoing the subsystem
> in question to be more SMP friendly?

Executive summary: My experience has indicated that 1) mutexes, 2)
condition variables, 3) barriers, and 4) message queues are an adequate set
of locking primitives for almost any problem.

I've been drooling over threads since being introduced to OS/2 in 1992.  I
actually started using threads in significant ways in about 1996.  The last
5 years have taught me a few lessons about what is useful versus what
sounds good on paper.  To make it clear where this email is going, I'll
start off by saying "reader/writer locks are rarely useful", and later on
will add support to that opinion.

Here's a laundry list of synchronization primitives, in approximately
increasing order of complexity:

* mutex {blocking, spinning, adaptive, recursive}

  Simple mutual exclusion lock, in many flavors.  Only one thread can own a
  mutex at a time.

  - blocking

    If a thread tries to acquire a mutex that is already owned, the thread
    will block until it is granted the mutex (i.e. the previous owner
    releases the mutex).

  - spinning

    If a thread tries to acquire a mutex that is already owned, the threads
    spins in a tight loop, trying again and again to acquire the mutex
    until it succeeds.  Spin mutexes tend to be very dangerous and
    difficult to use correctly.  In userland, spinning mutexes are almost
    always a bad idea (though not *always*).  In the kernel, and in
    threading library implementations, there are decidedly legitimate
    reasons for using spinning mutexes.

  - adaptive

    A spinning mutex that blocks after a certain amount of spinning.  In
    userland programming, these pretty much remove any need for pure
    spinning mutexes.

  - recursive

    A thread can acquire the same mutex more than once, recursively.  Until
    the SMP work, I never used recursive mutexes.  In my opinion, if
    recursive mutexes are needed to solve a programming problem, then the
    problem needs to be re-thought.  That is, recursive mutexes are a
    crutch, not a necessity.  However, given that we're converting an old
    piece of code to being threaded, there are a number of places where
    recursive mutexes save us from having to rip major portions of the
    kernel to pieces.  Hopefully we can keep recursive mutex use to a
    minimum, but from a pragmatic point of view, we need them, at least
    for now.

* condition variable

  The name is pretty descriptive.  There are two basic operations on a
  condition variable: waiting and signaling.  A thread can wait for a
  condition to occur, and another thread can signal that the condition has
  occurred (or broadcast, in order to wake all threads waiting on the
  condition).  Condition variables are useful for waiting for state changes
  to occur.

  Note that the recently added msleep() is for all practical purposes a
  condition variable implementation.

The rest of the primitives can be constructed using mutexes and condition
variables as building blocks.

* Dijkstra semaphore

  I've gone through my library (including Knuth and various other books)
  and the best I've been able to determine is that Dijkstra semaphores have
  two operations: P (passeren: Dutch for "to pass") and V (vrijgeven: Dutch
  for "to give free").  It is unclear to me given the references at hand
  whether Dijkstra semaphores are generalized enough to include a count
  like counting semaphores (described below), but I suspect so.  Perhaps
  Greg can shed some light on that.

* (counting) semaphore

  A number is associated with a semaphore, and the two main operations are
  to post (increment) and wait (decrement).  Posting always succeeds
  (somewhat analogous to unlocking a mutex), but waiting will cause a
  thread to block if the decrement operation would cause the semaphore
  value to go below zero.

  Semaphores can be degenerately used as a mutex, by constraining the value
  of the semaphore to always be 0 or 1.  One major difference though is
  that semaphores can be waited on, then posted, by entirely different
  threads, whereas a mutex must be locked, then unlocked, by the same
  thread.

  Semaphores can also be degenerately used as a condition variable, by
  waiting until another thread posts (signals) that the condition has
  occurred.

* reader/writer lock

  Multiple readers can simultaneously own read locks, but only one writer
  (and no readers) can own a write lock at a time.  The idea behind
  reader/writer locks is to allow concurrent access to a data structure
  that has many potential readers, in cases where the data structure is
  rarely modified.

* barrier

  Multiple threads can wait on a barrier, and they all wait until a
  predetermined threshold (number of waiters) is reached, at which time the
  barrier breaks (the waiting threads can all resume running).

* message queue

  Messages can be written and read.  Message queues are useful for
  decoupling complex subsystems, in that they conceptually create an
  asynchronous link between separate state machines.

-----------

I've personally found mutexes, condition variables, barriers, and message
queues to be useful.  In reality, I've emulated barriers with semaphores,
which is the only thing I've used semaphores for, but in retrospect, a
barrier primitive would have been a better way to go.

That leaves out semaphores and reader/writer locks.  As mentioned above,
semaphores can be degenerately used as mutexes or condition variables, but
semaphores can also be avoided by using mutexes and condition variables in
combination.  The omission of reader/writer locks requires a more in-depth
explanation.

In fact, I have used reader/writer locks.  However, in every case where
I've done so, later analysis of the code has led me to conclude either
that:

* The additional overhead of reader/writer locks as opposed to plain
  mutexes has not been worth it.  That is, reader/writer locks were not
  increasing parallelism enough to compensate for the fact that they're
  more expensive to use.

  This is an important point; people (including me) tend to try to optimize
  a problem by using reader/writer locks, when in reality it usually harms
  performance.

* The problem should have been solved in a different way that did not
  involve the need for reader/writer locks.

There's one situation where I've used a form of reader/writer locks: in
combination with a job queue.  In that case, a job queue consisted of jobs
that needed to either read or write a shared data structure.  Only one
write job could be dispatched at a time, but all contiguous read jobs could
be dispatched in parallel (I say contiguous, because in this case, ordering
of operations was important).  So, this was a situation in which
reader/writer locking made sense, but it was in a context that a
reader/writer lock primitive was useless.

There may be places where a reader/writer locking primitive truly makes
sense, but I've never seen one.

Jason


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message