Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 Oct 2000 12:50:23 -0500 (EST)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        freebsd-smp@FreeBSD.org
Subject:   Reference count invariants in a fine-grained threaded environment
Message-ID:  <Pine.NEB.3.96L.1001031123647.58688x-100000@fledge.watson.org>

next in thread | raw e-mail | index | archive | help

While hacking up the prison code, I decided (for better or for worse) that
I should attempt to write it with a multi-threaded environment in mind,
protecting critical data structures against corrupting simultaneous use. 
John Baldwin and I discussed this a bit on IRC, and we had also discussed
it (with Jason and others) at BSDcon, so I wrote up some thoughts on the
matter and have attached them below. 

The premise is that right now, there are a number of shared data objects
in the FreeBSD kernel that are managed using reference counts -- either to
reduce overhead, or for legitimate sharing.  The concern is that (a) 
freeing of these objects must be handled carefully to avoid dereferencing
of these pointers after a reference has been freed, and (b) that global
data structures managing these shared objects be protected properly also. 
We concluded that we may need to define rules about how reference counts
are handled, either on a per-object basis, or globally as a set of
recommendations.  Presumably it is also desirable to avoid high contention
on mutexes for performance reasons, so it may be desirable to combine
protection of multiple objects under the same mutex if they are frequently
accessed in combination, but will generate little contention.

Robert N M Watson             FreeBSD Core Team, TrustedBSD Project
robert@fledge.watson.org      NAI Labs, Safeport Network Services

Rules for interactions between mutexes and reference-counted kernel
objects:

Assumptions:

- Objects with reference counts have a mutex that can protect their
  reference count, and possibly other variables (or other instances).  For
  example, struct cred might have a mutex per instance, but all struct
  prison's might use the same mutex. 

Definition:

- For a reference counted object, a "reference" refers to an outstanding
  set of pointers which are guaranteed not to become invalid until the
  reference is released.  In practice, this is guaranteed by mutexes
  protecting the reference count in the object, and by appropriate use
  of outstanding "references" by their consumers (which may involve
  mutexes or other system invariants). 

1) Consumers of reference counts must protect their reference (pointer) 
   when using the reference: i.e., they must guaranty that they do not
   attempt to dereference the pointer at any time after the reference
   may potentially have been released.  This means they must either
   explicitely or implicitely protect the referenece from release during
   use by virtue of a mutex, or other invariants.  If this introduces
   excessive contention due to extended use of the reference across
   blocking, then an additional reference must be acquired. 

2) Reference counted objects should provide service routines that allow
   additional references to be acquired given an existing valid reference
   without introducing race conditions. 

Example: struct proc and struct cred

Each struct cred has a reference count, and a mutex protecting that
reference count.  Unlike some credential objects, credentials are
immutable after creation (in effect copy-on-write) meaning that the
contents of the credential (when holding a valid reference) do not need to
be protected using the mutex. 

When a new credential is needed, a holder of a legitimate reference
invokes crcopy() which duplicates the original credential, and adds a
reference to the new credential for the caller.  The caller is responsible
for releasing the reference on the old credential when necessary.  The
caller can make changes to this credential as long as they are the only
holder (i.e., haven't committed it to a data structure that might permit
other callers to use it). 

The cred reference in struct proc must be implicitely or explicitely
protected against simultaneous access in unfortunate ways.  Here's an
example of simultaneous access that we need to protect against: 

Right now, during exec(), a credential change occurs on the process if
exec() is invoked on a setuid binary.  execve() uses crcopy to acquire a
new credential, and replaces the reference in the struct proc with the new
reference, freeing the old reference. 

sysctl() is used by ps and other utilities to retrieve process
information.  the sysctl handler will use pfind() to acquire a reference
to the struct proc, and then follow the struct proc's credential reference
to fill out the eproc returned to userland along with the struct proc
contents.  It is important that sysctl() not race with execve() to use
that credential reference.  First, it is necessary that the handler not be
able to gain an additional pointer to the credential without atomically
gaining an additional reference.  In a world where struct proc has a
mutex, this can be done by grabbing the struct proc mutex, using the
struct proc reference to get an additional credential reference
atomically, then releasing the struct proc mutex.  If the struct proc
mutex is not grabbed, a race could occur: sysctl grabs the old pointer,
and exec releases the reference, causing struct cred to be garbage
collected.  The bad pointer is now dereferenced by sysctl, resulting in
incorrect results (as it may have been reused), or a page fault. 

A similar example can be seen in the new prison code: struct proc
references struct prison, describing the process's prison centrally.  The
prison reference count, and global prison list, are protected using a
single prison_list mutex.  It is important that consumers of struct proc
only access the prison structure when they have a legitimate reference
that cannot be released during their use.  Right now, this is guaranteed
by global, which protects all struct procs, but when this becomes explicit
later, it is important that no access to the "p->p_prison" pointer (not
just the structure itself, but also the *pointer*) be protected from
inappropriate simultaneous use by multiple threads in the kernel.  It is
also important to note that, unlike struct cred, struct prison is not
immutable: certain aspects can be modified at run-time, which may have
important ramifications for all consumers.  If you hold an outstanding
reference on a cred, you don't have to grab a mutex to access the contents
safely (what about memory consistency?), but if you have one on a struct
prison, not all fields may remain the same, meaning additional protection
may be required.

Speaking of this issue: in a shared struct where some fields are constant
and others are not, false sharing can be a problem.  I.e., if the memory
architecture defines locked memory access over a memory line that's 8
bytes, but the compiler places two 32-bit variables in that line, with one
protected by a mutex and the other not.  What should we assume for all
supported architectures, or is it better to cover the entire structure
with a mutex (in which case, must it be the same mutex as the reference
count in the object, given that the reference count and another variable
might be covered by the same locked operation?)




To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1001031123647.58688x-100000>