Date: Tue, 31 Oct 2000 12:50:23 -0500 (EST) From: Robert Watson <rwatson@FreeBSD.org> To: freebsd-smp@FreeBSD.org Subject: Reference count invariants in a fine-grained threaded environment Message-ID: <Pine.NEB.3.96L.1001031123647.58688x-100000@fledge.watson.org>
next in thread | raw e-mail | index | archive | help
While hacking up the prison code, I decided (for better or for worse) that I should attempt to write it with a multi-threaded environment in mind, protecting critical data structures against corrupting simultaneous use. John Baldwin and I discussed this a bit on IRC, and we had also discussed it (with Jason and others) at BSDcon, so I wrote up some thoughts on the matter and have attached them below. The premise is that right now, there are a number of shared data objects in the FreeBSD kernel that are managed using reference counts -- either to reduce overhead, or for legitimate sharing. The concern is that (a) freeing of these objects must be handled carefully to avoid dereferencing of these pointers after a reference has been freed, and (b) that global data structures managing these shared objects be protected properly also. We concluded that we may need to define rules about how reference counts are handled, either on a per-object basis, or globally as a set of recommendations. Presumably it is also desirable to avoid high contention on mutexes for performance reasons, so it may be desirable to combine protection of multiple objects under the same mutex if they are frequently accessed in combination, but will generate little contention. Robert N M Watson FreeBSD Core Team, TrustedBSD Project robert@fledge.watson.org NAI Labs, Safeport Network Services Rules for interactions between mutexes and reference-counted kernel objects: Assumptions: - Objects with reference counts have a mutex that can protect their reference count, and possibly other variables (or other instances). For example, struct cred might have a mutex per instance, but all struct prison's might use the same mutex. Definition: - For a reference counted object, a "reference" refers to an outstanding set of pointers which are guaranteed not to become invalid until the reference is released. In practice, this is guaranteed by mutexes protecting the reference count in the object, and by appropriate use of outstanding "references" by their consumers (which may involve mutexes or other system invariants). 1) Consumers of reference counts must protect their reference (pointer) when using the reference: i.e., they must guaranty that they do not attempt to dereference the pointer at any time after the reference may potentially have been released. This means they must either explicitely or implicitely protect the referenece from release during use by virtue of a mutex, or other invariants. If this introduces excessive contention due to extended use of the reference across blocking, then an additional reference must be acquired. 2) Reference counted objects should provide service routines that allow additional references to be acquired given an existing valid reference without introducing race conditions. Example: struct proc and struct cred Each struct cred has a reference count, and a mutex protecting that reference count. Unlike some credential objects, credentials are immutable after creation (in effect copy-on-write) meaning that the contents of the credential (when holding a valid reference) do not need to be protected using the mutex. When a new credential is needed, a holder of a legitimate reference invokes crcopy() which duplicates the original credential, and adds a reference to the new credential for the caller. The caller is responsible for releasing the reference on the old credential when necessary. The caller can make changes to this credential as long as they are the only holder (i.e., haven't committed it to a data structure that might permit other callers to use it). The cred reference in struct proc must be implicitely or explicitely protected against simultaneous access in unfortunate ways. Here's an example of simultaneous access that we need to protect against: Right now, during exec(), a credential change occurs on the process if exec() is invoked on a setuid binary. execve() uses crcopy to acquire a new credential, and replaces the reference in the struct proc with the new reference, freeing the old reference. sysctl() is used by ps and other utilities to retrieve process information. the sysctl handler will use pfind() to acquire a reference to the struct proc, and then follow the struct proc's credential reference to fill out the eproc returned to userland along with the struct proc contents. It is important that sysctl() not race with execve() to use that credential reference. First, it is necessary that the handler not be able to gain an additional pointer to the credential without atomically gaining an additional reference. In a world where struct proc has a mutex, this can be done by grabbing the struct proc mutex, using the struct proc reference to get an additional credential reference atomically, then releasing the struct proc mutex. If the struct proc mutex is not grabbed, a race could occur: sysctl grabs the old pointer, and exec releases the reference, causing struct cred to be garbage collected. The bad pointer is now dereferenced by sysctl, resulting in incorrect results (as it may have been reused), or a page fault. A similar example can be seen in the new prison code: struct proc references struct prison, describing the process's prison centrally. The prison reference count, and global prison list, are protected using a single prison_list mutex. It is important that consumers of struct proc only access the prison structure when they have a legitimate reference that cannot be released during their use. Right now, this is guaranteed by global, which protects all struct procs, but when this becomes explicit later, it is important that no access to the "p->p_prison" pointer (not just the structure itself, but also the *pointer*) be protected from inappropriate simultaneous use by multiple threads in the kernel. It is also important to note that, unlike struct cred, struct prison is not immutable: certain aspects can be modified at run-time, which may have important ramifications for all consumers. If you hold an outstanding reference on a cred, you don't have to grab a mutex to access the contents safely (what about memory consistency?), but if you have one on a struct prison, not all fields may remain the same, meaning additional protection may be required. Speaking of this issue: in a shared struct where some fields are constant and others are not, false sharing can be a problem. I.e., if the memory architecture defines locked memory access over a memory line that's 8 bytes, but the compiler places two 32-bit variables in that line, with one protected by a mutex and the other not. What should we assume for all supported architectures, or is it better to cover the entire structure with a mutex (in which case, must it be the same mutex as the reference count in the object, given that the reference count and another variable might be covered by the same locked operation?) To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1001031123647.58688x-100000>