Date: Wed, 2 Oct 2013 19:20:15 +0000 (UTC) From: John Baldwin <jhb@FreeBSD.org> To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-9@freebsd.org Subject: svn commit: r256005 - stable/9/share/man/man9 Message-ID: <201310021920.r92JKFWQ046924@svn.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: jhb Date: Wed Oct 2 19:20:15 2013 New Revision: 256005 URL: http://svnweb.freebsd.org/changeset/base/256005 Log: MFC 233422,233680,233681,237619,239904,249373,252346,252379,252423: Sync locking(9) with HEAD. The only change not merged is that 9 still supports !MPSAFE filesystems. Modified: stable/9/share/man/man9/locking.9 Directory Properties: stable/9/share/man/man9/ (props changed) Modified: stable/9/share/man/man9/locking.9 ============================================================================== --- stable/9/share/man/man9/locking.9 Wed Oct 2 19:18:00 2013 (r256004) +++ stable/9/share/man/man9/locking.9 Wed Oct 2 19:20:15 2013 (r256005) @@ -24,7 +24,7 @@ .\" .\" $FreeBSD$ .\" -.Dd May 25, 2012 +.Dd June 30, 2013 .Dt LOCKING 9 .Os .Sh NAME @@ -33,44 +33,51 @@ .Sh DESCRIPTION The .Em FreeBSD -kernel is written to run across multiple CPUs and as such requires -several different synchronization primitives to allow the developers -to safely access and manipulate the many data types required. +kernel is written to run across multiple CPUs and as such provides +several different synchronization primitives to allow developers +to safely access and manipulate many data types. .Ss Mutexes -Mutexes (also called "sleep mutexes") are the most commonly used +Mutexes (also called "blocking mutexes") are the most commonly used synchronization primitive in the kernel. -Thread acquires (locks) a mutex before accessing data shared with other +A thread acquires (locks) a mutex before accessing data shared with other threads (including interrupt threads), and releases (unlocks) it afterwards. -If the mutex cannot be acquired, the thread requesting it will sleep. +If the mutex cannot be acquired, the thread requesting it will wait. +Mutexes are adaptive by default, meaning that +if the owner of a contended mutex is currently running on another CPU, +then a thread attempting to acquire the mutex will spin rather than yielding +the processor. Mutexes fully support priority propagation. .Pp See .Xr mutex 9 for details. -.Ss Spin mutexes -Spin mutexes are variation of basic mutexes; the main difference between -the two is that spin mutexes never sleep - instead, they spin, waiting -for the thread holding the lock, which runs on another CPU, to release it. -Differently from ordinary mutex, spin mutexes disable interrupts when acquired. -Since disabling interrupts is expensive, they are also generally slower. -Spin mutexes should be used only when necessary, e.g. to protect data shared +.Ss Spin Mutexes +Spin mutexes are a variation of basic mutexes; the main difference between +the two is that spin mutexes never block. +Instead, they spin while waiting for the lock to be released. +To avoid deadlock, a thread that holds a spin mutex must never yield its CPU. +Unlike ordinary mutexes, spin mutexes disable interrupts when acquired. +Since disabling interrupts can be expensive, they are generally slower to +acquire and release. +Spin mutexes should be used only when absolutely necessary, +e.g. to protect data shared with interrupt filter code (see .Xr bus_setup_intr 9 -for details). -.Ss Pool mutexes -With most synchronization primitives, such as mutexes, programmer must -provide a piece of allocated memory to hold the primitive. +for details), +or for scheduler internals. +.Ss Mutex Pools +With most synchronization primitives, such as mutexes, the programmer must +provide memory to hold the primitive. For example, a mutex may be embedded inside the structure it protects. -Pool mutex is a variant of mutex without this requirement - to lock or unlock -a pool mutex, one uses address of the structure being protected with it, -not the mutex itself. -Pool mutexes are seldom used. +Mutex pools provide a preallocated set of mutexes to avoid this +requirement. +Note that mutexes from a pool may only be used as leaf locks. .Pp See .Xr mtx_pool 9 for details. -.Ss Reader/writer locks -Reader/writer locks allow shared access to protected data by multiple threads, +.Ss Reader/Writer Locks +Reader/writer locks allow shared access to protected data by multiple threads or exclusive access by a single thread. The threads with shared access are known as .Em readers @@ -82,26 +89,16 @@ since it may modify protected data. Reader/writer locks can be treated as mutexes (see above and .Xr mutex 9 ) with shared/exclusive semantics. -More specifically, regular mutexes can be -considered to be equivalent to a write-lock on an -.Em rw_lock. -The -.Em rw_lock -locks have priority propagation like mutexes, but priority -can be propagated only to an exclusive holder. +Reader/writer locks support priority propagation like mutexes, +but priority is propagated only to an exclusive holder. This limitation comes from the fact that shared owners are anonymous. -Another important property is that shared holders of -.Em rw_lock -can recurse, but exclusive locks are not allowed to recurse. -This ability should not be used lightly and -.Em may go away. .Pp See .Xr rwlock 9 for details. -.Ss Read-mostly locks -Mostly reader locks are similar to +.Ss Read-Mostly Locks +Read-mostly locks are similar to .Em reader/writer locks but optimized for very infrequent write locking. .Em Read-mostly @@ -113,21 +110,41 @@ data structure. See .Xr rmlock 9 for details. +.Ss Sleepable Read-Mostly Locks +Sleepable read-mostly locks are a variation on read-mostly locks. +Threads holding an exclusive lock may sleep, +but threads holding a shared lock may not. +Priority is propagated to shared owners but not to exclusive owners. .Ss Shared/exclusive locks Shared/exclusive locks are similar to reader/writer locks; the main difference -between them is that shared/exclusive locks may be held during unbounded sleep -(and may thus perform an unbounded sleep). -They are inherently less efficient than mutexes, reader/writer locks -and read-mostly locks. -They don't support priority propagation. -They should be considered to be closely related to -.Xr sleep 9 . -In fact it could in some cases be -considered a conditional sleep. +between them is that shared/exclusive locks may be held during unbounded sleep. +Acquiring a contested shared/exclusive lock can perform an unbounded sleep. +These locks do not support priority propagation. .Pp See .Xr sx 9 for details. +.Ss Lockmanager locks +Lockmanager locks are sleepable shared/exclusive locks used mostly in +.Xr VFS 9 +.Po +as a +.Xr vnode 9 +lock +.Pc +and in the buffer cache +.Po +.Xr BUF_LOCK 9 +.Pc . +They have features other lock types do not have such as sleep +timeouts, blocking upgrades, +writer starvation avoidance, draining, and an interlock mutex, +but this makes them complicated both to use and to implement; +for this reason, they should be avoided. +.Pp +See +.Xr lock 9 +for details. .Ss Counting semaphores Counting semaphores provide a mechanism for synchronizing access to a pool of resources. @@ -140,43 +157,21 @@ See .Xr sema 9 for details. .Ss Condition variables -Condition variables are used in conjunction with mutexes to wait for -conditions to occur. -A thread must hold the mutex before calling the -.Fn cv_wait* , +Condition variables are used in conjunction with locks to wait for +a condition to become true. +A thread must hold the associated lock before calling one of the +.Fn cv_wait , functions. -When a thread waits on a condition, the mutex -is atomically released before the thread is blocked, then reacquired -before the function call returns. +When a thread waits on a condition, the lock +is atomically released before the thread yields the processor +and reacquired before the function call returns. +Condition variables may be used with blocking mutexes, +reader/writer locks, read-mostly locks, and shared/exclusive locks. .Pp See .Xr condvar 9 for details. -.Ss Giant -Giant is an instance of a mutex, with some special characteristics: -.Bl -enum -.It -It is recursive. -.It -Drivers and filesystems can request that Giant be locked around them -by not marking themselves MPSAFE. -Note that infrastructure to do this is slowly going away as non-MPSAFE -drivers either became properly locked or disappear. -.It -Giant must be locked first before other locks. -.It -It is OK to hold Giant while performing unbounded sleep; in such case, -Giant will be dropped before sleeping and picked up after wakeup. -.It -There are places in the kernel that drop Giant and pick it back up -again. -Sleep locks will do this before sleeping. -Parts of the network or VM code may do this as well, depending on the -setting of a sysctl. -This means that you cannot count on Giant keeping other code from -running if your code sleeps, even if you want it to. -.El -.Ss Sleep/wakeup +.Ss Sleep/Wakeup The functions .Fn tsleep , .Fn msleep , @@ -185,7 +180,12 @@ The functions .Fn wakeup , and .Fn wakeup_one -handle event-based thread blocking. +also handle event-based thread blocking. +Unlike condition variables, +arbitrary addresses may be used as wait channels and a dedicated +structure does not need to be allocated. +However, care must be taken to ensure that wait channel addresses are +unique to an event. If a thread must wait for an external event, it is put to sleep by .Fn tsleep , .Fn msleep , @@ -205,9 +205,10 @@ the thread is being put to sleep. All threads sleeping on a single .Fa chan are woken up later by -.Fn wakeup , -often called from inside an interrupt routine, to indicate that the -resource the thread was blocking on is available now. +.Fn wakeup +.Pq often called from inside an interrupt routine +to indicate that the +event the thread was blocking on has occurred. .Pp Several of the sleep functions including .Fn msleep , @@ -223,126 +224,170 @@ includes the flag, then the lock will not be reacquired before returning. The lock is used to ensure that a condition can be checked atomically, and that the current thread can be suspended without missing a -change to the condition, or an associated wakeup. +change to the condition or an associated wakeup. In addition, all of the sleep routines will fully drop the .Va Giant mutex -(even if recursed) +.Pq even if recursed while the thread is suspended and will reacquire the .Va Giant -mutex before the function returns. -.Pp -See -.Xr sleep 9 -for details. +mutex +.Pq restoring any recursion +before the function returns. .Pp -.Ss Lockmanager locks -Shared/exclusive locks, used mostly in -.Xr VFS 9 , -in particular as a -.Xr vnode 9 -lock. -They have features other lock types don't have, such as sleep timeout, -writer starvation avoidance, draining, and interlock mutex, but this makes them -complicated to implement; for this reason, they are deprecated. +The +.Fn pause +function is a special sleep function that waits for a specified +amount of time to pass before the thread resumes execution. +This sleep cannot be terminated early by either an explicit +.Fn wakeup +or a signal. .Pp See -.Xr lock 9 +.Xr sleep 9 for details. +.Ss Giant +Giant is a special mutex used to protect data structures that do not +yet have their own locks. +Since it provides semantics akin to the old +.Xr spl 9 +interface, +Giant has special characteristics: +.Bl -enum +.It +It is recursive. +.It +Drivers and filesystems can request that Giant be locked around them +by not marking themselves MPSAFE. +Note that infrastructure to do this is slowly going away as non-MPSAFE +drivers either became properly locked or disappear. +.It +Giant must be locked before other non-sleepable locks. +.It +Giant is dropped during unbounded sleeps and reacquired after wakeup. +.It +There are places in the kernel that drop Giant and pick it back up +again. +Sleep locks will do this before sleeping. +Parts of the network or VM code may do this as well. +This means that you cannot count on Giant keeping other code from +running if your code sleeps, even if you want it to. +.El .Sh INTERACTIONS -The primitives interact and have a number of rules regarding how +The primitives can interact and have a number of rules regarding how they can and can not be combined. -Many of these rules are checked using the -.Xr witness 4 -code. -.Ss Bounded vs. unbounded sleep -The following primitives perform bounded sleep: mutexes, pool mutexes, -reader/writer locks and read-mostly locks. -.Pp -The following primitives block (perform unbounded sleep): shared/exclusive locks, -counting semaphores, condition variables, sleep/wakeup and lockmanager locks. -.Pp -It is an error to do any operation that could result in any kind of sleep while -holding spin mutex. -.Pp -As a general rule, it is an error to do any operation that could result -in unbounded sleep while holding any primitive from the 'bounded sleep' group. -For example, it is an error to try to acquire shared/exclusive lock while -holding mutex, or to try to allocate memory with M_WAITOK while holding -read-write lock. +Many of these rules are checked by +.Xr witness 4 . +.Ss Bounded vs. Unbounded Sleep +In a bounded sleep +.Po also referred to as +.Dq blocking +.Pc +the only resource needed to resume execution of a thread +is CPU time for the owner of a lock that the thread is waiting to acquire. +In an unbounded sleep +.Po +often referred to as simply +.Dq sleeping +.Pc +a thread waits for an external event or for a condition +to become true. +In particular, +a dependency chain of threads in bounded sleeps should always make forward +progress, +since there is always CPU time available. +This requires that no thread in a bounded sleep is waiting for a lock held +by a thread in an unbounded sleep. +To avoid priority inversions, +a thread in a bounded sleep lends its priority to the owner of the lock +that it is waiting for. +.Pp +The following primitives perform bounded sleeps: +mutexes, reader/writer locks and read-mostly locks. +.Pp +The following primitives perform unbounded sleeps: +sleepable read-mostly locks, shared/exclusive locks, lockmanager locks, +counting semaphores, condition variables, and sleep/wakeup. +.Ss General Principles +.Bl -bullet +.It +It is an error to do any operation that could result in yielding the processor +while holding a spin mutex. +.It +It is an error to do any operation that could result in unbounded sleep +while holding any primitive from the 'bounded sleep' group. +For example, it is an error to try to acquire a shared/exclusive lock while +holding a mutex, or to try to allocate memory with M_WAITOK while holding a +reader/writer lock. .Pp -As a special case, it is possible to call +Note that the lock passed to one of the .Fn sleep or -.Fn mtx_sleep -while holding a single mutex. -It will atomically drop that mutex and reacquire it as part of waking up. -This is often a bad idea because it generally relies on the programmer having -good knowledge of all of the call graph above the place where -.Fn mtx_sleep -is being called and assumptions the calling code has made. -Because the lock gets dropped during sleep, one must re-test all -the assumptions that were made before, all the way up the call graph to the -place where the lock was acquired. -.Pp -It is an error to do any operation that could result in any kind of sleep when -running inside an interrupt filter. -.Pp +.Fn cv_wait +functions is dropped before the thread enters the unbounded sleep and does +not violate this rule. +.It +It is an error to do any operation that could result in yielding of +the processor when running inside an interrupt filter. +.It It is an error to do any operation that could result in unbounded sleep when running inside an interrupt thread. +.El .Ss Interaction table The following table shows what you can and can not do while holding -one of the synchronization primitives discussed: -.Bl -column ".Ic xxxxxxxxxxxxxxxxxxx" ".Xr XXXXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXX" -offset indent -.It Xo -.Em "You have: You want:" Ta spin mtx Ta mutex Ta sx Ta rwlock Ta rmlock Ta sleep -.Xc -.It spin mtx Ta \&ok-1 Ta \&no Ta \&no Ta \&no Ta \&no Ta \&no-3 -.It mutex Ta \&ok Ta \&ok-1 Ta \&no Ta \&ok Ta \&ok Ta \&no-3 -.It sx Ta \&ok Ta \&ok Ta \&ok-2 Ta \&ok Ta \&ok Ta \&ok-4 -.It rwlock Ta \&ok Ta \&ok Ta \&no Ta \&ok-2 Ta \&ok Ta \&no-3 -.It rmlock Ta \&ok Ta \&ok Ta \&no-5 Ta \&ok Ta \&ok-2 Ta \&no-5 +one of the locking primitives discussed. Note that +.Dq sleep +includes +.Fn sema_wait , +.Fn sema_timedwait , +any of the +.Fn cv_wait +functions, +and any of the +.Fn sleep +functions. +.Bl -column ".Ic xxxxxxxxxxxxxxxx" ".Xr XXXXXXXXX" ".Xr XXXXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXXXXX" ".Xr XXXXXX" -offset 3n +.It Em " You want:" Ta spin mtx Ta mutex/rw Ta rmlock Ta sleep rm Ta sx/lk Ta sleep +.It Em "You have: " Ta -------- Ta -------- Ta ------ Ta -------- Ta ------ Ta ------ +.It spin mtx Ta \&ok Ta \&no Ta \&no Ta \&no Ta \&no Ta \&no-1 +.It mutex/rw Ta \&ok Ta \&ok Ta \&ok Ta \&no Ta \&no Ta \&no-1 +.It rmlock Ta \&ok Ta \&ok Ta \&ok Ta \&no Ta \&no Ta \&no-1 +.It sleep rm Ta \&ok Ta \&ok Ta \&ok Ta \&ok-2 Ta \&ok-2 Ta \&ok-2/3 +.It sx Ta \&ok Ta \&ok Ta \&ok Ta \&ok Ta \&ok Ta \&ok-3 +.It lockmgr Ta \&ok Ta \&ok Ta \&ok Ta \&ok Ta \&ok Ta \&ok .El .Pp .Em *1 -Recursion is defined per lock. -Lock order is important. +There are calls that atomically release this primitive when going to sleep +and reacquire it on wakeup +.Po +.Fn mtx_sleep , +.Fn rw_sleep , +.Fn msleep_spin , +etc. +.Pc . .Pp .Em *2 -Readers can recurse though writers can not. -Lock order is important. +These cases are only allowed while holding a write lock on a sleepable +read-mostly lock. .Pp .Em *3 -There are calls that atomically release this primitive when going to sleep -and reacquire it on wakeup (e.g. -.Fn mtx_sleep , -.Fn rw_sleep -and -.Fn msleep_spin ) . -.Pp -.Em *4 -Though one can sleep holding an sx lock, one can also use -.Fn sx_sleep -which will atomically release this primitive when going to sleep and +Though one can sleep while holding this lock, +one can also use a +.Fn sleep +function to atomically release this primitive when going to sleep and reacquire it on wakeup. .Pp -.Em *5 -.Em Read-mostly -locks can be initialized to support sleeping while holding a write lock. -See -.Xr rmlock 9 -for details. +Note that non-blocking try operations on locks are always permitted. .Ss Context mode table The next table shows what can be used in different contexts. At this time this is a rather easy to remember table. -.Bl -column ".Ic Xxxxxxxxxxxxxxxxxxx" ".Xr XXXXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXX" -offset indent -.It Xo -.Em "Context:" Ta spin mtx Ta mutex Ta sx Ta rwlock Ta rmlock Ta sleep -.Xc +.Bl -column ".Ic Xxxxxxxxxxxxxxxxxxx" ".Xr XXXXXXXXX" ".Xr XXXXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXXXXX" ".Xr XXXXXX" -offset 3n +.It Em "Context:" Ta spin mtx Ta mutex/rw Ta rmlock Ta sleep rm Ta sx/lk Ta sleep .It interrupt filter: Ta \&ok Ta \&no Ta \&no Ta \&no Ta \&no Ta \&no -.It interrupt thread: Ta \&ok Ta \&ok Ta \&no Ta \&ok Ta \&ok Ta \&no -.It callout: Ta \&ok Ta \&ok Ta \&no Ta \&ok Ta \&no Ta \&no -.It syscall: Ta \&ok Ta \&ok Ta \&ok Ta \&ok Ta \&ok Ta \&ok +.It interrupt thread: Ta \&ok Ta \&ok Ta \&ok Ta \&no Ta \&no Ta \&no +.It callout: Ta \&ok Ta \&ok Ta \&ok Ta \&no Ta \&no Ta \&no +.It system call: Ta \&ok Ta \&ok Ta \&ok Ta \&ok Ta \&ok Ta \&ok .El .Sh SEE ALSO .Xr witness 4 ,
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201310021920.r92JKFWQ046924>