Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 28 Jun 1999 00:42:37 -0700 (PDT)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Terry Lambert <tlambert@primenet.com>
Cc:        freebsd-smp@FreeBSD.ORG
Subject:   Re: high-efficiency SMP locks - submission for review
Message-ID:  <199906280742.AAA18132@apollo.backplane.com>
References:   <199906280503.WAA26806@usr04.primenet.com>

next in thread | previous in thread | raw e-mail | index | archive | help
:1)	SMP locking should be done utilizing intention modes.  I
:	had a nice discussion with Mike Smith and Julian over the
:	proposed locking mechanism for non-intention mode multiple
:	reader, single writer locks for the buffer cache.  This
:	boiled down to:

    Yes, I agree completely.  There are several states in the buffer
    structure management that could easily be separately locked.  The
    compare-and-exchange primitive I am using as the basis for my qlocks
    can just as easily be used as the basis for an exclusive intention
    lock.

:> struct qlock {
:> 	volatile int count;	/* lock refs: neg=exclusive, pos=shared */
:> 	int	waiting;	/* process(s) are waiting on lock	*/
:> 	struct proc *holder;	/* holder of an exclusive lock		*/
:> };
:
:	In particular, the use of a single wait count means that you
:	will not have an ordered list for the equivalent of a "wake
:	one".  This is a real problem, since it allows for a deadly
:	embrace deadlock to occur when two kernel contexts each hold
:	one lock and want to acquire the other contexts lock.
:
:	I believe that resolving this as an EWOULDBLOCK, and then

    Well, I differ with you here.  lockmgr() can't even handle that.
    Lock primitives that become too complex can no longer be categorized
    as primitives.  That is one of the biggest problems with lockmgr(),
    in fact.

    What should be done instead is to separate the functionality of the
    more complex locking functions - such as those that deal with deadlock
    situations - because these locking functions have a considerable amount
    of overhead compared to lower level primitives.

:	The use of a count, rather than a relationship between
:	the loc holder(s) and the things being locked is also
:	similarly problematic.

    The 'count' can mean anything.   In one set of functions it can be
    a shared/exclusive count, in another it can implement 32 structural
    zones.  I am not attempting to produce an all-encompassing solution,
    what I am attempting to do is produce a framework which can be used
    to implement different locking types with a minimum of overhead.  I've
    only implemented one:  A basic shared/exclusive mechanism.

:	Finally, there is insufficient mechanism to avoid competition
:	starvation, where a write blocks indefinitely as multiple
:	readers pass the resource between each other.

    Yes, this is an issue -- but the solution in lockmgr() has only led
    to more esoteric deadlock situations and, I think, harmed performance
    as much as it has helped it.  That isn't to say that we can't implement
    the same solution in qlocks, but the way I would do it would be by adding
    a new function, not modifying existing primitives.  For example, 
    qlock_wr() might not try to hold-off other shared locks, but 
    qlock_hipri_wr() could.

    The biggest problem we face at the moment is that locks are being held
    for much too long a period of time.  For example, locks are being held
    *through* I/O operations as a means of controlling access.  This is
    precisely the wrong way to use a lock.  The lock should be used to 
    protect the data structure and then released.  An I/O operation in
    progress should set a "the data is being messed with" flag and then
    release its lock, not attempt to hold the lock permanently.  Or, as you
    mentioned, intention locks can be used to separate the I/O op from other
    types of operations.

    Many of the shared/exclusive problems go away ( or at least go into
    hiding ) when the locks are used properly.

:	I believe the following is the minimal set of structures
:	required to resolve the blocking operation inheritance and
:	deadlock detection:
:
:/*
: *
: */
:     
:struct lockable;
:typedef struct lockable LKA;
:
:struct lockingentity; 
:typedef struct lockingentity LKE;
:
:struct locklistentry;
:typedef struct locklistentry LKLE;
:
:struct lockentitylistentry;  
:typedef struct lockentitylistentry LKEE;
:
:struct lockentitylistentry { 
:        LKEE    *next;          /* next list entry*/
:        LKEE    *inherit;       /* who inherits this dependency*/
:        LKE     *entity;        /* entity waiting on us*/
:};      
:
:struct locklistentry {
:        LKLE    *next;          /* next lock in list*/
:        LKA     *held;          /* a held lock*/
:};       
:
:            
:struct lockingentity { 
:        LKLE    *holding;       /* locks that entity is holding*/
:        LKA     *waiton;        /* lock that entity is waiting on*/
:};
:
:struct lockable {
:        LKEE    *holds;         /* the locking entities*/
:        LKEE    *waits;         /* entities waiting for the lock*/
:};
:
:	This presumes a model where each context which wishes to
:	acquire a lock is considered a lockingentity (this works
:	equally well for kernel threads vs. processes vs. async

    Holy cow, that is very expensive.  If I were to implement that sort
    of locking subsystem it would be at a higher level, it would not
    be a primitive.

    I tend to dislike complex locking solutions, but only because I
    strongly believe that it is possible to avoid the necessity of
    such by organizing code and algorithms properly.  As lockmgr() has
    shown, trying to implement complex locking solutions in an SMP
    environment can become *very* expensive.  So expensive that the
    complex locking solution winds up hurting performance more then
    a simpler solution would help performance.

:	Finally, one could envision a hierarchical relationship
:	between lockingentities, e.g. multiple threads within a
:	process, so as to avoid self-deadlock.  This really
:	depends on your threads implemenation; obviously, a user
:	space call conversion mechanism backed by multiple kernel
:	threads, and implemented wholly using asynchronous kernel
:	entries (e.g. an async call gate) would be immune to the
:	requirement.  Other implementations would need something
:	like:
:
:					Terry Lambert
:					terry@lambert.org

    You are right on the money here.  This is precisely the problem
    that the SMP implementation currently faces and is attempting
    to solve with spl-aware simple locks.  In fact, the way spl*() ops
    work currently is very similar to obtaining multiple locks
    simultaniously.  Again, I really hate having to implement complex
    solutions when it may be possible to obtain the same effect by
    reorganizing the functions that require the locking in the first place.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199906280742.AAA18132>