From owner-freebsd-smp Mon Jun 28 0:42:42 1999 Delivered-To: freebsd-smp@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2]) by hub.freebsd.org (Postfix) with ESMTP id A7EEC150D1 for ; Mon, 28 Jun 1999 00:42:39 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id AAA18132; Mon, 28 Jun 1999 00:42:37 -0700 (PDT) (envelope-from dillon) Date: Mon, 28 Jun 1999 00:42:37 -0700 (PDT) From: Matthew Dillon Message-Id: <199906280742.AAA18132@apollo.backplane.com> To: Terry Lambert Cc: freebsd-smp@FreeBSD.ORG Subject: Re: high-efficiency SMP locks - submission for review References: <199906280503.WAA26806@usr04.primenet.com> Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org :1) SMP locking should be done utilizing intention modes. I : had a nice discussion with Mike Smith and Julian over the : proposed locking mechanism for non-intention mode multiple : reader, single writer locks for the buffer cache. This : boiled down to: Yes, I agree completely. There are several states in the buffer structure management that could easily be separately locked. The compare-and-exchange primitive I am using as the basis for my qlocks can just as easily be used as the basis for an exclusive intention lock. :> struct qlock { :> volatile int count; /* lock refs: neg=exclusive, pos=shared */ :> int waiting; /* process(s) are waiting on lock */ :> struct proc *holder; /* holder of an exclusive lock */ :> }; : : In particular, the use of a single wait count means that you : will not have an ordered list for the equivalent of a "wake : one". This is a real problem, since it allows for a deadly : embrace deadlock to occur when two kernel contexts each hold : one lock and want to acquire the other contexts lock. : : I believe that resolving this as an EWOULDBLOCK, and then Well, I differ with you here. lockmgr() can't even handle that. Lock primitives that become too complex can no longer be categorized as primitives. That is one of the biggest problems with lockmgr(), in fact. What should be done instead is to separate the functionality of the more complex locking functions - such as those that deal with deadlock situations - because these locking functions have a considerable amount of overhead compared to lower level primitives. : The use of a count, rather than a relationship between : the loc holder(s) and the things being locked is also : similarly problematic. The 'count' can mean anything. In one set of functions it can be a shared/exclusive count, in another it can implement 32 structural zones. I am not attempting to produce an all-encompassing solution, what I am attempting to do is produce a framework which can be used to implement different locking types with a minimum of overhead. I've only implemented one: A basic shared/exclusive mechanism. : Finally, there is insufficient mechanism to avoid competition : starvation, where a write blocks indefinitely as multiple : readers pass the resource between each other. Yes, this is an issue -- but the solution in lockmgr() has only led to more esoteric deadlock situations and, I think, harmed performance as much as it has helped it. That isn't to say that we can't implement the same solution in qlocks, but the way I would do it would be by adding a new function, not modifying existing primitives. For example, qlock_wr() might not try to hold-off other shared locks, but qlock_hipri_wr() could. The biggest problem we face at the moment is that locks are being held for much too long a period of time. For example, locks are being held *through* I/O operations as a means of controlling access. This is precisely the wrong way to use a lock. The lock should be used to protect the data structure and then released. An I/O operation in progress should set a "the data is being messed with" flag and then release its lock, not attempt to hold the lock permanently. Or, as you mentioned, intention locks can be used to separate the I/O op from other types of operations. Many of the shared/exclusive problems go away ( or at least go into hiding ) when the locks are used properly. : I believe the following is the minimal set of structures : required to resolve the blocking operation inheritance and : deadlock detection: : :/* : * : */ : :struct lockable; :typedef struct lockable LKA; : :struct lockingentity; :typedef struct lockingentity LKE; : :struct locklistentry; :typedef struct locklistentry LKLE; : :struct lockentitylistentry; :typedef struct lockentitylistentry LKEE; : :struct lockentitylistentry { : LKEE *next; /* next list entry*/ : LKEE *inherit; /* who inherits this dependency*/ : LKE *entity; /* entity waiting on us*/ :}; : :struct locklistentry { : LKLE *next; /* next lock in list*/ : LKA *held; /* a held lock*/ :}; : : :struct lockingentity { : LKLE *holding; /* locks that entity is holding*/ : LKA *waiton; /* lock that entity is waiting on*/ :}; : :struct lockable { : LKEE *holds; /* the locking entities*/ : LKEE *waits; /* entities waiting for the lock*/ :}; : : This presumes a model where each context which wishes to : acquire a lock is considered a lockingentity (this works : equally well for kernel threads vs. processes vs. async Holy cow, that is very expensive. If I were to implement that sort of locking subsystem it would be at a higher level, it would not be a primitive. I tend to dislike complex locking solutions, but only because I strongly believe that it is possible to avoid the necessity of such by organizing code and algorithms properly. As lockmgr() has shown, trying to implement complex locking solutions in an SMP environment can become *very* expensive. So expensive that the complex locking solution winds up hurting performance more then a simpler solution would help performance. : Finally, one could envision a hierarchical relationship : between lockingentities, e.g. multiple threads within a : process, so as to avoid self-deadlock. This really : depends on your threads implemenation; obviously, a user : space call conversion mechanism backed by multiple kernel : threads, and implemented wholly using asynchronous kernel : entries (e.g. an async call gate) would be immune to the : requirement. Other implementations would need something : like: : : Terry Lambert : terry@lambert.org You are right on the money here. This is precisely the problem that the SMP implementation currently faces and is attempting to solve with spl-aware simple locks. In fact, the way spl*() ops work currently is very similar to obtaining multiple locks simultaniously. Again, I really hate having to implement complex solutions when it may be possible to obtain the same effect by reorganizing the functions that require the locking in the first place. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message