From owner-freebsd-arch Thu Jan 23 18:26:15 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1756B37B401; Thu, 23 Jan 2003 18:26:13 -0800 (PST) Received: from tesla.distributel.net (nat.MTL.distributel.NET [66.38.181.24]) by mx1.FreeBSD.org (Postfix) with ESMTP id 40C5D43F18; Thu, 23 Jan 2003 18:26:12 -0800 (PST) (envelope-from bmilekic@unixdaemons.com) Received: (from bmilekic@localhost) by tesla.distributel.net (8.11.6/8.11.6) id h0O2RME80438; Thu, 23 Jan 2003 21:27:22 -0500 (EST) (envelope-from bmilekic@unixdaemons.com) Date: Thu, 23 Jan 2003 21:27:22 -0500 From: Bosko Milekic To: Terry Lambert Cc: Doug Rabson , John Baldwin , arch@FreeBSD.org, Andrew Gallatin Subject: Re: M_ flags summary. Message-ID: <20030123212722.A80406@unixdaemons.com> References: <1043339738.29341.1.camel@builder02.qubesoft.com> <3E309FE5.F74564DC@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <3E309FE5.F74564DC@mindspring.com>; from tlambert2@mindspring.com on Thu, Jan 23, 2003 at 06:07:33PM -0800 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, Jan 23, 2003 at 06:07:33PM -0800, Terry Lambert wrote: > Doug Rabson wrote: > > On Thu, 2003-01-23 at 15:39, John Baldwin wrote: > > > This would prevent the malloc implementation from using internal mutexes > > > that it msleep's or cv_wait's on. You only get to pass in one mutex > > > to cv_wait* and msleep. > > > > That did occur to me too, which was why I wrote "or something". It looks > > hard to DTRT here without a version of msleep which took a list of > > mutexes to release. > > The hard part here is that this is almost entirely useless for > most FS directory operations, which must hold both the mutex for > the parent directory, and the mutex for the object being > manipulated, plus potentially other mutexes (e.g. rename), etc.. > There are other places where this is true, too. Exactly. > > > In my experience, one can often "fix" problems > > > with holding locks across malloc() by malloc()'ing things earlier in the > > > function before you need locks. > > > > This is obviously preferable. > > This is preferrable for *most* cases. For cases where a failure > of an operation to complete immediately results in the operation > being queued, which requires an allocation, then you are doing a > preallocation for the failure code path. Doing a preallocation > that way is incredibly expensive. If on the other hand, you are > doing the allocation on the assumption of success, then it's > "free". The real question is whether or not the allocation is in > the common or uncommon code path. In that case you shouldn't be holding the lock protecting the queue before actually detecting the failure. Once you detect the failure, then you allocate your resource, _then_ you grab the queue lock, _then_ you queue the operation. This works unless you left out some of the detail from your example. The point is that I'm sure that a reasonable solution exists for each scenario, unless the design is wrong to begin with... but I'm willing to accept that my intuition has misled me. > The easy way to mitigate the issue here is to maintain an object > free list, and use that, instead of the allocator. Of course, if > you do that, you can often avoid holding a mutex altogether. And > if the code tolerates a failure to allocate reasonably well, you > can signal a "need to refill free list", and not hold a mutex over > an allocation at all. Although clever, this is somewhat bogus behavior w.r.t. the allocator. Remember that the allocator already keeps a cache but if you instead start maintaining your own (lock-free) cache, yes, maybe you're improving local performance but, overall, you're doing what the allocator should be doing anyway and, in some cases, this hampers the allocator's ability to manage the resources it is responsible for. But I'm sure you know this because, yes, you are technically correct. > -- Terry In any case, it's good that we're discussing general solution possibilities for these sorts of problems but I think that we agree that they are rather special exception situations that, given good thought and MP-oriented design, can be avoided. And that's what I think the allocator API should encourage: good design. By specifying the wait-case as the default behavior, the allocator API is effectively encouraging all non-ISR code to be prepared to wait, for whatever amount of time (the actual amount of time is irrelevant in making my point). Regards, -- Bosko Milekic * bmilekic@unixdaemons.com * bmilekic@FreeBSD.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message