From owner-freebsd-arch  Thu Jan 23 18:26:15 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 1756B37B401; Thu, 23 Jan 2003 18:26:13 -0800 (PST)
Received: from tesla.distributel.net (nat.MTL.distributel.NET [66.38.181.24])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 40C5D43F18; Thu, 23 Jan 2003 18:26:12 -0800 (PST)
	(envelope-from bmilekic@unixdaemons.com)
Received: (from bmilekic@localhost)
	by tesla.distributel.net (8.11.6/8.11.6) id h0O2RME80438;
	Thu, 23 Jan 2003 21:27:22 -0500 (EST)
	(envelope-from bmilekic@unixdaemons.com)
Date: Thu, 23 Jan 2003 21:27:22 -0500
From: Bosko Milekic <bmilekic@unixdaemons.com>
To: Terry Lambert <tlambert2@mindspring.com>
Cc: Doug Rabson <dfr@nlsystems.com>, John Baldwin <jhb@FreeBSD.org>,
	arch@FreeBSD.org, Andrew Gallatin <gallatin@cs.duke.edu>
Subject: Re: M_ flags summary.
Message-ID: <20030123212722.A80406@unixdaemons.com>
References: <XFMail.20030123103959.jhb@FreeBSD.org> <1043339738.29341.1.camel@builder02.qubesoft.com> <3E309FE5.F74564DC@mindspring.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5.1i
In-Reply-To: <3E309FE5.F74564DC@mindspring.com>; from tlambert2@mindspring.com on Thu, Jan 23, 2003 at 06:07:33PM -0800
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG


On Thu, Jan 23, 2003 at 06:07:33PM -0800, Terry Lambert wrote:
> Doug Rabson wrote:
> > On Thu, 2003-01-23 at 15:39, John Baldwin wrote:
> > > This would prevent the malloc implementation from using internal mutexes
> > > that it msleep's or cv_wait's on.  You only get to pass in one mutex
> > > to cv_wait* and msleep.
> > 
> > That did occur to me too, which was why I wrote "or something". It looks
> > hard to DTRT here without a version of msleep which took a list of
> > mutexes to release.
> 
> The hard part here is that this is almost entirely useless for
> most FS directory operations, which must hold both the mutex for
> the parent directory, and the mutex for the object being
> manipulated, plus potentially other mutexes (e.g. rename), etc..
> There are other places where this is true, too.

  Exactly.
 
> > >   In my experience, one can often "fix" problems
> > > with holding locks across malloc() by malloc()'ing things earlier in the
> > > function before you need locks.
> > 
> > This is obviously preferable.
> 
> This is preferrable for *most* cases.  For cases where a failure
> of an operation to complete immediately results in the operation
> being queued, which requires an allocation, then you are doing a
> preallocation for the failure code path.  Doing a preallocation
> that way is incredibly expensive.  If on the other hand, you are
> doing the allocation on the assumption of success, then it's
> "free".  The real question is whether or not the allocation is in
> the common or uncommon code path.

  In that case you shouldn't be holding the lock protecting the queue
  before actually detecting the failure.  Once you detect the failure,
  then you allocate your resource, _then_ you grab the queue lock,
  _then_ you queue the operation.  This works unless you left out some
  of the detail from your example.  The point is that I'm sure that a
  reasonable solution exists for each scenario, unless the design is
  wrong to begin with... but I'm willing to accept that my intuition has
  misled me.

> The easy way to mitigate the issue here is to maintain an object
> free list, and use that, instead of the allocator.  Of course, if
> you do that, you can often avoid holding a mutex altogether.  And
> if the code tolerates a failure to allocate reasonably well, you
> can signal a "need to refill free list", and not hold a mutex over
> an allocation at all.

  Although clever, this is somewhat bogus behavior w.r.t. the allocator.
  Remember that the allocator already keeps a cache but if you instead
  start maintaining your own (lock-free) cache, yes, maybe you're
  improving local performance but, overall, you're doing what the
  allocator should be doing anyway and, in some cases, this hampers the
  allocator's ability to manage the resources it is responsible for.
  But I'm sure you know this because, yes, you are technically correct.

> -- Terry

  In any case, it's good that we're discussing general solution
  possibilities for these sorts of problems but I think that we agree
  that they are rather special exception situations that, given good
  thought and MP-oriented design, can be avoided.  And that's what I
  think the allocator API should encourage: good design.  By specifying
  the wait-case as the default behavior, the allocator API is
  effectively encouraging all non-ISR code to be prepared to wait, for
  whatever amount of time (the actual amount of time is irrelevant in
  making my point).

Regards,
-- 
Bosko Milekic * bmilekic@unixdaemons.com * bmilekic@FreeBSD.org


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message