From owner-freebsd-arch Mon Feb 17 20:33: 6 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B16C337B401 for ; Mon, 17 Feb 2003 20:33:03 -0800 (PST) Received: from tesla.distributel.net (nat.MTL.distributel.NET [66.38.181.24]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0A59043FB1 for ; Mon, 17 Feb 2003 20:33:03 -0800 (PST) (envelope-from bmilekic@unixdaemons.com) Received: (from bmilekic@localhost) by tesla.distributel.net (8.11.6/8.11.6) id h1I4W6t68519; Mon, 17 Feb 2003 23:32:06 -0500 (EST) (envelope-from bmilekic@unixdaemons.com) Date: Mon, 17 Feb 2003 23:32:06 -0500 From: Bosko Milekic To: Julian Elischer Cc: freebsd-arch@FreeBSD.ORG Subject: Re: mb_alloc cache balancer / garbage collector Message-ID: <20030217233206.A68495@unixdaemons.com> References: <20030217230327.A68207@unixdaemons.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: ; from julian@elischer.org on Mon, Feb 17, 2003 at 08:21:07PM -0800 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Mon, Feb 17, 2003 at 08:21:07PM -0800, Julian Elischer wrote: > On Mon, 17 Feb 2003, Bosko Milekic wrote: > > On Mon, Feb 17, 2003 at 07:45:45PM -0800, Julian Elischer wrote: > > > On Mon, 17 Feb 2003, Bosko Milekic wrote: > > > > Right, it basically means that in this scenario we degenerate to a > > single cache. The structure to which the mbuf is freed is called a > > "bucket" and right now a "bucket" keeps a PAGE_SIZE worth of mbufs. > > The idea is that you can move these buckets around from cache to > > cache, even if they're not totally full. In the scenario that you > > describe (which by the way is still inexistent), assuming that we > > determine that it's really worth doing the binding of the threads to > > individual CPUs (I'm not quite convinced that it is, ... yet), in > > that was a contrived example, however I can imagine many cases where the > networking thread runs on one CPU, and tries to stay there due to > affinity issues, which means that the fielding of interrupts and hense > filling of mbufs, is left to the other CPU. > > I'm not saing that NICS need to be bound to processors (though if they > were part of the processor unit as in some older SUN boxes that might > make sense) but I am saying that I think that the producer and consumer > might quite easily be constantly on different CPUs. > > Here's another example. One of the things that we will be doing in > threads is the ability to bind a thread to a CPU. If that thread opens a > socket, and starts receiving stuff then the 'consumer' is now locked to > one CPU. Now let's make that thread also be using about 100% of that > CPU. The other CPU is idle and therefore probably the producer is going > to run there. It is true that "on average" things should even out but it > is also very easy to make scenarios where this isn't true. > > Or, two processes doing some set of transactions with each other. > (both usning lots of CPU). > "On average" the producer and the consumre are going to be on different > CPUs. It stilll seems odd to me that the consumer has to pass it back > to the producer's CPU because "on average" it will require a locking > cycle of some sort. Hmmm, to be perfectly honest with you, both of your examples are good examples. I guess what we'd have to do, at least eventually, is modify the code to, when freeing, also migrate the bucket over to the local CPU. Then future frees that involve objects going to the same bucket will need the consuming CPU's cache lock and won't need to contend with the producing CPU cache lock. However, as I already mentionned, it seems to me that this will only really work if you have a strict consumer/producer relationship where the consumer strictly sits on one CPU and the producer on another. The thing is that we don't know how often those cases are going to arise and whether we're warranted to make the change. Either way, I think we need to lock things down, make the modifications, boot two seperate kernels (one that implements each variation) and whack away at it. I don't want to ignore this but I'd like to put it aside for now until we're in the position that will allow us to look at it with more data on hand. Either way, I don't think this counters the advantages of the kproc (which is the original subject of this thread). In fact, it is worth noting that if we do notice that most consumers/producers are on different CPUs in most cases, and we do make the change above, the mbufd kproc can actually help in moving back the objects to <-> from the global cache faster, and with less cache ping-ponging going on (because the recycle-moves would be made in larger chunks) and so less often. -- Bosko Milekic * bmilekic@unixdaemons.com * bmilekic@FreeBSD.org "If we open a quarrel between the past and the present, we shall find that we have lost the future." To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message