From owner-freebsd-arch Wed Feb 27 11:33:50 2002 Delivered-To: freebsd-arch@freebsd.org Received: from angelica.unixdaemons.com (angelica.unixdaemons.com [209.148.64.135]) by hub.freebsd.org (Postfix) with ESMTP id 4066C37B41D for ; Wed, 27 Feb 2002 11:33:38 -0800 (PST) Received: from angelica.unixdaemons.com (bmilekic@localhost.unixdaemons.com [127.0.0.1]) by angelica.unixdaemons.com (8.12.2/8.12.1) with ESMTP id g1RJXUh4040455; Wed, 27 Feb 2002 14:33:30 -0500 (EST) Received: (from bmilekic@localhost) by angelica.unixdaemons.com (8.12.2/8.12.1/Submit) id g1RJXUcj040454; Wed, 27 Feb 2002 14:33:30 -0500 (EST) (envelope-from bmilekic) Date: Wed, 27 Feb 2002 14:33:30 -0500 From: Bosko Milekic To: Terry Lambert Cc: Jeff Roberson , arch@FreeBSD.ORG Subject: Re: Slab allocator Message-ID: <20020227143330.A34054@unixdaemons.com> References: <20020227005915.C17591-100000@mail.chesapeake.net> <3C7D1E31.B13915E7@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <3C7D1E31.B13915E7@mindspring.com>; from tlambert2@mindspring.com on Wed, Feb 27, 2002 at 09:58:09AM -0800 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, Feb 27, 2002 at 09:58:09AM -0800, Terry Lambert wrote: > First, let me say OUTSTANDING WORK! > > Jeff Roberson wrote: > > There are also per cpu queues of items, with a per cpu lock. This allows > > for very effecient allocation, and also it provides near linear > > performance as the number of cpus increase. I do still depend on giant to > > talk to the back end page supplier (kmem_alloc, etc.). Once the VM is > > locked the allocator will not require giant at all. > > What is the per-CPU lock required for? I think it can be > gotten rid of, or at least taken out of the critical path, > with more information. Per-CPU caches. Reduces lock contention and trashes caches less often. > > I would eventually like to pull other allocators into uma (The slab > > allocator). We could get rid of some of the kernel submaps and provide a > > much more dynamic amount of various resources. Something I had in mind > > were pbufs and mbufs, which could easily come from uma. This gives us the > > ability to redistribute memory to wherever it is needed, and not lock it > > in a particular place once it's there. > > How do you handle interrupt-time allocation of mbufs, in > this case? The zalloci() handles this by pre-creation of > the PTE's for the page mapping in the KVA, and then only > has to deal with grabbing free physical pages to back them, > which is a non-blocking operation that can occur at interrupt, > and which, if it fails, is not fatal (i.e. it's handled; I've > considered doing the same for the page mapping and PTE's, but > that would make the time-to-run far less deterministic). Terry, how long will you keep thinking that mbufs come through the zone allocator? :-) For G*d's sake man, we've been over this before! > > There are a few things that need to be fixed right now. For one, the zone > > statistics don't reflect the items that are in the per cpu queues. I'm > > thinking about clean ways to collect this without locking every zone and > > per cpu queue when some one calls sysctl. > > The easy way around this is to say that these values are > snpashots. So you maintain the figures of merit on a per > CPU basis in the context of the CPU doing the allocations > and deallocations, and treat it as read-only for the > purposes of statistics reporting. This means that you > don't need locks to get the statistics. For debugging, > you could provide a rigid locked interface (e.g. by only > enabling locking for the statistics gathering via a sysctl > that defaults to "off"). Yes, this is exactly what we did with mb_alloc. This is also what I was trying to say in my last Email. > > The other problem is with the per cpu buckets. They are a > > fixed size right now. I need to define several zones for > > the buckets to come from and a way to manage growing/shrinking > > the buckets. > > I built a "chain" allocator that dealt with this issue, and > also the object granularit issue. Basically, it calculated > the LCM of the object size rounded to a MAX(sizeof(long),8) > boundary for processor alignment sensitivity reasons, and > the page size (also for processor sensitivity reasons), and > then allocated a contiguous region from which it obtained > objects of that type. All in all, it meant zero unnecessary > space wastage (for 1,000,000 TCP connections, the savings > were 1/4 of a Gigabyte for one zone alone). That's great, until you run out of pre-allocated contiguous space. [...] > And thanks again for the most excellent work! > > -- Terry -- Bosko Milekic bmilekic@unixdaemons.com bmilekic@FreeBSD.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message