From owner-freebsd-hackers@FreeBSD.ORG  Sat Feb  2 09:59:45 2008
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4584C16A419;
	Sat,  2 Feb 2008 09:59:45 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 18B4F13C43E;
	Sat,  2 Feb 2008 09:59:45 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [209.31.154.41])
	by cyrus.watson.org (Postfix) with ESMTP id 760DF4A06C;
	Sat,  2 Feb 2008 04:59:44 -0500 (EST)
Date: Sat, 2 Feb 2008 09:59:44 +0000 (GMT)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Alexander Motin <mav@FreeBSD.org>
In-Reply-To: <47A43873.40801@FreeBSD.org>
Message-ID: <20080202095658.R63379@fledge.watson.org>
References: <47A25412.3010301@FreeBSD.org> <47A25A0D.2080508@elischer.org>
	<47A2C2A2.5040109@FreeBSD.org>
	<20080201185435.X88034@fledge.watson.org>
	<47A43873.40801@FreeBSD.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-hackers@freebsd.org, freebsd-performance@freebsd.org,
	Julian Elischer <julian@elischer.org>
Subject: Re: Memory allocation performance
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 02 Feb 2008 09:59:45 -0000


On Sat, 2 Feb 2008, Alexander Motin wrote:

> Robert Watson wrote:
>> I guess the question is: where are the cycles going?  Are we suffering 
>> excessive cache misses in managing the slabs?  Are you effectively "cycling 
>> through" objects rather than using a smaller set that fits better in the 
>> cache?
>
> In my test setup only several objects from zone usually allocated same time, 
> but they allocated two times per every packet.
>
> To check UMA dependency I have made a trivial one-element cache which in my 
> test case allows to avoid two for four allocations per packet.

Avoiding unnecessary allocations is a good general principle, but duplicating 
cache logic is a bad idea.  If you're able to structure the below without 
using locking, it strikes me you'd do much better, especially if it's in a 
single processing pass.  Can you not use a per-thread/stack/session variable 
to avoid that?

> .....alloc.....
> -       item = uma_zalloc(ng_qzone, wait | M_ZERO);
> +       mtx_lock_spin(&itemcachemtx);
> +       item = itemcache;
> +       itemcache = NULL;
> +       mtx_unlock_spin(&itemcachemtx);

Why are you using spin locks?  They are quite a bit more expensive on several 
hardwawre platforms, and any environment it's safe to call uma_zalloc() from 
will be equally safe to use regular mutexes from (i.e., mutex-sleepable).

> +       if (item == NULL)
> +               item = uma_zalloc(ng_qzone, wait | M_ZERO);
> +       else
> +               bzero(item, sizeof(*item));
> .....free.....
> -       uma_zfree(ng_qzone, item);
> +       mtx_lock_spin(&itemcachemtx);
> +       if (itemcache == NULL) {
> +               itemcache = item;
> +               item = NULL;
> +       }
> +       mtx_unlock_spin(&itemcachemtx);
> +       if (item)
> +               uma_zfree(ng_qzone, item);
> ...............
>
> To be sure that test system is CPU-bound I have throttled it with sysctl to 
> 1044MHz. With this patch my test PPPoE-to-PPPoE router throughput has grown 
> from 17 to 21Mbytes/s. Profiling results I have sent promised close results.
>
>> Is some bit of debugging enabled that shouldn't be, perhaps due to a 
>> failure of ifdefs?
>
> I have commented out all INVARIANTS and WITNESS options from GENERIC kernel 
> config. What else should I check?

Hence my request for drilling down a bit on profiling -- the question I'm 
asking is whether profiling shows things running or taking time that shouldn't 
be.

Robert N M Watson
Computer Laboratory
University of Cambridge