From owner-freebsd-hackers@FreeBSD.ORG Sat Feb 2 09:31:41 2008 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4F1E716A418; Sat, 2 Feb 2008 09:31:41 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from cmail.optima.ua (cmail.optima.ua [195.248.191.121]) by mx1.freebsd.org (Postfix) with ESMTP id 4CCA313C447; Sat, 2 Feb 2008 09:31:40 +0000 (UTC) (envelope-from mav@FreeBSD.org) X-Spam-Flag: SKIP X-Spam-Yversion: Spamooborona 1.7.0 Received: from [212.86.226.226] (account mav@alkar.net HELO [192.168.3.2]) by cmail.optima.ua (CommuniGate Pro SMTP 5.1.14) with ESMTPA id 72019456; Sat, 02 Feb 2008 11:31:39 +0200 Message-ID: <47A43873.40801@FreeBSD.org> Date: Sat, 02 Feb 2008 11:31:31 +0200 From: Alexander Motin User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: Robert Watson References: <47A25412.3010301@FreeBSD.org> <47A25A0D.2080508@elischer.org> <47A2C2A2.5040109@FreeBSD.org> <20080201185435.X88034@fledge.watson.org> In-Reply-To: <20080201185435.X88034@fledge.watson.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@freebsd.org, freebsd-performance@freebsd.org, Julian Elischer Subject: Re: Memory allocation performance X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Feb 2008 09:31:41 -0000 Robert Watson wrote: > I guess the question is: where are the cycles going? Are we suffering > excessive cache misses in managing the slabs? Are you effectively > "cycling through" objects rather than using a smaller set that fits > better in the cache? In my test setup only several objects from zone usually allocated same time, but they allocated two times per every packet. To check UMA dependency I have made a trivial one-element cache which in my test case allows to avoid two for four allocations per packet. .....alloc..... - item = uma_zalloc(ng_qzone, wait | M_ZERO); + mtx_lock_spin(&itemcachemtx); + item = itemcache; + itemcache = NULL; + mtx_unlock_spin(&itemcachemtx); + if (item == NULL) + item = uma_zalloc(ng_qzone, wait | M_ZERO); + else + bzero(item, sizeof(*item)); .....free..... - uma_zfree(ng_qzone, item); + mtx_lock_spin(&itemcachemtx); + if (itemcache == NULL) { + itemcache = item; + item = NULL; + } + mtx_unlock_spin(&itemcachemtx); + if (item) + uma_zfree(ng_qzone, item); ............... To be sure that test system is CPU-bound I have throttled it with sysctl to 1044MHz. With this patch my test PPPoE-to-PPPoE router throughput has grown from 17 to 21Mbytes/s. Profiling results I have sent promised close results. > Is some bit of debugging enabled that shouldn't > be, perhaps due to a failure of ifdefs? I have commented out all INVARIANTS and WITNESS options from GENERIC kernel config. What else should I check? -- Alexander Motin