From owner-freebsd-current@FreeBSD.ORG  Thu Sep  9 06:00:49 2004
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 106FB16A4CF; Thu,  9 Sep 2004 06:00:49 +0000 (GMT)
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id B7E0543D58; Thu,  9 Sep 2004 06:00:20 +0000 (GMT)
	(envelope-from robert@fledge.watson.org)
Received: from fledge.watson.org (localhost [127.0.0.1])
	by fledge.watson.org (8.13.1/8.13.1) with ESMTP id i8960FRP001150;
	Thu, 9 Sep 2004 02:00:15 -0400 (EDT)
	(envelope-from robert@fledge.watson.org)
Received: from localhost (robert@localhost)i8960ERA001147;
	Thu, 9 Sep 2004 02:00:15 -0400 (EDT)
	(envelope-from robert@fledge.watson.org)
Date: Thu, 9 Sep 2004 02:00:14 -0400 (EDT)
From: Robert Watson <rwatson@freebsd.org>
X-Sender: robert@fledge.watson.org
To: Matthew Dillon <dillon@apollo.backplane.com>
In-Reply-To: <200409090445.i894jRei071606@apollo.backplane.com>
Message-ID: <Pine.NEB.3.96L.1040909014259.748E-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: scottl@freebsd.org
cc: Gerrit Nagelhout <gnagelhout@sandvine.com>
cc: current@freebsd.org
Subject: Re: FreeBSD 5.3 Bridge performance take II
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Sep 2004 06:00:49 -0000


On Wed, 8 Sep 2004, Matthew Dillon wrote:

>     I would recommend against per-thread caches.  Instead, make the per-cpu
>     caches actually *be* per-cpu (that is, not require a mutex).  This is
<big snip>

One of the paragraphs you appear not to have quoted from my e-mail was
this one:

% One nice thing about using this experimental code is that I hope it will
% allow us to reason more effectively about the extent to which improving
% per-cpu data structures improves efficiency -- I can now much more
% easily say "OK, what happens if eliminate the cost of locking for common
% place mbuf allocation/free".  I've also started looking at per-interface
% caches based on the same model, which has some similar limitations (but
% also some similar benefits), such as stuffing per-interface uma caches
% in struct ifnet. 

I.e., using per-thread UMA caches is a 30-60 minute hack that allows me to
explore and measure the performance benefits (and costs) of several
different approaches, including per-cpu, per-thread, and per-data
structure/object caching without doing the full implementation up front. 
Per-thread caching, for example, can simulate the effects of
non-preemption and mutex avoidance in micro-benchmarking, although in the
general case under macro-benchmark perspective it suffers from a number of
problems (including the draining, balancing, and extra storage cost
issues).  I didn't attempt to address these problems under the assumption
that the current implementation is a tool for exploring performance, not
something to actually use.

In doing so, my hope was to identify which areas will offer the most
immediate performance benefits, be it simply cutting down on costly
operations (such as the entropy harvesting code for Yarrow which appears
to have found its way into our interrupt path), rethinking locking
strategies, optimizing out/coalescing locking, optimizing out excess
memory allocation, optimizing synchronization primitives with the same
semantics, changing synchronization assumptions to offer weaker/stronger
semantics, etc.

Right now, though, the greatest obstacle in my immediate path appears to
be a bug in the current version of the if_em driver that causes the
interfaces on my test box to wedge under even moderate load.  The if_em
cards I have on other machines seem not to do this, which suggests a
driver weirdness with this particular version of the chipset/card.  Go
figure...

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert@fledge.watson.org      Principal Research Scientist, McAfee Research