Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 24 Dec 2006 22:08:45 +1100 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Robert Watson <rwatson@FreeBSD.org>
Cc:        cvs-src@FreeBSD.org, Scott Long <scottl@samsco.org>, src-committers@FreeBSD.org, cvs-all@FreeBSD.org, John Polstra <jdp@polstra.com>
Subject:   Re: cvs commit: src/sys/dev/bge if_bge.c
Message-ID:  <20061224211712.W25632@delplex.bde.org>
In-Reply-To: <20061224085231.Y37996@fledge.watson.org>
References:  <XFMail.20061223102713.jdp@polstra.com> <20061223213014.U35809@fledge.watson.org> <458E11AE.2000004@samsco.org> <20061224085231.Y37996@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 24 Dec 2006, Robert Watson wrote:

>> From the perspective of optimizing these particular paths, small packet 
>> sizes 
> best reveal processing overhead up to about the TCP/socket buffer layer on 
> modern hardware (DMA, etc).  The uni/bidirectional axis is interesting 
> because it helps reveal the impact of the direct dispatch vs. netisr dispatch 
> choice for the IP layer with respect to exercising parallelism.  I didn't 
> explicitly measure CPU, but as the configurations max out the CPUs in my test 
> bed, typically any significant CPU reduction is measurable in an improvement 
> in throughput.  For example, I was easily able to measure the CPU reduction 
> in switching from using the socket reference to the file descriptor reference 
> in sosend() on small packet transmit, which was a relatively minor functional 
> change in locking and reference counting.

Be careful with micro-optimizations.  I saw a single change (adding
about 1K in unrelated code that is never executed) give a pessimization
of 15% for tx bge (from 360 kpps to 300 kpps).  Before that I was
trying harder than now to find optimizations involving avoiding copying,
and thought that I had increased the speed from 330 kpps to 360 kpps
by removing things, but I may have just increased the speed by moving
cache phenomena.  The phenomena in this case seem to be related to
instructions more than data and I suspect that they are very MD.  The
machine that has them doesn't support APIC or ACPI, so hwpmc cannot
do anything useful on it.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20061224211712.W25632>