From owner-freebsd-net@FreeBSD.ORG  Mon Jul  7 13:37:30 2008
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A1C1A1065684
	for <freebsd-net@FreeBSD.org>; Mon,  7 Jul 2008 13:37:30 +0000 (UTC)
	(envelope-from andre@freebsd.org)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
	by mx1.freebsd.org (Postfix) with ESMTP id 07E6C8FC1A
	for <freebsd-net@FreeBSD.org>; Mon,  7 Jul 2008 13:37:29 +0000 (UTC)
	(envelope-from andre@freebsd.org)
Received: (qmail 32780 invoked from network); 7 Jul 2008 12:27:56 -0000
Received: from localhost (HELO [127.0.0.1]) ([127.0.0.1])
	(envelope-sender <andre@freebsd.org>)
	by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
	for <brde@optusnet.com.au>; 7 Jul 2008 12:27:56 -0000
Message-ID: <48721C18.4060109@freebsd.org>
Date: Mon, 07 Jul 2008 15:37:28 +0200
From: Andre Oppermann <andre@freebsd.org>
User-Agent: Thunderbird 1.5.0.14 (Windows/20071210)
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <4867420D.7090406@gtcomm.net>
	<ea7b9c170806302050p2a3a5480t29923a4ac2d7c852@mail.gmail.com>
	<4869ACFC.5020205@gtcomm.net> <4869B025.9080006@gtcomm.net>
	<486A7E45.3030902@gtcomm.net> <486A8F24.5010000@gtcomm.net>
	<486A9A0E.6060308@elischer.org> <486B41D5.3060609@gtcomm.net>
	<alpine.LFD.1.10.0807021052041.557@filebunker.xip.at>
	<486B4F11.6040906@gtcomm.net>
	<alpine.LFD.1.10.0807021155280.557@filebunker.xip.at>
	<486BC7F5.5070604@gtcomm.net>
	<20080703160540.W6369@delplex.bde.org>
	<486C7F93.7010308@gtcomm.net>
	<20080703195521.O6973@delplex.bde.org>
	<486D35A0.4000302@gtcomm.net>
	<alpine.LFD.1.10.0807041106591.19613@filebunker.xip.at>
	<486DF1A3.9000409@gtcomm.net>
	<alpine.LFD.1.10.0807041303490.20760@filebunker.xip.at>
	<486E65E6.3060301@gtcomm.net>
	<alpine.LFD.1.10.0807052356130.2145@filebunker.xip.at>
	<4871DB8E.5070903@freebsd.org>
	<20080707191918.B4703@besplex.bde.org>
	<4871FB66.1060406@freebsd.org>
	<20080707213356.G7572@besplex.bde.org>
In-Reply-To: <20080707213356.G7572@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: FreeBSD Net <freebsd-net@FreeBSD.org>, Ingo Flaschberger <if@xip.at>,
	Paul <paul@gtcomm.net>
Subject: Re: Freebsd IP Forwarding performance (question,
 and some info) [7-stable, current, em, smp]
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jul 2008 13:37:30 -0000

Bruce Evans wrote:
> On Mon, 7 Jul 2008, Andre Oppermann wrote:
> 
>> Bruce Evans wrote:
>>> What are the other overheads?  I calculate 1.644Mpps counting the 
>>> inter-frame
>>> gap, with 64-byte packets and 64-header_size payloads.  If the 64 bytes
>>> is for the payload, then the max is much lower.
>>
>> The theoretical maximum at 64byte frames is 1,488,100.  I've looked
>> up my notes the 1.244Mpps number can be ajusted to 1.488Mpps.
> 
> Where is the extra?  I still get 1.644736 Mpps (10^9/(8*64+96)).
> 1.488095 is for 64 bits extra (10^9/(8*64+96+64)).

The preamble has 64 bits and is in addition to the inter-frame gap.

>>>>> I hoped to reach 1Mpps with the hardware I mentioned some mails 
>>>>> before, but 2Mpps is far far away.
>>>>> Currently I get 160kpps via pci-32mbit-33mhz-1,2ghz mobile pentium.
>>>>
>>>> This is more or less expected.  PCI32 is not able to sustain high
>>>> packet rates.  The bus setup times kill the speed.  For larger packets
>>>> the ratio gets much better and some reasonable throughput can be 
>>>> achieved.
>>>
>>> I get about 640 kpps without forwarding (sendto: slightly faster;
>>> recvfrom: slightly slower) on a 2.2GHz A64.  Underclocking the memory
>>> from 200MHz to 100MHz only reduces the speed by about 10%, while not
>>> overclocking the CPU by 10% reduces the speed by the same 10%, so the
>>> system is apparently still mainly CPU-bound.
>>
>> On PCI32@33MHz?  He's using a 1.2GHz Mobile Pentium on top of that.
> 
> Yes.  My example shows that FreeBSD is more CPU-bound than I/O bound up
> to CPUs considerably faster than a 1.2GHz Pentium (though PentiumM is
> fast relative to its clock speed).  The memory interface may matter more
> than the CPU clock.
> 
>>>> NetFPGA doesn't have enough TCAM space to be useful for real routing
>>>> (as in Internet sized routing table).  The trick many embedded 
>>>> networking
>>>> CPUs use is cache prefetching that is integrated with the network
>>>> controller.  The first 64-128bytes of every packet are transferred
>>>> automatically into the L2 cache by the hardware.  This allows 
>>>> relatively
>>>> slow CPUs (700 MHz Broadcom BCM1250 in Cisco NPE-G1 or 1.67-GHz 
>>>> Freescale
>>>> 7448 in NPE-G2) to get more than 1Mpps.  Until something like this is
>>>> possible on Intel or AMD x86 CPUs we have a ceiling limited by RAM 
>>>> speed.
>>>
>>> Does using fa$ter memory (speed and/or latency) help here?  64 bytes
>>> is so small that latency may be more of a problem, especially without
>>> a prefetch.
>>
>> Latency.  For IPv4 packet forwarding only one cache line per packet
>> is fetched.  More memory speed only helps with the DMA from/to the
>> network card.
> 
> I use low-end memory, but on the machine that does 640 kpps it somehow
> has latency almost 4 times as low as on new FreeBSD cluster machines
> (~42 nsec instead of ~150).  perfmon (fixed for AXP and A64) and hwpmc
> report an average of 11 k8-dc-misses per sendto() while sending via
> bge at 640 kpps.  11 * 42 accounts for 442 nsec out of the 1562 per
> packet at this rate.  11 * 150 = 1650 would probably make this rate
> unachievable despite the system having 20 times as much CPU and bus.

We were talking routing here.  That is a packet received via network
interface and sent out on another.  Crosses the PCI bus twice.

-- 
Andre