FreeBSD Mail Archives

Date:      Mon, 07 Jul 2008 11:56:44 +0200
From:      Andre Oppermann <andre@freebsd.org>
To:        Paul <paul@gtcomm.net>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]
Message-ID:  <4871E85C.8090907@freebsd.org>
In-Reply-To: <486B41D5.3060609@gtcomm.net>
References:  <4867420D.7090406@gtcomm.net>	<200806301944.m5UJifJD081781@lava.sentex.ca>	<20080701004346.GA3898@stlux503.dsto.defence.gov.au>	<alpine.LFD.1.10.0807010257570.19444@filebunker.xip.at>	<20080701010716.GF3898@stlux503.dsto.defence.gov.au>	<alpine.LFD.1.10.0807010308320.19444@filebunker.xip.at>	<486986D9.3000607@monkeybrains.net>	<48699960.9070100@gtcomm.net>	<ea7b9c170806302005n2a66f592h2127f87a0ba2c6d2@mail.gmail.com>	<20080701033117.GH83626@cdnetworks.co.kr>	<ea7b9c170806302050p2a3a5480t29923a4ac2d7c852@mail.gmail.com>	<4869ACFC.5020205@gtcomm.net>	<4869B025.9080006@gtcomm.net>	<486A7E45.3030902@gtcomm.net>	<486A8F24.5010000@gtcomm.net> <486A9A0E.6060308@elischer.org> <486B41D5.3060609@gtcomm.net>

Paul wrote:
> SMP DISABLED on my Opteron 2212  (ULE, Preemption on)
> Yields ~750kpps in em0 and out em1  (one direction)
> I am miffed why this yields more pps than
> a) with all 4 cpus running and b) 4 cpus with lagg load balanced over 3 
> incoming connections so 3 taskq threads

SMP adds quite some overhead in the generic case is currently not
well suited for high performance packet forwarding.

On SMP interrupts are delivered to one CPU but not necessarily the
one that will later on handle the taskqueue to process the packets.
That adds overhead.  Ideally the interrupt for each network interface
is bound to exactly one pre-determined CPU and the taskqueue is bound
to the same CPU.  That way the overhead for interrupt and taskqueue
scheduling can be kept at a minimum.  Most of the infrastructure to
do this binding already exists in the kernel but is not yet exposed
to the outside for us to make use of it.  I'm also not sure if the
ULE scheduler skips the more global locks when interrupt and the
thread are on the same CPU.

Distributing the interrupts and taskqueues among the available CPUs
gives concurrent forwarding with bi- or multi-directional traffic.
All incoming traffic from any particular interface is still serialized
though.

-- 
Andre

> I would be willing to set up test equipment (several servers plugged 
> into a switch) with ipkvm and power port access
> if someone or a group of people want to figure out ways to improve the 
> routing process, ipfw, and lagg.
> 
> Maximum PPS with one ipfw rule on UP:
> tops out about 570Kpps.. almost 200kpps lower ? (frown)
> 
> I'm going to drop in a 3ghz opteron instead of the 2ghz 2212 that's in 
> here and see how that scales, using UP same kernel etc I have now.
> 
> 
> 
> 
> 
> Julian Elischer wrote:
>> Paul wrote:
>>> ULE without PREEMPTION is now yeilding better results.
>>>         input          (em0)           output
>>>   packets  errs      bytes    packets  errs      bytes colls
>>>    571595 40639   34564108          1     0        226     0
>>>    577892 48865   34941908          1     0        178     0
>>>    545240 84744   32966404          1     0        178     0
>>>    587661 44691   35534512          1     0        178     0
>>>    587839 38073   35544904          1     0        178     0
>>>    587787 43556   35540360          1     0        178     0
>>>    540786 39492   32712746          1     0        178     0
>>>    572071 55797   34595650          1     0        178     0
>>>  
>>> *OUCH, IPFW HURTS..
>>> loading ipfw, and adding one ipfw rule allow ip from any to any drops 
>>> 100Kpps off :/ what's up with THAT?
>>> unloaded ipfw module and back 100kpps more again, that's not right 
>>> with ONE rule.. :/
>>
>> ipfw need sto gain a lock on hte firewall before running,
>> and is quite complex..  I can believe it..
>>
>> in FreeBSD 4.8 I was able to use ipfw and filter 1Gb between two 
>> interfaces (bridged) but I think it has slowed down since then due to 
>> the SMP locking.
>>
>>
>>>
>>> em0 taskq is still jumping cpus.. is there any way to lock it to one 
>>> cpu or is this just a function of ULE
>>>
>>> running a tar czpvf all.tgz *  and seeing if pps changes..
>>> negligible.. guess scheduler is doing it's job at least..
>>>
>>> Hmm. even when it's getting 50-60k errors per second on the interface 
>>> I can still SCP a file through that interface although it's not 
>>> fast.. 3-4MB/s..
>>>
>>> You know, I wouldn't care if it added 5ms latency to the packets when 
>>> it was doing 1mpps as long as it didn't drop any.. Why can't it do 
>>> that? Queue them up and do them in bigggg chunks so none are 
>>> dropped........hmm?
>>>
>>> 32 bit system is compiling now..  won't do > 400kpps with GENERIC 
>>> kernel, as with 64 bit did 450k with GENERIC, although that could be
>>> the difference between opteron 270 and opteron 2212..
>>>
>>> Paul
>>>
>>> _______________________________________________
>>> freebsd-net@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>>
>>
> 
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
> 
>

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4871E85C.8090907>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation