Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 29 Jun 2008 04:04:29 -0400
From:      Paul <paul@gtcomm.net>
To:        FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Freebsd IP Forwarding  performance (question, and some info) [7-stable, current, em, smp]
Message-ID:  <4867420D.7090406@gtcomm.net>

next in thread | raw e-mail | index | archive | help
This is just a question but who can get more than 400k pps forwarding 
performance ?
I have tested fbsd 6/7/8 so far with many different configs. (all using 
intel pci-ex nic and SMP)
fbsd  7-stable/8(current) seem to be the fastest and always hit this 
ceiling of 400k pps.  Soon as it hits that I get errors galore.
Received no buffers, missed packets, rx overruns.. It's because 'em0 
taskq' is 90% cpu or so..
Now, while this is happening I have two CPU's 100% idle, and the other 
two CPUs are about 60%/20% ..
So why in the world can't it use more cpus? 
Simple test setup:
packet generator on em0
destination out em1
have to have ip forwarding and fastforwarding on (fastforward definitely 
makes a big difference, another 100kpps or so, without it can barely hit 
300k)
Packets are TCP, randomized sources, randomized ports for src and dst, 
single destination ip.
I even tried the yandex driver in FBSD6 but it could barely even get 
200k pps and it had a lot of weird issues, and fbsd6 couldn't hit 400k 
pps by itself.
I am not using polling, that seems to make no difference, i tried that too.
So question. What can I do for more performance (SMP)?  Are there any 
good kernel options?
If I disable ip forwarding i can do 750kpps with no errors because it's 
not going anywhere..em0 taskq cpu usage is less than half of what it is 
when it's forwarding.  so obviously the issue is somewhere in the 
forwarding path and fastforwarding greatly helps!! see below.
forwarding off:
            input          (em0)           output
   packets  errs      bytes    packets  errs      bytes colls
    757223     0   46947830          1     0        226     0
    753551     0   46720166          1     0        178     0
    756359     0   46894262          1     0        178     0
    757570     0   46969344          1     0        178     0
    753724     0   46730830          1     0        178     0
    745372     0   46213130          1     0        178     0


(I had to slow down the packet generation to about 420-430kpps)
forwarding on:
           input          (em0)           output
   packets  errs      bytes    packets  errs      bytes colls
    285918 151029   17726936        460     0      25410     0
    284929 146151   17665602        417     0      22642     0
    284253 147000   17623690        442     0      23884     0
    285438 147765   17697160        448     0      24316     0
    286582 147171   17768088        456     0      24748     0
    287194 147088   17806032        422     0      22912     0
    285812 141713   17720348        440     0      23884     0
    284958 137579   17667412        457     0      25104     0

fastforwarding on:

          input          (em0)           output
   packets  errs      bytes    packets  errs      bytes colls
    399795 22790   24787310        459     0      25130     0
    397425 25254   24640354        434     0      23560     0
    403223 26937   24999830        431     0      23452     0
    396587 21431   24588398        467     0      25288     0
    400970 25776   24860144        459     0      24910     0
    397819 23657   24664782        432     0      23452     0
    406222 27418   25185768        432     0      23506     0
    406718 12407   25216520        461     0      25018     0

  PID USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
   11 root     171 ki31     0K    64K CPU1   1  29:24 100.00% {idle: cpu1}
   11 root     171 ki31     0K    64K RUN    0  28:46 100.00% {idle: cpu0}
   11 root     171 ki31     0K    64K CPU3   3  24:32 84.62% {idle: cpu3}
    0 root     -68    0     0K   128K CPU2   2  12:59 84.13% {em0 taskq}
    0 root     -68    0     0K   128K -      3   2:12 19.92% {em1 taskq}
   11 root     171 ki31     0K    64K RUN    2  19:46 19.63% {idle: cpu2}



Well if anything.. at least it's a good show of the difference 
fastforwarding makes!! :)
I have
options         NO_ADAPTIVE_MUTEXES     ## Improve routing performance?
options         STOP_NMI                # Stop CPUS using NMI instead of IPI
no IPV6
no firewall loaded
no netgraph
HZ is 4000
em driver is 4096 on receive buffers
using VLAN devices (em1 output)
Tested on Xeon and Opteron processor
Don't have exact results.
Above results are dual opteron 2212 with freebsd current
FreeBSD 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Sat Jun 28 23:37:39 CDT 
2008    

Well I'm curious of the results of others..

Thanks for reading!! :)





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4867420D.7090406>