Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 11 Jul 2014 12:32:04 -0700
From:      Navdeep Parhar <nparhar@gmail.com>
To:        John Jasem <jjasen@gmail.com>, FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Re: tuning routing using cxgbe and T580-CR cards?
Message-ID:  <53C03BB4.2090203@gmail.com>
In-Reply-To: <53C01EB5.6090701@gmail.com>
References:  <53C01EB5.6090701@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 07/11/14 10:28, John Jasem wrote:
> In testing two Chelsio T580-CR dual port cards with FreeBSD 10-STABLE,
> I've been able to use a collection of clients to generate approximately
> 1.5-1.6 million TCP packets per second sustained, and routinely hit
> 10GB/s, both measured by netstat -d -b -w1 -W (I usually use -h for the
> quick read, accepting the loss of granularity).

When forwarding, the pps rate is often more interesting, and almost
always the limiting factor, as compared to the total amount of data
being passed around.  10GB at this pps probably means 9000 MTU.  Try
with 1500 too if possible.

"netstat -d 1" and "vmstat 1" for a few seconds when your system is
under maximum load would be useful.  And what kind of CPU is in this system?

> 
> While performance has so far been stellar, and I'm honestly speculating
> I will need more CPU depth and horsepower to get much faster, I'm
> curious if there is any gain to tweaking performance settings. I'm
> seeing, under multiple streams, with N targets connecting to N servers,
> interrupts on all CPUs peg at 99-100%, and I'm curious if tweaking
> configs will help, or its a free clue to get more horsepower.
> 
> So, far, except for temporarily turning off pflogd, and setting the
> following sysctl variables, I've not done any performance tuning on the
> system yet.
> 
> /etc/sysctl.conf
> net.inet.ip.fastforwarding=1
> kern.random.sys.harvest.ethernet=0
> kern.random.sys.harvest.point_to_point=0
> kern.random.sys.harvest.interrupt=0
> 
> a) One of the first things I did in prior testing was to turn
> hyperthreading off. I presume this is still prudent, as HT doesn't help
> with interrupt handling?

It is always worthwhile to try your workload with and without
hyperthreading.

> b) I briefly experimented with using cpuset(1) to stick interrupts to
> physical CPUs, but it offered no performance enhancements, and indeed,
> appeared to decrease performance by 10-20%. Has anyone else tried this?
> What were your results?
> 
> c) the defaults for the cxgbe driver appear to be 8 rx queues, and N tx
> queues, with N being the number of CPUs detected. For a system running
> multiple cards, routing or firewalling, does this make sense, or would
> balancing tx and rx be more ideal? And would reducing queues per card
> based on NUMBER-CPUS and NUM-CHELSIO-PORTS make sense at all?

The defaults are nrxq = min(8, ncores) and ntxq = min(16, ncores).  The
man page mentions this.  The reason for 8 vs. 16 is that tx queues are
"cheaper" as they don't have to be backed by rx buffers.  It only needs
some memory for the tx descriptor ring and some hardware resources.

It appears that your system has >= 16 cores.  For forwarding it probably
makes sense to have nrxq = ntxq.  If you're left with 8 or fewer cores
after disabling hyperthreading you'll automatically get 8 rx and tx
queues.  Otherwise you'll have to fiddle with the hw.cxgbe.nrxq10g and
ntxq10g tunables (documented in the man page).


> d) dev.cxl.$PORT.qsize_rxq: 1024 and dev.cxl.$PORT.qsize_txq: 1024.
> These appear to not be writeable when if_cxgbe is loaded, so I speculate
> they are not to be messed with, or are loader.conf variables? Is there
> any benefit to messing with them?

Can't change them after the port has been administratively brought up
even once.  This is mentioned in the man page.  I don't really recommend
changing them any way.

> 
> e) dev.t5nex.$CARD.toe.sndbuf: 262144. These are writeable, but messing
> with values did not yield an immediate benefit. Am I barking up the
> wrong tree, trying?

The TOE tunables won't make a difference unless you have enabled TOE,
the TCP endpoints lie on the system, and the connections are being
handled by the TOE on the chip.  This is not the case on your systems.
The driver does not enable TOE by default and the only way to use it is
to switch it on explicitly.  There is no possibility that you're using
it without knowing that you are.

> 
> f) based on prior experiments with other vendors, I tried tweaks to
> net.isr.* settings, but did not see any benefits worth discussing. Am I
> correct in this speculation, based on others experience?
> 
> g) Are there other settings I should be looking at, that may squeeze out
> a few more packets?

The pps rates that you've observed are within the chip's hardware limits
by at least an order of magnitude.  Tuning the kernel rather than the
driver may be the best bang for your buck.

Regards,
Navdeep

> 
> Thanks in advance!
> 
> -- John Jasen (jjasen@gmail.com)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?53C03BB4.2090203>