From owner-freebsd-net@FreeBSD.ORG Fri Jul 11 19:32:07 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 242C4D8A for ; Fri, 11 Jul 2014 19:32:07 +0000 (UTC) Received: from mail-pd0-x22c.google.com (mail-pd0-x22c.google.com [IPv6:2607:f8b0:400e:c02::22c]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id ED6C12EB2 for ; Fri, 11 Jul 2014 19:32:06 +0000 (UTC) Received: by mail-pd0-f172.google.com with SMTP id w10so1883259pde.17 for ; Fri, 11 Jul 2014 12:32:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=wPewy67AoXmw0jDSxiXnFZpaMkIkolNDR8/c5NkbGFs=; b=X0OUwrpWC37VcicxaUsL8afOqmEO8HelmNtSustH1ZdPMrIOERjsbpIMMG0NxJ/xtC SeVgS48LstbRE0eqVm7uFtzjqC9ID7bDaOTBVT9Oi15N9h59aBNz7sWo+tryPgz9dn0p JZGBRHGyfBdVykczgEK47NDJCKAASeO9yZjp66IkAkvBt4Wz1w1MYLjsWi1UXaJw858a abZvCHzW5sBQaxcHzc9kRYx/LMtfe1t0oHKlV7obTWbJz/OJ8iuPwu7MD/3gcBMP/qGX MglazFUaedOhHUXHRiBU04BUPcOwbKERxCw+7O1pbO7jdAbvBYzxfHEjjVJ7lGNSKlzb NlIg== X-Received: by 10.66.254.166 with SMTP id aj6mr839363pad.11.1405107126522; Fri, 11 Jul 2014 12:32:06 -0700 (PDT) Received: from [10.192.166.0] (stargate.chelsio.com. [67.207.112.58]) by mx.google.com with ESMTPSA id z2sm4058909pdp.91.2014.07.11.12.32.05 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 11 Jul 2014 12:32:05 -0700 (PDT) Message-ID: <53C03BB4.2090203@gmail.com> Date: Fri, 11 Jul 2014 12:32:04 -0700 From: Navdeep Parhar User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: John Jasem , FreeBSD Net Subject: Re: tuning routing using cxgbe and T580-CR cards? References: <53C01EB5.6090701@gmail.com> In-Reply-To: <53C01EB5.6090701@gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Jul 2014 19:32:07 -0000 On 07/11/14 10:28, John Jasem wrote: > In testing two Chelsio T580-CR dual port cards with FreeBSD 10-STABLE, > I've been able to use a collection of clients to generate approximately > 1.5-1.6 million TCP packets per second sustained, and routinely hit > 10GB/s, both measured by netstat -d -b -w1 -W (I usually use -h for the > quick read, accepting the loss of granularity). When forwarding, the pps rate is often more interesting, and almost always the limiting factor, as compared to the total amount of data being passed around. 10GB at this pps probably means 9000 MTU. Try with 1500 too if possible. "netstat -d 1" and "vmstat 1" for a few seconds when your system is under maximum load would be useful. And what kind of CPU is in this system? > > While performance has so far been stellar, and I'm honestly speculating > I will need more CPU depth and horsepower to get much faster, I'm > curious if there is any gain to tweaking performance settings. I'm > seeing, under multiple streams, with N targets connecting to N servers, > interrupts on all CPUs peg at 99-100%, and I'm curious if tweaking > configs will help, or its a free clue to get more horsepower. > > So, far, except for temporarily turning off pflogd, and setting the > following sysctl variables, I've not done any performance tuning on the > system yet. > > /etc/sysctl.conf > net.inet.ip.fastforwarding=1 > kern.random.sys.harvest.ethernet=0 > kern.random.sys.harvest.point_to_point=0 > kern.random.sys.harvest.interrupt=0 > > a) One of the first things I did in prior testing was to turn > hyperthreading off. I presume this is still prudent, as HT doesn't help > with interrupt handling? It is always worthwhile to try your workload with and without hyperthreading. > b) I briefly experimented with using cpuset(1) to stick interrupts to > physical CPUs, but it offered no performance enhancements, and indeed, > appeared to decrease performance by 10-20%. Has anyone else tried this? > What were your results? > > c) the defaults for the cxgbe driver appear to be 8 rx queues, and N tx > queues, with N being the number of CPUs detected. For a system running > multiple cards, routing or firewalling, does this make sense, or would > balancing tx and rx be more ideal? And would reducing queues per card > based on NUMBER-CPUS and NUM-CHELSIO-PORTS make sense at all? The defaults are nrxq = min(8, ncores) and ntxq = min(16, ncores). The man page mentions this. The reason for 8 vs. 16 is that tx queues are "cheaper" as they don't have to be backed by rx buffers. It only needs some memory for the tx descriptor ring and some hardware resources. It appears that your system has >= 16 cores. For forwarding it probably makes sense to have nrxq = ntxq. If you're left with 8 or fewer cores after disabling hyperthreading you'll automatically get 8 rx and tx queues. Otherwise you'll have to fiddle with the hw.cxgbe.nrxq10g and ntxq10g tunables (documented in the man page). > d) dev.cxl.$PORT.qsize_rxq: 1024 and dev.cxl.$PORT.qsize_txq: 1024. > These appear to not be writeable when if_cxgbe is loaded, so I speculate > they are not to be messed with, or are loader.conf variables? Is there > any benefit to messing with them? Can't change them after the port has been administratively brought up even once. This is mentioned in the man page. I don't really recommend changing them any way. > > e) dev.t5nex.$CARD.toe.sndbuf: 262144. These are writeable, but messing > with values did not yield an immediate benefit. Am I barking up the > wrong tree, trying? The TOE tunables won't make a difference unless you have enabled TOE, the TCP endpoints lie on the system, and the connections are being handled by the TOE on the chip. This is not the case on your systems. The driver does not enable TOE by default and the only way to use it is to switch it on explicitly. There is no possibility that you're using it without knowing that you are. > > f) based on prior experiments with other vendors, I tried tweaks to > net.isr.* settings, but did not see any benefits worth discussing. Am I > correct in this speculation, based on others experience? > > g) Are there other settings I should be looking at, that may squeeze out > a few more packets? The pps rates that you've observed are within the chip's hardware limits by at least an order of magnitude. Tuning the kernel rather than the driver may be the best bang for your buck. Regards, Navdeep > > Thanks in advance! > > -- John Jasen (jjasen@gmail.com)