From owner-freebsd-stable@FreeBSD.ORG  Thu Jan 27 19:57:44 2011
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 9AB701065672
	for <freebsd-stable@freebsd.org>; Thu, 27 Jan 2011 19:57:44 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta10.emeryville.ca.mail.comcast.net
	(qmta10.emeryville.ca.mail.comcast.net [76.96.30.17])
	by mx1.freebsd.org (Postfix) with ESMTP id 78FB48FC23
	for <freebsd-stable@freebsd.org>; Thu, 27 Jan 2011 19:57:44 +0000 (UTC)
Received: from omta04.emeryville.ca.mail.comcast.net ([76.96.30.35])
	by qmta10.emeryville.ca.mail.comcast.net with comcast
	id 0jvo1g0020lTkoCAAjxjCa; Thu, 27 Jan 2011 19:57:43 +0000
Received: from koitsu.dyndns.org ([98.248.34.134])
	by omta04.emeryville.ca.mail.comcast.net with comcast
	id 0jxi1g0062tehsa8Qjxi5v; Thu, 27 Jan 2011 19:57:43 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id CD2A09B422; Thu, 27 Jan 2011 11:57:41 -0800 (PST)
Date: Thu, 27 Jan 2011 11:57:41 -0800
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Damien Fleuriot <ml@my.gd>
Message-ID: <20110127195741.GA40449@icarus.home.lan>
References: <4D41417A.20904@my.gd>
	<1DB50624F8348F48840F2E2CF6040A9D014BEB8833@orsmsx508.amr.corp.intel.com>
	<4D41B197.6070308@my.gd> <201101280146.57028.wmn@siberianet.ru>
	<4D41C9FC.10503@my.gd>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <4D41C9FC.10503@my.gd>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: Sergey Lobanov <wmn@siberianet.ru>,
	"freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>,
	"freebsd-pf@freebsd.org" <freebsd-pf@freebsd.org>
Subject: Re: High interrupt rate on a PF box + performance
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Jan 2011 19:57:44 -0000

On Thu, Jan 27, 2011 at 08:39:40PM +0100, Damien Fleuriot wrote:
> 
> 
> On 1/27/11 7:46 PM, Sergey Lobanov wrote:
> > В сообщении от Пятница 28 января 2011 00:55:35 автор Damien Fleuriot написал:
> >> On 1/27/11 6:41 PM, Vogel, Jack wrote:
> >>> Jeremy is right, if you have a problem the first step is to try the
> >>> latest code.
> >>>
> >>> However, when I look at the interrupts below I don't see what the problem
> >>> is? The Broadcom seems to have about the same rate, it just doesn't have
> >>> MSIX (multiple vectors).
> >>>
> >>> Jack
> >>
> >> My main concern is that the CPU %interrupt is quite high, also, we seem
> >> to be experiencing input errors on the interfaces.
> > Would you show igb tuning which is done in loader.conf and output of sysctl 
> > dev.igb.0?
> > Did you rise number of igb descriptors such as:
> > hw.igb.rxd=4096
> > hw.igb.txd=4096 ?
> 
> There is no tuning at all on our part in the loader's conf.
> 
> Find below the sysctls:
> 
> # sysctl -a |grep igb
> dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection version - 1.7.3
> dev.igb.0.%driver: igb
> dev.igb.0.%location: slot=0 function=0
> dev.igb.0.%pnpinfo: vendor=0x8086 device=0x10d6 subvendor=0x8086
> subdevice=0x145a class=0x020000
> dev.igb.0.%parent: pci14
> dev.igb.0.debug: -1
> dev.igb.0.stats: -1
> dev.igb.0.flow_control: 3
> dev.igb.0.enable_aim: 1
> dev.igb.0.low_latency: 128
> dev.igb.0.ave_latency: 450
> dev.igb.0.bulk_latency: 1200
> dev.igb.0.rx_processing_limit: 100
> dev.igb.1.%desc: Intel(R) PRO/1000 Network Connection version - 1.7.3
> dev.igb.1.%driver: igb
> dev.igb.1.%location: slot=0 function=1
> dev.igb.1.%pnpinfo: vendor=0x8086 device=0x10d6 subvendor=0x8086
> subdevice=0x145a class=0x020000
> dev.igb.1.%parent: pci14
> dev.igb.1.debug: -1
> dev.igb.1.stats: -1
> dev.igb.1.flow_control: 3
> dev.igb.1.enable_aim: 1
> dev.igb.1.low_latency: 128
> dev.igb.1.ave_latency: 450
> dev.igb.1.bulk_latency: 1200
> dev.igb.1.rx_processing_limit: 100

I'm not aware of how to tune igb(4), so the advice Sergey gave you may
be applicable.  You'll need to schedule downtime to adjust those
tunables however (since a reboot will be requried).

I also reviewed the munin graphs.  I don't see anything necessarily
wrong.  However, you omitted yearly graphs for the network interfaces.
Why I care about that:

The pf state table (yearly) graph basically correlates with the CPU
usage (yearly) graph, and I expect that the yearly network graphs would
show a similar trend: an increase in your overall traffic over the
course of a year.

What I'm trying to figure out is what you're concerned about.  You are
in fact pushing anywhere between 60-120MBytes/sec across these
interfaces.  Given those numbers, I'm not surprised by the ""high""
interrupt usage.

Graphs of this nature usually indicate that you're hitting a
"bottleneck" (for lack of better word) where you're simply doing "too
much" with a single machine (given its network throughput).  The machine
is spending a tremendous amount of CPU time handling network traffic,
and equally as much with regards to the pf usage.

If you want my opinion based on the information I have so far, it's
this: you need to scale your infrastructure.  You can no longer rely on
a single machine to handle this amount of traffic.

As for the network errors you see -- to get low-level NIC and driver
statistics, you'll need to run "sysctl dev.igb.X.stats=1" then run
"dmesg" and look at the numbers shown (the sysctl command won't output
anything itself).  This may help indicate where the packets are being
lost.  You should also check the interface counters on the switch which
these interfaces are connected to.  I sure hope it's a managed switch
which can give you those statistics.

Hope this helps, or at least acts as food for thought.

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.               PGP 4BD6C0CB |