Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 27 Jan 2011 12:02:18 -0800
From:      Jack Vogel <jfvogel@gmail.com>
To:        Jeremy Chadwick <freebsd@jdc.parodius.com>
Cc:        Sergey Lobanov <wmn@siberianet.ru>, "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>, "freebsd-pf@freebsd.org" <freebsd-pf@freebsd.org>
Subject:   Re: High interrupt rate on a PF box + performance
Message-ID:  <AANLkTikxNXJHpLwp8-m9cbpKw5GvXt0WUEaA15G85VUg@mail.gmail.com>
In-Reply-To: <20110127195741.GA40449@icarus.home.lan>
References:  <4D41417A.20904@my.gd> <1DB50624F8348F48840F2E2CF6040A9D014BEB8833@orsmsx508.amr.corp.intel.com> <4D41B197.6070308@my.gd> <201101280146.57028.wmn@siberianet.ru> <4D41C9FC.10503@my.gd> <20110127195741.GA40449@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
If you go to 8.2 and the latest driver you will get better stats also,
ahem...

Jack


On Thu, Jan 27, 2011 at 11:57 AM, Jeremy Chadwick
<freebsd@jdc.parodius.com>wrote:

> On Thu, Jan 27, 2011 at 08:39:40PM +0100, Damien Fleuriot wrote:
> >
> >
> > On 1/27/11 7:46 PM, Sergey Lobanov wrote:
> > > =F7 =D3=CF=CF=C2=DD=C5=CE=C9=C9 =CF=D4 =F0=D1=D4=CE=C9=C3=C1 28 =D1=
=CE=D7=C1=D2=D1 2011 00:55:35 =C1=D7=D4=CF=D2 Damien Fleuriot
> =CE=C1=D0=C9=D3=C1=CC:
> > >> On 1/27/11 6:41 PM, Vogel, Jack wrote:
> > >>> Jeremy is right, if you have a problem the first step is to try the
> > >>> latest code.
> > >>>
> > >>> However, when I look at the interrupts below I don't see what the
> problem
> > >>> is? The Broadcom seems to have about the same rate, it just doesn't
> have
> > >>> MSIX (multiple vectors).
> > >>>
> > >>> Jack
> > >>
> > >> My main concern is that the CPU %interrupt is quite high, also, we
> seem
> > >> to be experiencing input errors on the interfaces.
> > > Would you show igb tuning which is done in loader.conf and output of
> sysctl
> > > dev.igb.0?
> > > Did you rise number of igb descriptors such as:
> > > hw.igb.rxd=3D4096
> > > hw.igb.txd=3D4096 ?
> >
> > There is no tuning at all on our part in the loader's conf.
> >
> > Find below the sysctls:
> >
> > # sysctl -a |grep igb
> > dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection version - 1.7.3
> > dev.igb.0.%driver: igb
> > dev.igb.0.%location: slot=3D0 function=3D0
> > dev.igb.0.%pnpinfo: vendor=3D0x8086 device=3D0x10d6 subvendor=3D0x8086
> > subdevice=3D0x145a class=3D0x020000
> > dev.igb.0.%parent: pci14
> > dev.igb.0.debug: -1
> > dev.igb.0.stats: -1
> > dev.igb.0.flow_control: 3
> > dev.igb.0.enable_aim: 1
> > dev.igb.0.low_latency: 128
> > dev.igb.0.ave_latency: 450
> > dev.igb.0.bulk_latency: 1200
> > dev.igb.0.rx_processing_limit: 100
> > dev.igb.1.%desc: Intel(R) PRO/1000 Network Connection version - 1.7.3
> > dev.igb.1.%driver: igb
> > dev.igb.1.%location: slot=3D0 function=3D1
> > dev.igb.1.%pnpinfo: vendor=3D0x8086 device=3D0x10d6 subvendor=3D0x8086
> > subdevice=3D0x145a class=3D0x020000
> > dev.igb.1.%parent: pci14
> > dev.igb.1.debug: -1
> > dev.igb.1.stats: -1
> > dev.igb.1.flow_control: 3
> > dev.igb.1.enable_aim: 1
> > dev.igb.1.low_latency: 128
> > dev.igb.1.ave_latency: 450
> > dev.igb.1.bulk_latency: 1200
> > dev.igb.1.rx_processing_limit: 100
>
> I'm not aware of how to tune igb(4), so the advice Sergey gave you may
> be applicable.  You'll need to schedule downtime to adjust those
> tunables however (since a reboot will be requried).
>
> I also reviewed the munin graphs.  I don't see anything necessarily
> wrong.  However, you omitted yearly graphs for the network interfaces.
> Why I care about that:
>
> The pf state table (yearly) graph basically correlates with the CPU
> usage (yearly) graph, and I expect that the yearly network graphs would
> show a similar trend: an increase in your overall traffic over the
> course of a year.
>
> What I'm trying to figure out is what you're concerned about.  You are
> in fact pushing anywhere between 60-120MBytes/sec across these
> interfaces.  Given those numbers, I'm not surprised by the ""high""
> interrupt usage.
>
> Graphs of this nature usually indicate that you're hitting a
> "bottleneck" (for lack of better word) where you're simply doing "too
> much" with a single machine (given its network throughput).  The machine
> is spending a tremendous amount of CPU time handling network traffic,
> and equally as much with regards to the pf usage.
>
> If you want my opinion based on the information I have so far, it's
> this: you need to scale your infrastructure.  You can no longer rely on
> a single machine to handle this amount of traffic.
>
> As for the network errors you see -- to get low-level NIC and driver
> statistics, you'll need to run "sysctl dev.igb.X.stats=3D1" then run
> "dmesg" and look at the numbers shown (the sysctl command won't output
> anything itself).  This may help indicate where the packets are being
> lost.  You should also check the interface counters on the switch which
> these interfaces are connected to.  I sure hope it's a managed switch
> which can give you those statistics.
>
> Hope this helps, or at least acts as food for thought.
>
> --
> | Jeremy Chadwick                                   jdc@parodius.com |
> | Parodius Networking                       http://www.parodius.com/ |
> | UNIX Systems Administrator                  Mountain View, CA, USA |
> | Making life hard for others since 1977.               PGP 4BD6C0CB |
>
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTikxNXJHpLwp8-m9cbpKw5GvXt0WUEaA15G85VUg>