From owner-freebsd-current@FreeBSD.ORG  Fri Mar  5 18:16:47 2010
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8C689106566B
	for <current@freebsd.org>; Fri,  5 Mar 2010 18:16:47 +0000 (UTC)
	(envelope-from ianf@clue.co.za)
Received: from inbound01.jnb1.gp-online.net (inbound01.jnb1.gp-online.net
	[41.161.16.135])
	by mx1.freebsd.org (Postfix) with ESMTP id ACD6F8FC1F
	for <current@freebsd.org>; Fri,  5 Mar 2010 18:16:46 +0000 (UTC)
Received: from [41.154.88.19] (helo=clue.co.za)
	by inbound01.jnb1.gp-online.net with esmtpsa
	(TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.63)
	(envelope-from <ianf@clue.co.za>)
	id 1Nnc4p-0003GS-Ti; Fri, 05 Mar 2010 20:16:44 +0200
Received: from localhost ([127.0.0.1] helo=clue.co.za)
	by clue.co.za with esmtp (Exim 4.69 (FreeBSD))
	(envelope-from <ianf@clue.co.za>)
	id 1Nnc4d-0003mB-6e; Fri, 05 Mar 2010 20:16:31 +0200
To: pyunyh@gmail.com
From: Ian FREISLICH <ianf@clue.co.za>
In-Reply-To: <20100305175639.GB14818@michelle.cdnetworks.com> 
References: <20100305175639.GB14818@michelle.cdnetworks.com>
	<E1NnVaT-0003Ft-3p@clue.co.za> 
X-Attribution: BOFH
Date: Fri, 05 Mar 2010 20:16:31 +0200
Message-Id: <E1Nnc4d-0003mB-6e@clue.co.za>
Cc: current@freebsd.org
Subject: Re: dev.bce.X.com_no_buffers increasing and packet loss 
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 05 Mar 2010 18:16:47 -0000

Pyun YongHyeon wrote:
> On Fri, Mar 05, 2010 at 01:20:57PM +0200, Ian FREISLICH wrote:
> > Hi
> > 
> > I have a system that is experiencing mild to severe packet loss.
> > The interfaces are configured as follows:
> > 
> > lagg0: bce0, bce1, bce2, bce3  lagproto lacp
> > 
> > lagg0 then is used as the hwdev for the vlan interfaces.
> > 
> > I have pf with a few queues for bandwidth management.
> > 
> > There isn't that much traffic on it (200-500Mbit/s).
> > 
> > I see only the following suspect for packet loss:
> > 
> > dev.bce.0.com_no_buffers: 140151466
> > dev.bce.1.com_no_buffers: 514723247
> > dev.bce.2.com_no_buffers: 10454050
> > dev.bce.3.com_no_buffers: 369371
> > 
> > Most of the time, these numbers are static, but every once in a
> > while they increase massively by several thousand, but only on 2
> > interfaces.  The 1 minute average rate on those interfaces is 266/s
> > and 123/s.
> > 
> > Does anyone think this is related to the packet loss or are these
> > counters just a red herring?  Is there anything that can be done
> > to reduce this count?
> > 
> 
> I think this sysctl node indicates number of dropped frames in
> completion processor of NetXtreme II. The counter is incremented
> when the processor received a frame successfully but it couldn't
> pass the frame to system as there are no available RX buffers so
> completion processor dopped the received frame.
> If you see mbuf shortage from netstat that would be normal. But if
> system has a lot of free mbuf resources it may indicate other
> issue. bce(4) may not be able to replenish controller with RX
> buffer if system is suffering from high load.

I don't think I've ever seen an mbuf shortage on this host, and
load isn't that high, typically 12% CPU or 88% idle.  That's just
on 2 (of 16) cores busy.  There's tons of free memory (~12G) if I
need to increase the number of buffers available, but I'm not sure
which tunable to use to do that.  The routing table also isn't large
at about 4000 prefixes.

[firewall1.jnb1] ~ # netstat -m
4118/7147/11265 mbufs in use (current/cache/total)
3092/6850/9942/131072 mbuf clusters in use (current/cache/total/max)
2060/4212 mbuf+clusters out of packet secondary zone in use (current/cache)
0/678/678/65536 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/32768 9k jumbo clusters in use (current/cache/total/max)
0/0/0/16384 16k jumbo clusters in use (current/cache/total/max)
7214K/18198K/25412K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

I currently set the following in loader.conf:

net.isr.maxthreads="8"
net.isr.direct=0
if_igb_load="yes"
kern.ipc.nmbclusters="131072"
kern.maxusers="1024"

Ian

--
Ian Freislich