FreeBSD Mail Archives

Date:      Mon, 13 Sep 2010 11:48:33 -0700
From:      Pyun YongHyeon <pyunyh@gmail.com>
To:        Tom Judge <tom@tomjudge.com>
Cc:        freebsd-net@freebsd.org, davidch@broadcom.com, yongari@freebsd.org
Subject:   Re: bce(4) - com_no_buffers (Again)
Message-ID:  <20100913184833.GF1229@michelle.cdnetworks.com>
In-Reply-To: <4C8E3D79.6090102@tomjudge.com>
References:  <4C894A76.5040200@tomjudge.com> <20100910002439.GO7203@michelle.cdnetworks.com> <4C8E3D79.6090102@tomjudge.com>

On Mon, Sep 13, 2010 at 10:04:25AM -0500, Tom Judge wrote:
> On 09/09/2010 07:24 PM, Pyun YongHyeon wrote:
> > On Thu, Sep 09, 2010 at 03:58:30PM -0500, Tom Judge wrote:
> >   
> >> Hi,
> >> I am just following up on the thread from March (I think) about this issue.
> >>
> >> We are seeing this issue on a number of systems running 7.1. 
> >>
> >> The systems in question are all Dell:
> >>
> >> * R710 R610 R410
> >> * PE2950
> >>
> >> The latter do not show the issue as much as the R series systems.
> >>
> >> The cards in one of the R610's that I am testing with are:
> >>
> >> bce0@pci0:1:0:0:        class=0x020000 card=0x02361028 chip=0x163914e4
> >> rev=0x20 hdr=0x00
> >>     vendor     = 'Broadcom Corporation'
> >>     device     = 'NetXtreme II BCM5709 Gigabit Ethernet'
> >>     class      = network
> >>     subclass   = ethernet
> >>
> >> They are connected to Dell PowerConnect 5424 switches.
> >>
> >> uname -a:
> >> FreeBSD bandor.chi-dc.mintel.ad 7.1-RELEASE-p4 FreeBSD 7.1-RELEASE-p4
> >> #3: Wed Sep  8 08:19:03 UTC 2010    
> >> tj@dev-tj-7-1-amd64.chicago.mintel.ad:/usr/obj/usr/src/sys/MINTELv10  amd64
> >>
> >> We are also using 8192 byte jumbo frames, if_lagg and if_vlan in the
> >> configuration (the nics are in promisc as we are currently capturing
> >> netflow data on another vlan for diagnostic purposes. ):
> >>
> >>
> >>     
> <SNIP IFCONFIG/>
> >> I have updated the bce driver and the Broadcomm MII driver to the
> >> version from stable/7 and am still seeing the issue.
> >>
> >> This morning I did a test with increasing the RX_PAGES to 8 but the
> >> system just hung starting the network.  The route command got stuck in a
> >> zone state (Sorry can't remember exactly which).
> >>
> >> The real question is, how do we go about increasing the number of RX
> >> BDs? I guess we have to bump more that just RX_PAGES...
> >>
> >>
> >> The cause for us, from what we can see, is the openldap server sending
> >> large group search results back to nss_ldap or pam_ldap.  When it does
> >> this it seems to send each of the 600 results in its own TCP segment
> >> creating a small packet storm (600*~100byte PDU's) at the destination
> >> host.  The kernel then retransmits 2 blocks of 100 results each after
> >> SACK kicks in for the data that was dropped by the NIC.
> >>
> >>
> >> Thanks in advance
> >>
> >> Tom
> >>
> >>
> >>     
> <SNIP SYSCTL OUTPUT/>
> > FW may drop incoming frames when it does not see available RX
> > buffers. Increasing number of RX buffers slightly reduce the
> > possibility of dropping frames but it wouldn't completely fix it.
> > Alternatively driver may tell available RX buffers in the middle
> > of RX ring processing instead of giving updated buffers at the end
> > of RX processing. This way FW may see available RX buffers while
> > driver/upper stack is busy to process received frames. But this may
> > introduce coherency issues because the RX ring is shared between
> > host and FW. If FreeBSD has way to sync partial region of a DMA
> > map, this could be implemented without fear of coherency issue.
> > Another way to improve RX performance would be switching to
> > multi-RX queue with RSS but that would require a lot of work and I
> > had no time to implement it.
> >   
> 
> Does this mean that these cards are going to perform badly? This is was
> what I gathered from the previous thread.
> 

I mean there are still many rooms to be done in driver for better
performance. bce(4) controllers are one of best controllers for
servers and driver didn't take full advantage of it.

> > BTW, given that you've updated to bce(4)/mii(4) of stable/7, I
> > wonder why TX/RX flow controls were not kicked in.
> >   
> 
> The working copy I used for grabbing the upstream source is at r212371.
> 
> Last changes for the directories in my working copy:
> 
> sys/dev/bce @  211388
> sys/dev/mii @ 212020
> 
> 
> I discovered that flow control was disabled on the switches, so I set it
> to auto and added a pair of BCE_PRINTF's in the code where it enables
> and disables flow control and now it gets enabled.
> 

Ok.

> 
> Without BCE_JUMBO_HDRSPLIT then we see no errors.  With it we see number
> of errors, however the rate seems to be reduced compaired to the
> previous version of the driver.
> 

It seems there are issues in header splitting and it was disabled
by default. Header splitting reduces packet processing overhead in
upper layer so it's normal to see better performance with header
splitting.

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100913184833.GF1229>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation