Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 24 Sep 2010 12:44:32 -0500
From:      Tom Judge <tom@tomjudge.com>
To:        David Christensen <davidch@broadcom.com>
Cc:        "pyunyh@gmail.com" <pyunyh@gmail.com>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, "yongari@freebsd.org" <yongari@freebsd.org>
Subject:   Re: bce(4) - com_no_buffers (Again)
Message-ID:  <4C9CE380.6020906@tomjudge.com>
In-Reply-To: <4C9BABA4.1060805@tomjudge.com>
References:  <4C894A76.5040200@tomjudge.com>	<20100910002439.GO7203@michelle.cdnetworks.com>	<4C8E3D79.6090102@tomjudge.com>	<20100913184833.GF1229@michelle.cdnetworks.com>	<4C8E768E.7000003@tomjudge.com>	<20100913193322.GG1229@michelle.cdnetworks.com>	<4C8E8BD1.5090007@tomjudge.com>	<20100913205348.GJ1229@michelle.cdnetworks.com>	<4C9B6CBD.2030408@tomjudge.com>	<5D267A3F22FD854F8F48B3D2B52381933B5A78B484@IRVEXCHCCR01.corp.ad.broadcom.com>	<4C9BA9FD.50406@tomjudge.com> <4C9BABA4.1060805@tomjudge.com>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------000604030004000001090500
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

On 09/23/2010 02:33 PM, Tom Judge wrote:
> The throttle command I am using in the tests is the one from here:
>
> http://klicman.org/throttle/
>
>
> On 09/23/2010 02:26 PM, Tom Judge wrote:
>   
>> On 09/23/2010 01:21 PM, David Christensen wrote:
>>   
>>     
>>>>>> Under testing I have yet to see a memory fragmentation issue with
>>>>>>         
>>>>>>           
>>>>>>             
>>>> this
>>>>     
>>>>       
>>>>         
>>>>>> driver.  I follow up if/when I find a problem with this again.
>>>>>>
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>> So here we are again.  The system is locking up again because of 9k
>>>> mbuf
>>>> allocation failures.
>>>>     
>>>>       
>>>>         
>>> Failure to allocate a new buffer should cause the driver to
>>> drop the received frame and reuse the buffer, not lock up the
>>> system.  Are you seeing the lockup come from bce(4) or does
>>> it come from somewhere else due to the dropped data?
>>>
>>>   
>>>     
>>>       
>> The lockup is not from the NIC as such, the systems have the appearance
>> of locking up as home directories are on NFS and the user information is
>> stored in a remote LDAP server.   When the system starts to drop frames
>> due to lack of 9k memory regions it tends to last for a few minutes
>> (when it is really bad) and stop all traffic into the system.  This
>> appears to the average user as a complete system pause.
>>
>>
>>   
>>     
>>>>>> Is there a way to fix the RX buffer shortage issues (when header
>>>>>> splitting is turned on) so that they are guarded by flow control.
>>>>>>         
>>>>>>           
>>>>>>             
>>>> Maybe
>>>>     
>>>>       
>>>>         
>>>>>> change the low watermark for flow control when its enabled?
>>>>>>
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>> I'm not sure how much it would help but try changing RX low
>>>>> watermark. Default value is 32 which seems to be reasonable value.
>>>>> But it's only for 5709/5716 controllers and Linux seems to use
>>>>> different default value.
>>>>>
>>>>>       
>>>>>         
>>>>>           
>>>> These are: NetXtreme II BCM5709 Gigabit Ethernet
>>>>
>>>> So my next task is to turn the watermark related defines into sysctls
>>>> and turn on header splitting so that I can try to tune them without
>>>> having to reboot.
>>>>
>>>>     
>>>>       
>>>>         
>>> Do you have flow control enabled?  There are arguments both for
>>> and against flow control.  For bce(4), I haven't tested flow control
>>> for quite a while and it's behavior may have changed since it is
>>> controlled by firmware.   Keep an eye on the hardware statistics
>>> to see that's it's actively generating pause frames.
>>>   
>>>     
>>>       
>> 3) With flow control enabled and header splitting on flood the server
>> with very small frames (200 bytes). (Using the same test as in case 1). 
>> My aim is to tune the watermark here so that there are no frames dropped
>> due to BD shortages.
>>
>>     

Card info unhidden:

bce0: ASIC (0x57092003); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.2.2);
Flags (SPLT|MSI|MFW); MFW (NCSI 2.0.8)

So having done lots of testing with flow control turned on as well as
header splitting it seems like flow control may be broken with header
splitting?


I have been using the patch attached to play with the flow control water
marks.

I have tried with with following data points and am finding it difficult
to get flow control to kick in before the card runs out of descriptors
and starts dropping frames:

low:    16    high:    127
low:    32    high:    127
low:    64    high:    127
low:    96    high:    127
low:    32    high:    196
low:    64    high:    196
low:    128  high:    256

None of these seem to have any noticeable or effect on the drop rate or
the number of dev.bce.0.stat_FlowControlDone's in the sample period.


Thoughs?


Tom

-- 
TJU13-ARIN


--------------000604030004000001090500
Content-Type: text/plain;
 name="if_bce.patch.txt"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="if_bce.patch.txt"

Index: if_bce.c
===================================================================
--- if_bce.c	(revision 949)
+++ if_bce.c	(working copy)
@@ -511,6 +511,21 @@
 SYSCTL_UINT(_hw_bce, OID_AUTO, msi_enable, CTLFLAG_RDTUN, &bce_msi_enable, 0,
 "MSI-X|MSI|INTx selector");
 
+
+/* Tunable RX flow control low water mark. */
+/* Without header splitting the default is 32 */
+static int bce_rx_low_water_mark = BCE_L2CTX_RX_LO_WATER_MARK_DEFAULT;
+TUNABLE_INT("hw.bce.rx_low_water_mark", &bce_rx_low_water_mark);
+SYSCTL_UINT(_hw_bce, OID_AUTO, rx_low_water_mark, CTLFLAG_RDTUN, &bce_rx_low_water_mark, 0,
+"Default RX Flow Control Low Water Mark");
+
+/* Tunable RX flow control high water mark. */
+/* Without header splitting the default is 32 */
+static int bce_rx_high_water_mark = USABLE_RX_BD / 4;
+TUNABLE_INT("hw.bce.rx_high_water_mark", &bce_rx_high_water_mark);
+SYSCTL_UINT(_hw_bce, OID_AUTO, rx_high_water_mark, CTLFLAG_RDTUN, &bce_rx_high_water_mark, 0,
+"Default RX Flow Control High Water Mark");
+
 /* ToDo: Add tunable to enable/disable strict MTU handling. */
 /* Currently allows "loose" RX MTU checking (i.e. sets the  */
 /* H/W RX MTU to the size of the largest receive buffer, or */
@@ -1780,11 +1795,15 @@
 	}
 
  	if (mii->mii_media_active & IFM_FLAG1) {
+		BCE_PRINTF("%s(%d): Enabling TX flow control.\n",
+		    __FILE__, __LINE__);
 		DBPRINT(sc, BCE_INFO_PHY,
 		    "%s(): Enabling TX flow control.\n", __FUNCTION__);
 		BCE_SETBIT(sc, BCE_EMAC_TX_MODE, BCE_EMAC_TX_MODE_FLOW_EN);
 		sc->bce_flags |= BCE_USING_TX_FLOW_CONTROL;
 	} else {
+		BCE_PRINTF("%s(%d): Disabling TX flow control.\n",
+		    __FILE__, __LINE__);
 		DBPRINT(sc, BCE_INFO_PHY,
 		    "%s(): Disabling TX flow control.\n", __FUNCTION__);
 		BCE_CLRBIT(sc, BCE_EMAC_TX_MODE, BCE_EMAC_TX_MODE_FLOW_EN);
@@ -5414,7 +5433,7 @@
 		u32 lo_water, hi_water;
 
 		if (sc->bce_flags && BCE_USING_TX_FLOW_CONTROL) {
-			lo_water = BCE_L2CTX_RX_LO_WATER_MARK_DEFAULT;
+			lo_water = bce_rx_low_water_mark;
 		} else {
 			lo_water = 0;
 		}
@@ -5423,11 +5442,12 @@
 			lo_water = 0;
 		}
 
-		hi_water = USABLE_RX_BD / 4;
+		hi_water = bce_rx_high_water_mark;
 
 		if (hi_water <= lo_water) {
 			lo_water = 0;
 		}
+        BCE_PRINTF("Setting Up Flow Control (Pre Scaling), Low Watermark: %d, High Watermark: %d\n", (int)lo_water, (int)hi_water);
 
 		lo_water /= BCE_L2CTX_RX_LO_WATER_MARK_SCALE;
 		hi_water /= BCE_L2CTX_RX_HI_WATER_MARK_SCALE;
@@ -5436,7 +5456,8 @@
 			hi_water = 0xf;
 		else if (hi_water == 0)
 			lo_water = 0;
-
+        
+        BCE_PRINTF("Setting Up Flow Control (Post Scaling), Low Watermark: %d, High Watermark: %d\n", (int)lo_water, (int)hi_water);
 		val |= (lo_water << BCE_L2CTX_RX_LO_WATER_MARK_SHIFT) |
 		    (hi_water << BCE_L2CTX_RX_HI_WATER_MARK_SHIFT);
 	}

--------------000604030004000001090500--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C9CE380.6020906>