From owner-freebsd-net@FreeBSD.ORG Thu Sep 23 19:34:14 2010 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0D72C1065673; Thu, 23 Sep 2010 19:34:14 +0000 (UTC) (envelope-from tom@tomjudge.com) Received: from eu1sys200aog104.obsmtp.com (eu1sys200aog104.obsmtp.com [207.126.144.117]) by mx1.freebsd.org (Postfix) with SMTP id 8FF4E8FC1B; Thu, 23 Sep 2010 19:34:12 +0000 (UTC) Received: from source ([63.174.175.251]) by eu1sys200aob104.postini.com ([207.126.147.11]) with SMTP ID DSNKTJursufVwsN4b4anvrGPpXfzLpMff9da@postini.com; Thu, 23 Sep 2010 19:34:13 UTC Received: from [172.17.10.53] (unknown [172.17.10.53]) by bbbx3.usdmm.com (Postfix) with ESMTP id 84050FD019; Thu, 23 Sep 2010 19:34:09 +0000 (UTC) Message-ID: <4C9BABA4.1060805@tomjudge.com> Date: Thu, 23 Sep 2010 14:33:56 -0500 From: Tom Judge User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.12) Gecko/20100915 Lightning/1.0b1 Thunderbird/3.0.8 MIME-Version: 1.0 To: David Christensen References: <4C894A76.5040200@tomjudge.com> <20100910002439.GO7203@michelle.cdnetworks.com> <4C8E3D79.6090102@tomjudge.com> <20100913184833.GF1229@michelle.cdnetworks.com> <4C8E768E.7000003@tomjudge.com> <20100913193322.GG1229@michelle.cdnetworks.com> <4C8E8BD1.5090007@tomjudge.com> <20100913205348.GJ1229@michelle.cdnetworks.com> <4C9B6CBD.2030408@tomjudge.com> <5D267A3F22FD854F8F48B3D2B52381933B5A78B484@IRVEXCHCCR01.corp.ad.broadcom.com> <4C9BA9FD.50406@tomjudge.com> In-Reply-To: <4C9BA9FD.50406@tomjudge.com> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: "pyunyh@gmail.com" , "freebsd-net@freebsd.org" , "yongari@freebsd.org" Subject: Re: bce(4) - com_no_buffers (Again) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Sep 2010 19:34:14 -0000 The throttle command I am using in the tests is the one from here: http://klicman.org/throttle/ On 09/23/2010 02:26 PM, Tom Judge wrote: > On 09/23/2010 01:21 PM, David Christensen wrote: > >>>>> Under testing I have yet to see a memory fragmentation issue with >>>>> >>>>> >>> this >>> >>> >>>>> driver. I follow up if/when I find a problem with this again. >>>>> >>>>> >>>>> >>>>> >>> So here we are again. The system is locking up again because of 9k >>> mbuf >>> allocation failures. >>> >>> >> Failure to allocate a new buffer should cause the driver to >> drop the received frame and reuse the buffer, not lock up the >> system. Are you seeing the lockup come from bce(4) or does >> it come from somewhere else due to the dropped data? >> >> >> > The lockup is not from the NIC as such, the systems have the appearance > of locking up as home directories are on NFS and the user information is > stored in a remote LDAP server. When the system starts to drop frames > due to lack of 9k memory regions it tends to last for a few minutes > (when it is really bad) and stop all traffic into the system. This > appears to the average user as a complete system pause. > > > >>>>> Is there a way to fix the RX buffer shortage issues (when header >>>>> splitting is turned on) so that they are guarded by flow control. >>>>> >>>>> >>> Maybe >>> >>> >>>>> change the low watermark for flow control when its enabled? >>>>> >>>>> >>>>> >>>>> >>>> I'm not sure how much it would help but try changing RX low >>>> watermark. Default value is 32 which seems to be reasonable value. >>>> But it's only for 5709/5716 controllers and Linux seems to use >>>> different default value. >>>> >>>> >>>> >>> These are: NetXtreme II BCM5709 Gigabit Ethernet >>> >>> So my next task is to turn the watermark related defines into sysctls >>> and turn on header splitting so that I can try to tune them without >>> having to reboot. >>> >>> >>> >> Do you have flow control enabled? There are arguments both for >> and against flow control. For bce(4), I haven't tested flow control >> for quite a while and it's behavior may have changed since it is >> controlled by firmware. Keep an eye on the hardware statistics >> to see that's it's actively generating pause frames. >> >> > At the moment I have a number tests: > > 1) With flow control disabled and header splitting on or off flood the > server with very small frames (200 bytes). This will trigger the > firmware to drop frames due to BD shortages (incrementing > dev.bce.X.com_no_buffers). > > Traffic source: > > route change test-system -mtu 200 > dd if=/dev/zero bs=8000 | nc -l 1111 > > Test system: > > nc source 1111 > /dev/null > > > 2) With flow control enabled and header splitting off flood the server > with traffic with very slow userland processing: > > Traffic source: > > for I in 1 2 3 4 5 6 7 8; do ( dd if=/dev/zero bs=8000 | nc -l 1111$I & > ); done > > Test system: > > 8* > nc source 1111$I | throttle -k 1 > /dev/null > > On our systems this will reliably trigger denied 9k allocations. > > 3) With flow control enabled and header splitting on flood the server > with very small frames (200 bytes). (Using the same test as in case 1). > My aim is to tune the watermark here so that there are no frames dropped > due to BD shortages. > > > > > I am under the impression that the best solution is to tune the RX ring > so that flow control can be disabled but I not sure I could do this. > > > >>> My next question is, is it possible to increase the size of the RX ring >>> without switching to RSS? >>> >>> >>> >> I have a change I've been working on to allow RX/TX ring size >> to be adjusted through a sysctl. Let me pretty it up a bit and >> send it to you for test. You should be able to adjust the ring >> size without enabling RSS. >> >> >> > If you can provide a patch I have hardware available to test on. > > Thanks > > Tom > > -- TJU13-ARIN