From owner-freebsd-net@FreeBSD.ORG Fri Jul 6 05:52:05 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2E08A106564A; Fri, 6 Jul 2012 05:52:05 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238]) by mx1.freebsd.org (Postfix) with ESMTP id D93BE8FC1C; Fri, 6 Jul 2012 05:52:04 +0000 (UTC) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id E118E7300A; Fri, 6 Jul 2012 08:11:26 +0200 (CEST) Date: Fri, 6 Jul 2012 08:11:26 +0200 From: Luigi Rizzo To: "Alexander V. Chernikov" Message-ID: <20120706061126.GA65432@onelab2.iet.unipi.it> References: <4FF361CA.4000506@FreeBSD.org> <20120703214419.GC92445@onelab2.iet.unipi.it> <4FF36438.2030902@FreeBSD.org> <4FF3E2C4.7050701@FreeBSD.org> <4FF3FB14.8020006@FreeBSD.org> <4FF402D1.4000505@FreeBSD.org> <20120704091241.GA99164@onelab2.iet.unipi.it> <4FF412B9.3000406@FreeBSD.org> <20120704154856.GC3680@onelab2.iet.unipi.it> <4FF59955.5090406@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4FF59955.5090406@FreeBSD.org> User-Agent: Mutt/1.4.2.3i Cc: Doug Barton , net@freebsd.org Subject: Re: FreeBSD 10G forwarding performance @Intel X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Jul 2012 05:52:05 -0000 On Thu, Jul 05, 2012 at 05:40:37PM +0400, Alexander V. Chernikov wrote: > On 04.07.2012 19:48, Luigi Rizzo wrote: ... > Traffic stats with most possible counters eliminated: > (there is a possibility in ixgbe code to update rx/tx packets once per > rx_process_limit (which is 100 by default)): > > input (ix0) output > packets errs idrops bytes packets errs bytes colls > 2.8M 0 0 186M 2.8M 0 186M 0 > 2.8M 0 0 187M 2.8M 0 186M 0 > > And it seems that netstat uses 1024 as divisor (no HN_DIVISOR_1000 > passed in if.c to show_stat), so real frame count from Ixia side is much > closer to 3MPPS (~ 2.961600 ). ... > IPFW contention: > Same setup as shown upper, same traffic level > > 17:48 [0] test15# ipfw show > 00100 0 0 allow ip from any to any > 65535 0 0 deny ip from any to any > > net.inet.ip.fw.enable: 0 -> 1 > input (ix0) output > packets errs idrops bytes packets errs bytes colls > 2.1M 734k 0 187M 2.1M 0 139M 0 > 2.1M 736k 0 187M 2.1M 0 139M 0 > 2.1M 737k 0 187M 2.1M 0 89M 0 > 2.1M 735k 0 187M 2.1M 0 189M 0 > net.inet.ip.fw.update_counters: 1 -> 0 > 2.3M 636k 0 187M 2.3M 0 148M 0 > 2.5M 343k 0 187M 2.5M 0 164M 0 > 2.5M 351k 0 187M 2.5M 0 164M 0 > 2.5M 345k 0 187M 2.5M 0 164M 0 ... > It seems that ipfw counters are suffering from this problem, too. > Unfortunately, there is no DPCPU allocator in our kernel. > I'm planning to make a very simple per-cpu counters patch: > ( > allocate 65k*(u64_bytes+u64_packets) memory for each CPU per vnet > instance init and make ipfw use it as counter backend. > > There is a problem with several rules residing in single entry. This can > (probably) be worked-around by using fast counters for the first such > rule (or not using fast counters for such rules at all) > ) > > What do you think about this? the thing discussed a few years ago (at least the one i took out of the discussion) was that the counter fields in rules should hold the index of a per-cpu counter associated to the rule. So CTR_INC(rule->ctr) becomes something like pcpu->ipfw_ctrs[rule->ctr]++ Once you create a new rule you also grab one free index from ipfw_ctrs[], and the same should go for dummynet counters. The alternative would be to allocate the rule and a set of counters within the rule itself, but that kills 64 bytes per core per rule to avoid cache contention. cheers luigi