From owner-freebsd-ipfw@FreeBSD.ORG Wed Apr 24 19:51:38 2013 Return-Path: Delivered-To: freebsd-ipfw@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id DD38A8FA; Wed, 24 Apr 2013 19:51:38 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2]) by mx1.freebsd.org (Postfix) with ESMTP id A317D1AF6; Wed, 24 Apr 2013 19:51:38 +0000 (UTC) Received: from dhcp170-36-red.yandex.net ([95.108.170.36]) by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1UV5mT-000A67-I5; Wed, 24 Apr 2013 23:55:05 +0400 Message-ID: <51783798.4020004@FreeBSD.org> Date: Wed, 24 Apr 2013 23:50:48 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130418 Thunderbird/17.0.5 MIME-Version: 1.0 To: Luigi Rizzo Subject: Re: [patch] ipfw interface tracking and opcode rewriting References: <517801D3.5040502@FreeBSD.org> <20130424162349.GA8439@onelab2.iet.unipi.it> <51780C49.7000204@FreeBSD.org> <20130424190930.GA10395@onelab2.iet.unipi.it> In-Reply-To: <20130424190930.GA10395@onelab2.iet.unipi.it> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-ipfw@freebsd.org, luigi@freebsd.org X-BeenThere: freebsd-ipfw@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: IPFW Technical Discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Apr 2013 19:51:38 -0000 On 24.04.2013 23:09, Luigi Rizzo wrote: > On Wed, Apr 24, 2013 at 08:46:01PM +0400, Alexander V. Chernikov wrote: >> On 24.04.2013 20:23, Luigi Rizzo wrote: > ... >>>> vesrion) in the middle of the next week. >>> hmmm.... this is quite a large change, and from the description it >>> is a bit unclear to me how the "opcode rewriting" thing relates to >>> the use of strings vs index for name matching. >> sorry, I havent't describe this explicitly. >> Index matching is done via storing interface index in in p.glob field of >> ipfw_insn_if instruction. > understood. the reasons why i did not use the index is that > one could specify a non-existing interface name, and also interfaces > can be renamed. If you want to use indexses, you should add > (perhaps you do, i haven't checked) Yes, this is done (without 'good' renaming handling), but still. > hooks to the interface add/rename/delete code in order to > update the ruleset upon changes on the if list, and it > seemed to me a bad idea to add this dependency > (lockingwise, too). > > Really, with 16-byte fixed size interface names, the match > is as simple as this: > > #if CAN_DO_FAST_MATCH && IFNAMSIZ == 16 /* archs with no align requirements */ > { > uint64_t *a = (uint64_t *)ifp->if_xname; > uint64_t *b = (uint64_t *)cmd->name; > if (a[0] == b[0] && a[1] == b[1]) > return 1; > } > #else > if (strncmp(ifp->if_xname, cmd->name, IFNAMSIZ) == 0) > return 1 > #endif > > (assuming the names are zero-padded, which should be the case already). > Since you have the measurement infrastructure in place, perhaps > you have an easy way to try this patch and see how effective > it is in terms of performance. I'll try this tomorrow, thanks. > >>> Additionally, i wonder if there isn't a better way to replace strncmp >>> with some two 64-bit comparisons (the name is 16 bytes) by making >>> sure that the fields are zero-padded and suitably aligned. >>> At this point, on the machines you care about (able to sustain >>> 1+ Mpps) the two comparison should have the same cost as >>> the index comparison, without the need to track update in the names. >> Well, actually I'm thinking of the next 2 steps: >> 1) making kernel rule header more compact (20 bytes instead of 48) and >> making it invisible for userland. >> This involves rule counters to be stored separately (and possibly as >> pcpu-based ones). >> 2) since ruleset is now nearly readonly and more or less compact we can >> try to store it in >> contiguous address space to optimize cache line usage. > certainly a worthwhile goal (also using gleb's new counters) > but i suspect that compacting rules are a second order effect. > I a bit skeptical they make a big difference on the in-kernel > version of ipfw. You might see some difference in the My current numbers are ~5mpps of IPv4 forwarding with ipfw turned on (1 rule) for vlans over ixgbe, with 60% cpu usage (2xE5646). For lagg with 2x ixgbe it is ~7mpps with the same 60% usage. (And, say, 70% of CPU usage on our production is ipfw, despite low number of rules). > userspace version, which runs on top of netmap. We are preparing to move forward in this direction (and thinking of 20-30mpps as our goal). (And I hope some changes of kernel-based version can migrate to userland one :)) > > cheers > luigi >