From owner-cvs-src@FreeBSD.ORG Sat Nov 15 02:35:51 2003 Return-Path: Delivered-To: cvs-src@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D822F16A4CF for ; Sat, 15 Nov 2003 02:35:50 -0800 (PST) Received: from mailtoaster1.pipeline.ch (mailtoaster1.pipeline.ch [62.48.0.70]) by mx1.FreeBSD.org (Postfix) with ESMTP id B0F9843FEC for ; Sat, 15 Nov 2003 02:35:47 -0800 (PST) (envelope-from oppermann@pipeline.ch) Received: (qmail 85281 invoked from network); 15 Nov 2003 10:38:42 -0000 Received: from unknown (HELO pipeline.ch) ([62.48.0.54]) (envelope-sender ) by mailtoaster1.pipeline.ch (qmail-ldap-1.03) with SMTP for ; 15 Nov 2003 10:38:42 -0000 Message-ID: <3FB60181.4256A519@pipeline.ch> Date: Sat, 15 Nov 2003 11:35:45 +0100 From: Andre Oppermann X-Mailer: Mozilla 4.76 [en] (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 To: Luigi Rizzo References: <200311142102.hAEL2Nen073186@repoman.freebsd.org> <20031114153145.A54064@xorpc.icir.org> <3FB593F5.1053E7E2@pipeline.ch> <20031115002921.B68056@xorpc.icir.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit cc: cvs-src@FreeBSD.org cc: src-committers@FreeBSD.org cc: cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/netinet in_var.h ip_fastfwd.c ip_flow.c ip_flow.h ip_input.c ip_output.c src/sys/sys mbuf.h src/sys/conf files src/sys/net if_arcsubr.c if_ef.c if_ethersubr.c if_fddisubr.c if_iso88025subr.c if_ppp.c X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Nov 2003 10:35:51 -0000 Luigi Rizzo wrote: > > [mu > On Sat, Nov 15, 2003 at 03:48:21AM +0100, Andre Oppermann wrote: > > Luigi Rizzo wrote: > ... > > > Given that there are large segments of common code between > > > ip_fastforward() and ip_input()/ip_output() (i am thinking of the > > > entire ipfw handling code, for one, and also some basic integrity > > > checks, the fragmentation code, etc.) I also wonder if it wouldn't > > > be beneficial to put the optimizations into the standard path rather > > > than create a new (partial) replica of the same code, with the > > > potential of introducing bugs, and with some substantial I-cache > > > pollution which might well destroy the benefits of minor optimizations. > > > > I don't see much cache pollution here. Normally you use ip_fastforward > > i said I-cache, not data cache. Even a routed does some substantial > amount of local communication (bgp and routing processes etc.) so Here on my CORE2 router (4.8-REL) with two full and 130 peering BGP4 feeds I see about three to four route changes per second which make it to the kernel (route -nv monitor). Out of 303'394'603 packets 44'721'091 were for itself. But that is probably bogus since the counters are only 32-bit (and the machine has a uptime of 63 days). So the overall packet count has wrapped at least once (if not more) and is more likely to be 4'303'394'603. So about 1 percent of all packets (as I said it's probably even less because it has wrapped more than that) are for machine. All other packets use the fast path. To put this more into perspective wrt counter wrapping, on my interfaces I have a byte counter wrap every 40 minutes or so. So the true ratio is probably even far less than one percent and more in the region of one per mille. The wrapping looks really ugly on MRTG and RRtool graphs. Interface counters should be 64bit or they become useless with todays traffic levels... > i am pretty sure that in any non-trivial case you will end up having > both the slow path and the fast path conflicting for the instruction > cache. Merging them might help -- i have seen many cases where > inlining code as opposed to explicit function calls makes things > slower for this precise reason. I will try to measure that with more precision. You did have code which was able to record and timestamp events several thousand times per second. Do still have that code somewhere? > > > Minor comments on the code: > > > > > > + one of the initial comments in the new code states > > > > > > ... The only part of the packet we touch with the CPU is the > > > IP header. ... > > > > > > this is not true if you use ipfw because that code touches many > > > places in the packet (and can also do some expensive computation > > > like trying to locate the uid/gid of a packet; the fact that we > > > only deal with packets not for us does not prevent the existence > > > of such firewall rules). > > > > Well, as I said, everybody is free to shoot himself with such highly > > complex firewall rules. I'd say the ipfw code could be optimized with > > some of the ideas I've specified earlier. I don't think the ipfw code > > would do a uid/gid lookup if neither the destination nor source is > > i was just saying that the comment is untrue. Ok, I will modify it say something like "as long firewalling is not going through the whole packet" or so. > > > + could you clarify the divert logic ? I am a bit rusty with that > > > part of the code, but i am under the impression that in > > > ip_fastforward() you are passing along args.divert_rule and > > > losing track of divert_info which is instead what you need too. > > > > It's not you being rusty, the code is indeed hard to follow. :-/ > > divert_info is used for ip packet reassembly. ip_divert() is then > > just using it to determine whether the packet was catched on the > > way into the machine or out of it. It seems to have it's largest > > significance for the ip reassembly. I've tested that too with an > > earlier version of my code. However I will redo those tests to be > > sure it is working as expected. > > ok, the specific case where i think it fails is when you divert a > fragmented packet -- your code seems to store the divert_info > (the port you divert to) into divert_rule, and lose track of > the former. It looks like I need both of them... -- Andre