From owner-freebsd-arch@FreeBSD.ORG Tue Jan 8 19:03:58 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AB29116A419; Tue, 8 Jan 2008 19:03:58 +0000 (UTC) (envelope-from qing.li@bluecoat.com) Received: from whisker.bluecoat.com (whisker.bluecoat.com [216.52.23.28]) by mx1.freebsd.org (Postfix) with ESMTP id 337DB13C448; Tue, 8 Jan 2008 19:03:57 +0000 (UTC) (envelope-from qing.li@bluecoat.com) Received: from bcs-mail2.internal.cacheflow.com (bcs-mail2.internal.cacheflow.com [10.2.2.59]) by whisker.bluecoat.com (8.13.8/8.13.8) with ESMTP id m08J3rpv001532; Tue, 8 Jan 2008 11:03:53 -0800 (PST) X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Tue, 8 Jan 2008 11:03:47 -0800 Message-ID: <305C539CA2F86249BF51CDCE8996AFF4096E12A7@bcs-mail2.internal.cacheflow.com> In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: resend: multiple routing table roadmap (format fix) Thread-Index: AchSJHwwqXVcLPAaSm26guZBFGjinAAAsGMg References: <4772F123.5030303@elischer.org> <477416CC.4090906@elischer.org> <477D2EF3.2060909@elischer.org> <4780E5E7.2070202@FreeBSD.org><4781197F.1000105@elischer.org> <47814AF0.9070509@freebsd.org> From: "Li, Qing" To: "Vadim Goncharov" , "Andre Oppermann" Cc: Qing Li , FreeBSD Net , arch@freebsd.org, Ivo Vachkov , Robert Watson , Julian Elischer , "Bruce M. Simpson" Subject: RE: resend: multiple routing table roadmap (format fix) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Jan 2008 19:03:58 -0000 >=20 > Why a full walk, why such a dumb way?=20 > Correct, we don't do a full walk.=20 > > To remove an ARP entry for host A.B.C.D in L2 table of form=20 > (A.B.C.D -> 00:01:02:03:04:05), it is enough to do a (usual speed)=20 > routing lookup for host A.B.C.D and modify a one pointer in=20 > it's rtentry to NULL or remove rtentry (if it's selected to=20 > be implemented as cloned). Thus, when on regular forwarding=20 > (table read) a routing lookup is done, we already have a FAST=20 > access - one pointer dereference - for it's L2 table entry,=20 > be it ARP or any other L2 type (which support becoming easily=20 > with separation of L2 and L3). And on every modification of=20 > L2 table - which is RARE - do lookup with usual speed to=20 > modify cached pointer. Compare it with a scheme where for=20 > EVERY forwarded packet, there is a need for DOUBLE lookup -=20 > after a routing one, do another in L2 table. >=20 Is it really a double lookup though ? =20 With the current routing table that contains the ARP entries, a search has to proceed pass the interface route further down=20 the routing tree, and the depth depends on the number of ARP=20 entries in the table. With L2/L3 seperation, the routing search stops at the interface route, and further search for the exact entry continues in a separate L2 table. From a high level it does seem there could be performance issues such as cache invalidation problem, however, I cannot quantify at this point what that degration translates into,=20 and what impact it has on the overall scheme of things. I am not sure if anyone can quantify such performance question at this point. > > Current routing table implementation, with all disadvantages=20 > of combining > L2 and L3, have from the same combinig a one HUGE benefit -=20 > performance. =20 > And never, ever, ever, ever even try to split L2 from L3 with=20 > losing that performance - then it should be still never=20 > split, despite all disadvantages, and you'll become an enemy=20 > of many, many users. Especially while caching allows to do=20 > things reasonably fast. >=20 No disagreement here. -- Qing