From owner-freebsd-net@FreeBSD.ORG Mon Mar 18 12:19:25 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 4540D9D4; Mon, 18 Mar 2013 12:19:25 +0000 (UTC) (envelope-from melifaro@ipfw.ru) Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2]) by mx1.freebsd.org (Postfix) with ESMTP id 0BB7E77D; Mon, 18 Mar 2013 12:19:25 +0000 (UTC) Received: from [2001:920:7000:101:5925:271b:9fa8:e84e] by mail.ipfw.ru with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1UHZ5Y-000NU4-KA; Mon, 18 Mar 2013 16:22:52 +0400 References: <5146121B.5080608@FreeBSD.org> <514649A5.4090200@freebsd.org> In-Reply-To: <514649A5.4090200@freebsd.org> Mime-Version: 1.0 (1.0) Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Message-Id: <3659B942-7C37-431F-8945-C8A5BCD8DC67@ipfw.ru> X-Mailer: iPhone Mail (10B146) From: "Alexander V. Chernikov" Subject: Re: MPLS Date: Mon, 18 Mar 2013 13:20:30 +0100 To: Andre Oppermann Cc: Sami Halabi , "Alexander V. Chernikov" , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Mar 2013 12:19:25 -0000 On 17.03.2013, at 23:54, Andre Oppermann wrote: > On 17.03.2013 19:57, Alexander V. Chernikov wrote: >> On 17.03.2013 13:20, Sami Halabi wrote: >>>> ITOH OpenBSD has a complete implementation of MPLS out of the box, mayb= e >> Their control plane code is mostly useless due to design approach (routin= g daemons talk via kernel). >=20 > What's your approach? It is actually not mine. We have discussed this a bit in radix-related threa= d. Generally quagga/bird (and other hiperf hardware-accelerated and software= routers) have feature-rich RIb from which best routes (possibly multipath) a= re installed to kernel/fib. Kernel main task should be to do efficient looku= ps while every other advanced feature should be implemented in userland. >=20 >> Their data plane code, well.. Yes, we can use some defines from their hea= ders, but that's all :) >>>> porting it would be short and more straight forward than porting linux L= DP >>>> implementation of BIRD. >>=20 >> It is not 'linux' implementation. LDP itself is cross-platform. >> The most tricky place here is control plane. >> However, making _fast_ MPLS switching is tricky too, since it requires ch= ages in our netisr/ethernet >> handling code. >=20 > Can you explain what changes you think are necessary and why? We definitely need ability to dispatch chain of mbufs - this was already dis= cussed in intel rx ring lock thread in -net. Currently significant number of drivers support interrupt moderation permitt= ing several/tens/hundreds of packets to be received on interrupt. For each packet we have to run some basic checks, PFIL hooks, netisr code, l= 3 code resulting in many locks being acquired/released per each packet. Typically we rely on NIC to put packet in given queue (direct isr), which wo= rks bad for non-hashable types of traffic like gre, PPPoE, MPLS. Additionall= y, hashing function is either standard (from M$ NDIS) or documented permitti= ng someone malicious to generate 'special' traffic matching single queue. Currently even if we can add m2flowid/m2cpu function able to hash, say, gre o= r MPLS, it is unefficient since we have to lock/unlock netisr queues for eve= ry packet.=20 I'm thinking of * utilizing m_nextpkt field in mbuf header * adding some nh_chain flag to netisr If given netisr does not support flag and nextpkt is not null we simply call= such netisr in cycle. * netisr hash function accepts mbuf 'chain' and pointer to array (Sizeof N *= ptr), sorts mbuf to N netisr queues saving list heads to supplied array. A= fter that we put given lists to appropriate queues. * teach ethersubr RX code to deal with mbuf chains (not easy one) * add some partial support of handling chains to fastfwd code >=20 > --=20 > Andre >=20 >=20