From owner-freebsd-net@FreeBSD.ORG Sun Aug 18 21:54:37 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 6E7A8B80 for ; Sun, 18 Aug 2013 21:54:37 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-wi0-x236.google.com (mail-wi0-x236.google.com [IPv6:2a00:1450:400c:c05::236]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id E478C2EF7 for ; Sun, 18 Aug 2013 21:54:36 +0000 (UTC) Received: by mail-wi0-f182.google.com with SMTP id hi8so2395300wib.9 for ; Sun, 18 Aug 2013 14:54:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=5IJ1LckSABAWarGj34CjBWncY6lBKDVY3WehV401YSY=; b=Xvq/akqVFrJIzbutis3hEuNTN2TPxlw4Dp4r94p0H163NCn2rjulyrYZCztDvTxdn5 KxFLI+Vocle/EdlgHrarPEkZZ/c/kHslYA+Ds8d8KKBUzpiV7CGX8FRaIfck2xfY0T0U TSqq2wZyeYq8lexcGGMqzZNSVJ6TvN0CdROxl/8ajaNM0gkGY/6BENcCX+a3nmXBsrAD tY5YxvJLgx8qwaWaM8k92vexAHKUnxDTfmebqONDBOGnJr5Hhgc7KcfhlYiupXccnBqz gO9mLSTC1Z5y4LNCrULSQLZqg88Dq6nhCprBFf3JWqknhNp+o0H+uOn4V7WddjnokwXT 1a1w== MIME-Version: 1.0 X-Received: by 10.180.37.164 with SMTP id z4mr5797725wij.30.1376862875180; Sun, 18 Aug 2013 14:54:35 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.217.116.136 with HTTP; Sun, 18 Aug 2013 14:54:35 -0700 (PDT) In-Reply-To: References: <520A6D07.5080106@freebsd.org> <520AFBE8.1090109@freebsd.org> <520B24A0.4000706@freebsd.org> <520B3056.1000804@freebsd.org> <20130814102109.GA63246@onelab2.iet.unipi.it> <1376745244.6575.YahooMailNeo@web121606.mail.ne1.yahoo.com> <1376748170.66110.YahooMailNeo@web121601.mail.ne1.yahoo.com> <1376833738.94737.YahooMailNeo@web121605.mail.ne1.yahoo.com> <71EA3DFB-B410-432D-98E0-B6341556BE6D@netgate.com> <1376851152.3322.YahooMailNeo@web121606.mail.ne1.yahoo.com> <1376859717.20232.YahooMailNeo@web121605.mail.ne1.yahoo.com> Date: Sun, 18 Aug 2013 14:54:35 -0700 X-Google-Sender-Auth: 4CoomC8rr7PtLsvRhP93XRR7MD0 Message-ID: Subject: Re: it's the output, not ack coalescing (Re: TSO and FreeBSD vs Linux) From: Adrian Chadd To: Luigi Rizzo Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: Barney Cordoba , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Aug 2013 21:54:37 -0000 Hi, I think the "UNIX architecture" is a bit broken for anything other than the occasional (for various traffic levels defining "occasional!") traffic connection. It's serving us well purely through the sheer force of will of modern CPU power but I think we can do a lot better. _I_ think the correct model is a netmap model - batched packet handling, lightweight drivers pushing and pulling batches of things, with some lightweight plugins to service that inside the kernel and/or push into the netmap ring buffer in userland. Interfacing into the ethernet and socket layer should be something that bolts on the side, kind of netgraph style. It would likely look a lot more like a switching backplane with socket IO being one of many processing possibilities. If socket IO stays packet at a time than great; but that's messing up the ability to do a lot of other interesting things. That's why I'm (more) interested in what you've done architecture wise than just saying "dump it in userland and be done with it." I think the VALE kernel stuff is very interesting from an architectural perspective. The questions (to me!) are: * how do we implement this in the current framework? (That's not too scary though; we'd just have the existing ethernet input/output path be one of many processing modules, and VALE would be another; netmap-userland would be another; etc, etc); * how do we make it a compile time fallback to the traditional model, for platforms that continue to be memory and/or cache constrained? (read: everything that's embedded) * ... and not simply have lots of #Ifdef NETMAP everywhere, but make the fallback be something sane and fall out of the API design? I'll try to rope some more ideas into that design at the cambridge and euro BSD developer summits. I'll try to post some kind of work roadmap to the list(s) for comments and potential code hacking. Anyway. I'll continue waving hands and hacking on code until I have something that works. Luigi - when are you next at a BSD developer summit / conference? Will you be at Malta? -adrian