From owner-freebsd-net@FreeBSD.ORG Mon Sep 27 13:09:37 2010 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1F9261065672 for ; Mon, 27 Sep 2010 13:09:37 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 83D8F8FC14 for ; Mon, 27 Sep 2010 13:09:36 +0000 (UTC) Received: (qmail 81732 invoked from network); 27 Sep 2010 13:02:09 -0000 Received: from localhost (HELO [127.0.0.1]) ([127.0.0.1]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 27 Sep 2010 13:02:09 -0000 Message-ID: <4CA09792.3070307@freebsd.org> Date: Mon, 27 Sep 2010 15:09:38 +0200 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.9) Gecko/20100825 Thunderbird/3.1.3 MIME-Version: 1.0 To: Julian Elischer References: <4C9DA26D.7000309@freebsd.org> <4C9DB0C3.5010601@freebsd.org> <4C9EE905.5090701@freebsd.org> In-Reply-To: <4C9EE905.5090701@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD Net Subject: Re: mbuf changes X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Sep 2010 13:09:37 -0000 On 26.09.2010 08:32, Julian Elischer wrote: > On 9/25/10 1:20 AM, Andre Oppermann wrote: >> On 25.09.2010 09:19, Julian Elischer wrote: >>> over the last few years there has been a bit of talk about some changes people want to see in mbufs >>> for 9.x >>> extra fields, changes in the way things are done, etc. >>> >>> If you are one of these people, pipe up now.. >>> >>> to get the ball rolling.. >>> >>> * Add a field for the current FIB.. currently this is 4 bits stolen from the flags. >>> what would be a good width: 8,12,16,24,32 bits? >>> this would allow setfib to use numbers greater than 16 (the current max) >> >> 16 bits for 65535 FIB's should be sufficient. More than that seems really >> excessive. >> >>> * Preallocating some room for some number of tags before we start allocating >>> (expensively) new ones. >> >> Within the mbuf? Or at external and attached mbuf allocation time? Tags >> are variable width and such not really suitable for pre-allocation. > > yes possibly within.. thre could be for example a reaserver 20 byte field and if it > doesn't fit in that we go to expensive tags. > I'm just waving my arms here. See my reply to Luigi for a detailed view on this. >>> * dynamically working out what the front padding size should be.. per session.. i.e. >>> when a packet is sent out and needs to be adjusted to add more headers, the originating >>> socket should be notified, or maybe the route should have this information... >>> so that future packets can start out with enough head room. >>> (this is not strictly to do with mbufs but might need some added field to point to the structure >>> that needs to be >>> updated. >> >> We already have "max_linkhdr" that specifies how much space is left >> for prepends at the start of each packet. The link protocols set >> this and also IPSec adds itself in there if enabled. If you have >> other encapsulations you should make them add in there as well. > > this doesn't take into account tunneling and encapsulation. It should/could but the tunneling and encapsulation protocols have to add themself to it when active. IPSec does this. > we could do a lot better than this. > especially on a per-route basis. > if the first mbuf in a session had a pointer to the relevent rtentry, > then as it is processed that could be updated.. Please please please don't add a rtentry pointer to the mbuf. Besides that the routing table is a very poor place to do this. We don't have host routes anymore and the locking and refcounting is rather expensive. max_linkhdr should be sufficient (fix small fixes to some protocol mbuf allocators) even for excessive cases of encapsulation: TCP over IPv4 over IPSec(AH+ESP) over UDP over IPv6 over PPPoE over Ethernet = 60 + 20 + (8+24) + 8 + 40 + 8 + 14 = 182 total, of which 102 are prepends. Maybe we need an API for the tunneling and encapsulation protocols to add their overhead to max_linkhdr. -- Andre