From owner-freebsd-current Wed Nov 13 08:22:57 1996 Return-Path: owner-current Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id IAA26456 for current-outgoing; Wed, 13 Nov 1996 08:22:57 -0800 (PST) Received: from premise.CS.Berkeley.EDU (root@premise.CS.Berkeley.EDU [128.32.33.172]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id IAA26440; Wed, 13 Nov 1996 08:22:39 -0800 (PST) Received: from premise.CS.Berkeley.EDU (bmah@localhost.Berkeley.EDU [127.0.0.1]) by premise.CS.Berkeley.EDU (8.8.2/8.8.2) with ESMTP id IAA03524; Wed, 13 Nov 1996 08:22:30 -0800 (PST) Message-Id: <199611131622.IAA03524@premise.CS.Berkeley.EDU> X-Mailer: exmh version 1.6.9 8/22/96 To: Chris Csanady cc: dyson@freebsd.org, gibbs@freefall.freebsd.org (Justin T. Gibbs), roberto@keltia.freenix.fr, freebsd-current@freebsd.org Subject: Re: pbufs (was: Re: ufs is too slow?) In-reply-to: Your message of "Wed, 13 Nov 1996 07:15:18 CST." <199611131315.HAA03860@friley216.res.iastate.edu> From: bmah@cs.berkeley.edu (Bruce A. Mah) Reply-to: bmah@cs.berkeley.edu X-Face: g~c`.{#4q0"(V*b#g[i~rXgm*w;:nMfz%_RZLma)UgGN&=j`5vXoU^@n5v4:OO)c["!w)nD/!!~e4Sj7LiT'6*wZ83454H""lb{CC%T37O!!'S$S&D}sem7I[A 2V%N&+ Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 13 Nov 1996 08:22:26 -0800 Sender: owner-current@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Chris Csanady writes: > VJ's pbufs? ive not heard of these before, what are they? i remember hearin > g > something about him rewriting the tcp/ip stack to work well on gigabit > networks... this wouldnt have anything to do with it, would it? probably > not i suppose, but if anyone could point me to papers on anything related, id > apreciate it. :) I don't think Van has written any papers, but he's given a few talks on it. Going largely from memory: One of the losses of the current BSD TCP/IP implementation is that the unit of memory allocation (mbufs) doesn't really match the unit of network transmission. You'd like to allocate memory in packet- (or frame-) sized chunks, e.g. ~1500 bytes for Ethernet, rather than small mbufs (~112 bytes) or page mbufs (4K) which require a lot of copying and munging around of data. So the idea behind pbufs is that the lower protocol layers expose enough details to higher layers (e.g. the socket layer and TCP) to make this feasible. This also dovetails rather nicely with some assertions he's made about putting memory on network interfaces and mapping them into kernel memory (e.g. using the card's memory as socket/protocol buffers). One example of this is the Medusa/Afterburner series of high-speed network interfaces from HP/HP Labs. (Editorial note: Packet traces have shown that many packets, at least on LANs, tend to be small. So it's not clear to me what effect this would have for "typical" network traffic, though the wins for large bulk transfers have shown to be substantial.) Some of the other modifications also streamlined the protocol processing by combining layers (in the implementation, not protocol design). By doing several layers at once you can save overhead. Layering is great when designing a protocol stack, but many times not so great when you go to build something. Somewhere in the mess I call a desk I think I have hardcopies from a talk he gave on this stuff at a Gigabit TCP workshop a few years ago, but a cursory search has not revealed it. Bruce.