From owner-freebsd-arch@FreeBSD.ORG Wed Apr 23 10:02:38 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3F5FB106566C; Wed, 23 Apr 2008 10:02:38 +0000 (UTC) (envelope-from darrenr@freebsd.org) Received: from out1.smtp.messagingengine.com (out1.smtp.messagingengine.com [66.111.4.25]) by mx1.freebsd.org (Postfix) with ESMTP id F17F38FC17; Wed, 23 Apr 2008 10:02:37 +0000 (UTC) (envelope-from darrenr@freebsd.org) Received: from compute1.internal (compute1.internal [10.202.2.41]) by out1.messagingengine.com (Postfix) with ESMTP id 4148310238F; Wed, 23 Apr 2008 06:02:37 -0400 (EDT) Received: from web6.messagingengine.com ([10.202.2.215]) by compute1.internal (MEProxy); Wed, 23 Apr 2008 06:02:37 -0400 Received: by web6.messagingengine.com (Postfix, from userid 99) id 233117791D; Wed, 23 Apr 2008 06:02:37 -0400 (EDT) Message-Id: <1208944957.9641.1249417345@webmail.messagingengine.com> X-Sasl-Enc: npSVo2kzw/mdgxJ725U+lc8B/qYP648Vrehulv/gWScH 1208944957 From: "Darren Reed" To: "Robert Watson" Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="ISO-8859-1" MIME-Version: 1.0 X-Mailer: MessagingEngine.com Webmail Interface References: <20080317133029.GA19369@sub.vaned.net> <20080317134335.A3253@fledge.watson.org> <47FB586F.90606@freebsd.org> <20080408132058.U10870@fledge.watson.org> In-Reply-To: <20080408132058.U10870@fledge.watson.org> Date: Wed, 23 Apr 2008 12:02:37 +0200 Cc: arch@freebsd.org, freebsd-current@freebsd.org, "Christian S.J. Peron" Subject: Re: HEADS UP: zerocopy bpf commits impending X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: darrenr@freebsd.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Apr 2008 10:02:38 -0000 On Tue, 8 Apr 2008 13:28:18 +0100 (BST), "Robert Watson" said: > > On Tue, 8 Apr 2008, Darren Reed wrote: > > > Is there a performance analysis of the copy vs zerocopy available? (I don't > > see one in the paper, just a "to do" item.) > > > > The numbers I'm interested in seeing are how many Mb/s you can capture > > before you start suffering packet loss. This needs to be done with > > sequenced packets so that you can observe gaps in the sequence captured. > > We've done some analysis, and a couple of companies have the zero-copy > BPF > code deployed. I hope to generate a more detailed analysis before the > developer summit so we can review it at BSDCan. The basic observation is > that > for quite a few types of network links, the win isn't in packet loss per > se, > but in reduced CPU use, freeing up CPU for other activities. There are a > number of sources of win: > > - Reduced system call overhead -- as load increases, # system calls goes > down, > especially if you get a two-CPU pipeline going. > > - Reduced memory access, especially for larger buffer sizes, avoids > filling > the cache twice (first in copyout, then again in using the buffer in > userspace). > > - Reduced lock contention, as only a single thread, the device driver > ithread, > is acquiring the bpf descriptor's lock, and it's no longer contending > with > the user thread. > > One interesting, and in retrospect reasonable, side effect is that user > CPU > time goes up in the SMP scenario, as cache misses on the BPF buffer move > from > the read() system call to userspace. And, as you observe, you have to > use > somewhat larger buffer sizes, as in the previous scenario there were > three > buffers: two kernel buffers and a user buffer, and now there are simply > two > kernel buffers shared directly with user space. > > The original committed version has a problem in that it allows only one > kernel > buffer to be "owned" by userspace at a time, which can lead to excess > calls to > select(); this has now been corrected, so if people have run performance > benchmarks, they should update to the new code and re-run them. > > I don't have numbers off-hand, but 5%-25% were numbers that appeared in > some > of the measurements, and I'd like to think that the recent fix will > further > improve that. Out of curiosity, were those numbers for single cpu/core systems or systems with more than one cpu/core active/available? I know the testing I did was all single threaded, so moving time from kernel to user couldn't be expected to make a large overall difference in a non-SMP kernel (NetBSD-something at the time.) Darren -- Darren Reed darrenr@fastmail.net