From owner-freebsd-performance@FreeBSD.ORG Fri May 14 17:20:23 2010 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BB12C1065670; Fri, 14 May 2010 17:20:23 +0000 (UTC) (envelope-from pisymbol@gmail.com) Received: from mail-yw0-f181.google.com (mail-yw0-f181.google.com [209.85.211.181]) by mx1.freebsd.org (Postfix) with ESMTP id 4E7BB8FC17; Fri, 14 May 2010 17:20:22 +0000 (UTC) Received: by ywh11 with SMTP id 11so1435813ywh.7 for ; Fri, 14 May 2010 10:20:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=NqZFPlSfG/uLdkjsSIhq0IKDA7WAKPdN7ilFOtGm9+w=; b=IDgwxHdm74Kgk2FJLScPB1UoCFE04h33HrVAw/23/vR2ty56DjTwrNJoygQoqS+ifz lGwzrRZ2F9Qb/INVO+5e7O7wzf0w+a0LRYV9j3HcbcbGBoIKXI6PUPhEenws3QFKYZN2 1zHE5eIWoGnOoYo6llPcxbkAVyKpFd2TF8bXc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=FduRR33H5djFfHR3E4R4QgkxAbntHDD+cKpq7VeKhBNFikjm66jFGaD5t4II89oXw7 /j/2zkE6yT3ojxsvprUVT/y9n1Ds8Vs5bS3+m9cAW1ItBg5MGZp8F2JgRNUgzGF9Twg3 l5sr9pTxofMtM+7cX3WuqPc5WlDyK1hW/eHd0= MIME-Version: 1.0 Received: by 10.101.181.40 with SMTP id i40mr1719251anp.193.1273857622101; Fri, 14 May 2010 10:20:22 -0700 (PDT) Received: by 10.100.58.2 with HTTP; Fri, 14 May 2010 10:20:21 -0700 (PDT) In-Reply-To: References: <4BE52856.3000601@unsane.co.uk> <1273323582.3304.31.camel@efe> <20100511135103.GA29403@grapeape2.cs.duke.edu> <4BED5929.5020302@cs.duke.edu> Date: Fri, 14 May 2010 13:20:21 -0400 Message-ID: From: Alexander Sack To: Jack Vogel Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Mailman-Approved-At: Fri, 14 May 2010 17:44:30 +0000 Cc: Murat Balaban , freebsd-net@freebsd.org, freebsd-performance@freebsd.org, Andrew Gallatin Subject: Re: Intel 10Gb X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 May 2010 17:20:23 -0000 On Fri, May 14, 2010 at 1:01 PM, Jack Vogel wrote: > > > On Fri, May 14, 2010 at 8:18 AM, Alexander Sack wrot= e: >> >> On Fri, May 14, 2010 at 10:07 AM, Andrew Gallatin >> wrote: >> > Alexander Sack wrote: >> > <...> >> >>> Using this driver/firmware combo, we can receive minimal packets at >> >>> line rate (14.8Mpps) to userspace. =A0You can even access this using= a >> >>> libpcap interface. =A0The trick is that the fast paths are OS-bypass= , >> >>> and don't suffer from OS overheads, like lock contention. =A0See >> >>> http://www.myri.com/scs/SNF/doc/index.html for details. >> >> >> >> But your timestamps will be atrocious at 10G speeds. =A0Myricom doesn= 't >> >> timestamp packets AFAIK. =A0If you want reliable timestamps you need = to >> >> look at companies like Endace, Napatech, etc. >> > >> > I see your old help ticket in our system. =A0Yes, our timestamping >> > is not as good as a dedicated capture card with a GPS reference, >> > but it is good enough for most people. >> >> I was told btw that it doesn't timestamp at ALL. =A0I am assuming NOW >> that is incorrect. >> >> Define *most* people. >> >> I am not knocking the Myricom card. =A0In fact I so wish you guys would >> just add the ability to latch to a 1PPS for timestamping and it would >> be perfect. >> >> We use I think an older version of the card internally for replay. >> Its a great multi-purpose card. >> >> However with IPG at 10G in the nanoseconds, anyone trying to do OWDs >> or RTT will find it difficult compared to an Endace or Napatech card. >> >> Btw, I was referring to bpf(4) specifically, so please don't take my >> comments as a knock against it. >> >> >> PS I am not sure but Intel also supports writing packets directly in >> >> cache (yet I thought the 82599 driver actually does a prefetch anyway >> >> which had me confused on why that helps) >> > >> > You're talking about DCA. =A0We support DCA as well (and I suspect som= e >> > other 10G NICs do to). =A0There are a few barriers to using DCA on >> > FreeBSD, not least of which is that FreeBSD doesn't currently have the >> > infrastructure to support it (no IOATDMA or DCA drivers). >> >> Right. >> >> > DCA is also problematic because support from system/motherboard >> > vendors is very spotty. =A0The vendor must provide the correct tag tab= le >> > in BIOS such that the tags match the CPU/core numbering in the system. >> > Many motherboard vendors don't bother with this, and you cannot enable >> > DCA on a lot of systems, even though the underlying chipset supports >> > DCA. =A0I've done hacks to force-enable it in the past, with mixed >> > results. The problem is that DCA depends on having the correct tag >> > table, so that packets can be prefetched into the correct CPU's cache. >> > If the tag table is incorrect, DCA is a big pessimization, because it >> > blows the cache in other CPUs. >> >> Right. >> >> > That said, I would *love* it if FreeBSD grew ioatdma/dca support. >> > Jack, does Intel have any interest in porting DCA support to FreeBSD? >> >> Question for Jack or Drew, what DOES FreeBSD have to do to support >> DCA? =A0I thought DCA was something you just enable on the NIC chipset >> and if the system is IOATDMA aware, it just works. =A0Is that not right >> (assuming cache tags are correct and accessible)? =A0i.e. I thought this >> was hardware black magic than anything specific the OS has to do. >> > > OK, let me see if I can clarify some of this. First, there IS an I/OAT > driver > that I did for FreeBSD like 3 or 4 years ago, in the timeframe that we pu= t > the feature out. However, at that time all it was good for was the DMA > aspect > of things, and Prafulla used it to accelerate the stack copies; interest = did > not seem that great so I put the code aside, its not badly dated and need= s > to be brought up to date due to there being a few different versions of t= he > hardware now. > > At one point maybe a year back I started to take the code apart thinking > I would JUST do DCA, that got back-burnered due to other higher priority > issues, but its still an item in my queue. > > I also had a nibble of an interest in using the DMA engine so perhaps I > should not go down the road of just doing the DCA support in the I/OAT > part of the driver. The question is how to make the infrastructure work. > > To answer Alexander's question, DCA support is NOT in the NIC, its in > the chipset, that's why the I/OAT driver was done as a seperate driver, > but the NIC was the user of the info, its been a while since I was into > the code but if memory serves the I/OAT driver just enables the support > in the chipset, and then the NIC driver configures its engine to use it. Thank you very much Jack! :) It was not clear from the docs what was where to me. I just assumed this was Intel NIC knew Intel chipset black magic! LOL. > DCA and DMA were supported in Linux in the same driver because > the chipset features were easily handled together perhaps, I'm not > sure :) Ok! (it was my other reference) > Fabien's data earlier in this thread suggested that a strategicallly > placed prefetch did you more good than DCA did if I recall, what > do you all think of that? I thought there was a thread where prefetch didn't do much for you....lol..= . If you just prefetch willy-nilly then don't you run the risk of packets hitting caches on cores outside of what the application reading them is on thereby defeating the whole purpose of prefetch? > As far as I'm concerned right now I am willing to resurrect the driver, > clean it up and make the features available, we can see how valuable > they are after that, how does that sound?? Sounds good to me. I at least put it somewhere publicly for people to look= at. -aps