Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 14 May 2010 13:20:21 -0400
From:      Alexander Sack <pisymbol@gmail.com>
To:        Jack Vogel <jfvogel@gmail.com>
Cc:        Murat Balaban <murat@enderunix.org>, freebsd-net@freebsd.org, freebsd-performance@freebsd.org, Andrew Gallatin <gallatin@cs.duke.edu>
Subject:   Re: Intel 10Gb
Message-ID:  <AANLkTil-kmThBinyxxCRxNyHQKFbD0ndalN3STreRghC@mail.gmail.com>
In-Reply-To: <AANLkTikD-ndv7WKPRzeLh932lGxDBbouQoyD9Oy6ybC5@mail.gmail.com>
References:  <AANLkTimMrsM08Rmdr-l6RFu83VkqFw0Pk2sHxpV5Yl5x@mail.gmail.com> <4BE52856.3000601@unsane.co.uk> <1273323582.3304.31.camel@efe> <20100511135103.GA29403@grapeape2.cs.duke.edu> <AANLkTikROvNKUmpax-CbhEyj5o7TW0hfV_x79Bm_nU2V@mail.gmail.com> <4BED5929.5020302@cs.duke.edu> <AANLkTikAow9ZdK4XokeWXkbmusva2rKxeLO2EBBe3tsZ@mail.gmail.com> <AANLkTikD-ndv7WKPRzeLh932lGxDBbouQoyD9Oy6ybC5@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, May 14, 2010 at 1:01 PM, Jack Vogel <jfvogel@gmail.com> wrote:
>
>
> On Fri, May 14, 2010 at 8:18 AM, Alexander Sack <pisymbol@gmail.com> wrot=
e:
>>
>> On Fri, May 14, 2010 at 10:07 AM, Andrew Gallatin <gallatin@cs.duke.edu>
>> wrote:
>> > Alexander Sack wrote:
>> > <...>
>> >>> Using this driver/firmware combo, we can receive minimal packets at
>> >>> line rate (14.8Mpps) to userspace. =A0You can even access this using=
 a
>> >>> libpcap interface. =A0The trick is that the fast paths are OS-bypass=
,
>> >>> and don't suffer from OS overheads, like lock contention. =A0See
>> >>> http://www.myri.com/scs/SNF/doc/index.html for details.
>> >>
>> >> But your timestamps will be atrocious at 10G speeds. =A0Myricom doesn=
't
>> >> timestamp packets AFAIK. =A0If you want reliable timestamps you need =
to
>> >> look at companies like Endace, Napatech, etc.
>> >
>> > I see your old help ticket in our system. =A0Yes, our timestamping
>> > is not as good as a dedicated capture card with a GPS reference,
>> > but it is good enough for most people.
>>
>> I was told btw that it doesn't timestamp at ALL. =A0I am assuming NOW
>> that is incorrect.
>>
>> Define *most* people.
>>
>> I am not knocking the Myricom card. =A0In fact I so wish you guys would
>> just add the ability to latch to a 1PPS for timestamping and it would
>> be perfect.
>>
>> We use I think an older version of the card internally for replay.
>> Its a great multi-purpose card.
>>
>> However with IPG at 10G in the nanoseconds, anyone trying to do OWDs
>> or RTT will find it difficult compared to an Endace or Napatech card.
>>
>> Btw, I was referring to bpf(4) specifically, so please don't take my
>> comments as a knock against it.
>>
>> >> PS I am not sure but Intel also supports writing packets directly in
>> >> cache (yet I thought the 82599 driver actually does a prefetch anyway
>> >> which had me confused on why that helps)
>> >
>> > You're talking about DCA. =A0We support DCA as well (and I suspect som=
e
>> > other 10G NICs do to). =A0There are a few barriers to using DCA on
>> > FreeBSD, not least of which is that FreeBSD doesn't currently have the
>> > infrastructure to support it (no IOATDMA or DCA drivers).
>>
>> Right.
>>
>> > DCA is also problematic because support from system/motherboard
>> > vendors is very spotty. =A0The vendor must provide the correct tag tab=
le
>> > in BIOS such that the tags match the CPU/core numbering in the system.
>> > Many motherboard vendors don't bother with this, and you cannot enable
>> > DCA on a lot of systems, even though the underlying chipset supports
>> > DCA. =A0I've done hacks to force-enable it in the past, with mixed
>> > results. The problem is that DCA depends on having the correct tag
>> > table, so that packets can be prefetched into the correct CPU's cache.
>> > If the tag table is incorrect, DCA is a big pessimization, because it
>> > blows the cache in other CPUs.
>>
>> Right.
>>
>> > That said, I would *love* it if FreeBSD grew ioatdma/dca support.
>> > Jack, does Intel have any interest in porting DCA support to FreeBSD?
>>
>> Question for Jack or Drew, what DOES FreeBSD have to do to support
>> DCA? =A0I thought DCA was something you just enable on the NIC chipset
>> and if the system is IOATDMA aware, it just works. =A0Is that not right
>> (assuming cache tags are correct and accessible)? =A0i.e. I thought this
>> was hardware black magic than anything specific the OS has to do.
>>
>
> OK, let me see if I can clarify some of this. First, there IS an I/OAT
> driver
> that I did for FreeBSD like 3 or 4 years ago, in the timeframe that we pu=
t
> the feature out. However, at that time all it was good for was the DMA
> aspect
> of things, and Prafulla used it to accelerate the stack copies; interest =
did
> not seem that great so I put the code aside, its not badly dated and need=
s
> to be brought up to date due to there being a few different versions of t=
he
> hardware now.
>
> At one point maybe a year back I started to take the code apart thinking
> I would JUST do DCA, that got back-burnered due to other higher priority
> issues, but its still an item in my queue.
>
> I also had a nibble of an interest in using the DMA engine so perhaps I
> should not go down the road of just doing the DCA support in the I/OAT
> part of the driver. The question is how to make the infrastructure work.
>
> To answer Alexander's question, DCA support is NOT in the NIC, its in
> the chipset, that's why the I/OAT driver was done as a seperate driver,
> but the NIC was the user of the info, its been a while since I was into
> the code but if memory serves the I/OAT driver just enables the support
> in the chipset, and then the NIC driver configures its engine to use it.

Thank you very much Jack!  :)  It was not clear from the docs what was
where to me.  I just assumed this was Intel NIC knew Intel chipset
black magic!  LOL.

> DCA and DMA were supported in Linux in the same driver because
> the chipset features were easily handled together perhaps, I'm not
> sure :)

Ok!  (it was my other reference)

> Fabien's data earlier in this thread suggested that a strategicallly
> placed prefetch did you more good than DCA did if I recall, what
> do you all think of that?

I thought there was a thread where prefetch didn't do much for you....lol..=
.

If you just prefetch willy-nilly then don't you run the risk of
packets hitting caches on cores outside of what the application
reading them is on thereby defeating the whole purpose of prefetch?

> As far as I'm concerned right now I am willing to resurrect the driver,
> clean it up and make the features available, we can see how valuable
> they are after that, how does that sound??

Sounds good to me.  I at least put it somewhere publicly for people to look=
 at.

-aps



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTil-kmThBinyxxCRxNyHQKFbD0ndalN3STreRghC>