Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 20 Jan 2015 12:22:16 +0000
From:      Zoltan Kiss <zoltan.kiss@linaro.org>
To:        Luigi Rizzo <rizzo@iet.unipi.it>
Cc:        Mike Holmes <mike.holmes@linaro.org>, Ciprian Barbu <ciprian.barbu@linaro.org>, net@freebsd.org
Subject:   Re: ixgbe TX desc prefetch
Message-ID:  <54BE4878.70703@linaro.org>
In-Reply-To: <20150119202225.GA77414@onelab2.iet.unipi.it>
References:  <54BD5CDF.20602@linaro.org> <20150119202225.GA77414@onelab2.iet.unipi.it>

next in thread | previous in thread | raw e-mail | index | archive | help


On 19/01/15 20:22, Luigi Rizzo wrote:
> On Mon, Jan 19, 2015 at 07:37:03PM +0000, Zoltan Kiss wrote:
>> Hi,
>>
>> I'm using netmap on Ubuntu 14.04 (3.13.0-44 kernel, ixgbe 3.15.1-k), and
>> I can't max out a 10G link with pktgen :
>>
>> Sent 1068122462 packets, 64 bytes each, in 84.29 seconds.
>> Speed: 12.67 Mpps Bandwidth: 6.49 Gbps (raw 8.92 Gbps)
>>
>> The README says "ixgbe tops at about 12.5 Mpps unless the driver
>> prefetches tx descriptors". I've checked the actual driver code, it does
>> a prefetch(tx_desc) in the TX completion interrupt handler, is that what
>> you mean? Top shows ksoftirqd eats up one core while the pktgen process
>> is around 45%
>
> that comment is related to how the TXDCTL register
> in the NIC is programmed, not the CPU's prefetch, and it applies
> only to the case where you use only one queue to transmit.
> With multiple tx queues you should be able to do line rate
> regardless of that setting.
>
> However there are other things you might be hitting:
> - if you have IOMMU enabled, that adds overhead to the memory mappings
>    and i seem to remember that caused a drop in the tx rate;
> - try pkt-gen with sizes 60 and 64 (before CRC) to see if there is any
>    difference. Especially on the receive side, if the driver strips
>    the CRC, performance with 60 bytes is worse (and i assume you have
>    disabled flow control on both sender and receiver)
>
> Finally, we have seen degradation on recent linux kernels (> 3.5 i would say)
> and this seems to be due to the driver disabling interrupt moderation
> if the napi handler reports work has been completed. Since netmap
> does almost nothing in the NAPI handler, the OS is confused and thinks
> there is no load so it could as well optimize for low latency.
>
> A fix is to hardwire interrupt moderation to some 20-50us (not sure if
> you can do it with ethtool, we tweaked the driver's code to avoid
> the changes in moderation). That should deal with the high ksoftirq load.
>
> And finally, multiple ports on the same nic contend for PCIe bandwidth
> so it is well possible that the bus does not have capacity
> for full traffic on both ports.
That was it! This is an HP SFP560+ card, which has a PCIe v2.0 x8 lane 
connecter. That should do 32 Gbit/s, which is enough for me (although 
not enough for the 40 Gbit/s dual-port full duplex traffic promised by 
HP explicitly in their marketing material, unless you take into account 
the ovehead of 8b/10b, which would be a gross lie ...)
My problem was that this card was in a x16 lane slot, which was in 
practice a x4 lane slot only ... (I'll maybe write a mail to Lenovo that 
this was a bit mean, and they should mark it with some bigger letters 
than the usual labeling of the motherboard ...) Now I can receive 10Gbps 
both direction! Thanks for the tip

Regards,

Zoltan


>
> cheers
> luigi
>
>> My problem gets even worse when I want to use the another port on this
>> same dual port card to do receive back the traffic (I'm sending my
>> packets through a device I want to test for switching performance). The
>> sending performance drops down to 9.39 Mpps (6.61 Gbps), and the
>> receiving goes this much as well. I'm trying to bind the threads to
>> cores with "-a 3" and so, but they don't seem to obey based on top. The
>> TX now uses ca 50% CPU while RX is %20, but they don't seem to run on
>> their assigned CPU.
>> My card is an Intel 82599ES, the CPU is i5-4570 @ 3.2GHz (no HT). Maybe
>> the fact it is a workstation CPU contributes to this problem?
>>
>> All suggestions welcome!
>>
>> Regards,
>>
>> Zoltan Kiss
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?54BE4878.70703>