Date: Tue, 20 Jan 2015 12:22:16 +0000 From: Zoltan Kiss <zoltan.kiss@linaro.org> To: Luigi Rizzo <rizzo@iet.unipi.it> Cc: Mike Holmes <mike.holmes@linaro.org>, Ciprian Barbu <ciprian.barbu@linaro.org>, net@freebsd.org Subject: Re: ixgbe TX desc prefetch Message-ID: <54BE4878.70703@linaro.org> In-Reply-To: <20150119202225.GA77414@onelab2.iet.unipi.it> References: <54BD5CDF.20602@linaro.org> <20150119202225.GA77414@onelab2.iet.unipi.it>
next in thread | previous in thread | raw e-mail | index | archive | help
On 19/01/15 20:22, Luigi Rizzo wrote: > On Mon, Jan 19, 2015 at 07:37:03PM +0000, Zoltan Kiss wrote: >> Hi, >> >> I'm using netmap on Ubuntu 14.04 (3.13.0-44 kernel, ixgbe 3.15.1-k), and >> I can't max out a 10G link with pktgen : >> >> Sent 1068122462 packets, 64 bytes each, in 84.29 seconds. >> Speed: 12.67 Mpps Bandwidth: 6.49 Gbps (raw 8.92 Gbps) >> >> The README says "ixgbe tops at about 12.5 Mpps unless the driver >> prefetches tx descriptors". I've checked the actual driver code, it does >> a prefetch(tx_desc) in the TX completion interrupt handler, is that what >> you mean? Top shows ksoftirqd eats up one core while the pktgen process >> is around 45% > > that comment is related to how the TXDCTL register > in the NIC is programmed, not the CPU's prefetch, and it applies > only to the case where you use only one queue to transmit. > With multiple tx queues you should be able to do line rate > regardless of that setting. > > However there are other things you might be hitting: > - if you have IOMMU enabled, that adds overhead to the memory mappings > and i seem to remember that caused a drop in the tx rate; > - try pkt-gen with sizes 60 and 64 (before CRC) to see if there is any > difference. Especially on the receive side, if the driver strips > the CRC, performance with 60 bytes is worse (and i assume you have > disabled flow control on both sender and receiver) > > Finally, we have seen degradation on recent linux kernels (> 3.5 i would say) > and this seems to be due to the driver disabling interrupt moderation > if the napi handler reports work has been completed. Since netmap > does almost nothing in the NAPI handler, the OS is confused and thinks > there is no load so it could as well optimize for low latency. > > A fix is to hardwire interrupt moderation to some 20-50us (not sure if > you can do it with ethtool, we tweaked the driver's code to avoid > the changes in moderation). That should deal with the high ksoftirq load. > > And finally, multiple ports on the same nic contend for PCIe bandwidth > so it is well possible that the bus does not have capacity > for full traffic on both ports. That was it! This is an HP SFP560+ card, which has a PCIe v2.0 x8 lane connecter. That should do 32 Gbit/s, which is enough for me (although not enough for the 40 Gbit/s dual-port full duplex traffic promised by HP explicitly in their marketing material, unless you take into account the ovehead of 8b/10b, which would be a gross lie ...) My problem was that this card was in a x16 lane slot, which was in practice a x4 lane slot only ... (I'll maybe write a mail to Lenovo that this was a bit mean, and they should mark it with some bigger letters than the usual labeling of the motherboard ...) Now I can receive 10Gbps both direction! Thanks for the tip Regards, Zoltan > > cheers > luigi > >> My problem gets even worse when I want to use the another port on this >> same dual port card to do receive back the traffic (I'm sending my >> packets through a device I want to test for switching performance). The >> sending performance drops down to 9.39 Mpps (6.61 Gbps), and the >> receiving goes this much as well. I'm trying to bind the threads to >> cores with "-a 3" and so, but they don't seem to obey based on top. The >> TX now uses ca 50% CPU while RX is %20, but they don't seem to run on >> their assigned CPU. >> My card is an Intel 82599ES, the CPU is i5-4570 @ 3.2GHz (no HT). Maybe >> the fact it is a workstation CPU contributes to this problem? >> >> All suggestions welcome! >> >> Regards, >> >> Zoltan Kiss >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?54BE4878.70703>