Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Apr 2011 09:30:27 -0700
From:      YongHyeon PYUN <pyunyh@gmail.com>
To:        Adam Stylinski <kungfujesus06@gmail.com>
Cc:        freebsd-net@freebsd.org, Pierre Lamy <pierre@userid.org>
Subject:   Re: em0 performance subpar
Message-ID:  <20110428163027.GA17185@michelle.cdnetworks.com>
In-Reply-To: <20110428155124.GD19362@ossumpossum.geop.uc.edu>
References:  <20110428072946.GA11391@zephyr.adamsnet> <4DB961EA.4080407@userid.org> <20110428132513.GB2800@ossumpossum.geop.uc.edu> <4DB994E7.9060805@userid.org> <20110428155124.GD19362@ossumpossum.geop.uc.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Apr 28, 2011 at 11:51:24AM -0400, Adam Stylinski wrote:
> On Thu, Apr 28, 2011 at 11:25:11AM -0500, Pierre Lamy wrote:
> > Someone mentioned on freebsd-current:
> > 
> > With the 7.2.2 driver you also will use different mbuf pools depending on
> > > the MTU you are using. If you use jumbo frames it will use 4K clusters,
> > > if you go to 9K jumbos it will use 9K mbuf clusters. *The number of these
> > > allocated by default is small (like 6400 small  :) .*
> > >
> > > I would use 'netstat -m' to see what the pools look like.
> > 
> > Hope this helps. Check the perf with 1500 byte frames.
> > 
> > Adam Stylinski wrote:
> > > On Thu, Apr 28, 2011 at 08:47:38AM -0400, Pierre Lamy wrote:
> > >   
> > >> Try using netblast on FreeBSD instead of iperf, there have been a lot of 
> > >> discussions about this on this list.
> > >>
> > >> Is it possible you're maxing out the system's PCI-xxx bus? Did you tune 
> > >> up the system buffers? Data doesn't just get queued up on the NIC 
> > >> driver, it also queues within the system's kernel buffers. Try these, I 
> > >> have no idea if it will help:
> > >>
> > >> sysctl -w net.inet.tcp.sendspace=373760
> > >> sysctl -w net.inet.tcp.recvspace=373760
> > >> sysctl -w net.local.stream.sendspace=82320
> > >> sysctl -w net.local.stream.recvspace=82320
> > >> sysctl -w net.local.stream.recvspace=373760
> > >> sysctl -w net.local.stream.sendspace=373760
> > >> sysctl -w net.raw.recvspace=373760
> > >> sysctl -w net.raw.sendspace=373760
> > >> sysctl -w net.inet.tcp.local_slowstart_flightsize=10
> > >> sysctl -a net.inet.tcp.delayed_ack=0
> > >> sysctl -w kern.maxvnodes=600000
> > >> sysctl -w net.local.dgram.recvspace=8192
> > >> sysctl -w net.local.dgram.maxdgram=8192
> > >> sysctl -w net.inet.tcp.slowstart_flightsize=10
> > >> sysctl -w net.inet.tcp.path_mtu_discovery=0
> > >>
> > >> They're all tunable while system is running.
> > >>
> > >> -Pierre
> > >>
> > >> On 4/28/2011 3:29 AM, Adam Stylinski wrote:
> > >>     
> > >>> Hello,
> > >>>
> > >>> I have an intel gigabit network adapter (the 1000 GT w/chipset 82541PI) which performs poorly in Freebsd compared to the same card in Linux.  I've tried this card in two different freebsd boxes and for whatever reason I get poor transmit performance.  I've done all of the tweaking specified in just about every guide out there (the usual TCP window scaling, larger nmbclusters, delayed acks, etc) and still I get only around 600mbps.  I'm using jumbo frames, with an MTU of 9000.  I'm testing this with iperf.  While I realize that this may not be the most realistic test, linux hosts with the same card can achieve 995Mbit/s to another host running this.  When the Freebsd box is the server, Linux hosts can transmit to it at around 800 something Mbit/s.  I've increased the transmit descriptors as specified in the if_em man page, and while that gave me 20 or 30 more mbit/s, my transmit performance is still below normal.
> > >>>
> > >>> sysctl stats report that the card is trigger a lot of tx_desc_fail2:
> > >>> 	dev.em.0.tx_desc_fail2: 3431
> > >>>
> > >>> Looking at a comment in the source code this indicates that the card was not able to obtain enough transmit descriptors (but I've given the card the maximum of 4096 in my loader.conf tunable).  Is this a bug or a performance regression of some kind?  Does anybody have a fix for this?  I tried another card with the same chip in a different box on 8-STABLE to no avail (the box I'm trying to improve performance on is on 8.2-RELEASE-p1).
> > >>>
> > >>> Anybody manage to make this card push above 600mbps in ideal network benchmarks?  Any help would be gladly appreciated.
> > >>>       
> > >>
> > >>     
> > >
> > > I doubt I'm saturating the PCI bus, the only other thing on the bus is a really really crappy PCI video card.  The same card on lesser powerful machines with Linux (Pentium 4) are able to achieve much more throughput, so it's not likely a bus limitation.
> > >
> > > I adjusted the listed sysctl live tunables which I hadn't already compensated for and it didn't seem to have an effect.
> > >
> > >   
> > 
> > 
> > 
> I tweaked that, doesn't seem to help.  I'd rather not go down to a 1500 byte MTU just yet.  I may try this later, but most of my network is configured for jumbo frames.
> 

Do you get reasonable performance numbers with standard MTU but
see poor performance with jumbo frame? If yes, it indicates driver
poorly configured PBA for jumbo frame. The data sheet does not give
detailed information about PBA. It seems each controller has
different PBA size so correct configuration is very important to
get best performance. I don't a have 82541 controller that show
this problem but Jack may be able to experiment various PBA
configuration for jumbo frame in lab.

> [adam@nasbox ~]$ sysctl kern.ipc.nmbjumbo9
> kern.ipc.nmbjumbo9: 12800
> 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110428163027.GA17185>