Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Aug 2015 08:13:59 -0400 (EDT)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        pyunyh@gmail.com
Cc:        Hans Petter Selasky <hps@selasky.org>,  FreeBSD stable <freebsd-stable@freebsd.org>,  FreeBSD Net <freebsd-net@freebsd.org>,  Slawa Olhovchenkov <slw@zxy.spb.ru>,  Christopher Forgeron <csforgeron@gmail.com>,  Daniel Braniss <danny@cs.huji.ac.il>
Subject:   Re: ix(intel) vs mlxen(mellanox) 10Gb performance
Message-ID:  <1154739904.25677089.1439986439408.JavaMail.zimbra@uoguelph.ca>
In-Reply-To: <20150819081308.GC964@michelle.fasterthan.com>
References:  <473274181.23263108.1439814072514.JavaMail.zimbra@uoguelph.ca> <1721122651.24481798.1439902381663.JavaMail.zimbra@uoguelph.ca> <55D333D6.5040102@selasky.org> <1325951625.25292515.1439934848268.JavaMail.zimbra@uoguelph.ca> <55D429A4.3010407@selasky.org> <20150819074212.GB964@michelle.fasterthan.com> <55D43590.8050508@selasky.org> <20150819081308.GC964@michelle.fasterthan.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Yonghyeon PYUN wrote:
> On Wed, Aug 19, 2015 at 09:51:44AM +0200, Hans Petter Selasky wrote:
> > On 08/19/15 09:42, Yonghyeon PYUN wrote:
> > >On Wed, Aug 19, 2015 at 09:00:52AM +0200, Hans Petter Selasky wrote:
> > >>On 08/18/15 23:54, Rick Macklem wrote:
> > >>>Ouch! Yes, I now see that the code that counts the # of mbufs is before
> > >>>the
> > >>>code that adds the tcp/ip header mbuf.
> > >>>
> > >>>In my opinion, this should be fixed by setting if_hw_tsomaxsegcount to
> > >>>whatever
> > >>>the driver provides - 1. It is not the driver's responsibility to know
> > >>>if
> > >>>a tcp/ip
> > >>>header mbuf will be added and is a lot less confusing that expecting the
> > >>>driver
> > >>>author to know to subtract one. (I had mistakenly thought that
> > >>>tcp_output() had
> > >>>added the tc/ip header mbuf before the loop that counts mbufs in the
> > >>>list.
> > >>>Btw,
> > >>>this tcp/ip header mbuf also has leading space for the MAC layer
> > >>>header.)
> > >>>
> > >>
> > >>Hi Rick,
> > >>
> > >>Your question is good. With the Mellanox hardware we have separate
> > >>so-called inline data space for the TCP/IP headers, so if the TCP stack
> > >>subtracts something, then we would need to add something to the limit,
> > >>because then the scatter gather list is only used for the data part.
> > >>
> > >
> > >I think all drivers in tree don't subtract 1 for
> > >if_hw_tsomaxsegcount.  Probably touching Mellanox driver would be
> > >simpler than fixing all other drivers in tree.
> > 
> > Hi,
> > 
> > If you change the behaviour don't forget to update and/or add comments
> > describing it. Maybe the amount of subtraction could be defined by some
> > macro? Then drivers which inline the headers can subtract it?
> > 
> 
> I'm also ok with your suggestion.
> 
> > Your suggestion is fine by me.
> > 
> 
> > The initial TSO limits were tried to be preserved, and I believe that
> > TSO limits never accounted for IP/TCP/ETHERNET/VLAN headers!
> > 
> 
> I guess FreeBSD used to follow MS LSOv1 specification with minor
> exception in pseudo checksum computation. If I recall correctly the
> specification says upper stack can generate up to IP_MAXPACKET sized
> packet.  Other L2 headers like ethernet/vlan header size is not
> included in the packet and it's drivers responsibility to allocate
> additional DMA buffers/segments for L2 headers.
> 
Yep. The default for if_hw_tsomax was reduced from IP_MAXPACKET to
  32 * MCLBYTES - max_ethernet_header_size as a workaround/hack so that
devices limited to 32 transmit segments would work (ie. the entire packet,
including MAC header would fit in 32 MCLBYTE clusters).
This implied that many drivers did end up using m_defrag() to copy the mbuf
list to one made up of 32 MCLBYTE clusters.

If a driver sets if_hw_tsomaxsegcount correctly, then it can set if_hw_tsomax
to whatever it can handle as the largest TSO packet (without MAC header) the
hardware can handle. If it can handle > IP_MAXPACKET, then it can set it to that.

rick

> > >
> > >>Maybe it can be controlled by some kind of flag, if all the three TSO
> > >>limits should include the TCP/IP/ethernet headers too. I'm pretty sure
> > >>we want both versions.
> > >>
> > >
> > >Hmm, I'm afraid it's already complex.  Drivers have to tell almost
> > >the same information to both bus_dma(9) and network stack.
> > 
> > You're right it's complicated. Not sure if bus_dma can provide an API
> > for this though.
> > 
> > --HPS
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
> 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1154739904.25677089.1439986439408.JavaMail.zimbra>