Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 14 Sep 2015 11:29:00 +0200
From:      Hans Petter Selasky <hps@selasky.org>
To:        =?UTF-8?Q?Roger_Pau_Monn=c3=a9?= <royger@FreeBSD.org>, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r271946 - in head/sys: dev/oce dev/vmware/vmxnet3 dev/xen/netfront kern net netinet ofed/drivers/net/mlx4 sys
Message-ID:  <55F6935C.9000000@selasky.org>
In-Reply-To: <55F69093.5050807@FreeBSD.org>
References:  <201409220827.s8M8RRHB031526@svn.freebsd.org> <55F69093.5050807@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 09/14/15 11:17, Roger Pau Monné wrote:
> El 22/09/14 a les 10.27, Hans Petter Selasky ha escrit:
>> Author: hselasky
>> Date: Mon Sep 22 08:27:27 2014
>> New Revision: 271946
>> URL: http://svnweb.freebsd.org/changeset/base/271946
>>
>> Log:
>>    Improve transmit sending offload, TSO, algorithm in general.
>>
>>    The current TSO limitation feature only takes the total number of
>>    bytes in an mbuf chain into account and does not limit by the number
>>    of mbufs in a chain. Some kinds of hardware is limited by two
>>    factors. One is the fragment length and the second is the fragment
>>    count. Both of these limits need to be taken into account when doing
>>    TSO. Else some kinds of hardware might have to drop completely valid
>>    mbuf chains because they cannot loaded into the given hardware's DMA
>>    engine. The new way of doing TSO limitation has been made backwards
>>    compatible as input from other FreeBSD developers and will use
>>    defaults for values not set.
>>
>>    Reviewed by:	adrian, rmacklem
>>    Sponsored by:	Mellanox Technologies
>
> This commit makes xen-netfront tx performance drop from ~5Gbits/sec
> (with debug options enabled) to 446 Mbits/sec. I'm currently looking,
> but if anyone has ideas they are welcome.
>

Hi Roger,

Looking at the netfront code you should subtract 1 from tsomaxsegcount 
prior to r287775. The reason might simply be that 2K clusters are used 
instead of 4K clusters, causing m_defrag() to be called.

>         ifp->if_hw_tsomax = 65536 - (ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN);
>         ifp->if_hw_tsomaxsegcount = MAX_TX_REQ_FRAGS;
>         ifp->if_hw_tsomaxsegsize = PAGE_SIZE;

After r287775 can you try these settings:

ifp->if_hw_tsomax = 65536;
ifp->if_hw_tsomaxsegcount = MAX_TX_REQ_FRAGS;
ifp->if_hw_tsomaxsegsize = PAGE_SIZE;

And see if the performance is the same like before?

Thank you!

--HPS



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55F6935C.9000000>