Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 12 Aug 2014 23:13:49 -0700
From:      Adrian Chadd <adrian@freebsd.org>
To:        Wei Hu <weh@microsoft.com>
Cc:        "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, "d@delphij.net" <d@delphij.net>
Subject:   Re: vRSS support on FreeBSD
Message-ID:  <CAJ-VmokFTP64C84_a%2BmDu_YW0y3yY37tJfjpyMNHJ1tzL_kEEg@mail.gmail.com>
In-Reply-To: <dd994385fc5e4fa49847828a78c204c8@BY1PR0301MB0902.namprd03.prod.outlook.com>
References:  <184b69414bd246eeacc0d4234a730f2f@BY1PR0301MB0902.namprd03.prod.outlook.com> <CAJ-VmokJ8G-mz%2BL=zJkbQnCHHFBcvqhHjCrHjoWS5i9ViF5qrw@mail.gmail.com> <e514d7fdb89c4d2988dfb4300c633a0c@BY1PR0301MB0902.namprd03.prod.outlook.com> <CAJ-Vmonr6qP=Gwu600bwnQg0o74-V491gr_1u1uJ0qnp=wZ_Zg@mail.gmail.com> <dd994385fc5e4fa49847828a78c204c8@BY1PR0301MB0902.namprd03.prod.outlook.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi!

Is there a spec for this stuff floating around somewhere?

What do other platforms do for receive/transmit affinity on hyperv?



-a


On 12 August 2014 21:08, Wei Hu <weh@microsoft.com> wrote:
> Hi Adrian,
>
> The send mapping table is an array with fixed the size of elements, say V=
RSS_TAB_SIZE. It contains the tx queue number on which TX packet should be =
sent. So the vCPU =3D Send_table[hash-value % VRSS_TAB_SIZE % number_of_tx_=
queue] is the way to choose the tx queue. Send_table is updated by the host=
 every few minutes (on a busy system) or hours (on a light system).
>
> Since the vNIC doesn't give guest VM the hash value for a rx packet, I am=
 thinking maybe I can put the rx queue number in the m_pkthdr.flowid of the=
 mbuf on the receiving path. So the queue number will be passed to the mbuf=
 on the sending path. This way we choose the same queue to send the packet,=
 and we don't need to calculate the hash value in the software.
>
> The other way is calculating the hash value on the send path, and choose =
the tx queue based on the send table, letting the host to decide which queu=
e to send packet (since the send table is given by host).
>
> I may implement the both and see which one has better performance.
>
> Thanks,
> Wei
>
>
>
> -----Original Message-----
> From: adrian.chadd@gmail.com [mailto:adrian.chadd@gmail.com] On Behalf Of=
 Adrian Chadd
> Sent: Tuesday, August 12, 2014 2:27 AM
> To: Wei Hu
> Cc: d@delphij.net; freebsd-net@freebsd.org
> Subject: Re: vRSS support on FreeBSD
>
> On 11 August 2014 02:48, Wei Hu <weh@microsoft.com> wrote:
>> CC freebsd-net@ for wider discussion.
>>
>> Hi Adrian,
>>
>> Many thanks for the explanation.  I checked the if_igb.c  and found the =
flowid field was set in the RX side in igb_rxeof():
>>
>> Igb_rxeof()
>> {
>>  ...
>> #ifdef  RSS
>>                         /* XXX set flowtype once this works right */
>>                         rxr->fmp->m_pkthdr.flowid =3D
>>                             le32toh(cur->wb.lower.hi_dword.rss);
>>                         rxr->fmp->m_flags |=3D M_FLOWID;  ...
>> }
>>
>> I have two questions regarding this.
>>
>> 1. Is the RSS hash value stored in cur->wb.lower.hi_dword.rss set by the=
 NIC hardware?
>
> Yup.
>
>> 2. So the hash value and m_flags are stored in the mbuf related to the r=
eceived packet on the rx side(lgb_rxeof()). But we check the hash value and=
 m_flags in mbuf related to the send packet on the tx side (in igb_mq_start=
()). Does the kernel re-use the same mbuf for tx? If so, how does it know f=
or the same network stream it should use the same mbuf got from the rx for =
packet sending? If not, how does the kernel preserve the same hash value ac=
ross the rx mbuf and tx mbuf for same network stream? This seems quite magi=
cal to me.
>
> The mbuf flowid/flowtype ends up in the inpcb->inp_flowid /
> inpcb->inp_flowtype as part of the TCP receive path.
>
> Then whenever the TCP code outputs an mbuf, it copies the inpcb flow deta=
ils out to outbound mbufs.
>
>>
>> For the Hyper-V case, the host controls which vCPU it wants to interrupt=
. And the rule can change dynamically based on the load. For a non-busy VM,=
 host will send most packets to same vCPU for power saving purpose. For a b=
usy VM, host will distribute the packets evenly across all vCPUs. This mean=
s host could change the RSS bucket mapping dynamically. Hyper-V does this b=
y sending a mapping table to VM whenever the it needs update. This also mea=
ns we cannot use FreeBSD's own bucket mapping which I believe is fixed. Als=
o Hyper-V use its own hash key. So do you think it is possible we still use=
 the exisiting RSS infrastructure built in FreeBSD in this purpose?
>
> Eventually. Doing rebalancing in RSS is on the TODO list, after I get the=
 rest of the basic packet handling / routing done.
>
> How's vRSS notify the VM that the mapping table has changed? What's the f=
ormat of it look like?
>
>
> -a



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmokFTP64C84_a%2BmDu_YW0y3yY37tJfjpyMNHJ1tzL_kEEg>