Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 9 Jun 2013 17:04:49 -0400
From:      George Neville-Neil <gnn@neville-neil.com>
To:        "hackers@freebsd.org" <hackers@freebsd.org>
Cc:        "devsummit@freebsd.org" <devsummit@freebsd.org>
Subject:   Network Recieve Performance Working Group
Message-ID:  <8537DE82-46F4-4E11-AECA-42F118AB179F@neville-neil.com>

next in thread | raw e-mail | index | archive | help
Howdy,

At the Network Receive Performance working group at BSDCan we covered a =
narrower set of topics
than we normally do, which seems to have resulted in a reasonably sized =
work list for improving
our systems in this area.  The main issues relate to getting a good API =
that addresses multi-queue
NICs.  The notes are on the WIki page as well as reproduced here.

Best,
George

https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance

The discussion opened with an attempt to constrain the problem we were =
trying to solve, including pointing out that any KPI/API suggested =
needed to be achievable in the next six months.

Some of the existing solutions to the problem of talking to hardware =
with multiple queues, which all high end NICs currently have, were:

	=95 Connection Groups
		=95 Not really a KPI
		=95 RSS vs. Flow Table is an issue to solve, we have =
things for the former, but little for the latter
		=95 Socket affinity is also an issue
	=95 NAPI
		=95 This is an APi in Linux. It uses upcalls.
	=95 Flow table mapping. Chelsio may have some of this.
	=95 SRIO
	=95 VLL Cloner
There are several ways to map flows, including: 4 tuple, MAC filter, =
arbitrary offset. An API that only handles offset, length, value is too =
simple from the standpoint of getting the right data into the hardware. =
We need something more rich on the kernel side of the API to that driver =
writers don't have to figure out our intentions.

Some methods that a good KPI/API ought to have include:

	=95 Query Device for information about its queues, including how =
many exist, and how they are mapped to other resources, including CPU =
and memory
	=95 Map CPUID to a Flow
	=95 Setup RSS
	=95 Request RxRing local memory
	=95 Solaris Mapping API might be a way to go =
(http://www.oracle.com/technetwork/articles/servers-storage-admin/crossbow=
setup-191326.pdf)
	=95
Some consumers of such an API include: Performance, affinity, =
virtualization, policy, kernel bypass, QoS, and VIMAGE.

We have two patches, for different bits, to start from including Vijay's =
[RobertWatson] and Randall's [RandallStewart], [GeorgeNevilleNeil]

We need quite a few things, including:

	=95 Per connection flow table
	=95 Describing queues in the stack such that we can expose =
interesting parts via netstat.
	=95 Packet Batching. This was not overwhelmingly popular.
A straw person API includes:

	=95 MBUF Flag
	=95 Hash Value
	=95 The whole thing may be used as opaque
		=95 Used by the stack for inpcb
	=95 Get number of buckets
	=95 Map bucket to RSS
	=95 Map queue/ithread to CPU
	=95 Get width of the hash
	=95 RSS get CPU
	=95 RSS get hash algo
	=95 Pick hash inputs
	=95 Get and set key
	=95 Rebalance
	=95 Software hash table
	=95 Query queue length
	=95 Get queue affinity
	=95 Set mask (CPUSET) on socket
	=95 Set policy on CPU/socket
	=95 Queue event reporting
	=95 Load distrubtion stats




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8537DE82-46F4-4E11-AECA-42F118AB179F>