Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 16 Aug 2011 11:56:10 +0200
From:      Vlad Galu <dudu@dudu.ro>
To:        Vlad Galu <dudu@dudu.ro>
Cc:        Takuya ASADA <syuu@dokukino.com>, net@freebsd.org
Subject:   Re: Multiqueue support for bpf
Message-ID:  <0BB87D28-3094-422D-8262-5FA0E40BFC7C@dudu.ro>
In-Reply-To: <2AB05A3E-BDC3-427D-B4A7-ABDDFA98D194@dudu.ro>
References:  <CALG4x-VwhLmnh%2BRq0T8zdzp=yMD8o_WQ64_eqzc_dEhF-_mrGA@mail.gmail.com> <2AB05A3E-BDC3-427D-B4A7-ABDDFA98D194@dudu.ro>

next in thread | previous in thread | raw e-mail | index | archive | help
On Aug 16, 2011, at 11:50 AM, Vlad Galu wrote:
> On Aug 16, 2011, at 11:13 AM, Takuya ASADA wrote:
>> Hi all,
>>=20
>> I implemented multiqueue support for bpf, I'd like to present for =
review.
>> This is a Google Summer of Code project, the project goal is to
>> support multiqueue network interface on BPF, and provide interfaces
>> for multithreaded packet processing using BPF.
>> Modern high performance NICs have multiple receive/send queues and =
RSS
>> feature, this allows to process packet concurrently on multiple
>> processors.
>> Main purpose of the project is to support these hardware and get
>> benefit of parallelism.
>>=20
>> This provides following new APIs:
>> - queue filter for each bpf descriptor (bpf ioctl)
>>   - BIOCENAQMASK    Enables multiqueue filter on the descriptor
>>   - BIOCDISQMASK    Disables multiqueue filter on the descriptor
>>   - BIOCSTRXQMASK    Set mask bit on specified RX queue
>>   - BIOCCRRXQMASK    Clear mask bit on specified RX queue
>>   - BIOCGTRXQMASK    Get mask bit on specified RX queue
>>   - BIOCSTTXQMASK    Set mask bit on specified TX queue
>>   - BIOCCRTXQMASK    Clear mask bit on specified TX queue
>>   - BIOCGTTXQMASK    Get mask bit on specified TX queue
>>   - BIOCSTOTHERMASK    Set mask bit for the packets which not tied
>> with any queues
>>   - BIOCCROTHERMASK    Clear mask bit for the packets which not tied
>> with any queues
>>   - BIOCGTOTHERMASK    Get mask bit for the packets which not tied
>> with any queues
>>=20
>> - generic interface for getting hardware queue information from NIC
>> driver (socket ioctl)
>>   - SIOCGIFQLEN    Get interface RX/TX queue length
>>   - SIOCGIFRXQAFFINITY    Get interface RX queue affinity
>>   - SIOCGIFTXQAFFINITY    Get interface TX queue affinity
>>=20
>> Patch for -CURRENT is here, right now it only supports igb(4),
>> ixgbe(4), mxge(4):
>> http://www.dokukino.com/mq_bpf_20110813.diff
>>=20
>> And below is performance benchmark:
>>=20
>> =3D=3D=3D=3D
>> I implemented benchmark programs based on
>> bpfnull(//depot/projects/zcopybpf/utils/bpfnull/),
>>=20
>> test_sqbpf measures bpf throughput on one thread, without using =
multiqueue APIs.
>> =
http://p4db.freebsd.org/fileViewer.cgi?FSPC=3D//depot/projects/soc2011/mq_=
bpf/src/tools/regression/bpf/mq_bpf/test_sqbpf/test_sqbpf.c
>>=20
>> test_mqbpf is multithreaded version of test_sqbpf, using multiqueue =
APIs.
>> =
http://p4db.freebsd.org/fileViewer.cgi?FSPC=3D//depot/projects/soc2011/mq_=
bpf/src/tools/regression/bpf/mq_bpf/test_mqbpf/test_mqbpf.c
>>=20
>> I benchmarked with six conditions:
>> - benchmark1 only reads bpf, doesn't write packet anywhere
>> - benchmark2 writes packet on memory(mfs)
>> - benchmark3 writes packet on hdd(zfs)
>> - benchmark4 only reads bpf, doesn't write packet anywhere, with =
zerocopy
>> - benchmark5 writes packet on memory(mfs), with zerocopy
>> - benchmark6 writes packet on hdd(zfs), with zerocopy
>>=20
>>> =46rom benchmark result, I can say the performance is increased =
using
>> mq_bpf on 10GbE, but not on GbE.
>>=20
>> * Throughput benchmark
>> - Test environment
>> - FreeBSD node
>>  CPU: Core i7 X980 (12 threads)
>>  MB: ASUS P6X58D Premium(Intel X58)
>>  NIC1: Intel Gigabit ET Dual Port Server Adapter(82576)
>>  NIC2: Intel Ethernet X520-DA2 Server Adapter(82599)
>> - Linux node
>>  CPU: Core 2 Quad (4 threads)
>>  MB: GIGABYTE GA-G33-DS3R(Intel G33)
>>  NIC1: Intel Gigabit ET Dual Port Server Adapter(82576)
>>  NIC2: Intel Ethernet X520-DA2 Server Adapter(82599)
>>=20
>> iperf used for generate network traffic, with following argument =
options
>>  - Linux node: iperf -c [IP] -i 10 -t 100000 -P12
>>  - FreeBSD node: iperf -s
>>  # 12 threads, TCP
>>=20
>> following sysctl parameter is changed
>>  sysctl -w net.bpf.maxbufsize=3D1048576
>=20
>=20
> Thank you for your work! You may want to increase that (4x/8x) and =
rerun the test, though.

More, actually. Your current buffer is easily filled.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0BB87D28-3094-422D-8262-5FA0E40BFC7C>