Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 24 Sep 2013 01:46:46 +0300
From:      Sami Halabi <sodynet1@gmail.com>
To:        "Alexander V. Chernikov" <melifaro@yandex-team.ru>
Cc:        Adrian Chadd <adrian@freebsd.org>, Andre Oppermann <andre@freebsd.org>, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>, "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>, Luigi Rizzo <luigi@freebsd.org>, "Andrey V. Elsukov" <ae@freebsd.org>, FreeBSD Net <net@freebsd.org>
Subject:   Re: Network stack changes
Message-ID:  <CAEW%2BogZttyScUBQQWht%2BYGfLEDU_APcoRyYeMy_wDseAcZwVnA@mail.gmail.com>
In-Reply-To: <523F4F14.9090404@yandex-team.ru>
References:  <521E41CB.30700@yandex-team.ru> <CAJ-Vmo=N=HnZVCD41ZmDg2GwNnoa-tD0J0QLH80x=f7KA5d%2BUg@mail.gmail.com> <523F4F14.9090404@yandex-team.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,
> http://info.iet.unipi.it/~**luigi/papers/20120601-dxr.pdf<http://info.iet.unipi.it/~luigi/papers/20120601-dxr.pdf>;
> http://www.nxlab.fer.hr/dxr/**stable_8_20120824.diff<http://www.nxlab.fer.hr/dxr/stable_8_20120824.diff>;
I've tried the diff in 10-current, applied cleanly but had errors compiling
new kernel... is there any work to make it work? i'd love to test it.

Sami


On Sun, Sep 22, 2013 at 11:12 PM, Alexander V. Chernikov <
melifaro@yandex-team.ru> wrote:

> On 29.08.2013 15:49, Adrian Chadd wrote:
>
>> Hi,
>>
> Hello Adrian!
> I'm very sorry for the looong reply.
>
>
>
>> There's a lot of good stuff to review here, thanks!
>>
>> Yes, the ixgbe RX lock needs to die in a fire. It's kinda pointless to
>> keep locking things like that on a per-packet basis. We should be able to
>> do this in a cleaner way - we can defer RX into a CPU pinned taskqueue and
>> convert the interrupt handler to a fast handler that just schedules that
>> taskqueue. We can ignore the ithread entirely here.
>>
>> What do you think?
>>
> Well, it sounds good :) But performance numbers and Jack opinion is more
> important :)
>
> Are you going to Malta?
>
>
>> Totally pie in the sky handwaving at this point:
>>
>> * create an array of mbuf pointers for completed mbufs;
>> * populate the mbuf array;
>> * pass the array up to ether_demux().
>>
>> For vlan handling, it may end up populating its own list of mbufs to push
>> up to ether_demux(). So maybe we should extend the API to have a bitmap of
>> packets to actually handle from the array, so we can pass up a larger array
>> of mbufs, note which ones are for the destination and then the upcall can
>> mark which frames its consumed.
>>
>> I specifically wonder how much work/benefit we may see by doing:
>>
>> * batching packets into lists so various steps can batch process things
>> rather than run to completion;
>> * batching the processing of a list of frames under a single lock
>> instance - eg, if the forwarding code could do the forwarding lookup for
>> 'n' packets under a single lock, then pass that list of frames up to
>> inet_pfil_hook() to do the work under one lock, etc, etc.
>>
> I'm thinking the same way, but we're stuck with 'forwarding lookup' due to
> problem with egress interface pointer, as I mention earlier. However it is
> interesting to see how much it helps, regardless of locking.
>
> Currently I'm thinking that we should try to change radix to something
> different (it seems that it can be checked fast) and see what happened.
> Luigi's performance numbers for our radix are too awful, and there is a
> patch implementing alternative trie:
> http://info.iet.unipi.it/~**luigi/papers/20120601-dxr.pdf<http://info.iet.unipi.it/~luigi/papers/20120601-dxr.pdf>;
> http://www.nxlab.fer.hr/dxr/**stable_8_20120824.diff<http://www.nxlab.fer.hr/dxr/stable_8_20120824.diff>;
>
>
>
>
>> Here, the processing would look less like "grab lock and process to
>> completion" and more like "mark and sweep" - ie, we have a list of frames
>> that we mark as needing processing and mark as having been processed at
>> each layer, so we know where to next dispatch them.
>>
>> I still have some tool coding to do with PMC before I even think about
>> tinkering with this as I'd like to measure stuff like per-packet latency as
>> well as top-level processing overhead (ie, CPU_CLK_UNHALTED.THREAD_P /
>> lagg0 TX bytes/pkts, RX bytes/pkts, NIC interrupts on that core, etc.)
>>
> That will be great to see!
>
>>
>> Thanks,
>>
>>
>>
>> -adrian
>>
>>
> ______________________________**_________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/**mailman/listinfo/freebsd-net<http://lists.freebsd.org/mailman/listinfo/freebsd-net>;
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@**freebsd.org<freebsd-net-unsubscribe@freebsd.org>
> "
>



-- 
Sami Halabi
Information Systems Engineer
NMS Projects Expert
FreeBSD SysAdmin Expert


On Sun, Sep 22, 2013 at 11:12 PM, Alexander V. Chernikov <
melifaro@yandex-team.ru> wrote:

> On 29.08.2013 15:49, Adrian Chadd wrote:
>
>> Hi,
>>
> Hello Adrian!
> I'm very sorry for the looong reply.
>
>
>
>> There's a lot of good stuff to review here, thanks!
>>
>> Yes, the ixgbe RX lock needs to die in a fire. It's kinda pointless to
>> keep locking things like that on a per-packet basis. We should be able to
>> do this in a cleaner way - we can defer RX into a CPU pinned taskqueue and
>> convert the interrupt handler to a fast handler that just schedules that
>> taskqueue. We can ignore the ithread entirely here.
>>
>> What do you think?
>>
> Well, it sounds good :) But performance numbers and Jack opinion is more
> important :)
>
> Are you going to Malta?
>
>
>> Totally pie in the sky handwaving at this point:
>>
>> * create an array of mbuf pointers for completed mbufs;
>> * populate the mbuf array;
>> * pass the array up to ether_demux().
>>
>> For vlan handling, it may end up populating its own list of mbufs to push
>> up to ether_demux(). So maybe we should extend the API to have a bitmap of
>> packets to actually handle from the array, so we can pass up a larger array
>> of mbufs, note which ones are for the destination and then the upcall can
>> mark which frames its consumed.
>>
>> I specifically wonder how much work/benefit we may see by doing:
>>
>> * batching packets into lists so various steps can batch process things
>> rather than run to completion;
>> * batching the processing of a list of frames under a single lock
>> instance - eg, if the forwarding code could do the forwarding lookup for
>> 'n' packets under a single lock, then pass that list of frames up to
>> inet_pfil_hook() to do the work under one lock, etc, etc.
>>
> I'm thinking the same way, but we're stuck with 'forwarding lookup' due to
> problem with egress interface pointer, as I mention earlier. However it is
> interesting to see how much it helps, regardless of locking.
>
> Currently I'm thinking that we should try to change radix to something
> different (it seems that it can be checked fast) and see what happened.
> Luigi's performance numbers for our radix are too awful, and there is a
> patch implementing alternative trie:
> http://info.iet.unipi.it/~**luigi/papers/20120601-dxr.pdf<http://info.iet.unipi.it/~luigi/papers/20120601-dxr.pdf>;
> http://www.nxlab.fer.hr/dxr/**stable_8_20120824.diff<http://www.nxlab.fer.hr/dxr/stable_8_20120824.diff>;
>
>
>
>
>> Here, the processing would look less like "grab lock and process to
>> completion" and more like "mark and sweep" - ie, we have a list of frames
>> that we mark as needing processing and mark as having been processed at
>> each layer, so we know where to next dispatch them.
>>
>> I still have some tool coding to do with PMC before I even think about
>> tinkering with this as I'd like to measure stuff like per-packet latency as
>> well as top-level processing overhead (ie, CPU_CLK_UNHALTED.THREAD_P /
>> lagg0 TX bytes/pkts, RX bytes/pkts, NIC interrupts on that core, etc.)
>>
> That will be great to see!
>
>>
>> Thanks,
>>
>>
>>
>> -adrian
>>
>>
> ______________________________**_________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/**mailman/listinfo/freebsd-net<http://lists.freebsd.org/mailman/listinfo/freebsd-net>;
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@**freebsd.org<freebsd-net-unsubscribe@freebsd.org>
> "
>



-- 
Sami Halabi
Information Systems Engineer
NMS Projects Expert
FreeBSD SysAdmin Expert



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAEW%2BogZttyScUBQQWht%2BYGfLEDU_APcoRyYeMy_wDseAcZwVnA>