Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 2 Jan 2018 12:36:18 +0100
From:      Vincenzo Maffione <v.maffione@gmail.com>
To:        Charlie Smurthwaite <charlie@atech.media>
Cc:        "freebsd-net@freebsd.org" <net@freebsd.org>
Subject:   Re: Linux netmap memory allocation
Message-ID:  <CA%2B_eA9hs-GUCRH%2B5FAs1SPyR8S8GFndq_ScgDAmJ8njgOsQBCQ@mail.gmail.com>
In-Reply-To: <da1e5904-30c8-b06b-6e7f-0bf26fc99a17@atech.media>
References:  <7b85fc73-9cc8-0a60-5264-d26f47af5eae@atech.media> <CA%2B_eA9hthoig%2B_UZQNZhM-aBndM44f0wz-NKqWUoYpBA8Ss0jQ@mail.gmail.com> <6c5de1ed-0545-31b3-d0e2-4258fa4ccf1c@atech.media> <CA%2B_eA9hxQuej8L3SdY%2BhgpnDH3tccgsqOBtw1S=RkvURxu=Ktg@mail.gmail.com> <da1e5904-30c8-b06b-6e7f-0bf26fc99a17@atech.media>

next in thread | previous in thread | raw e-mail | index | archive | help
2018-01-01 23:05 GMT+01:00 Charlie Smurthwaite <charlie@atech.media>:

>
> On 01/01/18 21:05, Vincenzo Maffione wrote:
>
>
>
> 2018-01-01 17:14 GMT+01:00 Charlie Smurthwaite <charlie@atech.media>:
>
>> Hi,
>>
>> Thank you for your reply. I was able to resolve this.
>>
>> 1) I do indeed open one FD per NIC
>> 2) I no longer specify nr_arg1, nr_arg2 nor nr_arg3. Instead I just
>> verify that all NICs return with identical nr_arg2 so that the memory is
>> shared between them.
>> 3) I properly initialized my memory, my failure to do so was causing me a
>> lot of confusion,
>>
>> The resulting memory space is large enough for all the NICs, and
>> everything works perfectly with zero-copy forwarding, great!
>>
>> The only thing I am still having trouble with is the ability to
>> simultaneously trigger a TX and an RX sync on all NICs. I have tried
>> select, poll, and epoll, and in all cases, RX rings are updated but TX
>> rings are not and TX packets are not pushed out (this occurs using both
>> native and emulated netmap modes). I notice the documentation says "Note
>> that on epoll and kqueue, NETMAP_NO_TX_POLL and NETMAP_DO_RX_POLL only have
>> an effect when some event is posted for the file descriptor.", but the
>> behaviour seems the same on poll and select as well as epoll, perhaps this
>> is a linux-specific implementation detail?
>>
> I have also found that all of these mechanisms seem to incur a very high
>> cost in terms of CPU time (making them no more efficient than busy waiting
>> at 1Mpps+). My current approach is as follows, but I feel like there should
>> be a better option:
>>
>>     for(int n=0; n<NIC_COUNT; n++) {
>>       // usleep(10); // More CPU time seems to be saved with a careful
>> sleep than with select/poll/epoll
>>       ioctl(fds[n], NIOCTXSYNC);
>>       ioctl(fds[n], NIOCRXSYNC);
>>       rxring = rxrings[n];
>>       while (!nm_ring_empty(rxring)) {
>>         // Forward any packets waiting in this NIC's RX ring to the
>> appropriate TX ring
>>       }
>>     }
>>
>
> If you are using poll() or select() you should not use ioctl(NIOC*XSYNC),
> as the txsync/rxsync operations are automatically performed within the
> poll()/select() syscall (at least assuming you did not specify
> NETMAP_NO_TX_POLL).
> Also, whether netmap calls or does not call txsync/rxsync on certain rings
> depends on the parameters passed to nm_open().
> Make sure you check for nm_ring_space(txring) when forwarding.
>
> Cheers,
>   Vincenzo
>
>
>
> Hi Vincenzo,
>
> Thanks again for your assistance. You state the following (as does the
> manual):
>
> "If you are using poll() or select() you should not use ioctl(NIOC*XSYNC),
> as the txsync/rxsync operations are automatically performed within the
> poll()/select() syscall (at least assuming you did not specify
> NETMAP_NO_TX_POLL)."
>
> However, this is not happening for me :(
>
> I am using poll(), and I am not specifying NETMAP_NO_TX_POLL, and have
> found that sometimes frames and sent only when the TX buffer is full, and
> sometimes they are not sent at all. They are never sent as expected on
> every invocation of poll(). If I run ioctl(NIOCTXSYNC) manually, everything
> works correctly. I assume I have simply missed something from my nmreq.
>

I don't think you have missed anything within nmreq.  I see that you are
waiting for POLLIN only (and this is right in your router case), so poll()
will actually invoke txsync on interface #i only when netmap intercepts an
RX or TX interrupt on interface #i. This means that packets may stall for
long time in the TX rings if you don't call ioctl(TXSYNC). The manual is
not wrong, however. You can look at the apps/bridge/bridge.c example to
understand where this "poll automatically calls txsync" thing is useful.


> You also mentioned: "whether netmap calls or does not call txsync/rxsync
> on certain rings depends on the parameters passed to nm_open()". I do not
> use the nm_open helper method, but I am extremely interested to know what
> parameters would affect this bahaviour, as this would seem very relevant to
> my problem.
>

Yes, we do not normally use the low level interface (ioctl(REGIF)), because
it's just simpler to use the nm_open() interface. Within the first
parameter of nm_open() you can specify to open just one RX/TX rings couple,
e.g. with "enp1f0s1-3". Then you usually want to mmap() just once (as you
do in your program); with nm_open(), you do that with the NM_OPEN_NO_MMAP
flag.

>
> If you are interested or if it helps explain my question, my complete code
> (hopefully well commented but far from complete) can be found here:
> https://github.com/catphish/netmap-router/blob/
> 58a9b957c19b0a012088c491bd58bc3161a56ff1/router.c
>
> Specifically, if the ioctl call at line 92 is removed, the code does not
> work (packets are not transmitted, or are only transmitted when the buffer
> is full, which of these 2 behaviours seems to be random), however I would
> expect it to work because I do not specify NETMAP_NO_TX_POLL, and I would
> therefore hope that the poll() call on line 80 would have the same effect.
>

Yes, that depends on when netmap_poll() is called by the kernel, that
depends on when something is ready for receive on the file descriptor.
Looking at your program, I think you need to call ioctl(TXSYNC), at least
because you don't want to introduce artificial/unbounded latency. However,
since these calls are expensive, you could use them only when necessary
(e.g. when you nm_ring_space(txring) == 0 or when you actually forwarded
some packets on txring.

>
> I hope this all makes sense, and again, I hope I have simply missed
> something from the nmreq i pass to NIOCREGIF.
>
> It is worth mentioning that with the exception of this problem /
> confusion, I am getting extremely good results from this code and netmap in
> general.
>

That's nice to hear :)
Your program looks simple enough that we could even add it to the examples
(as an example of routing logic).

Cheers,
  VIncenzo


>
> Charlie
>
>
> *Charlie Smurthwaite*
> Technical Director
>
> *tel.* *email.* charlie@atech.media *web.* https://atech.media
>
> *This e-mail has been sent by aTech Media Limited (or one of its
> assoicated group companys, Dial 9 Communications Limited or Viaduct Hosting
> Limited). Its contents are confidential therefore if you have received this
> message in error, we would appreciate it if you could let us know and
> delete the message. aTech Media Limited is a UK limited company,
> registration number 5523199. Dial 9 Communications Limited is a UK limited
> company, registration number 7740921. Viaduct Hosting Limited is a UK
> limited company, registration number 8514362. All companies are registered
> at Unit 9 Winchester Place, North Street, Poole, Dorset, BH15 1NX.*
>



-- 
Vincenzo Maffione



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2B_eA9hs-GUCRH%2B5FAs1SPyR8S8GFndq_ScgDAmJ8njgOsQBCQ>