Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 1 Jan 2018 22:05:53 +0100
From:      Vincenzo Maffione <v.maffione@gmail.com>
To:        Charlie Smurthwaite <charlie@atech.media>
Cc:        "freebsd-net@freebsd.org" <net@freebsd.org>
Subject:   Re: Linux netmap memory allocation
Message-ID:  <CA%2B_eA9hxQuej8L3SdY%2BhgpnDH3tccgsqOBtw1S=RkvURxu=Ktg@mail.gmail.com>
In-Reply-To: <6c5de1ed-0545-31b3-d0e2-4258fa4ccf1c@atech.media>
References:  <7b85fc73-9cc8-0a60-5264-d26f47af5eae@atech.media> <CA%2B_eA9hthoig%2B_UZQNZhM-aBndM44f0wz-NKqWUoYpBA8Ss0jQ@mail.gmail.com> <6c5de1ed-0545-31b3-d0e2-4258fa4ccf1c@atech.media>

next in thread | previous in thread | raw e-mail | index | archive | help
2018-01-01 17:14 GMT+01:00 Charlie Smurthwaite <charlie@atech.media>:

> Hi,
>
> Thank you for your reply. I was able to resolve this.
>
> 1) I do indeed open one FD per NIC
> 2) I no longer specify nr_arg1, nr_arg2 nor nr_arg3. Instead I just verify
> that all NICs return with identical nr_arg2 so that the memory is shared
> between them.
> 3) I properly initialized my memory, my failure to do so was causing me a
> lot of confusion,
>
> The resulting memory space is large enough for all the NICs, and
> everything works perfectly with zero-copy forwarding, great!
>
> The only thing I am still having trouble with is the ability to
> simultaneously trigger a TX and an RX sync on all NICs. I have tried
> select, poll, and epoll, and in all cases, RX rings are updated but TX
> rings are not and TX packets are not pushed out (this occurs using both
> native and emulated netmap modes). I notice the documentation says "Note
> that on epoll and kqueue, NETMAP_NO_TX_POLL and NETMAP_DO_RX_POLL only have
> an effect when some event is posted for the file descriptor.", but the
> behaviour seems the same on poll and select as well as epoll, perhaps this
> is a linux-specific implementation detail?
>
I have also found that all of these mechanisms seem to incur a very high
> cost in terms of CPU time (making them no more efficient than busy waiting
> at 1Mpps+). My current approach is as follows, but I feel like there should
> be a better option:
>
>     for(int n=0; n<NIC_COUNT; n++) {
>       // usleep(10); // More CPU time seems to be saved with a careful
> sleep than with select/poll/epoll
>       ioctl(fds[n], NIOCTXSYNC);
>       ioctl(fds[n], NIOCRXSYNC);
>       rxring = rxrings[n];
>       while (!nm_ring_empty(rxring)) {
>         // Forward any packets waiting in this NIC's RX ring to the
> appropriate TX ring
>       }
>     }
>

If you are using poll() or select() you should not use ioctl(NIOC*XSYNC),
as the txsync/rxsync operations are automatically performed within the
poll()/select() syscall (at least assuming you did not specify
NETMAP_NO_TX_POLL).
Also, whether netmap calls or does not call txsync/rxsync on certain rings
depends on the parameters passed to nm_open().
Make sure you check for nm_ring_space(txring) when forwarding.

Cheers,
  Vincenzo


> Thanks again,
>
> Charlie
>
>
> On 01/01/18 15:40, Vincenzo Maffione wrote:
>
> Hi,
>   If you have 32 NICs you should open 32 netmap file descriptors, (and you
> should not specify 64 in nr_arg1 or 256 in nr_arg3, this is for different
> usecases). Also, since you want to do zercopy you must not specify a
> separate memory area (nr_arg2), but use the same one.
> You may want to use the high level API nm_open()
> https://github.com/luigirizzo/netmap/blob/master/sys/net/
> netmap_user.h#L307
>
> You may also want to look at the netmap tutorial to get a better idea of
> how the API works (https://github.com/vmaffione/netmap-tutorial).
>
> Cheers,
>   Vincenzo
>
> 2017-12-28 18:34 GMT+01:00 Charlie Smurthwaite <charlie@atech.media>:
>
>> Hi,
>>
>> I'm just starting to use netmap and it is my intention to do zero-copy
>> forwarding of frames between a large number of NICs. I am using Intel
>> i350 (igb) on Linux. I therefore require a large memory area for rings
>> and buffers.
>>
>> My calculation:
>> 32 NICs * 2 rings (TX+RX) * 256 frames * 2048 bytes = 32MB
>>
>> I am currently having a problem (or perhaps just a misunderstanding)
>> regarding allocation of this memory. I am attempting to use the
>> following code:
>>
>> void thread_main(int thread_id) {
>>   struct nmreq req; // A struct for the netmap request
>>   int fd;           // File descriptor for netmap socket
>>   void * mem;       // Pointer to allocated memory area
>>
>>   fd = open("/dev/netmap", 0);     // Open a generic netmap socket
>>   strcpy(req.nr_name, "enp8s0f0"); // Copy NIC name into request
>>   req.nr_version = NETMAP_API;     // Set version number
>>   req.nr_flags = NR_REG_ONE_NIC;   // We will be using a single hw ring
>>
>>   // Select ring 0, disable TX on poll
>>   req.nr_ringid = NETMAP_NO_TX_POLL | NETMAP_HW_RING | 0;
>>
>>   // Ask for 64 additional rings to be allocated (32 * (TX+RX))
>>   req.nr_arg1 = 64;
>>
>>   // Allocate a separate memory area for each thread
>>   req.nr_arg2 = 10 + thread_id;
>>
>>   // Ask for additional buffers (256 per ring)
>>   req.nr_arg3 = 64*256;
>>
>>   // Initialize port
>>   ioctl(fd, NIOCREGIF, &req);
>>
>>   // Check the allocated memory size
>>   printf("memsize: %u\n", req.nr_memsize);
>>   // Check the allocated memory area
>>   printf("nr_arg2: %u\n", req.nr_arg2);
>> }
>>
>> The output is as follows:
>>
>> memsize: 4206859
>> nr_arg2: 10
>>
>> This is far short of the amount of memory I am hoping to be allocated.
>> Am I doing something wrong, or is this simply an indication that the
>> driver is unwilling to allocate more than 4MB?
>>
>> A secondary (related) problem is that if I don't set arg1,arg2,arg3 in
>> my code (ie they will be zero), then I get varying output (it varies
>> between each of the following):
>>
>> memsize: 4206843
>> nr_arg2: 0
>>
>> memsize: 343019520
>> nr_arg2: 1
>>
>> Any pointers would be appreciated. Thanks!
>>
>> Charlie
>>
>>
>> Charlie Smurthwaite
>> Technical Director
>>
>> tel.  email. charlie@atech.media<mailto:charlie@atech.media> web.
>> https://atech.media
>>
>> This e-mail has been sent by aTech Media Limited (or one of its
>> assoicated group companys, Dial 9 Communications Limited or Viaduct Hosting
>> Limited). Its contents are confidential therefore if you have received this
>> message in error, we would appreciate it if you could let us know and
>> delete the message. aTech Media Limited is a UK limited company,
>> registration number 5523199. Dial 9 Communications Limited is a UK limited
>> company, registration number 7740921. Viaduct Hosting Limited is a UK
>> limited company, registration number 8514362. All companies are registered
>> at Unit 9 Winchester Place, North Street, Poole, Dorset, BH15 1NX.
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>>
>
>
>
> --
> Vincenzo Maffione
>
>
>


-- 
Vincenzo Maffione



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2B_eA9hxQuej8L3SdY%2BhgpnDH3tccgsqOBtw1S=RkvURxu=Ktg>