Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 28 Jul 2018 17:01:43 +0530
From:      Sumit Lakra <sumitlakradev@gmail.com>
To:        Giuseppe Lettieri <g.lettieri@iet.unipi.it>
Cc:        "Alexander V. Chernikov" <melifaro@freebsd.org>, net@freebsd.org, freebsd-hackers@freebsd.org
Subject:   Re: PSPAT subsystem Implementation in FreeBSD - GSoC 2018
Message-ID:  <CALsHEA9FDeX0mg98EB53ZG=79mEA46zRJ6yT4TAaOP-6Wi8b_Q@mail.gmail.com>
In-Reply-To: <CALsHEA9mhsCNXJT2OYvg%2BH9aw3cLYR1hUNEiqT8bkkXNLOsA2Q@mail.gmail.com>
References:  <CALsHEA_f2290Oc7e7GKGxWK4RWDT=98_PyG=B8XffJxLkYuJwQ@mail.gmail.com> <b22143f1-401e-3068-bb24-cf56d936d2bb@iet.unipi.it> <CALsHEA94Fv2bvPDU%2BaMbeE8ybB9thH%2B5%2BT3x2WSen0eF4FN2wQ@mail.gmail.com> <CALsHEA_mYR-4X1m4dhnsc5YLtweS8PuR4DpepNXWp4EHrEnLDg@mail.gmail.com> <CALsHEA_aVjmbHcoOUcNUYYbZ-1ndb-akqies8ZpKn7UMEWAC6w@mail.gmail.com> <2bb73b27-d5c7-93dd-aaf8-ff47b64b7d70@iet.unipi.it> <CALsHEA8YbOkxCqp%2BPYZydOssxzKE4Jxn7cq0UwW00hRHbcUYrg@mail.gmail.com> <CALsHEA-gU-No-ovc%2BeDQ8HKvF591pe7V-okJP-YuG37T6rWFSQ@mail.gmail.com> <CALsHEA9eSmJgg%2BpLFBnktMKwERP8HUTV5mOkwAFAsUJ134TLnw@mail.gmail.com> <4686483f-21de-129e-efd3-359a5189eb46@iet.unipi.it> <CALsHEA95i-MUFVo3J4NNyhhuwAmBp6DyPoYNYu4HqnubcN6ypw@mail.gmail.com> <ad9d93b1-3e8d-a92b-4a6b-460bb5b0c3a8@iet.unipi.it> <CALsHEA_JDYNU5pr4AJ9%2BqsQcTnD_=5tYXG3m93TkcNAyqWNnRQ@mail.gmail.com> <4a7920a2-26ba-9abe-d677-233aa7d47cd0@iet.unipi.it> <CALsHEA_rCH91yKjFYV_NZa=6DSbnrLCSAWMFHOVF9HzE4h6iAQ@mail.gmail.com> <e1fe418f-8b99-e22c-1465-5d024d5ce8e4@iet.unipi.it> <CALsHEA9mhsCNXJT2OYvg%2BH9aw3cLYR1hUNEiqT8bkkXNLOsA2Q@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hello,

I tried some other simpler tests today. First I tried intercepting the
packets from ip_output.c again and added some printf statements to track
the path of package (code
<https://github.com/freebsd/freebsd/commit/1f1adef538ae31560e2283f06ccfed1ddaf1e678>).
As before, they were successfully intercepted and placed in the PSPAT
client queues but the arbiter was unable to find them most of the time (not
always), when scanning the queues. As per my previous assumption this was
probably due to client threads early return without any error indications
which assumed that the packet was dispatched. So, to test it I did this
<https://github.com/freebsd/freebsd/commit/e0c7deb1b66d7722e3ca54c981f164fb86072a3d>..
I couldn't make the client threads pause as they apparently had some non
sleepable locks held, so I made them go through a really long loop before
returning hoping this would allow PSPAT enough time to pick them up and
dispatch.. and bingo.. it worked. The packets no longer disappeared from
the PSPAT client queues and reached the pspat_txqs_flush().

This could also be the same reason how the packets with PROTO_LAYER2 tags
disappeared, although as I mentioned in the previous mail, they were really
not good for interception anyway.

Next, I uncommented the actual if_output() call in the pspat_txqs_flush()
to dispatch the packets that were reaching this point, but somehow the
function call failed again(code
<https://github.com/freebsd/freebsd/commit/03ba5313658a346101e6ec8ad8a17a006a772eec>).
In order to check if the function was called with correct parameters, I
used some printf statements to check them (code
<https://github.com/freebsd/freebsd/commit/08a753f94704e66415ee01f8b445093edce6c39e>)..
they were intact. But the function call was failing when called by the
arbiter thread to dispatch packets. The exact same function called with the
exact same arguments and yet it fails when called from a thread other then
the client thread... Why does this happen ??.. I can't figure out !!

This makes my second assumption from the previous mail to be possibly
correct too, and this is probably why calling dummynet_send() from
pspat_txqs_flush() didn't work either.. Put simply, there is some thread
specific stuff going on with the client threads and they don't like any
other threads trying to step in their shoes and dispatch their packets, and
this is not restricted to dummynet/ipfw but maybe true for the entire
network stack and many other functions

Like I had said, I have already completed the PSPAT part and tested it to
be working well but trying to make it work with the existing networking
subsystem is turning out to be increasingly complex. I have no idea how to
get around this problem, but will keep trying to come up with something.
Any help/ideas will be greatly appreciated.

Thanks and Regards,
Sumit

On Sat, Jul 28, 2018 at 1:52 AM, Sumit Lakra <sumitlakradev@gmail.com>
wrote:

> Hello,
>
> I tried the sysctl and it worked in that I was able to intercept the
> packets with DIR  == DIR_OUT | PROTO_LAYER2, but I am beginning to face
> some other increasingly difficult and unanticipated problems in trying to
> attach the PSPAT code to work with the present networking system. As you
> mentioned you are a bit busy now, I was hoping maybe Alexander will be able
> to help me a little here. It will be good to hear a different viewpoint as
> well. Also, there are issues I am facing which I believe even you may not
> be aware of, hence I am also sending this mail to the mailing lists in hope
> of getting additional opinions from other experts of dummynet/ipfw and the
> FreeBSD network stack.
>
> PSPAT WIP branch - https://github.com/theGodlessL
> akra/freebsd-pspat/tree/pspat-temp
>
> Firstly, as per our previous ideas we had the plan to intercept the
> packets from dummynet... pass it through PSPAT... and finally dispatch them
> out from the dispatcher queue via the arbiter or a dedicated dispatcher
> thread using functions like ip_output() or ether_output_frame() similar to
> dummynet_send(). I had already spent a good deal of time trying to get
> these working but it failed every time and resulted in kernel panics. My
> first thoughts were that the packets are not complete enough for these
> functions. (net.link.ether.ipfw worked but it also resulted in an error
> when sending the packet to ether_output_frame). So, in order to test it, I
> wrote a simple commit to test whether these packets can really be sent to
> these functions without making them go through PSPAT at all. Turns out,
> they failed.
>
> The first one can be seen here
> <https://github.com/freebsd/freebsd/commit/15c802d6d6a74316e916ac4a4d98648c33dc5b5a>..
> sending DIR_OUT packets to ip_output() directly from dummynt_io() with
> nothing to do with PSPAT failed.
> The second one can be seen here
> <https://github.com/freebsd/freebsd/commit/6532de0f04a9d8d2d873a62da2015f879b97daf0>..
> a similar failure with DIR_OUT | PROTO_LAYER2 packets. These both attempts
> resulted in kernel panics.
>
> The conclusion was that neither of these are a good match for PSPAT input
> and output interception. (Also, in case of the DIR_OUT | PROTO_LAYER2
> packets, they were successfully intercepted and put on the PSPAT client
> mailboxes but when the arbiter scanned them, it somehow returned NULL. This
> did not happen with DIR_OUT packets which successfully reached the PSPAT
> exit point)
>
> So, next I tried to check if we can let dummynet tag the packets and then
> call the dummynet_send() functions to dispatch them directly. The first try
> with no PSPAT looked like this
> <https://github.com/freebsd/freebsd/commit/d34814d152216778f447492f0ce240052a78e534>..
> and it worked without any errors. Although I am unable to make out anything
> special being done by the code here, but somehow, letting dummynet tag the
> packets and just reading those tags in dummynet_send() before calling
> ip_output() or ether_output_frame() seems to work better than trying to
> call these functions directly.
>
> Anyway, I figured then it would be a good idea to let dummynet tag these
> packets before redirecting them to PSPAT and then calling dummynet_send()
> itself at the PSPAT exit point pspat_txqs_flush(), and so I did as can be
> seen here
> <https://github.com/freebsd/freebsd/commit/ebc4de0a3c62bb6d5be4fee74d81adde483f6f11>..,
> but it didn't work out again. The packets were successfully intercepted and
> reached pspat_txqs_flush() but when dummynet_send() is called on them they
> result in kernel panics.. I can't figure out how and why ?
>
> So after all these attempts and many more like them, when I was unable to
> get it working, I decided to intercept the packets in the originally
> planned way that we thought of at the start of GSoC, i.e. to intercept them
> where if_output() is called. I also thought that it would be better to call
> the exact same function during which we intercept it, while dispatching, as
> we don't do anything other than scheduling in PSPAT so the packet remains
> in the same state and calling a lower layer function on the packet may not
> end well. So, I wrote a bunch of printf statements to see which is the most
> commonly used if_output() function call. For testing, I wanted to intercept
> at only one position instead of all the dozen places where this is called
> in the code. The chosen point was ip_output.c line 662, which is what is
> almost always used..I guess. I wrote the code to intercept packets here as
> can be seen here
> <https://github.com/freebsd/freebsd/commit/cdb6b8e30d0886fb36077d3e01126d0c0423e0d4>.
> On testing, I found that the interception was successful and the packets
> were stored in the PSPAT client queues, but the arbiter always returned
> NULL while scanning these queues for packets. This issue was similar to
> intercepting DIR_OUT | PROTO_LAYER2 packets.
>
> Lastly, leaving the search for the perfect point for packet interception,
> I decided to try and implement the Scheduler Algorithm use in PSPAT. I am
> yet to use the patch and see how it works. I was more keen in trying a
> different approach, where it would be used similar to the use of SA's in
> dummynet_io(). This looked like this
> <https://github.com/freebsd/freebsd/commit/80b1b093c11c96dc22a5a7fdea4eda2321557247>;
> ..
>
> The idea was that this approach would make it easier for PSPAT to be
> integrated with dummynet which is the long term goal. Also, as all the 7
> SA's are loaded when dummynet is loaded into the kernel, it didn't seem to
> make much sense to write all that loading code separately for each SA for
> PSPAT all over again. And another perk would be that the SA to be used with
> PSPAT could be changed with the exact same command with which we change
> SA's to be used to dummynet in general. Also, before I wrote this code, I
> tried to check if we can send a packet to a SA from within
> pspat_client_handler() after it has been passed the required arguments, and
> I was glad that it was able to enqueue the packets successfully. However,
> when we try to do the same from the arbiter thread, it fails for some
> reason.
>
> As you can see from this mail, I have been trying out a lot of different
> approaches and ideas to attach PSPAT with the present subsystem but it is
> not working. I haven't been able to make any real progress lately, so I
> made commits of some of those attempts and have tried to explain my
> approach here so someone can help me point out what exactly is wrong and
> how to fix it. I myself have a couple of other ideas to figure out why this
> is not working and I will try them and let you know soon how they go.
>
> As of now, I have two theories on these -
> [1]   The points from where we are trying to intercept packets are all
> called by the client threads itself till the very last function call which
> actually sends the packet to NIC, so, when we instead make the client
> thread put the packet in PSPAT client queue and return with a value
> indicating no error, the thread may consider the packet dispatched on its
> way back and hence free up the mbuf pointer as well. This would explain the
> disappearing of packets from client queues when they are scanned by the
> arbiter.
>
> [2]  The original client threads may have some thread specific
> allocations/de-allocations or tags etc somewhere in the network stack which
> get modified when we return the client thread with no error indications,
> and later when we call a same function from a different thread, this causes
> a conflict. This would explain why we are not able to dispatch using the
> dispatcher/arbiter thread (dummynet_send) or why the arbiter thread fails
> to enqueue packets to SA when the client thread is able to do so when
> called from within pspat_client_handler()
>
> To summarize, I was hoping that in this project, the PSPAT would be the
> big deal, and it would only take a little more code to add for
> enqueuing/dequeuing to/from an SA and that the packet interception would
> also be relatively easy, but I have already written and tested the PSPAT
> code and these are the parts which are turning out to be way more
> complicated. I hope I can get some help here.
>
> Thanks and Regards,
> Sumit
>
> On Tue, Jul 24, 2018 at 7:23 PM, Giuseppe Lettieri <
> g.lettieri@iet.unipi.it> wrote:
>
>> Hello Sumit,
>>
>> sorry but I am a bit busy right now. Have you tried playing with the
>> sysctls mentioned in the PACKET FLOW section of the ipfw manpage? If you
>> may want to set net.link.ether.ipfw=1 and reset the others.
>>
>> Cheers,
>> Giuseppe
>>
>>
>> Il 24/07/2018 06:26, Sumit Lakra ha scritto:
>>
>>> Hello,
>>>
>>> I was trying to use the scheduler similar to its use in dummynet_io()
>>> where it uses FIFO as the default scheduler. I have been able to enqueue
>>> the packets successfuly but there was an error I was facing while
>>> dequeuing. This implementation is not very appropriate though for a few
>>> reasons. I started to try it only because it seemed possible and it would
>>> help me understand the SAs better. I will try to get it working if possible
>>> before using the patch and trying a different approach. I will notify you
>>> soon about the results or issues if any.
>>>
>>> Meanwhile, have you been able to check what was it about dummynet that
>>> caused packet duplication and other unexpected results ? The packet
>>> duplication is not a big problem in terms of PSPAT though as I have already
>>> written a simple hack to only let in a packet once. What's more important
>>> is -
>>>
>>> [1] Why are there no packets with dir == DIR_OUT | PROTO_LAYER2 ? This
>>> will help us intercept such packets and send them to ether_output_frame().
>>> or,
>>> [2] Why were the packets with dir == DIR_OUT not ready for ip_output() ?
>>> They caused an error as they had incomplete headers.. How to fix this ?
>>> Should we call some function other than ip_output() ?
>>>
>>> Ignoring the packet duplication, if you could help me with either one of
>>> the above two issues, it will help me complete and test the PSPAT code as a
>>> whole.
>>>
>>> Thanks and Regards,
>>> Sumit
>>>
>>
>>
>> --
>> Dr. Ing. Giuseppe Lettieri
>> Dipartimento di Ingegneria della Informazione
>> Universita' di Pisa
>> Largo Lucio Lazzarino 1, 56122 Pisa - Italy
>> Ph. : (+39) 050-2217.649 (direct) .599 (switch)
>> Fax : (+39) 050-2217.600
>> e-mail: g.lettieri@iet.unipi.it
>>
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CALsHEA9FDeX0mg98EB53ZG=79mEA46zRJ6yT4TAaOP-6Wi8b_Q>