From owner-freebsd-hackers@freebsd.org Thu Jul 26 18:55:56 2018 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 062D8105739F for ; Thu, 26 Jul 2018 18:55:56 +0000 (UTC) (envelope-from yuri@rawbw.com) Received: from shell1.rawbw.com (shell1.rawbw.com [198.144.192.42]) by mx1.freebsd.org (Postfix) with ESMTP id 7E4A474417 for ; Thu, 26 Jul 2018 18:55:55 +0000 (UTC) (envelope-from yuri@rawbw.com) Received: from yv.noip.me (c-24-4-131-132.hsd1.ca.comcast.net [24.4.131.132]) (authenticated bits=0) by shell1.rawbw.com (8.15.1/8.15.1) with ESMTPSA id w6QItrRe036609 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO) for ; Thu, 26 Jul 2018 11:55:54 -0700 (PDT) (envelope-from yuri@rawbw.com) X-Authentication-Warning: shell1.rawbw.com: Host c-24-4-131-132.hsd1.ca.comcast.net [24.4.131.132] claimed to be yv.noip.me Subject: Re: 'scanimage -L' from 'graphics/sane-backends' causes system crashes after a while To: Freebsd hackers list References: <5fe20134-4b4f-4789-fa54-8ce746453130@rawbw.com> From: Yuri Message-ID: <4d7eff74-b657-ab35-c025-4aabbe53cc2f@rawbw.com> Date: Thu, 26 Jul 2018 11:55:52 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:60.0) Gecko/20100101 Thunderbird/60.0 MIME-Version: 1.0 In-Reply-To: <5fe20134-4b4f-4789-fa54-8ce746453130@rawbw.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Jul 2018 18:55:56 -0000 On 2/20/18 10:15 AM, Yuri wrote: > For my system this works with a 100% reliability: run 'scanimage -L' > (or just scan something), and system will crash after a few hours or so. > > '-L' option calls the function 'sane_get_devices', which has about 90 > implementations there. It calls all of them trying to find a scanner. > Some of them cause system to crash later. > > My real scanner is on the wifi network. I'm not sure if the real > scanner is what causes the problem, or maybe it's some other test > among these 90 'sane_get_devices' functions that causes this problem. > > What is the easiest way to troubleshoot this? The problem is that the > crash doesn't come right away. > > 11.1-STABLE For the record: I never found the reason why this command crashes the system. But there is a workaround: in the file /usr/local/etc/sane.d/dll.conf comment out all scanners except the one that you have. This prevents sane from scanning all possible scanners, and avoids system crashes. Yuri From owner-freebsd-hackers@freebsd.org Fri Jul 27 20:23:27 2018 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A37B0105593C for ; Fri, 27 Jul 2018 20:23:27 +0000 (UTC) (envelope-from sumitlakradev@gmail.com) Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 08D42863BF for ; Fri, 27 Jul 2018 20:23:26 +0000 (UTC) (envelope-from sumitlakradev@gmail.com) Received: by mail-wr1-x431.google.com with SMTP id v14-v6so6223770wro.5 for ; Fri, 27 Jul 2018 13:23:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=h7MSSyzHj5CGFEU5m2oaPfGEZ7l8AEaGIxf69VM7CFw=; b=Z4nhJl8QYPeSvRJknOMF8Haq0hrpjeUNFJNH3TwAIz2kPiTP4WIK9JGIdWyr67rga/ zbCvb6hxHsFxyyQKZAyQ6cK73tkBmebwgj43SEWfQgaQHUc5r4aBZm0GmHlVhmy+WIP+ cn3OniJq9oNI4Z87m2jqYOjQilkatn8fZfiINjIwhbs/IJTq5VUPBfscLPmzJH/tQNBV 3IINZeu3ZrfruSJaJ0mhqY+JBK2Dz9hJ9IU036XEe5MjGu2Shab6UwIL85sphRLdznEL YsiBeG7yDM8e5z5sugqfws/q9n2LKZZtqOJpwFxc3sGykJAQj45bTWBZIZesMwbUvF7d b12w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=h7MSSyzHj5CGFEU5m2oaPfGEZ7l8AEaGIxf69VM7CFw=; b=hzk9RkFKJ4KfmabpOcLnPtqZbXGaAi8jVZY6YVO1JDqLmC8TnTJ/rrUEo9mYyRRb3T 5ssQwquvN+upowisNlYw06COlN2nzA6CpyYy0qmUFqnUZ/HdeLBHNEdWEjeU54XPWetW SIIFuekWecz+lzoIKrMfjKGIsJYMA7PTLF+xBsvrlL0ke3YijCljfQE0sIpVi9fZmvRy GaE8tpVq9Zv1oprXUWzd0FbB5r3J+Rq3DpQ+BOpCkPdvZKtaL2OqJe/5Rchj+9nPUaYz V5oafZXP81Jj88rI1RhEi1wAfiYvW0W87joUjCPAEUxUSFu9iCU10blEacGYCq8NJLeE WTRw== X-Gm-Message-State: AOUpUlG73nP69Pfs9C4k8k2WqNCprw0dG1Tnuxb93mHXxr8WvOEJr+hM QWsMEpGNb6gZHRFJWxd4Q+83+eKR6zdbwoqXl5c= X-Google-Smtp-Source: AAOMgpeqDZ+avxYdn0Ou8lKswyb/YOQJljqyiPPyMSoHTiJQoH5nCRXUmWB3lecdhXCKt/JHTEUpstBeFUOss/QwEpA= X-Received: by 2002:adf:e8c2:: with SMTP id k2-v6mr6587518wrn.31.1532723005781; Fri, 27 Jul 2018 13:23:25 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:adf:c806:0:0:0:0:0 with HTTP; Fri, 27 Jul 2018 13:22:45 -0700 (PDT) In-Reply-To: References: <2bb73b27-d5c7-93dd-aaf8-ff47b64b7d70@iet.unipi.it> <4686483f-21de-129e-efd3-359a5189eb46@iet.unipi.it> <4a7920a2-26ba-9abe-d677-233aa7d47cd0@iet.unipi.it> From: Sumit Lakra Date: Sat, 28 Jul 2018 01:52:45 +0530 Message-ID: Subject: Re: PSPAT subsystem Implementation in FreeBSD - GSoC 2018 To: Giuseppe Lettieri Cc: "Alexander V. Chernikov" X-Mailman-Approved-At: Fri, 27 Jul 2018 20:34:16 +0000 Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.27 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jul 2018 20:23:27 -0000 Hello, I tried the sysctl and it worked in that I was able to intercept the packets with DIR == DIR_OUT | PROTO_LAYER2, but I am beginning to face some other increasingly difficult and unanticipated problems in trying to attach the PSPAT code to work with the present networking system. As you mentioned you are a bit busy now, I was hoping maybe Alexander will be able to help me a little here. It will be good to hear a different viewpoint as well. Also, there are issues I am facing which I believe even you may not be aware of, hence I am also sending this mail to the mailing lists in hope of getting additional opinions from other experts of dummynet/ipfw and the FreeBSD network stack. PSPAT WIP branch - https://github.com/theGodlessL akra/freebsd-pspat/tree/pspat-temp Firstly, as per our previous ideas we had the plan to intercept the packets from dummynet... pass it through PSPAT... and finally dispatch them out from the dispatcher queue via the arbiter or a dedicated dispatcher thread using functions like ip_output() or ether_output_frame() similar to dummynet_send(). I had already spent a good deal of time trying to get these working but it failed every time and resulted in kernel panics. My first thoughts were that the packets are not complete enough for these functions. (net.link.ether.ipfw worked but it also resulted in an error when sending the packet to ether_output_frame). So, in order to test it, I wrote a simple commit to test whether these packets can really be sent to these functions without making them go through PSPAT at all. Turns out, they failed. The first one can be seen here .. sending DIR_OUT packets to ip_output() directly from dummynt_io() with nothing to do with PSPAT failed. The second one can be seen here .. a similar failure with DIR_OUT | PROTO_LAYER2 packets. These both attempts resulted in kernel panics. The conclusion was that neither of these are a good match for PSPAT input and output interception. (Also, in case of the DIR_OUT | PROTO_LAYER2 packets, they were successfully intercepted and put on the PSPAT client mailboxes but when the arbiter scanned them, it somehow returned NULL. This did not happen with DIR_OUT packets which successfully reached the PSPAT exit point) So, next I tried to check if we can let dummynet tag the packets and then call the dummynet_send() functions to dispatch them directly. The first try with no PSPAT looked like this .. and it worked without any errors. Although I am unable to make out anything special being done by the code here, but somehow, letting dummynet tag the packets and just reading those tags in dummynet_send() before calling ip_output() or ether_output_frame() seems to work better than trying to call these functions directly. Anyway, I figured then it would be a good idea to let dummynet tag these packets before redirecting them to PSPAT and then calling dummynet_send() itself at the PSPAT exit point pspat_txqs_flush(), and so I did as can be seen here .., but it didn't work out again. The packets were successfully intercepted and reached pspat_txqs_flush() but when dummynet_send() is called on them they result in kernel panics.. I can't figure out how and why ? So after all these attempts and many more like them, when I was unable to get it working, I decided to intercept the packets in the originally planned way that we thought of at the start of GSoC, i.e. to intercept them where if_output() is called. I also thought that it would be better to call the exact same function during which we intercept it, while dispatching, as we don't do anything other than scheduling in PSPAT so the packet remains in the same state and calling a lower layer function on the packet may not end well. So, I wrote a bunch of printf statements to see which is the most commonly used if_output() function call. For testing, I wanted to intercept at only one position instead of all the dozen places where this is called in the code. The chosen point was ip_output.c line 662, which is what is almost always used..I guess. I wrote the code to intercept packets here as can be seen here . On testing, I found that the interception was successful and the packets were stored in the PSPAT client queues, but the arbiter always returned NULL while scanning these queues for packets. This issue was similar to intercepting DIR_OUT | PROTO_LAYER2 packets. Lastly, leaving the search for the perfect point for packet interception, I decided to try and implement the Scheduler Algorithm use in PSPAT. I am yet to use the patch and see how it works. I was more keen in trying a different approach, where it would be used similar to the use of SA's in dummynet_io(). This looked like this .. The idea was that this approach would make it easier for PSPAT to be integrated with dummynet which is the long term goal. Also, as all the 7 SA's are loaded when dummynet is loaded into the kernel, it didn't seem to make much sense to write all that loading code separately for each SA for PSPAT all over again. And another perk would be that the SA to be used with PSPAT could be changed with the exact same command with which we change SA's to be used to dummynet in general. Also, before I wrote this code, I tried to check if we can send a packet to a SA from within pspat_client_handler() after it has been passed the required arguments, and I was glad that it was able to enqueue the packets successfully. However, when we try to do the same from the arbiter thread, it fails for some reason. As you can see from this mail, I have been trying out a lot of different approaches and ideas to attach PSPAT with the present subsystem but it is not working. I haven't been able to make any real progress lately, so I made commits of some of those attempts and have tried to explain my approach here so someone can help me point out what exactly is wrong and how to fix it. I myself have a couple of other ideas to figure out why this is not working and I will try them and let you know soon how they go. As of now, I have two theories on these - [1] The points from where we are trying to intercept packets are all called by the client threads itself till the very last function call which actually sends the packet to NIC, so, when we instead make the client thread put the packet in PSPAT client queue and return with a value indicating no error, the thread may consider the packet dispatched on its way back and hence free up the mbuf pointer as well. This would explain the disappearing of packets from client queues when they are scanned by the arbiter. [2] The original client threads may have some thread specific allocations/de-allocations or tags etc somewhere in the network stack which get modified when we return the client thread with no error indications, and later when we call a same function from a different thread, this causes a conflict. This would explain why we are not able to dispatch using the dispatcher/arbiter thread (dummynet_send) or why the arbiter thread fails to enqueue packets to SA when the client thread is able to do so when called from within pspat_client_handler() To summarize, I was hoping that in this project, the PSPAT would be the big deal, and it would only take a little more code to add for enqueuing/dequeuing to/from an SA and that the packet interception would also be relatively easy, but I have already written and tested the PSPAT code and these are the parts which are turning out to be way more complicated. I hope I can get some help here. Thanks and Regards, Sumit On Tue, Jul 24, 2018 at 7:23 PM, Giuseppe Lettieri wrote: > Hello Sumit, > > sorry but I am a bit busy right now. Have you tried playing with the > sysctls mentioned in the PACKET FLOW section of the ipfw manpage? If you > may want to set net.link.ether.ipfw=1 and reset the others. > > Cheers, > Giuseppe > > > Il 24/07/2018 06:26, Sumit Lakra ha scritto: > >> Hello, >> >> I was trying to use the scheduler similar to its use in dummynet_io() >> where it uses FIFO as the default scheduler. I have been able to enqueue >> the packets successfuly but there was an error I was facing while >> dequeuing. This implementation is not very appropriate though for a few >> reasons. I started to try it only because it seemed possible and it would >> help me understand the SAs better. I will try to get it working if possible >> before using the patch and trying a different approach. I will notify you >> soon about the results or issues if any. >> >> Meanwhile, have you been able to check what was it about dummynet that >> caused packet duplication and other unexpected results ? The packet >> duplication is not a big problem in terms of PSPAT though as I have already >> written a simple hack to only let in a packet once. What's more important >> is - >> >> [1] Why are there no packets with dir == DIR_OUT | PROTO_LAYER2 ? This >> will help us intercept such packets and send them to ether_output_frame(). >> or, >> [2] Why were the packets with dir == DIR_OUT not ready for ip_output() ? >> They caused an error as they had incomplete headers.. How to fix this ? >> Should we call some function other than ip_output() ? >> >> Ignoring the packet duplication, if you could help me with either one of >> the above two issues, it will help me complete and test the PSPAT code as a >> whole. >> >> Thanks and Regards, >> Sumit >> > > > -- > Dr. Ing. Giuseppe Lettieri > Dipartimento di Ingegneria della Informazione > Universita' di Pisa > Largo Lucio Lazzarino 1, 56122 Pisa - Italy > Ph. : (+39) 050-2217.649 (direct) .599 (switch) > Fax : (+39) 050-2217.600 > e-mail: g.lettieri@iet.unipi.it > From owner-freebsd-hackers@freebsd.org Sat Jul 28 11:32:26 2018 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B5D5E104C7B7; Sat, 28 Jul 2018 11:32:25 +0000 (UTC) (envelope-from sumitlakradev@gmail.com) Received: from mail-wm0-x233.google.com (mail-wm0-x233.google.com [IPv6:2a00:1450:400c:c09::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 28DD288504; Sat, 28 Jul 2018 11:32:25 +0000 (UTC) (envelope-from sumitlakradev@gmail.com) Received: by mail-wm0-x233.google.com with SMTP id y2-v6so7985016wma.1; Sat, 28 Jul 2018 04:32:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=lPC+GNVvsIDdx/Y3Dm7ygcaRxjwi6CJK5Kjd2oKtbfE=; b=oH2ecfVPBqThhAFAQP9T5GiHPCY2Vr9n0dOAhs40fMvIGGJKTvKBal1sLXiWT+vT40 O+Pqx2wmHGtNPRERqZnbgzE0laA1YWo10D97ixnasTyAvjzOAmgURysXoNia2qOO6WXt pzlY2NHFwj2nob+w166ewD9CNaY1dVfkLBNGrnQI+zSVd6nMjNIQZv+l3fyB/MD/fvI/ SQZ977FHmnzKp6PqQEAs60OdgJb8v1q0gQxH7sJVOmXqeW9RMqKqfS9XGc5eL7xJkpHr /d4o4NnEFV37wxmNbK5NhfPhj+4dup9tGCH7fOou+OAQIgpqoHq0lu8VxAQUsv8kSm5i OQLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=lPC+GNVvsIDdx/Y3Dm7ygcaRxjwi6CJK5Kjd2oKtbfE=; b=r7XjG+4KiZx/DoDzoea2aK71nlUejNtzZE3Lv5RbZPJff1M7x9M+VW+UzC1YL4fqqo A61gxAGXBIpHpMXQkUm3Xuximw+gvf7SnFGcdEK+1bNo6m1ItoZwH2hAVi3UKnx4OVSD OP0ybchrzWOt1JA45gqCGDfmY3u8bbFu5+RO1BKiuE4KJ+gzc/jhns9FyeUIC2dh7z9E acXlyObO4RaaQmgI8e45dmF8LHl4qpMe7DvQxLw2mgfTWd8s+iTbb9m7IIZlElRW4KBb k+wd1E7qflN7l2H9cyQUxP66E1QXg1ud7QJkUh1RA5pWC4JfJEAAL22BIVigdBvwZ6jk Jn7g== X-Gm-Message-State: AOUpUlHSE0T6KPG0T07b8rBuY2BuWwV0Kb4c/V4rbSD8H6FpFL2revPC FX7ZVhJsngZNqU1YqdcyoVSfw4q7asEQDOeSHho= X-Google-Smtp-Source: AAOMgpfO847FsFCr05/7Vfqxshy8BRFlxfhgwvnLyc6Fs9H/USy8RthdkcU8TG21+CAHYvnTFumSWc2SGKayBteaAXY= X-Received: by 2002:a1c:ec06:: with SMTP id k6-v6mr8237231wmh.39.1532777543547; Sat, 28 Jul 2018 04:32:23 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:adf:c806:0:0:0:0:0 with HTTP; Sat, 28 Jul 2018 04:31:43 -0700 (PDT) In-Reply-To: References: <2bb73b27-d5c7-93dd-aaf8-ff47b64b7d70@iet.unipi.it> <4686483f-21de-129e-efd3-359a5189eb46@iet.unipi.it> <4a7920a2-26ba-9abe-d677-233aa7d47cd0@iet.unipi.it> From: Sumit Lakra Date: Sat, 28 Jul 2018 17:01:43 +0530 Message-ID: Subject: Re: PSPAT subsystem Implementation in FreeBSD - GSoC 2018 To: Giuseppe Lettieri Cc: "Alexander V. Chernikov" , net@freebsd.org, freebsd-hackers@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.27 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Jul 2018 11:32:26 -0000 Hello, I tried some other simpler tests today. First I tried intercepting the packets from ip_output.c again and added some printf statements to track the path of package (code ). As before, they were successfully intercepted and placed in the PSPAT client queues but the arbiter was unable to find them most of the time (not always), when scanning the queues. As per my previous assumption this was probably due to client threads early return without any error indications which assumed that the packet was dispatched. So, to test it I did this .. I couldn't make the client threads pause as they apparently had some non sleepable locks held, so I made them go through a really long loop before returning hoping this would allow PSPAT enough time to pick them up and dispatch.. and bingo.. it worked. The packets no longer disappeared from the PSPAT client queues and reached the pspat_txqs_flush(). This could also be the same reason how the packets with PROTO_LAYER2 tags disappeared, although as I mentioned in the previous mail, they were really not good for interception anyway. Next, I uncommented the actual if_output() call in the pspat_txqs_flush() to dispatch the packets that were reaching this point, but somehow the function call failed again(code ). In order to check if the function was called with correct parameters, I used some printf statements to check them (code ).. they were intact. But the function call was failing when called by the arbiter thread to dispatch packets. The exact same function called with the exact same arguments and yet it fails when called from a thread other then the client thread... Why does this happen ??.. I can't figure out !! This makes my second assumption from the previous mail to be possibly correct too, and this is probably why calling dummynet_send() from pspat_txqs_flush() didn't work either.. Put simply, there is some thread specific stuff going on with the client threads and they don't like any other threads trying to step in their shoes and dispatch their packets, and this is not restricted to dummynet/ipfw but maybe true for the entire network stack and many other functions Like I had said, I have already completed the PSPAT part and tested it to be working well but trying to make it work with the existing networking subsystem is turning out to be increasingly complex. I have no idea how to get around this problem, but will keep trying to come up with something. Any help/ideas will be greatly appreciated. Thanks and Regards, Sumit On Sat, Jul 28, 2018 at 1:52 AM, Sumit Lakra wrote: > Hello, > > I tried the sysctl and it worked in that I was able to intercept the > packets with DIR == DIR_OUT | PROTO_LAYER2, but I am beginning to face > some other increasingly difficult and unanticipated problems in trying to > attach the PSPAT code to work with the present networking system. As you > mentioned you are a bit busy now, I was hoping maybe Alexander will be able > to help me a little here. It will be good to hear a different viewpoint as > well. Also, there are issues I am facing which I believe even you may not > be aware of, hence I am also sending this mail to the mailing lists in hope > of getting additional opinions from other experts of dummynet/ipfw and the > FreeBSD network stack. > > PSPAT WIP branch - https://github.com/theGodlessL > akra/freebsd-pspat/tree/pspat-temp > > Firstly, as per our previous ideas we had the plan to intercept the > packets from dummynet... pass it through PSPAT... and finally dispatch them > out from the dispatcher queue via the arbiter or a dedicated dispatcher > thread using functions like ip_output() or ether_output_frame() similar to > dummynet_send(). I had already spent a good deal of time trying to get > these working but it failed every time and resulted in kernel panics. My > first thoughts were that the packets are not complete enough for these > functions. (net.link.ether.ipfw worked but it also resulted in an error > when sending the packet to ether_output_frame). So, in order to test it, I > wrote a simple commit to test whether these packets can really be sent to > these functions without making them go through PSPAT at all. Turns out, > they failed. > > The first one can be seen here > .. > sending DIR_OUT packets to ip_output() directly from dummynt_io() with > nothing to do with PSPAT failed. > The second one can be seen here > .. > a similar failure with DIR_OUT | PROTO_LAYER2 packets. These both attempts > resulted in kernel panics. > > The conclusion was that neither of these are a good match for PSPAT input > and output interception. (Also, in case of the DIR_OUT | PROTO_LAYER2 > packets, they were successfully intercepted and put on the PSPAT client > mailboxes but when the arbiter scanned them, it somehow returned NULL. This > did not happen with DIR_OUT packets which successfully reached the PSPAT > exit point) > > So, next I tried to check if we can let dummynet tag the packets and then > call the dummynet_send() functions to dispatch them directly. The first try > with no PSPAT looked like this > .. > and it worked without any errors. Although I am unable to make out anything > special being done by the code here, but somehow, letting dummynet tag the > packets and just reading those tags in dummynet_send() before calling > ip_output() or ether_output_frame() seems to work better than trying to > call these functions directly. > > Anyway, I figured then it would be a good idea to let dummynet tag these > packets before redirecting them to PSPAT and then calling dummynet_send() > itself at the PSPAT exit point pspat_txqs_flush(), and so I did as can be > seen here > .., > but it didn't work out again. The packets were successfully intercepted and > reached pspat_txqs_flush() but when dummynet_send() is called on them they > result in kernel panics.. I can't figure out how and why ? > > So after all these attempts and many more like them, when I was unable to > get it working, I decided to intercept the packets in the originally > planned way that we thought of at the start of GSoC, i.e. to intercept them > where if_output() is called. I also thought that it would be better to call > the exact same function during which we intercept it, while dispatching, as > we don't do anything other than scheduling in PSPAT so the packet remains > in the same state and calling a lower layer function on the packet may not > end well. So, I wrote a bunch of printf statements to see which is the most > commonly used if_output() function call. For testing, I wanted to intercept > at only one position instead of all the dozen places where this is called > in the code. The chosen point was ip_output.c line 662, which is what is > almost always used..I guess. I wrote the code to intercept packets here as > can be seen here > . > On testing, I found that the interception was successful and the packets > were stored in the PSPAT client queues, but the arbiter always returned > NULL while scanning these queues for packets. This issue was similar to > intercepting DIR_OUT | PROTO_LAYER2 packets. > > Lastly, leaving the search for the perfect point for packet interception, > I decided to try and implement the Scheduler Algorithm use in PSPAT. I am > yet to use the patch and see how it works. I was more keen in trying a > different approach, where it would be used similar to the use of SA's in > dummynet_io(). This looked like this > > .. > > The idea was that this approach would make it easier for PSPAT to be > integrated with dummynet which is the long term goal. Also, as all the 7 > SA's are loaded when dummynet is loaded into the kernel, it didn't seem to > make much sense to write all that loading code separately for each SA for > PSPAT all over again. And another perk would be that the SA to be used with > PSPAT could be changed with the exact same command with which we change > SA's to be used to dummynet in general. Also, before I wrote this code, I > tried to check if we can send a packet to a SA from within > pspat_client_handler() after it has been passed the required arguments, and > I was glad that it was able to enqueue the packets successfully. However, > when we try to do the same from the arbiter thread, it fails for some > reason. > > As you can see from this mail, I have been trying out a lot of different > approaches and ideas to attach PSPAT with the present subsystem but it is > not working. I haven't been able to make any real progress lately, so I > made commits of some of those attempts and have tried to explain my > approach here so someone can help me point out what exactly is wrong and > how to fix it. I myself have a couple of other ideas to figure out why this > is not working and I will try them and let you know soon how they go. > > As of now, I have two theories on these - > [1] The points from where we are trying to intercept packets are all > called by the client threads itself till the very last function call which > actually sends the packet to NIC, so, when we instead make the client > thread put the packet in PSPAT client queue and return with a value > indicating no error, the thread may consider the packet dispatched on its > way back and hence free up the mbuf pointer as well. This would explain the > disappearing of packets from client queues when they are scanned by the > arbiter. > > [2] The original client threads may have some thread specific > allocations/de-allocations or tags etc somewhere in the network stack which > get modified when we return the client thread with no error indications, > and later when we call a same function from a different thread, this causes > a conflict. This would explain why we are not able to dispatch using the > dispatcher/arbiter thread (dummynet_send) or why the arbiter thread fails > to enqueue packets to SA when the client thread is able to do so when > called from within pspat_client_handler() > > To summarize, I was hoping that in this project, the PSPAT would be the > big deal, and it would only take a little more code to add for > enqueuing/dequeuing to/from an SA and that the packet interception would > also be relatively easy, but I have already written and tested the PSPAT > code and these are the parts which are turning out to be way more > complicated. I hope I can get some help here. > > Thanks and Regards, > Sumit > > On Tue, Jul 24, 2018 at 7:23 PM, Giuseppe Lettieri < > g.lettieri@iet.unipi.it> wrote: > >> Hello Sumit, >> >> sorry but I am a bit busy right now. Have you tried playing with the >> sysctls mentioned in the PACKET FLOW section of the ipfw manpage? If you >> may want to set net.link.ether.ipfw=1 and reset the others. >> >> Cheers, >> Giuseppe >> >> >> Il 24/07/2018 06:26, Sumit Lakra ha scritto: >> >>> Hello, >>> >>> I was trying to use the scheduler similar to its use in dummynet_io() >>> where it uses FIFO as the default scheduler. I have been able to enqueue >>> the packets successfuly but there was an error I was facing while >>> dequeuing. This implementation is not very appropriate though for a few >>> reasons. I started to try it only because it seemed possible and it would >>> help me understand the SAs better. I will try to get it working if possible >>> before using the patch and trying a different approach. I will notify you >>> soon about the results or issues if any. >>> >>> Meanwhile, have you been able to check what was it about dummynet that >>> caused packet duplication and other unexpected results ? The packet >>> duplication is not a big problem in terms of PSPAT though as I have already >>> written a simple hack to only let in a packet once. What's more important >>> is - >>> >>> [1] Why are there no packets with dir == DIR_OUT | PROTO_LAYER2 ? This >>> will help us intercept such packets and send them to ether_output_frame(). >>> or, >>> [2] Why were the packets with dir == DIR_OUT not ready for ip_output() ? >>> They caused an error as they had incomplete headers.. How to fix this ? >>> Should we call some function other than ip_output() ? >>> >>> Ignoring the packet duplication, if you could help me with either one of >>> the above two issues, it will help me complete and test the PSPAT code as a >>> whole. >>> >>> Thanks and Regards, >>> Sumit >>> >> >> >> -- >> Dr. Ing. Giuseppe Lettieri >> Dipartimento di Ingegneria della Informazione >> Universita' di Pisa >> Largo Lucio Lazzarino 1, 56122 Pisa - Italy >> Ph. : (+39) 050-2217.649 (direct) .599 (switch) >> Fax : (+39) 050-2217.600 >> e-mail: g.lettieri@iet.unipi.it >> > >