From owner-freebsd-net@freebsd.org Thu Dec 29 18:17:05 2016 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3919BC96C49 for ; Thu, 29 Dec 2016 18:17:05 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from mail.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1112B1B13; Thu, 29 Dec 2016 18:17:05 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by mail.baldwin.cx (Postfix) with ESMTPSA id CF0A710A998; Thu, 29 Dec 2016 13:17:01 -0500 (EST) From: John Baldwin To: Vincenzo Maffione Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , Adrian Chadd , Giuseppe Lettieri , Luigi Rizzo , Navdeep Parhar Subject: Re: cxgbe's native netmap support broken since r307394 Date: Thu, 29 Dec 2016 10:16:58 -0800 Message-ID: <28018209.hrnUzHgbYH@ralph.baldwin.cx> User-Agent: KMail/4.14.10 (FreeBSD/11.0-STABLE; KDE/4.14.10; amd64; ; ) In-Reply-To: References: <20161217222812.GA4979@ox> <9729046.di4hkLAXiP@ralph.baldwin.cx> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mail.baldwin.cx); Thu, 29 Dec 2016 13:17:01 -0500 (EST) X-Virus-Scanned: clamav-milter 0.99.2 at mail.baldwin.cx X-Virus-Status: Clean X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Dec 2016 18:17:05 -0000 On Thursday, December 29, 2016 10:02:32 AM Vincenzo Maffione wrote: > Ok, thanks for the clarification. I change the lock type to MTX_DEF > (and did a test). I attached the new patch. Looks good to me, thanks! > Cheers, > Vincenzo > > 2016-12-29 2:06 GMT+01:00 John Baldwin : > > On Wednesday, December 28, 2016 07:25:22 PM Vincenzo Maffione wrote: > >> Hi, > >> The "worker_lock" is taken by nm_os_kthread_wakeup_worker(), which > >> in turn is called by nm_pt_host_notify(). The latter function is a > >> callback that may be called by a driver interrupt service routine; > >> more precisely this happens when the driver calls netmap_tx_irq() or > >> netmap_rx_irq(). As far as I know in FreeBSD it is not possible to > >> lock a MTX_DEF mtx inside an ISR. Am I wrong on this? > > > > It depends. Most interrupt handlers run in ithreads and can use MTX_DEF. > > Only interrupt handlers that run from a filter are restricted to using > > MTX_SPIN. Looking at all the calls to netmap_[tr]x_irq() in the tree, > > they are all done from contexts that are safe to use MTX_DEF (and in > > general they have to be as the equivalent code for the non-netmap case > > is calling routines like if_input or m_freem that can't be invoked from > > a filter either). > > > >> > >> Cheers, > >> Vincenzo > >> > >> 2016-12-28 19:06 GMT+01:00 John Baldwin : > >> > Why are you using MTX_SPIN? Changing the lock type to MTX_DEF would seem to > >> > be a smaller patch and probably more correct for FreeBSD. > >> > > >> > On Thursday, December 22, 2016 05:29:41 PM Luigi Rizzo wrote: > >> >> sure go ahead and thank you! > >> >> > >> >> On Thu, Dec 22, 2016 at 5:15 PM, Adrian Chadd wrote: > >> >> > ok, does anyone mind if I commit it as-is? > >> >> > > >> >> > > >> >> > -a > >> >> > > >> >> > > >> >> > On 21 December 2016 at 13:37, Vincenzo Maffione wrote: > >> >> >> Hi Luigi, > >> >> >> I attached a minimal change containing two fixes: > >> >> >> > >> >> >> - change IFNET_WLOCK into IFNET_RLOCK, to fix the cxgbe issue related > >> >> >> to this thread > >> >> >> - use the proper locking functions for the "worker_lock", unrelated > >> >> >> but needed to avoid the O.S. to trap because of a mismatch between > >> >> >> MTX_SPIN and MTX_DEF. > >> >> >> > >> >> >> Cheers, > >> >> >> Vincenzo > >> >> >> > >> >> >> 2016-12-21 20:30 GMT+01:00 Luigi Rizzo : > >> >> >>> On Wed, Dec 21, 2016 at 11:15 AM, Vincenzo Maffione > >> >> >>> wrote: > >> >> >>>> Hi, > >> >> >>>> There is no commit related to that in the FreeBSD svn or git. > >> >> >>>> > >> >> >>>> The fix has been published to the github netmap repository here > >> >> >>>> (branch master): https://github.com/luigirizzo/netmap > >> >> >>>> > >> >> >>>> What we should do is to import all the recent updates from the github > >> >> >>>> into HEAD. I can prepare a patch for HEAD, if you wish. Just let me > >> >> >>>> know. > >> >> >>> > >> >> >>> I just checked and the diff between FreeBSD head and netmap head > >> >> >>> in github is almost 3k lines due to a lot of recent refactoring. > >> >> >>> So, if there is an easy way to extract just the locking change that would > >> >> >>> be preferable as an interim solution. > >> >> >>> > >> >> >>> cheers > >> >> >>> luigi > >> >> >>> > >> >> >>>> > >> >> >>>> Cheers, > >> >> >>>> Vincenzo > >> >> >>>> > >> >> >>>> > >> >> >>>> 2016-12-20 21:45 GMT+01:00 Adrian Chadd : > >> >> >>>>> hi, > >> >> >>>>> > >> >> >>>>> What's the commit? We should get it into -HEAD asap. > >> >> >>>>> > >> >> >>>>> > >> >> >>>>> -adrian > >> >> >>>>> > >> >> >>>>> > >> >> >>>>> On 20 December 2016 at 01:25, Vincenzo Maffione wrote: > >> >> >>>>>> Ok, applied to the netmap github repo. > >> >> >>>>>> This fix will be published when Luigi does the next commit on FreeBSD. > >> >> >>>>>> > >> >> >>>>>> Cheers, > >> >> >>>>>> Vincenzo > >> >> >>>>>> > >> >> >>>>>> 2016-12-19 20:05 GMT+01:00 Navdeep Parhar : > >> >> >>>>>>> IFNET_RLOCK will work, thanks. > >> >> >>>>>>> > >> >> >>>>>>> Navdeep > >> >> >>>>>>> > >> >> >>>>>>> On Mon, Dec 19, 2016 at 3:21 AM, Vincenzo Maffione wrote: > >> >> >>>>>>>> Hi Navdeep, > >> >> >>>>>>>> > >> >> >>>>>>>> Indeed, we have reviewed the code, and we think it is ok to > >> >> >>>>>>>> implement nm_os_ifnet_lock() with IFNET_RLOCK(), instead of using > >> >> >>>>>>>> IFNET_WLOCK(). > >> >> >>>>>>>> Since IFNET_RLOCK() results into sx_slock(), this should fix the issue. > >> >> >>>>>>>> > >> >> >>>>>>>> On FreeBSD, this locking is needed to protect a flag read by nm_iszombie(). > >> >> >>>>>>>> However, on Linux the same lock is also needed to protect the call to > >> >> >>>>>>>> the nm_hw_register() callback, so we prefer to have an "unified" > >> >> >>>>>>>> locking scheme, i.e. always calling nm_hw_register under the lock. > >> >> >>>>>>>> > >> >> >>>>>>>> Does this make sense to you? Would it be easy for you to make a quick > >> >> >>>>>>>> test by replacing IFNET_WLOCK with IFNET_RLOCK? > >> >> >>>>>>>> > >> >> >>>>>>>> Thanks, > >> >> >>>>>>>> Vincenzo > >> >> >>>>>>>> > >> >> >>>>>>>> 2016-12-17 23:28 GMT+01:00 Navdeep Parhar : > >> >> >>>>>>>>> Luigi, Vincenzo, > >> >> >>>>>>>>> > >> >> >>>>>>>>> The last major update to netmap (r307394 and followups) broke cxgbe's > >> >> >>>>>>>>> native netmap support. The problem is that netmap_hw_reg now holds an > >> >> >>>>>>>>> rw_lock around the driver's netmap_on/off routines. It has always been > >> >> >>>>>>>>> safe for the driver to sleep during these operations but now it panics > >> >> >>>>>>>>> instead. > >> >> >>>>>>>>> > >> >> >>>>>>>>> Why is IFNET_WLOCK needed here? It seems like a regression to disallow > >> >> >>>>>>>>> sleep on the control path. > >> >> >>>>>>>>> > >> >> >>>>>>>>> Regards, > >> >> >>>>>>>>> Navdeep > >> >> >>>>>>>>> > >> >> >>>>>>>>> begin_synchronized_op with the following non-sleepable locks held: > >> >> >>>>>>>>> exclusive rw ifnet_rw (ifnet_rw) r = 0 (0xffffffff8271d680) locked @ > >> >> >>>>>>>>> /root/ws/head/sys/dev/netmap/netmap_freebsd.c:95 > >> >> >>>>>>>>> stack backtrace: > >> >> >>>>>>>>> #0 0xffffffff810837a5 at witness_debugger+0xe5 > >> >> >>>>>>>>> #1 0xffffffff81084d88 at witness_warn+0x3b8 > >> >> >>>>>>>>> #2 0xffffffff83ef2bcc at begin_synchronized_op+0x6c > >> >> >>>>>>>>> #3 0xffffffff83f14beb at cxgbe_netmap_reg+0x5b > >> >> >>>>>>>>> #4 0xffffffff809846f1 at netmap_hw_reg+0x81 > >> >> >>>>>>>>> #5 0xffffffff809806de at netmap_do_regif+0x19e > >> >> >>>>>>>>> #6 0xffffffff8098121d at netmap_ioctl+0x7ad > >> >> >>>>>>>>> #7 0xffffffff8098682f at freebsd_netmap_ioctl+0x5f > >> >> >>>>>>>> > >> >> >>>>>>>> > >> >> >>>>>>>> > >> >> >>>>>>>> -- > >> >> >>>>>>>> Vincenzo Maffione > >> >> >>>>>>>> _______________________________________________ > >> >> >>>>>>>> freebsd-net@freebsd.org mailing list > >> >> >>>>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net > >> >> >>>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > >> >> >>>>>> > >> >> >>>>>> > >> >> >>>>>> > >> >> >>>>>> -- > >> >> >>>>>> Vincenzo Maffione > >> >> >>>>>> _______________________________________________ > >> >> >>>>>> freebsd-net@freebsd.org mailing list > >> >> >>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net > >> >> >>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > >> >> >>>> > >> >> >>>> > >> >> >>>> > >> >> >>>> -- > >> >> >>>> Vincenzo Maffione > >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> -- > >> >> >>> -----------------------------------------+------------------------------- > >> >> >>> Prof. Luigi RIZZO, rizzo@iet.unipi.it . Dip. di Ing. dell'Informazione > >> >> >>> http://www.iet.unipi.it/~luigi/ . Universita` di Pisa > >> >> >>> TEL +39-050-2217533 . via Diotisalvi 2 > >> >> >>> Mobile +39-338-6809875 . 56122 PISA (Italy) > >> >> >>> -----------------------------------------+------------------------------- > >> >> >> > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> Vincenzo Maffione > >> >> > >> >> > >> >> > >> >> > >> > > >> > > >> > -- > >> > John Baldwin > >> > >> > >> > >> > > > > > > -- > > John Baldwin > > > > -- John Baldwin