From owner-freebsd-current@FreeBSD.ORG Fri Jan 23 22:34:44 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 93E311065689; Fri, 23 Jan 2009 22:34:44 +0000 (UTC) (envelope-from hselasky@c2i.net) Received: from swip.net (mailfe16.swipnet.se [212.247.155.225]) by mx1.freebsd.org (Postfix) with ESMTP id 877D48FC0A; Fri, 23 Jan 2009 22:34:43 +0000 (UTC) (envelope-from hselasky@c2i.net) X-Cloudmark-Score: 0.000000 [] X-Cloudmark-Analysis: v=1.0 c=1 a=nTib0wbUE7AA:10 a=J2nnhgZB8JEA:10 a=6I5d2MoRAAAA:8 a=oWSvtglMsGghnkqxLvYA:9 a=fOFekdCCHLJb-4NVGNQA:7 a=OAMeedZ3YPCsO9wchdcBP6Tzsg0A:4 a=50e4U0PicR4A:10 Received: from [85.19.218.115] (account mc467741@c2i.net HELO [10.37.1.92]) by mailfe16.swip.net (CommuniGate Pro SMTP 5.2.6) with ESMTPA id 441916268; Fri, 23 Jan 2009 23:34:41 +0100 From: Hans Petter Selasky To: Maksim Yevmenkin Date: Fri, 23 Jan 2009 23:37:04 +0100 User-Agent: KMail/1.9.7 References: <20090123154336.GJ60948@e.0x20.net> <200901232146.58360.hselasky@c2i.net> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200901232337.05627.hselasky@c2i.net> Cc: freebsd-current@freebsd.org, current@freebsd.org, Alfred Perlstein Subject: Re: panic: mutex Giant not owned at /usr/src/sys/kern/tty_ttydisc.c:1127 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Jan 2009 22:34:45 -0000 Hi Maksim, On Friday 23 January 2009, Maksim Yevmenkin wrote: > > i've added taskqueue_drain() to detach() in my "stall" diff that i > posted few hours back. is that addresses some of your concerns? Yes, at detach you always need to drain by principle. > also > keep in mind that while netgraph node is created and destroyed > primarily by the driver in response to hardware arrival and departure, > nothing protects it from the netgraph side. in other words there is > another way to manipulate netgraph node using netgraph api. for > example user can request shutdown of a netgraph node, etc. so all of > this has to be taking into consideration. Ok, I've fixed this: Destroy is now non-blocking USB-wise. See: http://perforce.freebsd.org/chv.cgi?CH=156586 > > in fact, if i understand you correctly, with addition of > taskqueue_drain() to detach(), the later becomes completely > synchronous. that is > > 1) NG_NODE_REF() > 2) ng_rmnode_self (i.e. mark node as possibly dead, detach all the > hooks, etc. etc.) > 3) taskqueue_drain() (i.e. sleep until task has finished) > 4) usb2_transfer_unsetup() (which does "usb2_transfer_drain() and sleeps) > 5) NG_NODE_UNREF() > 6) free everything up > > in theory, items (1) and (5) above just an additional protection > because i was not sure about inner workings of usb2. i think, when we > sort everything out they can be removed. I have a question. What is the following doing at the middle of the ubt_detach(): if (node != NULL) NG_NODE_UNREF(node); If you say that: ng_rmnode_self(node); Cleans up the last reference? > >> >> > 2) You drain all USB transfers: "usb2_transfer_drain()" called > >> >> > unlocked. > >> >> > >> >> yes, usb2_transfer_unsetup() does that, does it not? > >> > > >> > That is correct, but sometimes you need to stop the transfers at an > >> > earlier point. It depends on your design. > > right > > >> when detach() is called its already out of driver's hands. all it can > >> do now is to call unsetup(), or am i missing something here? > > > > The correct thing is to do a usb2_transfer_drain() which will wait until > > the callback has been called back, if a transfer is pending. > > i do not understand. if usb2_transfer_unsetup() calls > usb2_transfer_drain() and sleeps, then what difference does it make in > detach()? You are right that USB-wise it does not matter if you call usb2_transfer_drain() separately or if you call usb2_transfer_unsetup(). I'm just thinking with regard to the state inside bluetooth, because there are two contexts running in parallell. I.E. Callbacks might be executed in paralell to bluetooth tearing down. > i just followed original code that did it this way, i.e. > drain and unsetup all transfers in one call. Yes, that's fine. > > > In USB2 a "usb2_start_hardware()" call is always paired with a USB > > transfer callback, no matter what you do! This way of design has been > > chosen so that you can do refcounts inside the callback! > > ok, and that is exactly what code does. the only thing is that > refcounts are on netgraph node and not on softc structure. again the > reason for that is because netgraph node can be accessed from both > usb2 driver and netgraph subsystem directly. Isn't it enough to increment the refcount at attach and decrement it at detach ???? > > > I see in your code that you try to do things Asynchronously. This > > > > >> > complicates stuff quite a lot. In my patches I've tried to do some > >> > things synchronously. > >> > >> no, No, NO (sorry for shouting). we _CAN_NOT_ sleep in netgraph. > >> period. you _CAN_NOT_ grab usb interface mutexes unless you can > >> absolutely guarantee that they will not sleep for any significant > >> amount of time. > > > > usb2_transfer_start() and usb2_transfer_stop() are non-blocking and will > > never sleep. The exception is "usb2_transfer_drain()" which can sleep for > > a few milliseconds and is only called when the hook is disconnected. I do > > not see that as a problem versus having "X" more checks in the code. > > well, when i looked at the code, i saw that both functions do > USB_BUS_LOCK() and USB_BUS_UNLOCK(). Well, this is the lock of the USB IRQ handler. One per USB controller. There are no more locks than this that will be locked. The IRQ handler is called from a high priority context and will not block very long. Same with USB callbacks. > i do not know enough about those > locks and usb2 in general, so i decided to err on the side of caution > and move all the operations that could potentially sleep into the > taskqueue context. this actually completely decouples netgraph from > usb2, and that is a "good thing (tm)" imo :) a little bit of > complexity is totally justified here imo. Ok, I see. > > also, now that you mentioned it, i should call usb2_transfer_drain() > in ubt_task() instead of usb2_transfer_stop() to make transfer stop > completely synchronous as well. When I think about it, usb2_transfer_stop() is enough. See my latest Patch in P4. > please let me reiterate, sleeping in netgraph is NOT allowed. even if > its for a few milliseconds. please accept it as a fact. think of all > netgraph methods as fast interrupt handlers. it is very typical for > fast interrupt handlers to do minimal amount of work and schedule swi > to do more heavy processing. that is exactly what the code does here. Fixed! > > there is no NG_NODE_REF in attach(). when we create node is comes with > one reference and that reference is removed by ng_rmnode_self(). > please read my comments about detach() above. hopefully it will make > sense now. Yes, I'm starting to understand it. > > i do not really have strong opinion about it. i just thought there > would be less contention between isoc and bulk/intr/ctrl transfers > when they use different locks. probably does not matter, since > everything is going over the same bus. i'm fine with this change utill > someone has credible evidence otherwise. Ok. --HPS