From owner-freebsd-current@FreeBSD.ORG  Fri Jan 23 22:34:44 2009
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 93E311065689;
	Fri, 23 Jan 2009 22:34:44 +0000 (UTC)
	(envelope-from hselasky@c2i.net)
Received: from swip.net (mailfe16.swipnet.se [212.247.155.225])
	by mx1.freebsd.org (Postfix) with ESMTP id 877D48FC0A;
	Fri, 23 Jan 2009 22:34:43 +0000 (UTC)
	(envelope-from hselasky@c2i.net)
X-Cloudmark-Score: 0.000000 []
X-Cloudmark-Analysis: v=1.0 c=1 a=nTib0wbUE7AA:10 a=J2nnhgZB8JEA:10
	a=6I5d2MoRAAAA:8 a=oWSvtglMsGghnkqxLvYA:9
	a=fOFekdCCHLJb-4NVGNQA:7 a=OAMeedZ3YPCsO9wchdcBP6Tzsg0A:4
	a=50e4U0PicR4A:10
Received: from [85.19.218.115] (account mc467741@c2i.net HELO [10.37.1.92])
	by mailfe16.swip.net (CommuniGate Pro SMTP 5.2.6)
	with ESMTPA id 441916268; Fri, 23 Jan 2009 23:34:41 +0100
From: Hans Petter Selasky <hselasky@c2i.net>
To: Maksim Yevmenkin <maksim.yevmenkin@gmail.com>
Date: Fri, 23 Jan 2009 23:37:04 +0100
User-Agent: KMail/1.9.7
References: <20090123154336.GJ60948@e.0x20.net>
	<200901232146.58360.hselasky@c2i.net>
	<bb4a86c70901231345u24cb0786obe9b26174af9ffd3@mail.gmail.com>
In-Reply-To: <bb4a86c70901231345u24cb0786obe9b26174af9ffd3@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200901232337.05627.hselasky@c2i.net>
Cc: freebsd-current@freebsd.org, current@freebsd.org,
	Alfred Perlstein <alfred@freebsd.org>
Subject: Re: panic: mutex Giant not owned at
	/usr/src/sys/kern/tty_ttydisc.c:1127
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 23 Jan 2009 22:34:45 -0000

Hi Maksim,

On Friday 23 January 2009, Maksim Yevmenkin wrote:

>
> i've added taskqueue_drain() to detach() in my "stall" diff that i
> posted few hours back. is that addresses some of your concerns? 

Yes, at detach you always need to drain by principle.

> also 
> keep in mind that while netgraph node is created and destroyed
> primarily by the driver in response to hardware arrival and departure,
> nothing protects it from the netgraph side. in other words there is
> another way to manipulate netgraph node using netgraph api. for
> example user can request shutdown of a netgraph node, etc. so all of
> this has to be taking into consideration.

Ok, I've fixed this: Destroy is now non-blocking USB-wise. See:

http://perforce.freebsd.org/chv.cgi?CH=156586

>
> in fact, if i understand you correctly, with addition of
> taskqueue_drain() to detach(), the later becomes completely
> synchronous. that is
>
> 1) NG_NODE_REF()
> 2) ng_rmnode_self (i.e. mark node as possibly dead, detach all the
> hooks, etc. etc.)
> 3) taskqueue_drain() (i.e. sleep until task has finished)
> 4) usb2_transfer_unsetup() (which does "usb2_transfer_drain() and sleeps)
> 5) NG_NODE_UNREF()
> 6) free everything up
>
> in theory, items (1) and (5) above just an additional protection
> because i was not sure about inner workings of usb2. i think, when we
> sort everything out they can be removed.

I have a question. What is the following doing at the middle of the 
ubt_detach():

	if (node != NULL)
		NG_NODE_UNREF(node);

If you say that:

		ng_rmnode_self(node);

Cleans up the last reference?

> >> >> > 2) You drain all USB transfers: "usb2_transfer_drain()" called
> >> >> > unlocked.
> >> >>
> >> >> yes, usb2_transfer_unsetup() does that, does it not?
> >> >
> >> > That is correct, but sometimes you need to stop the transfers at an
> >> > earlier point. It depends on your design.
>
> right
>
> >> when detach() is called its already out of driver's hands. all it can
> >> do now is to call unsetup(), or am i missing something here?
> >
> > The correct thing is to do a usb2_transfer_drain() which will wait until
> > the callback has been called back, if a transfer is pending.
>
> i do not understand. if usb2_transfer_unsetup() calls
> usb2_transfer_drain() and sleeps, then what difference does it make in
> detach()?

You are right that USB-wise it does not matter if you call 
usb2_transfer_drain() separately or if you call usb2_transfer_unsetup(). I'm 
just thinking with regard to the state inside bluetooth, because there are 
two contexts running in parallell. I.E. Callbacks might be executed in 
paralell to bluetooth tearing down.

> i just followed original code that did it this way, i.e. 
> drain and unsetup all transfers in one call.

Yes, that's fine.

>
> > In USB2 a "usb2_start_hardware()" call is always paired with a USB
> > transfer callback, no matter what you do! This way of design has been
> > chosen so that you can do refcounts inside the callback!
>
> ok, and that is exactly what code does. the only thing is that
> refcounts are on netgraph node and not on softc structure. again the
> reason for that is because netgraph node can be accessed from both
> usb2 driver and netgraph subsystem directly.

Isn't it enough to increment the refcount at attach and decrement it at 
detach ????


> > > I see in your code that you try to do things Asynchronously. This
> > >
> >> > complicates stuff quite a lot. In my patches I've tried to do some
> >> > things synchronously.
> >>
> >> no, No, NO (sorry for shouting). we _CAN_NOT_ sleep in netgraph.
> >> period. you _CAN_NOT_ grab usb interface mutexes unless you can
> >> absolutely guarantee that they will not sleep for any significant
> >> amount of time.
> >
> > usb2_transfer_start() and usb2_transfer_stop() are non-blocking and will
> > never sleep. The exception is "usb2_transfer_drain()" which can sleep for
> > a few milliseconds and is only called when the hook is disconnected. I do
> > not see that as a problem versus having "X" more checks in the code.
>
> well, when i looked at the code, i saw that both functions do
> USB_BUS_LOCK() and USB_BUS_UNLOCK(). 

Well, this is the lock of the USB IRQ handler. One per USB controller. There 
are no more locks than this that will be locked. The IRQ handler is called 
from a high priority context and will not block very long. Same with USB 
callbacks.

> i do not know enough about those 
> locks and usb2 in general, so i decided to err on the side of caution
> and move all the operations that could potentially sleep into the
> taskqueue context. this actually completely decouples netgraph from
> usb2, and that is a "good thing (tm)" imo :) a little bit of
> complexity is totally justified here imo.

Ok, I see.

>
> also, now that you mentioned it, i should call usb2_transfer_drain()
> in ubt_task() instead of usb2_transfer_stop() to make transfer stop
> completely synchronous as well.

When I think about it, usb2_transfer_stop() is enough. See my latest Patch in 
P4.

> please let me reiterate, sleeping in netgraph is NOT allowed. even if
> its for a few milliseconds. please accept it as a fact. think of all
> netgraph methods as fast interrupt handlers. it is very typical for
> fast interrupt handlers to do minimal amount of work and schedule swi
> to do more heavy processing. that is exactly what the code does here.

Fixed!

>
> there is no NG_NODE_REF in attach(). when we create node is comes with
> one reference and that reference is removed by ng_rmnode_self().
> please read my comments about detach() above. hopefully it will make
> sense now.

Yes, I'm starting to understand it.

>
> i do not really have strong opinion about it. i just thought there
> would be less contention between isoc and bulk/intr/ctrl transfers
> when they use different locks. probably does not matter, since
> everything is going over the same bus. i'm fine with this change utill
> someone has credible evidence otherwise.

Ok.

--HPS