Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Oct 2009 10:36:04 +1300
From:      Andrew Thompson <thompsa@FreeBSD.org>
To:        Alexander Motin <mav@FreeBSD.org>
Cc:        FreeBSD-Current <freebsd-current@freebsd.org>, Scott Long <scottl@FreeBSD.org>
Subject:   Re: CAM problem
Message-ID:  <20091020213604.GA63951@citylink.fud.org.nz>
In-Reply-To: <4ADD7683.7040907@FreeBSD.org>
References:  <mailpost.1255999338.6409497.5480.mailing.freebsd.current@FreeBSD.cs.nctu.edu.tw> <4ADD7683.7040907@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Oct 20, 2009 at 11:36:19AM +0300, Alexander Motin wrote:
> Andrew Thompson wrote:
> > I have a cam problem that is noticeable with usb devices. It relates to
> > the ordering of xpt_release_device() and the CAM_DEV_UNCONFIGURED flag
> > when yanking a device that has stalled. This then causes a problem with
> > the usb explore thread which will end up waiting on simfree forever,
> > blocking any further usb attach/detach on the controller.
> > 
> 
> As I can see, you are returning CAM_TID_INVALID error here. There is no
> special error handling for this error, comparing to CAM_SEL_TIMEOUT. If
> you return CAM_SEL_TIMEOUT there, device will be killed immediately and
> probably workaround this specific problem.

Setting the error to CAM_SEL_TIMEOUT does indeed workaround the problem.
I have committed this so at least it can be merged to 8.0.


> > scsi_dev_async: set dev dev3 unconfigured
> > 
> >  ^^^ dev3 gets the CAM_DEV_UNCONFIGURED flag cleared here
> 
> ... but removing configured status does not call deallocation, as
> unreferencing does.
> 
> > xpt_bus_deregister: xpt_release_bus
> > xpt_release_bus: ref=4 -> 3
> > xpt_release_device dev4 OK 
> > xpt_release_target: xpt_release_bus
> > xpt_release_bus: ref=3 -> 2
> > xpt_release_path: xpt_release_bus
> > xpt_release_bus: ref=2 -> 1
> > umass_cam_detach_sim:
> > umass-sim0: waiting... ref = 1
> > 
> >  ^^^ wait on "simfree" forever.
> 
> I think correct solution will be to additionally increment reference
> counter before clearing CAM_DEV_UNCONFIGURED and decrement it back after
> setting CAM_DEV_UNCONFIGURED back. Check for CAM_DEV_UNCONFIGURED inside
> xpt_release_device() then could be removed or turned into assertion.

I agree, this looks like the best solution.


Andrew



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20091020213604.GA63951>