From owner-freebsd-stable@FreeBSD.ORG  Thu May  7 08:07:54 2015
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 1CD94FFD
 for <freebsd-stable@FreeBSD.org>; Thu,  7 May 2015 08:07:54 +0000 (UTC)
Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id CDC8F1B7D
 for <freebsd-stable@FreeBSD.org>; Thu,  7 May 2015 08:07:53 +0000 (UTC)
Received: from slw by zxy.spb.ru with local (Exim 4.84 (FreeBSD))
 (envelope-from <slw@zxy.spb.ru>) id 1YqGqT-000NpY-I8
 for freebsd-stable@FreeBSD.org; Thu, 07 May 2015 11:07:49 +0300
Date: Thu, 7 May 2015 11:07:49 +0300
From: Slawa Olhovchenkov <slw@zxy.spb.ru>
To: freebsd-stable@FreeBSD.org
Subject: zfs, cam sticking on failed disk
Message-ID: <20150507080749.GB1394@zxy.spb.ru>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.23 (2014-03-12)
X-SA-Exim-Connect-IP: <locally generated>
X-SA-Exim-Mail-From: slw@zxy.spb.ru
X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable/>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 May 2015 08:07:54 -0000

I have zpool of 12 vdev (zmirrors).
One disk in one vdev out of service and stop serving reuquest:

dT: 1.036s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0| ada0
    0      0      0      0    0.0      0      0    0.0    0.0| ada1
    1      0      0      0    0.0      0      0    0.0    0.0| ada2
    0      0      0      0    0.0      0      0    0.0    0.0| ada3
    0      0      0      0    0.0      0      0    0.0    0.0| da0
    0      0      0      0    0.0      0      0    0.0    0.0| da1
    0      0      0      0    0.0      0      0    0.0    0.0| da2
    0      0      0      0    0.0      0      0    0.0    0.0| da3
    0      0      0      0    0.0      0      0    0.0    0.0| da4
    0      0      0      0    0.0      0      0    0.0    0.0| da5
    0      0      0      0    0.0      0      0    0.0    0.0| da6
    0      0      0      0    0.0      0      0    0.0    0.0| da7
    0      0      0      0    0.0      0      0    0.0    0.0| da8
    0      0      0      0    0.0      0      0    0.0    0.0| da9
    0      0      0      0    0.0      0      0    0.0    0.0| da10
    0      0      0      0    0.0      0      0    0.0    0.0| da11
    0      0      0      0    0.0      0      0    0.0    0.0| da12
    0      0      0      0    0.0      0      0    0.0    0.0| da13
    0      0      0      0    0.0      0      0    0.0    0.0| da14
    0      0      0      0    0.0      0      0    0.0    0.0| da15
    0      0      0      0    0.0      0      0    0.0    0.0| da16
    0      0      0      0    0.0      0      0    0.0    0.0| da17
    0      0      0      0    0.0      0      0    0.0    0.0| da18
   24      0      0      0    0.0      0      0    0.0    0.0| da19
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    0      0      0      0    0.0      0      0    0.0    0.0| da20
    0      0      0      0    0.0      0      0    0.0    0.0| da21
    0      0      0      0    0.0      0      0    0.0    0.0| da22
    0      0      0      0    0.0      0      0    0.0    0.0| da23
    0      0      0      0    0.0      0      0    0.0    0.0| da24
    0      0      0      0    0.0      0      0    0.0    0.0| da25
    0      0      0      0    0.0      0      0    0.0    0.0| da26
    0      0      0      0    0.0      0      0    0.0    0.0| da27

As result zfs operation on this pool stoped too.
`zpool list -v` don't worked.
`zpool detach tank da19` don't worked.
Application worked with this pool sticking in `zfs` wchan and don't killed.

# camcontrol tags da19 -v
(pass19:isci0:0:3:0): dev_openings  7
(pass19:isci0:0:3:0): dev_active    25
(pass19:isci0:0:3:0): allocated     25
(pass19:isci0:0:3:0): queued        0
(pass19:isci0:0:3:0): held          0
(pass19:isci0:0:3:0): mintags       2
(pass19:isci0:0:3:0): maxtags       255

How I can cancel this 24 requst?
Why this requests don't timeout (3 hours already)?
How I can forced detach this disk? (I am lready try `camcontrol reset`, `camconrol rescan`).
Why ZFS (or geom) don't timeout on request and don't rerouted to da18?