From owner-freebsd-stable@FreeBSD.ORG Thu May 7 08:07:54 2015 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1CD94FFD for ; Thu, 7 May 2015 08:07:54 +0000 (UTC) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CDC8F1B7D for ; Thu, 7 May 2015 08:07:53 +0000 (UTC) Received: from slw by zxy.spb.ru with local (Exim 4.84 (FreeBSD)) (envelope-from ) id 1YqGqT-000NpY-I8 for freebsd-stable@FreeBSD.org; Thu, 07 May 2015 11:07:49 +0300 Date: Thu, 7 May 2015 11:07:49 +0300 From: Slawa Olhovchenkov To: freebsd-stable@FreeBSD.org Subject: zfs, cam sticking on failed disk Message-ID: <20150507080749.GB1394@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 May 2015 08:07:54 -0000 I have zpool of 12 vdev (zmirrors). One disk in one vdev out of service and stop serving reuquest: dT: 1.036s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name 0 0 0 0 0.0 0 0 0.0 0.0| ada0 0 0 0 0 0.0 0 0 0.0 0.0| ada1 1 0 0 0 0.0 0 0 0.0 0.0| ada2 0 0 0 0 0.0 0 0 0.0 0.0| ada3 0 0 0 0 0.0 0 0 0.0 0.0| da0 0 0 0 0 0.0 0 0 0.0 0.0| da1 0 0 0 0 0.0 0 0 0.0 0.0| da2 0 0 0 0 0.0 0 0 0.0 0.0| da3 0 0 0 0 0.0 0 0 0.0 0.0| da4 0 0 0 0 0.0 0 0 0.0 0.0| da5 0 0 0 0 0.0 0 0 0.0 0.0| da6 0 0 0 0 0.0 0 0 0.0 0.0| da7 0 0 0 0 0.0 0 0 0.0 0.0| da8 0 0 0 0 0.0 0 0 0.0 0.0| da9 0 0 0 0 0.0 0 0 0.0 0.0| da10 0 0 0 0 0.0 0 0 0.0 0.0| da11 0 0 0 0 0.0 0 0 0.0 0.0| da12 0 0 0 0 0.0 0 0 0.0 0.0| da13 0 0 0 0 0.0 0 0 0.0 0.0| da14 0 0 0 0 0.0 0 0 0.0 0.0| da15 0 0 0 0 0.0 0 0 0.0 0.0| da16 0 0 0 0 0.0 0 0 0.0 0.0| da17 0 0 0 0 0.0 0 0 0.0 0.0| da18 24 0 0 0 0.0 0 0 0.0 0.0| da19 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 0 0 0 0 0.0 0 0 0.0 0.0| da20 0 0 0 0 0.0 0 0 0.0 0.0| da21 0 0 0 0 0.0 0 0 0.0 0.0| da22 0 0 0 0 0.0 0 0 0.0 0.0| da23 0 0 0 0 0.0 0 0 0.0 0.0| da24 0 0 0 0 0.0 0 0 0.0 0.0| da25 0 0 0 0 0.0 0 0 0.0 0.0| da26 0 0 0 0 0.0 0 0 0.0 0.0| da27 As result zfs operation on this pool stoped too. `zpool list -v` don't worked. `zpool detach tank da19` don't worked. Application worked with this pool sticking in `zfs` wchan and don't killed. # camcontrol tags da19 -v (pass19:isci0:0:3:0): dev_openings 7 (pass19:isci0:0:3:0): dev_active 25 (pass19:isci0:0:3:0): allocated 25 (pass19:isci0:0:3:0): queued 0 (pass19:isci0:0:3:0): held 0 (pass19:isci0:0:3:0): mintags 2 (pass19:isci0:0:3:0): maxtags 255 How I can cancel this 24 requst? Why this requests don't timeout (3 hours already)? How I can forced detach this disk? (I am lready try `camcontrol reset`, `camconrol rescan`). Why ZFS (or geom) don't timeout on request and don't rerouted to da18?