From owner-freebsd-scsi@FreeBSD.ORG Sun Jul 24 01:10:46 2011 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 704B51065673 for ; Sun, 24 Jul 2011 01:10:46 +0000 (UTC) (envelope-from mj@feral.com) Received: from ns1.feral.com (ns1.feral.com [192.67.166.1]) by mx1.freebsd.org (Postfix) with ESMTP id 8F4268FC13 for ; Sun, 24 Jul 2011 01:10:38 +0000 (UTC) Received: from [192.168.135.104] (c-24-7-47-62.hsd1.ca.comcast.net [24.7.47.62]) (authenticated bits=0) by ns1.feral.com (8.14.4/8.14.4) with ESMTP id p6O0vPAH053340 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Sat, 23 Jul 2011 17:57:29 -0700 (PDT) (envelope-from mj@feral.com) Message-ID: <4E2B6E2E.7050507@feral.com> Date: Sat, 23 Jul 2011 17:58:22 -0700 From: Matthew Jacob Organization: Feral Software User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.18) Gecko/20110616 Thunderbird/3.1.11 MIME-Version: 1.0 To: freebsd-scsi@freebsd.org References: <4E2B4674.8070605@FreeBSD.org> In-Reply-To: <4E2B4674.8070605@FreeBSD.org> Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (ns1.feral.com [192.67.166.1]); Sat, 23 Jul 2011 17:57:30 -0700 (PDT) Subject: Re: No retries after periph invalidation? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: mj@feral.com List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Jul 2011 01:10:46 -0000 On 7/23/2011 3:08 PM, Alexander Motin wrote: > Hi. > > I've simulated one real world device failure condition, when SATA disk > still reports its presence, but doesn't respond to any command. I've > found that due to multiple command retries, each of which cause 30s > timeout, bus reset and another retry/requeue, it may take ages to > eventually drop the failed device. Odd thing that those retries continue > even after XPT considered device lost and invalidated it. > > I've made a patch (http://people.freebsd.org/~mav/periph_noretry.patch) > for cam_periph_error() to block any retries after periph was marked as > invalid. With that patch all activity completes in 1-2 minutess, just > after several timeouts, required to consider device loss. > > Can this way considered to be correct? > Yes, I like this.