From owner-freebsd-scsi@FreeBSD.ORG Mon Mar 29 11:07:04 2010 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 13A921065673 for ; Mon, 29 Mar 2010 11:07:04 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id ECA988FC26 for ; Mon, 29 Mar 2010 11:07:03 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o2TB73eh058105 for ; Mon, 29 Mar 2010 11:07:03 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o2TB73Af058103 for freebsd-scsi@FreeBSD.org; Mon, 29 Mar 2010 11:07:03 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 29 Mar 2010 11:07:03 GMT Message-Id: <201003291107.o2TB73Af058103@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-scsi@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-scsi@FreeBSD.org X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Mar 2010 11:07:04 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/144648 scsi [aac] Strange values of speed and bus width in dmesg o kern/144301 scsi [ciss] [hang] HP proliant server locks when using ciss o kern/142351 scsi [mpt] LSILogic driver performance problems o kern/141934 scsi [cam] [patch] add support for SEAGATE DAT Scopion 130 o kern/134488 scsi [mpt] MPT SCSI driver probes max. 8 LUNs per device o kern/132250 scsi [ciss] ciss driver does not support more then 15 drive o kern/132206 scsi [mpt] system panics on boot when mirroring and 2nd dri p kern/130735 scsi [cam] [patch] pass M_NOWAIT to the malloc() call insid o kern/130621 scsi [mpt] tranfer rate is inscrutable slow when use lsi213 o kern/129602 scsi [ahd] ahd(4) gets confused and wedges SCSI bus o kern/128452 scsi [sa] [panic] Accessing SCSI tape drive randomly crashe o kern/128245 scsi [scsi] "inquiry data fails comparison at DV1 step" [re o kern/127927 scsi [isp] isp(4) target driver crashes kernel when set up o kern/124667 scsi [amd] [panic] FreeBSD-7 kernel page faults at amd-scsi o kern/123674 scsi [ahc] ahc driver dumping f kern/123666 scsi [aac] attach fails with Adaptec SAS RAID 3805 controll o sparc/121676 scsi [iscsi] iscontrol do not connect iscsi-target on sparc o kern/120487 scsi [sg] scsi_sg incompatible with scanners o kern/120247 scsi [mpt] FreeBSD 6.3 and LSI Logic 1030 = only 3.300MB/s o kern/119668 scsi [cam] [patch] certain errors are too verbose comparing o kern/114597 scsi [sym] System hangs at SCSI bus reset with dual HBAs o kern/110847 scsi [ahd] Tyan U320 onboard problem with more than 3 disks o kern/99954 scsi [ahc] reading from DVD failes on 6.x [regression] o kern/94838 scsi Kernel panic while mounting SD card with lock switch o o kern/92798 scsi [ahc] SCSI problem with timeouts o kern/90282 scsi [sym] SCSI bus resets cause loss of ch device o kern/76178 scsi [ahd] Problem with ahd and large SCSI Raid system o kern/74627 scsi [ahc] [hang] Adaptec 2940U2W Can't boot 5.3 s kern/61165 scsi [panic] kernel page fault after calling cam_send_ccb o kern/60641 scsi [sym] Sporadic SCSI bus resets with 53C810 under load o kern/60598 scsi wire down of scsi devices conflicts with config s kern/57398 scsi [mly] Current fails to install on mly(4) based RAID di o kern/52638 scsi [panic] SCSI U320 on SMP server won't run faster than o kern/44587 scsi dev/dpt/dpt.h is missing defines required for DPT_HAND o kern/40895 scsi wierd kernel / device driver bug o kern/39388 scsi ncr/sym drivers fail with 53c810 and more than 256MB m o kern/35234 scsi World access to /dev/pass? (for scanner) requires acce 37 problems total. From owner-freebsd-scsi@FreeBSD.ORG Mon Mar 29 15:36:29 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5703C106564A for ; Mon, 29 Mar 2010 15:36:29 +0000 (UTC) (envelope-from mj@feral.com) Received: from ns1.feral.com (ns1.feral.com [192.67.166.1]) by mx1.freebsd.org (Postfix) with ESMTP id 2A13F8FC1B for ; Mon, 29 Mar 2010 15:36:28 +0000 (UTC) Received: from [192.168.221.2] (remotevpn [192.168.221.2]) by ns1.feral.com (8.14.3/8.14.3) with ESMTP id o2TFaSY2009203 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Mon, 29 Mar 2010 07:36:28 -0800 (PST) (envelope-from mj@feral.com) Message-ID: <4BB0C8FC.9020107@feral.com> Date: Mon, 29 Mar 2010 08:36:28 -0700 From: Matthew Jacob Organization: Feral Software User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.8) Gecko/20100301 Fedora/3.0.3-1.fc11 Thunderbird/3.0.3 MIME-Version: 1.0 To: freebsd-scsi@freebsd.org References: <3bbf2fe11002281655i61a5f0a0if3f381ad0c4a1ef8@mail.gmail.com> <3bbf2fe11003031532u2207eb55h19c3a045215a7d84@mail.gmail.com> <4B8EF336.80107@feral.com> <3bbf2fe11003031547kd5f7314t3d83b2bde06c1c2f@mail.gmail.com> <4B8EF990.5030407@feral.com> <3bbf2fe11003031607wa3727b5ke89bc2a909d4d6a6@mail.gmail.com> <4B901419.8060800@feral.com> <3bbf2fe11003041737p30690522ya81e1b8f4bd6bbf9@mail.gmail.com> <3bbf2fe11003120601y3c403a1ct50f9fc6c1f0903bf@mail.gmail.com> <4B9A91DA.7030107@FreeBSD.org> <3bbf2fe11003200523t60895bfv1fa73d04e58a7838@mail.gmail.com> In-Reply-To: <3bbf2fe11003200523t60895bfv1fa73d04e58a7838@mail.gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender DNS name whitelisted, not delayed by milter-greylist-4.2.3 (ns1.feral.com [192.168.221.1]); Mon, 29 Mar 2010 07:36:28 -0800 (PST) Subject: Re: How is supposed to be protected the units list? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: mj@feral.com List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Mar 2010 15:36:29 -0000 Sorry for being quiet about this. I'm going to look at this again today. It turns out where I work there is tremendous amount of pain about this also. > So I made this new patch using the bus lock: > http://www.freebsd.org/~attilio/Sandvine/pdrv/xpt_lock.diff > > I would have preferred to have a dedicated lock for the units lists, > but as long as you seem having strong opionion, I'm fine with it. > Maybe Matt wants to add his refcounting modifies using this scheme if > we came to a consensous? > > Thanks, > Attilio > > > > From owner-freebsd-scsi@FreeBSD.ORG Mon Mar 29 18:42:18 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3A8991065670 for ; Mon, 29 Mar 2010 18:42:18 +0000 (UTC) (envelope-from mj@feral.com) Received: from ns1.feral.com (ns1.feral.com [192.67.166.1]) by mx1.freebsd.org (Postfix) with ESMTP id 004818FC1B for ; Mon, 29 Mar 2010 18:42:17 +0000 (UTC) Received: from [192.168.221.2] (remotevpn [192.168.221.2]) by ns1.feral.com (8.14.3/8.14.3) with ESMTP id o2TIgGoc010445 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Mon, 29 Mar 2010 10:42:17 -0800 (PST) (envelope-from mj@feral.com) Message-ID: <4BB0F488.1050806@feral.com> Date: Mon, 29 Mar 2010 11:42:16 -0700 From: Matthew Jacob Organization: Feral Software User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.8) Gecko/20100301 Fedora/3.0.3-1.fc11 Thunderbird/3.0.3 MIME-Version: 1.0 To: freebsd-scsi@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender DNS name whitelisted, not delayed by milter-greylist-4.2.3 (ns1.feral.com [192.168.221.1]); Mon, 29 Mar 2010 10:42:17 -0800 (PST) Subject: adding a "retry command after a delay" error X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: mj@feral.com List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Mar 2010 18:42:18 -0000 This is something I whipped up for dealing with some active/active-after-failover-time systems. I wanted to have a general facility to say of an ASC/ASCQ- retry the command, but after a period of delay. Wonder if anyone had comments? http://people.freebsd.org/~mjacob/delay_diffs.txt From owner-freebsd-scsi@FreeBSD.ORG Tue Mar 30 14:25:31 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2E609106564A; Tue, 30 Mar 2010 14:25:31 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id E2FF48FC15; Tue, 30 Mar 2010 14:25:30 +0000 (UTC) Received: by pwi9 with SMTP id 9so23853pwi.13 for ; Tue, 30 Mar 2010 07:25:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:received:message-id:subject :from:to:cc:content-type; bh=abXEErDqFkhfVqKPxCMTyrCzZ79G4y+IhzIz2SLWEZI=; b=xq2NROGEyLb7v2xTBo2rHhDyBgUseZujD6HFl+UXi8x+uMVQsAxg6R6OvhVrR3JWRd ezkUeIJA5cUaupWyo8qHhTnj+EVFB3K7rXLzvMiNlVNsJrAgbY2THNsneVdzSIbKujKy bsGGbXdwlds6dHEpGNyBhO+lYCWjaZBTMnVIw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=RPjcMAER4Bjbg+49apROy9gz/+R4bOLMF5fudu3Q5bH3kB27ry0+tsg/RPJf998/GE 2RLD8fqKfzi2F/2xKKiyY1aBC3/kX9+TxW92wq8vTMDER/n3X0BX9e/V3ydglOGCfTD5 woHKIs0XUM8bNix9DukYfPvA3vkLe6B0360+I= MIME-Version: 1.0 Sender: asmrookie@gmail.com Received: by 10.231.155.74 with HTTP; Tue, 30 Mar 2010 07:25:30 -0700 (PDT) In-Reply-To: <4BA5C746.7060203@FreeBSD.org> References: <3bbf2fe11002281655i61a5f0a0if3f381ad0c4a1ef8@mail.gmail.com> <3bbf2fe11003031547kd5f7314t3d83b2bde06c1c2f@mail.gmail.com> <4B8EF990.5030407@feral.com> <3bbf2fe11003031607wa3727b5ke89bc2a909d4d6a6@mail.gmail.com> <4B901419.8060800@feral.com> <3bbf2fe11003041737p30690522ya81e1b8f4bd6bbf9@mail.gmail.com> <3bbf2fe11003120601y3c403a1ct50f9fc6c1f0903bf@mail.gmail.com> <4B9A91DA.7030107@FreeBSD.org> <3bbf2fe11003200523t60895bfv1fa73d04e58a7838@mail.gmail.com> <4BA5C746.7060203@FreeBSD.org> Date: Tue, 30 Mar 2010 16:25:30 +0200 X-Google-Sender-Auth: 3b3471f80891d3e1 Received: by 10.142.196.10 with SMTP id t10mr2691360wff.223.1269959130360; Tue, 30 Mar 2010 07:25:30 -0700 (PDT) Message-ID: <3bbf2fe11003300725vdb1e4ddrf112778ca2bbbc20@mail.gmail.com> From: Attilio Rao To: Alexander Motin Content-Type: text/plain; charset=UTF-8 Cc: freebsd-scsi@freebsd.org, "Justin T. Gibbs" , mj@feral.com, Ed Maste Subject: Re: How is supposed to be protected the units list? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Mar 2010 14:25:31 -0000 2010/3/21 Alexander Motin : > Attilio Rao wrote: >> So I made this new patch using the bus lock: >> http://www.freebsd.org/~attilio/Sandvine/pdrv/xpt_lock.diff > > OK. I've looked on both and I think both have race window between unit > number allocation and insertion into the list. I've changed last patch > to not drop the lock in meantime. What do you think about this: > http://people.freebsd.org/~mav/unit_lock.patch > ? > > Part about scsi_da.c I don't like in both cases, as I am not sure that > locks can't be recursed there in case of some errors. I don't see how > adding second lock could solve it. The lock recursion is going to happen because of the necessary refcount acquisition as Matt pointed out? Or there is another recursion? In the former case, the global lock will help because you may just acquire it, refcount the periph, cache them and run lockless. You can't do this with xpt_lock_bus because of the recursion in cam_periph_acquire. So we want to live this unprotected and just live with this bug? Attilio -- Peace can only be achieved by understanding - A. Einstein From owner-freebsd-scsi@FreeBSD.ORG Tue Mar 30 14:34:23 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 58ACF1065674 for ; Tue, 30 Mar 2010 14:34:23 +0000 (UTC) (envelope-from mj@feral.com) Received: from ns1.feral.com (ns1.feral.com [192.67.166.1]) by mx1.freebsd.org (Postfix) with ESMTP id 2ADEF8FC23 for ; Tue, 30 Mar 2010 14:34:22 +0000 (UTC) Received: from [192.168.0.102] (m206-63.dsl.tsoft.com [198.144.206.63]) by ns1.feral.com (8.14.3/8.14.3) with ESMTP id o2UEYM9G029253 for ; Tue, 30 Mar 2010 06:34:22 -0800 (PST) (envelope-from mj@feral.com) Message-ID: <4BB20BF0.3090605@feral.com> Date: Tue, 30 Mar 2010 07:34:24 -0700 From: Matthew Jacob User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: freebsd-scsi@freebsd.org References: <3bbf2fe11002281655i61a5f0a0if3f381ad0c4a1ef8@mail.gmail.com> <3bbf2fe11003031547kd5f7314t3d83b2bde06c1c2f@mail.gmail.com> <4B8EF990.5030407@feral.com> <3bbf2fe11003031607wa3727b5ke89bc2a909d4d6a6@mail.gmail.com> <4B901419.8060800@feral.com> <3bbf2fe11003041737p30690522ya81e1b8f4bd6bbf9@mail.gmail.com> <3bbf2fe11003120601y3c403a1ct50f9fc6c1f0903bf@mail.gmail.com> <4B9A91DA.7030107@FreeBSD.org> <3bbf2fe11003200523t60895bfv1fa73d04e58a7838@mail.gmail.com> <4BA5C746.7060203@FreeBSD.org> <3bbf2fe11003300725vdb1e4ddrf112778ca2bbbc20@mail.gmail.com> In-Reply-To: <3bbf2fe11003300725vdb1e4ddrf112778ca2bbbc20@mail.gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Default is to whitelist mail, not delayed by milter-greylist-4.2.3 (ns1.feral.com [192.67.166.1]); Tue, 30 Mar 2010 06:34:22 -0800 (PST) Subject: Re: How is supposed to be protected the units list? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Mar 2010 14:34:23 -0000 I was distracted yesterday so didn't finish my testing. There are still lots of issues that can occur. I'm still forming some thoughts on this, but part of the problem is that there are things going on with a periph that make this difficult. Pieces of it can change or be changed- even under a lock- but the code surrounding the lock isn't aware that it can change *between* the lock. For example, I think that cam_periph alloc can have two different arrivals of something which could be for the same periph can end up with two different periph structures on the list with different unit numbers that point to the same bus. It doesn't matter that there were locks to provide some stability because the locks were dropped in between. > The lock recursion is going to happen because of the necessary > refcount acquisition as Matt pointed out? > Or there is another recursion? > In the former case, the global lock will help because you may just > acquire it, refcount the periph, cache them and run lockless. > You can't do this with xpt_lock_bus because of the recursion in > cam_periph_acquire. > > So we want to live this unprotected and just live with this bug? > > Attilio > > > From owner-freebsd-scsi@FreeBSD.ORG Tue Mar 30 14:43:55 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AC34A1065670 for ; Tue, 30 Mar 2010 14:43:55 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-pz0-f180.google.com (mail-pz0-f180.google.com [209.85.222.180]) by mx1.freebsd.org (Postfix) with ESMTP id 500578FC18 for ; Tue, 30 Mar 2010 14:43:55 +0000 (UTC) Received: by pzk10 with SMTP id 10so588778pzk.28 for ; Tue, 30 Mar 2010 07:43:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:received:message-id:subject :from:to:cc:content-type; bh=R+zLCHXvQq0rdHGXybmseZIuhmQoBMP2yuAvbR/uSjM=; b=OObqFzJnUIrPl+4Y5q+/2DKdXzV9MW0mDeYb8y573oLTucufVVsfV2mWpLlSbEipuV YI79E/ess/XE7rIXZttx8x44XTRwppNZcHhYwsfPSVJeUWIXwHC9Dv0NvbkL1j+uI2b5 UlmVzi0ZXxzUsVa9Xj8WXCdHiCmThEMncfDE8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=FNVj4zX7X3A3EU5atugzcXAXw7c2q+etfwuit6HCUNn+Ewta+EX+8lLvV70rZJMwwH iJY0fNyDTOJM/BC42FJNSEWQoO6C4TgJjsI0nts1dDFdOJgJK6mwcMpVgTRKIaNKe+Xj yjLe6gzlc60Q0KApSnvrbti3v7gAMD5lzOBGQ= MIME-Version: 1.0 Sender: asmrookie@gmail.com Received: by 10.231.155.74 with HTTP; Tue, 30 Mar 2010 07:43:50 -0700 (PDT) In-Reply-To: <4BB20BF0.3090605@feral.com> References: <3bbf2fe11002281655i61a5f0a0if3f381ad0c4a1ef8@mail.gmail.com> <3bbf2fe11003031607wa3727b5ke89bc2a909d4d6a6@mail.gmail.com> <4B901419.8060800@feral.com> <3bbf2fe11003041737p30690522ya81e1b8f4bd6bbf9@mail.gmail.com> <3bbf2fe11003120601y3c403a1ct50f9fc6c1f0903bf@mail.gmail.com> <4B9A91DA.7030107@FreeBSD.org> <3bbf2fe11003200523t60895bfv1fa73d04e58a7838@mail.gmail.com> <4BA5C746.7060203@FreeBSD.org> <3bbf2fe11003300725vdb1e4ddrf112778ca2bbbc20@mail.gmail.com> <4BB20BF0.3090605@feral.com> Date: Tue, 30 Mar 2010 16:43:50 +0200 X-Google-Sender-Auth: 04c913a1c373e0b9 Received: by 10.142.67.22 with SMTP id p22mr2663816wfa.179.1269960230605; Tue, 30 Mar 2010 07:43:50 -0700 (PDT) Message-ID: <3bbf2fe11003300743k7045d986rec439ad292f2743d@mail.gmail.com> From: Attilio Rao To: Matthew Jacob Content-Type: text/plain; charset=UTF-8 Cc: freebsd-scsi@freebsd.org Subject: Re: How is supposed to be protected the units list? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Mar 2010 14:43:55 -0000 2010/3/30 Matthew Jacob : > > I was distracted yesterday so didn't finish my testing. > > There are still lots of issues that can occur. I'm still forming some > thoughts on this, but part of the problem is that there are things going on > with a periph that make this difficult. Pieces of it can change or be > changed- even under a lock- but the code surrounding the lock isn't aware > that it can change *between* the lock. For example, I think that cam_periph > alloc can have two different arrivals of something which could be for the > same periph can end up with two different periph structures on the list with > different unit numbers that point to the same bus. It doesn't matter that > there were locks to provide some stability because the locks were dropped in > between. May you please refer precisely to current code snippets (in terms of functions, file, line, etc) in order to better explain this and the other issues, when you have time? Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein From owner-freebsd-scsi@FreeBSD.ORG Tue Mar 30 16:14:58 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6E1CB106564A for ; Tue, 30 Mar 2010 16:14:58 +0000 (UTC) (envelope-from mj@feral.com) Received: from ns1.feral.com (ns1.feral.com [192.67.166.1]) by mx1.freebsd.org (Postfix) with ESMTP id 423F18FC13 for ; Tue, 30 Mar 2010 16:14:57 +0000 (UTC) Received: from [192.168.221.2] (remotevpn [192.168.221.2]) by ns1.feral.com (8.14.3/8.14.3) with ESMTP id o2UGEvQ1029703 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Tue, 30 Mar 2010 08:14:57 -0800 (PST) (envelope-from mj@feral.com) Message-ID: <4BB22381.4010302@feral.com> Date: Tue, 30 Mar 2010 09:14:57 -0700 From: Matthew Jacob Organization: Feral Software User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.8) Gecko/20100301 Fedora/3.0.3-1.fc11 Thunderbird/3.0.3 MIME-Version: 1.0 To: freebsd-scsi@freebsd.org References: <3bbf2fe11002281655i61a5f0a0if3f381ad0c4a1ef8@mail.gmail.com> <3bbf2fe11003031607wa3727b5ke89bc2a909d4d6a6@mail.gmail.com> <4B901419.8060800@feral.com> <3bbf2fe11003041737p30690522ya81e1b8f4bd6bbf9@mail.gmail.com> <3bbf2fe11003120601y3c403a1ct50f9fc6c1f0903bf@mail.gmail.com> <4B9A91DA.7030107@FreeBSD.org> <3bbf2fe11003200523t60895bfv1fa73d04e58a7838@mail.gmail.com> <4BA5C746.7060203@FreeBSD.org> <3bbf2fe11003300725vdb1e4ddrf112778ca2bbbc20@mail.gmail.com> <4BB20BF0.3090605@feral.com> <3bbf2fe11003300743k7045d986rec439ad292f2743d@mail.gmail.com> In-Reply-To: <3bbf2fe11003300743k7045d986rec439ad292f2743d@mail.gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender DNS name whitelisted, not delayed by milter-greylist-4.2.3 (ns1.feral.com [192.168.221.1]); Tue, 30 Mar 2010 08:14:57 -0800 (PST) Subject: Re: How is supposed to be protected the units list? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: mj@feral.com List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Mar 2010 16:14:58 -0000 Absolutely. I might also be insane. Working on finding that out. > May you please refer precisely to current code snippets (in terms of > functions, file, line, etc) in order to better explain this and the > other issues, when you have time? > > Thanks, > Attilio > > > From owner-freebsd-scsi@FreeBSD.ORG Thu Apr 1 02:40:45 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D379C106564A for ; Thu, 1 Apr 2010 02:40:45 +0000 (UTC) (envelope-from mj@feral.com) Received: from ns1.feral.com (ns1.feral.com [192.67.166.1]) by mx1.freebsd.org (Postfix) with ESMTP id A6CD48FC0A for ; Thu, 1 Apr 2010 02:40:45 +0000 (UTC) Received: from [192.168.0.102] (m206-63.dsl.tsoft.com [198.144.206.63]) by ns1.feral.com (8.14.3/8.14.3) with ESMTP id o312ejcI034638 for ; Wed, 31 Mar 2010 18:40:45 -0800 (PST) (envelope-from mj@feral.com) Message-ID: <4BB407AF.6020602@feral.com> Date: Wed, 31 Mar 2010 19:40:47 -0700 From: Matthew Jacob User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: freebsd-scsi@freebsd.org References: <3bbf2fe11002281655i61a5f0a0if3f381ad0c4a1ef8@mail.gmail.com> <3bbf2fe11003031607wa3727b5ke89bc2a909d4d6a6@mail.gmail.com> <4B901419.8060800@feral.com> <3bbf2fe11003041737p30690522ya81e1b8f4bd6bbf9@mail.gmail.com> <3bbf2fe11003120601y3c403a1ct50f9fc6c1f0903bf@mail.gmail.com> <4B9A91DA.7030107@FreeBSD.org> <3bbf2fe11003200523t60895bfv1fa73d04e58a7838@mail.gmail.com> <4BA5C746.7060203@FreeBSD.org> <3bbf2fe11003300725vdb1e4ddrf112778ca2bbbc20@mail.gmail.com> <4BB20BF0.3090605@feral.com> <3bbf2fe11003300743k7045d986rec439ad292f2743d@mail.gmail.com> <4BB22381.4010302@feral.com> In-Reply-To: <4BB22381.4010302@feral.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Default is to whitelist mail, not delayed by milter-greylist-4.2.3 (ns1.feral.com [192.67.166.1]); Wed, 31 Mar 2010 18:40:45 -0800 (PST) Subject: Re: How is supposed to be protected the units list? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Apr 2010 02:40:45 -0000 On 3/30/2010 9:14 AM, Matthew Jacob wrote: > Absolutely. I might also be insane. Working on finding that out. >> May you please refer precisely to current code snippets (in terms of >> functions, file, line, etc) in order to better explain this and the It partly ends up the same as what Alexander found. I like his unit_lock patch a bit better. In terms of the xpt_lock.diff patches for da- I don't think this is a good idea as they stand. This is part of shutdown code, which can be called as part of panic, so there is no guarantee about what locks might or might not be held, or even whether the list being traversed is intact at all. The problems that I ran into previously in da won't be fixed really by any of this. The problem here for da(4) is that it is scheduling a task to run- if the periph is going away, then the task should be cancelled. Still testing... I'm working on a fault injection case where tons of arrival/departure events can be thrown at this. From owner-freebsd-scsi@FreeBSD.ORG Thu Apr 1 03:38:16 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5CE3C106566C for ; Thu, 1 Apr 2010 03:38:16 +0000 (UTC) (envelope-from mj@feral.com) Received: from ns1.feral.com (ns1.feral.com [192.67.166.1]) by mx1.freebsd.org (Postfix) with ESMTP id 23F7A8FC0A for ; Thu, 1 Apr 2010 03:38:15 +0000 (UTC) Received: from [192.168.0.102] (m206-63.dsl.tsoft.com [198.144.206.63]) by ns1.feral.com (8.14.3/8.14.3) with ESMTP id o313cFAB035121 for ; Wed, 31 Mar 2010 19:38:15 -0800 (PST) (envelope-from mj@feral.com) Message-ID: <4BB4152B.6030504@feral.com> Date: Wed, 31 Mar 2010 20:38:19 -0700 From: Matthew Jacob User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.9) Gecko/20100317 Thunderbird/3.0.4 MIME-Version: 1.0 To: freebsd-scsi@freebsd.org References: <4BB0F488.1050806@feral.com> In-Reply-To: <4BB0F488.1050806@feral.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Default is to whitelist mail, not delayed by milter-greylist-4.2.3 (ns1.feral.com [192.67.166.1]); Wed, 31 Mar 2010 19:38:15 -0800 (PST) Subject: Re: adding a "retry command after a delay" error X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Apr 2010 03:38:16 -0000 On 3/29/2010 11:42 AM, Matthew Jacob wrote: > This is something I whipped up for dealing with some > active/active-after-failover-time systems. I wanted to have a general > facility to say of an ASC/ASCQ- retry the command, but after a period > of delay. > > Wonder if anyone had comments? > http://people.freebsd.org/~mjacob/delay_diffs.txt Any comments? From owner-freebsd-scsi@FreeBSD.ORG Fri Apr 2 01:06:05 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A77221065688 for ; Fri, 2 Apr 2010 01:06:05 +0000 (UTC) (envelope-from mj@feral.com) Received: from ns1.feral.com (ns1.feral.com [192.67.166.1]) by mx1.freebsd.org (Postfix) with ESMTP id 67E208FC2D for ; Fri, 2 Apr 2010 01:06:05 +0000 (UTC) Received: from [192.168.221.2] (remotevpn [192.168.221.2]) by ns1.feral.com (8.14.3/8.14.3) with ESMTP id o32164td030625 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Thu, 1 Apr 2010 17:06:04 -0800 (PST) (envelope-from mj@feral.com) Message-ID: <4BB542FC.2080601@feral.com> Date: Thu, 01 Apr 2010 18:06:04 -0700 From: Matthew Jacob Organization: Feral Software User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.8) Gecko/20100301 Fedora/3.0.3-1.fc11 Thunderbird/3.0.3 MIME-Version: 1.0 To: freebsd-scsi@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender DNS name whitelisted, not delayed by milter-greylist-4.2.3 (ns1.feral.com [192.168.221.1]); Thu, 01 Apr 2010 17:06:05 -0800 (PST) Subject: crash and burn, but is it a fair test? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: mj@feral.com List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Apr 2010 01:06:05 -0000 Modify an HBA to fail one out of 4096 commands with CAM_SEL_TIMEOUT. Run several shells all doing loops of "camcontrol rescan all". Fairly soon, you get the crash and burn below. -------------------------- at Fisap1t bausl 0 t atrrgaept 0 l9u:n 8g dna2e:r Ficxted iDoirecnt fAcacuelsts SCwShI-5 dielvicee idan2 :k e800.r000MnBe/sl trmaondsefers idpa2u: Ciomdm and= Qu e7u;e iangp iEcnab led 1dd a=2 : 0972 x60i0nMsBt r(u188c743t68i0o0 5n12 byptoe siecntotrse: r2 55=H 603Sx/T 1187:487C0) undfa0f aftf fifsp1f b8u0s1 80 0tdafrg9et 0 l 4 sdta0:a ce rFi xed D i r ec t A ccess SCS=I -5 0devxi1ce0 : 0dxaf0: f8f00f.0f00fMB/fsf t7ranseferds Qd5aa0: dCom0mand ouefuering aEnambel edp didnat0e:r 9 21 600M B ( 1 887 4 3 68 0=0 05x12 1b0yte s:e0ctxofrs: 2f55H f6f3Sf/Tf 11f7f4787C)e 15b00 code segmdea3 antt i sp1 =b ubs a0 tarsgee t0 x0 l0u,n l0i Sdai3:t < LS0I IxNfFf-f01-f00f 0,760 > Ftixyed pDeirec t0 xAc1cebss CS I-5 dev ice = odDaP3: L8 00.000,0M B/psr tersa n1s,fe rsl ndga3: 1C,o mdmeandf Q3ueuei2n g En0a,b lged Bdaa3: 9n216 010M e (p18r874o36c8e00 s5s12o rb yete fsectolrsa:g s25 5H= 6i3Sn/tTe r1174r8u7pCt) ndaa4 abtl iesp1d bus, 0 trarget 0e sluumn 9e 0a4: u Fixerd Drireect Acnctess pSrCoScI-e5s sdev i c=e sd a4: 80(0g.00_0MeB/sv etnratnsfe)r [thread pid 2 tid 100021 ] Stopped at daioctl+0x49: movq 0x20(%rax),%rdi db> bt Tracing pid 2 tid 100021 td 0xffffff001042c700 daioctl() at daioctl+0x49 g_disk_ioctl() at g_disk_ioctl+0x72 g_dev_taste() at g_dev_taste+0x15e g_new_provider_event() at g_new_provider_event+0x95 g_run_events() at g_run_events+0x217 g_event_procbody() at g_event_procbody+0x6c fork_exit() at fork_exit+0x12a fork_trampoline() at fork_trampoline+0xe From owner-freebsd-scsi@FreeBSD.ORG Fri Apr 2 19:59:08 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0DF761065688 for ; Fri, 2 Apr 2010 19:59:08 +0000 (UTC) (envelope-from mj@feral.com) Received: from ns1.feral.com (ns1.feral.com [192.67.166.1]) by mx1.freebsd.org (Postfix) with ESMTP id D7F4C8FC24 for ; Fri, 2 Apr 2010 19:59:06 +0000 (UTC) Received: from [192.168.221.2] (remotevpn [192.168.221.2]) by ns1.feral.com (8.14.3/8.14.3) with ESMTP id o32Jx0e5022079 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Fri, 2 Apr 2010 11:59:03 -0800 (PST) (envelope-from mj@feral.com) Message-ID: <4BB64C83.4050005@feral.com> Date: Fri, 02 Apr 2010 12:58:59 -0700 From: Matthew Jacob Organization: Feral Software User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9) Gecko/20100330 Fedora/3.0.4-1.fc11 Thunderbird/3.0.4 MIME-Version: 1.0 To: freebsd-scsi@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender DNS name whitelisted, not delayed by milter-greylist-4.2.3 (ns1.feral.com [192.168.221.1]); Fri, 02 Apr 2010 11:59:03 -0800 (PST) Subject: useful tool X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: mj@feral.com List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Apr 2010 19:59:08 -0000 virtual sim I'm using this to find all sorts of race conditions in CAM http://people.freebsd.org/~mjacob/fk If anyone plays with it and sees any really glaring holes, please let me know.