From owner-freebsd-scsi@freebsd.org Mon Aug 31 08:09:47 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C31239C6BFC for ; Mon, 31 Aug 2015 08:09:47 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.116.210]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7DA401D0B for ; Mon, 31 Aug 2015 08:09:46 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from mbpro-w.cs.huji.ac.il ([132.65.80.91]) by kabab.cs.huji.ac.il with esmtp id 1ZWK9t-0001iH-Eu for freebsd-scsi@freebsd.org; Mon, 31 Aug 2015 11:09:41 +0300 From: Daniel Braniss Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: mfiutil for mrsas any time soon? Message-Id: Date: Mon, 31 Aug 2015 11:09:41 +0300 To: FreeBSD-scsi Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Aug 2015 08:09:47 -0000 Hi, Dell=E2=80=99s new servers are arriving with PERC 730/830 which are = handled by the mrsas(4) but I miss mfiutils! MegaCli is [I can=E2=80=99t put it in words]! so apart from using the stand alone tools, is there some other solution? thanks, danny From owner-freebsd-scsi@freebsd.org Wed Sep 2 16:41:12 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CDF069C9BE9 for ; Wed, 2 Sep 2015 16:41:12 +0000 (UTC) (envelope-from peo@intersonic.se) Received: from neonpark.inter-sonic.com (neonpark.inter-sonic.com [212.247.8.98]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "neonpark.inter-sonic.com", Issuer "StartCom Class 2 Primary Intermediate Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 986B01886 for ; Wed, 2 Sep 2015 16:41:11 +0000 (UTC) (envelope-from peo@intersonic.se) X-Virus-Scanned: amavisd-new at Intersonic AB Message-ID: <55E72440.8070507@intersonic.se> Date: Wed, 02 Sep 2015 18:30:56 +0200 From: Per olof Ljungmark Organization: Intersonic AB User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: freebsd-scsi@freebsd.org Subject: da2:ciss1:0:0:0): Periph destroyed Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Sep 2015 16:41:12 -0000 Hi, Recent 10-STABLE, HP D2600 with 12 SATA drives in RAID10 via a P812 controller, 7TB capacity as one volume, ZFS. If I pull a drive from the array, the following occurs and I am not sure about the logic here because the array is still intact and no data loss occurs. Despite that the volume is gone. # zpool clear imap cannot clear errors for imap: I/O error # zpool online imap da2 cannot online da2: pool I/O is currently suspended Only a reboot helped and then the pool came up just fine, no errors, but that is not exactly what you want on a production box. Did I miss something? Would geli_autodetach="NO" help? syslog output: Sep 2 17:55:19 str kernel: ciss1: *** Hot-plug drive removed, Port=1E Box=1 Bay=2 SN= Z4Z2S9SD Sep 2 17:55:19 str kernel: ciss1: *** Physical drive failure, Port=1E Box=1 Bay=2 Sep 2 17:55:19 str kernel: ciss1: *** State change, logical drive 0, new state=REGENING Sep 2 17:55:19 str kernel: ciss1: logical drive 0 (da2) changed status OK->interim recovery, spare status 0x21 Sep 2 17:55:19 str kernel: ciss1: *** State change, logical drive 0, new state=NEEDS_REBUILD Sep 2 17:55:19 str kernel: ciss1: logical drive 0 (da2) changed status interim recovery->ready for recovery, spare status 0x11 Sep 2 17:55:19 str kernel: da2 at ciss1 bus 0 scbus2 target 0 lun 0 Sep 2 17:55:19 str kernel: da2: s/n PAGXQ0BRH1W0WA detached Sep 2 17:55:19 str kernel: (da2:ciss1:0:0:0): Periph destroyed Sep 2 17:55:19 str devd: Executing 'logger -p kern.notice -t ZFS 'vdev is removed, pool_guid=13539160044045520113 vdev_guid=1325849881310347579'' Sep 2 17:55:19 str ZFS: vdev is removed, pool_guid=13539160044045520113 vdev_guid=1325849881310347579 Sep 2 17:55:19 str kernel: (da2:ciss1:0:0:0): fatal error, could not acquire reference count Sep 2 17:55:23 str kernel: ciss1: *** State change, logical drive 0, new state=REBUILDING Sep 2 17:55:23 str kernel: ciss1: logical drive 0 (da2) changed status ready for recovery->recovering, spare status 0x13 Sep 2 17:55:23 str kernel: cam_periph_alloc: attempt to re-allocate valid device da2 rejected flags 0x18 refcount 1 Sep 2 17:55:23 str kernel: daasync: Unable to attach to new device due to status 0x6 From owner-freebsd-scsi@freebsd.org Wed Sep 2 17:23:48 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3F4E59C880F for ; Wed, 2 Sep 2015 17:23:48 +0000 (UTC) (envelope-from sbruno@freebsd.org) Received: from mail.ignoranthack.me (ignoranthack.me [199.102.79.106]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0ED7094A for ; Wed, 2 Sep 2015 17:23:47 +0000 (UTC) (envelope-from sbruno@freebsd.org) Received: from [192.168.200.200] (unknown [50.136.155.142]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: sbruno@ignoranthack.me) by mail.ignoranthack.me (Postfix) with ESMTPSA id 3EE8A1939FD for ; Wed, 2 Sep 2015 17:23:41 +0000 (UTC) Subject: Re: da2:ciss1:0:0:0): Periph destroyed To: freebsd-scsi@freebsd.org References: <55E72440.8070507@intersonic.se> From: Sean Bruno Message-ID: <55E7309C.8010406@freebsd.org> Date: Wed, 2 Sep 2015 10:23:40 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <55E72440.8070507@intersonic.se> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Sep 2015 17:23:48 -0000 On 09/02/15 09:30, Per olof Ljungmark wrote: > Hi, > > Recent 10-STABLE, HP D2600 with 12 SATA drives in RAID10 via a P812 > controller, 7TB capacity as one volume, ZFS. > > If I pull a drive from the array, the following occurs and I am not sure > about the logic here because the array is still intact and no data loss > occurs. > > Despite that the volume is gone. > > # zpool clear imap > cannot clear errors for imap: I/O error > > # zpool online imap da2 > cannot online da2: pool I/O is currently suspended > > Only a reboot helped and then the pool came up just fine, no errors, but > that is not exactly what you want on a production box. > > Did I miss something? > > Would > geli_autodetach="NO" > help? > > syslog output: > > Sep 2 17:55:19 str kernel: ciss1: *** Hot-plug drive > removed, Port=1E Box=1 Bay=2 SN= Z4Z2S9SD > Sep 2 17:55:19 str kernel: ciss1: *** Physical drive > failure, Port=1E Box=1 Bay=2 > Sep 2 17:55:19 str kernel: ciss1: *** State change, logical > drive 0, new state=REGENING > Sep 2 17:55:19 str kernel: ciss1: logical drive 0 (da2) > changed status OK->interim recovery, spare status 0x21 > Sep 2 17:55:19 str kernel: ciss1: *** State change, logical > drive 0, new state=NEEDS_REBUILD > Sep 2 17:55:19 str kernel: ciss1: logical drive 0 (da2) > changed status interim recovery->ready for recovery, spare status > 0x11 > Sep 2 17:55:19 str kernel: da2 at ciss1 bus 0 scbus2 target > 0 lun 0 > Sep 2 17:55:19 str kernel: da2: s/n > PAGXQ0BRH1W0WA detached > Sep 2 17:55:19 str kernel: (da2:ciss1:0:0:0): Periph destroyed > Sep 2 17:55:19 str devd: Executing 'logger -p kern.notice > -t ZFS 'vdev is removed, pool_guid=13539160044045520113 > vdev_guid=1325849881310347579'' > Sep 2 17:55:19 str ZFS: vdev is removed, > pool_guid=13539160044045520113 vdev_guid=1325849881310347579 > Sep 2 17:55:19 str kernel: (da2:ciss1:0:0:0): fatal error, > could not acquire reference count > Sep 2 17:55:23 str kernel: ciss1: *** State change, logical > drive 0, new state=REBUILDING > Sep 2 17:55:23 str kernel: ciss1: logical drive 0 (da2) > changed status ready for recovery->recovering, spare status > 0x13 > Sep 2 17:55:23 str kernel: cam_periph_alloc: attempt to > re-allocate valid device da2 rejected flags 0x18 refcount 1 > Sep 2 17:55:23 str kernel: daasync: Unable to attach to new > device due to status 0x6 > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" > This looks like a bug I introduced at r249170. Now that I stare deeply into the abyss of ciss(4), I think the entire change is wrong. Do you want to try and revert that change from your kernel and rebuild for a test? I don't have access to ciss(4) hardware anylonger and cannot verify. sean ref https://svnweb.freebsd.org/base/head/sys/dev/ciss/ciss.c?r1=249170&r2=249169&pathrev=249170 From owner-freebsd-scsi@freebsd.org Wed Sep 2 17:59:38 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DC5709C9817 for ; Wed, 2 Sep 2015 17:59:37 +0000 (UTC) (envelope-from peo@intersonic.se) Received: from neonpark.inter-sonic.com (neonpark.inter-sonic.com [212.247.8.98]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "neonpark.inter-sonic.com", Issuer "StartCom Class 2 Primary Intermediate Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A2F9CD38 for ; Wed, 2 Sep 2015 17:59:37 +0000 (UTC) (envelope-from peo@intersonic.se) X-Virus-Scanned: amavisd-new at Intersonic AB Message-ID: <55E73900.5080302@intersonic.se> Date: Wed, 02 Sep 2015 19:59:28 +0200 From: Per olof Ljungmark Organization: Intersonic AB User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: freebsd-scsi@freebsd.org Subject: Re: da2:ciss1:0:0:0): Periph destroyed References: <55E72440.8070507@intersonic.se> <55E7309C.8010406@freebsd.org> In-Reply-To: <55E7309C.8010406@freebsd.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Sep 2015 17:59:38 -0000 On 2015-09-02 19:23, Sean Bruno wrote: > > > On 09/02/15 09:30, Per olof Ljungmark wrote: >> Hi, >> >> Recent 10-STABLE, HP D2600 with 12 SATA drives in RAID10 via a P812 >> controller, 7TB capacity as one volume, ZFS. >> >> If I pull a drive from the array, the following occurs and I am not sure >> about the logic here because the array is still intact and no data loss >> occurs. >> >> Despite that the volume is gone. >> >> # zpool clear imap >> cannot clear errors for imap: I/O error >> >> # zpool online imap da2 >> cannot online da2: pool I/O is currently suspended >> >> Only a reboot helped and then the pool came up just fine, no errors, but >> that is not exactly what you want on a production box. >> >> Did I miss something? >> >> Would >> geli_autodetach="NO" >> help? >> >> syslog output: >> >> Sep 2 17:55:19 str kernel: ciss1: *** Hot-plug drive >> removed, Port=1E Box=1 Bay=2 SN= Z4Z2S9SD >> Sep 2 17:55:19 str kernel: ciss1: *** Physical drive >> failure, Port=1E Box=1 Bay=2 >> Sep 2 17:55:19 str kernel: ciss1: *** State change, logical >> drive 0, new state=REGENING >> Sep 2 17:55:19 str kernel: ciss1: logical drive 0 (da2) >> changed status OK->interim recovery, spare status 0x21 >> Sep 2 17:55:19 str kernel: ciss1: *** State change, logical >> drive 0, new state=NEEDS_REBUILD >> Sep 2 17:55:19 str kernel: ciss1: logical drive 0 (da2) >> changed status interim recovery->ready for recovery, spare status >> 0x11 >> Sep 2 17:55:19 str kernel: da2 at ciss1 bus 0 scbus2 target >> 0 lun 0 >> Sep 2 17:55:19 str kernel: da2: s/n >> PAGXQ0BRH1W0WA detached >> Sep 2 17:55:19 str kernel: (da2:ciss1:0:0:0): Periph destroyed >> Sep 2 17:55:19 str devd: Executing 'logger -p kern.notice >> -t ZFS 'vdev is removed, pool_guid=13539160044045520113 >> vdev_guid=1325849881310347579'' >> Sep 2 17:55:19 str ZFS: vdev is removed, >> pool_guid=13539160044045520113 vdev_guid=1325849881310347579 >> Sep 2 17:55:19 str kernel: (da2:ciss1:0:0:0): fatal error, >> could not acquire reference count >> Sep 2 17:55:23 str kernel: ciss1: *** State change, logical >> drive 0, new state=REBUILDING >> Sep 2 17:55:23 str kernel: ciss1: logical drive 0 (da2) >> changed status ready for recovery->recovering, spare status >> 0x13 >> Sep 2 17:55:23 str kernel: cam_periph_alloc: attempt to >> re-allocate valid device da2 rejected flags 0x18 refcount 1 >> Sep 2 17:55:23 str kernel: daasync: Unable to attach to new >> device due to status 0x6 >> _______________________________________________ >> freebsd-scsi@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi >> To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" >> > > > This looks like a bug I introduced at r249170. Now that I stare deeply > into the abyss of ciss(4), I think the entire change is wrong. > > Do you want to try and revert that change from your kernel and rebuild > for a test? I don't have access to ciss(4) hardware anylonger and > cannot verify. > Yes, I can try. The installed rev is 281826 but I assume the change can apply here too? From owner-freebsd-scsi@freebsd.org Wed Sep 2 18:41:00 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7614E9C8D64 for ; Wed, 2 Sep 2015 18:41:00 +0000 (UTC) (envelope-from sbruno@freebsd.org) Received: from mail.ignoranthack.me (ignoranthack.me [199.102.79.106]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5B39DBB8 for ; Wed, 2 Sep 2015 18:41:00 +0000 (UTC) (envelope-from sbruno@freebsd.org) Received: from [192.168.10.69] (guest-wifi.isc.org [149.20.53.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: sbruno@ignoranthack.me) by mail.ignoranthack.me (Postfix) with ESMTPSA id B41441939FD for ; Wed, 2 Sep 2015 18:40:58 +0000 (UTC) Subject: Re: da2:ciss1:0:0:0): Periph destroyed To: freebsd-scsi@freebsd.org References: <55E72440.8070507@intersonic.se> <55E7309C.8010406@freebsd.org> <55E73900.5080302@intersonic.se> From: Sean Bruno Message-ID: <55E742B9.1060002@freebsd.org> Date: Wed, 2 Sep 2015 11:40:57 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <55E73900.5080302@intersonic.se> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Sep 2015 18:41:00 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 09/02/15 10:59, Per olof Ljungmark wrote: > On 2015-09-02 19:23, Sean Bruno wrote: >> >> >> On 09/02/15 09:30, Per olof Ljungmark wrote: >>> Hi, >>> >>> Recent 10-STABLE, HP D2600 with 12 SATA drives in RAID10 via a >>> P812 controller, 7TB capacity as one volume, ZFS. >>> >>> If I pull a drive from the array, the following occurs and I am >>> not sure about the logic here because the array is still intact >>> and no data loss occurs. >>> >>> Despite that the volume is gone. >>> >>> # zpool clear imap cannot clear errors for imap: I/O error >>> >>> # zpool online imap da2 cannot online da2: pool I/O is >>> currently suspended >>> >>> Only a reboot helped and then the pool came up just fine, no >>> errors, but that is not exactly what you want on a production >>> box. >>> >>> Did I miss something? >>> >>> Would geli_autodetach="NO" help? >>> >>> syslog output: >>> >>> Sep 2 17:55:19 str kernel: ciss1: *** Hot-plug >>> drive removed, Port=1E Box=1 Bay=2 SN= Z4Z2S9SD Sep >>> 2 17:55:19 str kernel: ciss1: *** Physical drive >>> failure, Port=1E Box=1 Bay=2 Sep 2 17:55:19 str >>> kernel: ciss1: *** State change, logical drive 0, new >>> state=REGENING Sep 2 17:55:19 str kernel: ciss1: >>> logical drive 0 (da2) changed status OK->interim recovery, >>> spare status 0x21 Sep 2 17:55:19 str >>> kernel: ciss1: *** State change, logical drive 0, new >>> state=NEEDS_REBUILD Sep 2 17:55:19 str kernel: >>> ciss1: logical drive 0 (da2) changed status interim >>> recovery->ready for recovery, spare status >>> 0x11 Sep 2 17:55:19 str >>> kernel: da2 at ciss1 bus 0 scbus2 target 0 lun 0 Sep 2 >>> 17:55:19 str kernel: da2: >>> s/n PAGXQ0BRH1W0WA detached Sep 2 17:55:19 str >>> kernel: (da2:ciss1:0:0:0): Periph destroyed Sep 2 17:55:19 >>> str devd: Executing 'logger -p kern.notice -t ZFS >>> 'vdev is removed, pool_guid=13539160044045520113 >>> vdev_guid=1325849881310347579'' Sep 2 17:55:19 >>> str ZFS: vdev is removed, pool_guid=13539160044045520113 >>> vdev_guid=1325849881310347579 Sep 2 17:55:19 str >>> kernel: (da2:ciss1:0:0:0): fatal error, could not acquire >>> reference count Sep 2 17:55:23 str kernel: ciss1: >>> *** State change, logical drive 0, new state=REBUILDING Sep 2 >>> 17:55:23 str kernel: ciss1: logical drive 0 (da2) >>> changed status ready for recovery->recovering, spare status >>> 0x13 Sep 2 17:55:23 >>> str kernel: cam_periph_alloc: attempt to >>> re-allocate valid device da2 rejected flags 0x18 refcount 1 Sep >>> 2 17:55:23 str kernel: daasync: Unable to attach to >>> new device due to status 0x6 >>> _______________________________________________ >>> freebsd-scsi@freebsd.org mailing list >>> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi To >>> unsubscribe, send any mail to >>> "freebsd-scsi-unsubscribe@freebsd.org" >>> >> >> >> This looks like a bug I introduced at r249170. Now that I stare >> deeply into the abyss of ciss(4), I think the entire change is >> wrong. >> >> Do you want to try and revert that change from your kernel and >> rebuild for a test? I don't have access to ciss(4) hardware >> anylonger and cannot verify. >> > > Yes, I can try. The installed rev is 281826 but I assume the change > can apply here too? > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi To > unsubscribe, send any mail to > "freebsd-scsi-unsubscribe@freebsd.org" > yeah, I think a "svn merge -c -249170" from /usr/src should do it if you are managing your system from svn sean -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQF8BAEBCgBmBQJV50K3XxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kmgMIANHQ+rrVBztQ8MDGuRgQHZcQ 3/zPYsltHtjO6ulMHWoXank6hlSIaN8AivYl4sC1fK49XQt9X6PcvLhwhfH369LS WvSeeaI0E3zUq8IGpsMyCJHZqMijs/HSYPb6iY4flJObJSDKWuf7JTHmNLm2lyfe TP5ARb0RbQTleB+6DQSMiwZArZqJENACkgGvfPilIBPilKUWG1+wUzln17SYH5gK wJS16j547Idd2ex1AA3A7dAmZC9GpWmGMXHwVV2drcwZb7VIYB/qrvT5lF2ECDDJ gWDxidNqYaz7DFpxYf9g9a7g3qOscAVz2Ls5WqmJkaA5RsxNDLU9L2E6Y3Zb+YA= =ZO3Z -----END PGP SIGNATURE----- From owner-freebsd-scsi@freebsd.org Wed Sep 2 19:02:14 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3B6319C9619 for ; Wed, 2 Sep 2015 19:02:14 +0000 (UTC) (envelope-from peo@intersonic.se) Received: from neonpark.inter-sonic.com (neonpark.inter-sonic.com [212.247.8.98]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "neonpark.inter-sonic.com", Issuer "StartCom Class 2 Primary Intermediate Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C08BDB07 for ; Wed, 2 Sep 2015 19:02:13 +0000 (UTC) (envelope-from peo@intersonic.se) X-Virus-Scanned: amavisd-new at Intersonic AB Message-ID: <55E747AC.6020302@intersonic.se> Date: Wed, 02 Sep 2015 21:02:04 +0200 From: Per olof Ljungmark Organization: Intersonic AB User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: freebsd-scsi@freebsd.org Subject: Re: da2:ciss1:0:0:0): Periph destroyed References: <55E72440.8070507@intersonic.se> <55E7309C.8010406@freebsd.org> <55E73900.5080302@intersonic.se> <55E742B9.1060002@freebsd.org> In-Reply-To: <55E742B9.1060002@freebsd.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Sep 2015 19:02:14 -0000 On 2015-09-02 20:40, Sean Bruno wrote: > > > On 09/02/15 10:59, Per olof Ljungmark wrote: >> On 2015-09-02 19:23, Sean Bruno wrote: >>> >>> >>> On 09/02/15 09:30, Per olof Ljungmark wrote: >>>> Hi, >>>> >>>> Recent 10-STABLE, HP D2600 with 12 SATA drives in RAID10 via a >>>> P812 controller, 7TB capacity as one volume, ZFS. >>>> >>>> If I pull a drive from the array, the following occurs and I am >>>> not sure about the logic here because the array is still intact >>>> and no data loss occurs. >>>> >>>> Despite that the volume is gone. >>>> >>>> # zpool clear imap cannot clear errors for imap: I/O error >>>> >>>> # zpool online imap da2 cannot online da2: pool I/O is >>>> currently suspended >>>> >>>> Only a reboot helped and then the pool came up just fine, no >>>> errors, but that is not exactly what you want on a production >>>> box. >>>> >>>> Did I miss something? >>>> >>>> Would geli_autodetach="NO" help? >>>> >>>> syslog output: >>>> >>>> Sep 2 17:55:19 str kernel: ciss1: *** Hot-plug >>>> drive removed, Port=1E Box=1 Bay=2 SN= Z4Z2S9SD Sep >>>> 2 17:55:19 str kernel: ciss1: *** Physical drive >>>> failure, Port=1E Box=1 Bay=2 Sep 2 17:55:19 str >>>> kernel: ciss1: *** State change, logical drive 0, new >>>> state=REGENING Sep 2 17:55:19 str kernel: ciss1: >>>> logical drive 0 (da2) changed status OK->interim recovery, >>>> spare status 0x21 Sep 2 17:55:19 str >>>> kernel: ciss1: *** State change, logical drive 0, new >>>> state=NEEDS_REBUILD Sep 2 17:55:19 str kernel: >>>> ciss1: logical drive 0 (da2) changed status interim >>>> recovery->ready for recovery, spare status >>>> 0x11 Sep 2 17:55:19 str >>>> kernel: da2 at ciss1 bus 0 scbus2 target 0 lun 0 Sep 2 >>>> 17:55:19 str kernel: da2: >>>> s/n PAGXQ0BRH1W0WA detached Sep 2 17:55:19 str >>>> kernel: (da2:ciss1:0:0:0): Periph destroyed Sep 2 17:55:19 >>>> str devd: Executing 'logger -p kern.notice -t ZFS >>>> 'vdev is removed, pool_guid=13539160044045520113 >>>> vdev_guid=1325849881310347579'' Sep 2 17:55:19 >>>> str ZFS: vdev is removed, pool_guid=13539160044045520113 >>>> vdev_guid=1325849881310347579 Sep 2 17:55:19 str >>>> kernel: (da2:ciss1:0:0:0): fatal error, could not acquire >>>> reference count Sep 2 17:55:23 str kernel: ciss1: >>>> *** State change, logical drive 0, new state=REBUILDING Sep 2 >>>> 17:55:23 str kernel: ciss1: logical drive 0 (da2) >>>> changed status ready for recovery->recovering, spare status >>>> 0x13 Sep 2 17:55:23 >>>> str kernel: cam_periph_alloc: attempt to >>>> re-allocate valid device da2 rejected flags 0x18 refcount 1 Sep >>>> 2 17:55:23 str kernel: daasync: Unable to attach to >>>> new device due to status 0x6 >>>> _______________________________________________ >>>> freebsd-scsi@freebsd.org mailing list >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi To >>>> unsubscribe, send any mail to >>>> "freebsd-scsi-unsubscribe@freebsd.org" >>>> >>> >>> >>> This looks like a bug I introduced at r249170. Now that I stare >>> deeply into the abyss of ciss(4), I think the entire change is >>> wrong. >>> >>> Do you want to try and revert that change from your kernel and >>> rebuild for a test? I don't have access to ciss(4) hardware >>> anylonger and cannot verify. >>> > >> Yes, I can try. The installed rev is 281826 but I assume the change >> can apply here too? >> _______________________________________________ >> freebsd-scsi@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi To >> unsubscribe, send any mail to >> "freebsd-scsi-unsubscribe@freebsd.org" > > > > yeah, I think a "svn merge -c -249170" from /usr/src should do it if > you are managing your system from svn > Sep 2 20:54:05 str kernel: ciss1: *** Hot-plug drive removed, Port=1E Box=1 Bay=3 SN= W4Z1G4BD Sep 2 20:54:05 str kernel: ciss1: *** Physical drive failure, Port=1E Box=1 Bay=3 Sep 2 20:54:50 str kernel: ciss1: *** Hot-plug drive inserted, Port=1E Box=1 Bay=3 SN= WD-WMC1P0F66XVC Sep 2 20:54:50 str kernel: ciss1: *** HP Array Controller Firmware Ver = 6.64, Build Num = 0 Right, this time it survived, the volume did not detach after reverting. If this change does not cause any other problems do you think it can go into -STABLE? Thanks! //per From owner-freebsd-scsi@freebsd.org Wed Sep 2 19:05:50 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D74219C9799 for ; Wed, 2 Sep 2015 19:05:50 +0000 (UTC) (envelope-from sbruno@freebsd.org) Received: from mail.ignoranthack.me (ignoranthack.me [199.102.79.106]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BBDC6CE4 for ; Wed, 2 Sep 2015 19:05:50 +0000 (UTC) (envelope-from sbruno@freebsd.org) Received: from [192.168.10.69] (guest-wifi.isc.org [149.20.53.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: sbruno@ignoranthack.me) by mail.ignoranthack.me (Postfix) with ESMTPSA id 5386E193A19 for ; Wed, 2 Sep 2015 19:05:49 +0000 (UTC) Subject: Re: da2:ciss1:0:0:0): Periph destroyed To: freebsd-scsi@freebsd.org References: <55E72440.8070507@intersonic.se> <55E7309C.8010406@freebsd.org> <55E73900.5080302@intersonic.se> <55E742B9.1060002@freebsd.org> <55E747AC.6020302@intersonic.se> From: Sean Bruno Message-ID: <55E7488C.90405@freebsd.org> Date: Wed, 2 Sep 2015 12:05:48 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <55E747AC.6020302@intersonic.se> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Sep 2015 19:05:51 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 09/02/15 12:02, Per olof Ljungmark wrote: > On 2015-09-02 20:40, Sean Bruno wrote: >> >> >> On 09/02/15 10:59, Per olof Ljungmark wrote: >>> On 2015-09-02 19:23, Sean Bruno wrote: >>>> >>>> >>>> On 09/02/15 09:30, Per olof Ljungmark wrote: >>>>> Hi, >>>>> >>>>> Recent 10-STABLE, HP D2600 with 12 SATA drives in RAID10 >>>>> via a P812 controller, 7TB capacity as one volume, ZFS. >>>>> >>>>> If I pull a drive from the array, the following occurs and >>>>> I am not sure about the logic here because the array is >>>>> still intact and no data loss occurs. >>>>> >>>>> Despite that the volume is gone. >>>>> >>>>> # zpool clear imap cannot clear errors for imap: I/O error >>>>> >>>>> # zpool online imap da2 cannot online da2: pool I/O is >>>>> currently suspended >>>>> >>>>> Only a reboot helped and then the pool came up just fine, >>>>> no errors, but that is not exactly what you want on a >>>>> production box. >>>>> >>>>> Did I miss something? >>>>> >>>>> Would geli_autodetach="NO" help? >>>>> >>>>> syslog output: >>>>> >>>>> Sep 2 17:55:19 str kernel: ciss1: *** Hot-plug >>>>> drive removed, Port=1E Box=1 Bay=2 SN= Z4Z2S9SD Sep 2 >>>>> 17:55:19 str kernel: ciss1: *** Physical drive >>>>> failure, Port=1E Box=1 Bay=2 Sep 2 17:55:19 >>>>> str kernel: ciss1: *** State change, logical drive 0, new >>>>> state=REGENING Sep 2 17:55:19 str kernel: >>>>> ciss1: logical drive 0 (da2) changed status OK->interim >>>>> recovery, spare status 0x21 Sep 2 17:55:19 >>>>> str kernel: ciss1: *** State change, logical >>>>> drive 0, new state=NEEDS_REBUILD Sep 2 17:55:19 >>>>> str kernel: ciss1: logical drive 0 (da2) >>>>> changed status interim recovery->ready for recovery, spare >>>>> status 0x11 Sep 2 17:55:19 >>>>> str kernel: da2 at ciss1 bus 0 scbus2 target 0 >>>>> lun 0 Sep 2 17:55:19 str kernel: da2: >>>> RAID 1(1+0) read> s/n PAGXQ0BRH1W0WA detached Sep 2 >>>>> 17:55:19 str kernel: (da2:ciss1:0:0:0): Periph >>>>> destroyed Sep 2 17:55:19 str devd: Executing >>>>> 'logger -p kern.notice -t ZFS 'vdev is removed, >>>>> pool_guid=13539160044045520113 >>>>> vdev_guid=1325849881310347579'' Sep 2 17:55:19 >>>>> str ZFS: vdev is removed, >>>>> pool_guid=13539160044045520113 >>>>> vdev_guid=1325849881310347579 Sep 2 17:55:19 >>>>> str kernel: (da2:ciss1:0:0:0): fatal error, could not >>>>> acquire reference count Sep 2 17:55:23 str >>>>> kernel: ciss1: *** State change, logical drive 0, new >>>>> state=REBUILDING Sep 2 17:55:23 str kernel: >>>>> ciss1: logical drive 0 (da2) changed status ready for >>>>> recovery->recovering, spare status >>>>> 0x13 Sep 2 17:55:23 >>>>> str kernel: cam_periph_alloc: attempt to >>>>> re-allocate valid device da2 rejected flags 0x18 refcount >>>>> 1 Sep 2 17:55:23 str kernel: daasync: Unable >>>>> to attach to new device due to status 0x6 >>>>> _______________________________________________ >>>>> freebsd-scsi@freebsd.org mailing list >>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi To >>>>> unsubscribe, send any mail to >>>>> "freebsd-scsi-unsubscribe@freebsd.org" >>>>> >>>> >>>> >>>> This looks like a bug I introduced at r249170. Now that I >>>> stare deeply into the abyss of ciss(4), I think the entire >>>> change is wrong. >>>> >>>> Do you want to try and revert that change from your kernel >>>> and rebuild for a test? I don't have access to ciss(4) >>>> hardware anylonger and cannot verify. >>>> >> >>> Yes, I can try. The installed rev is 281826 but I assume the >>> change can apply here too? >>> _______________________________________________ >>> freebsd-scsi@freebsd.org mailing list >>> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi To >>> unsubscribe, send any mail to >>> "freebsd-scsi-unsubscribe@freebsd.org" >> >> >> >> yeah, I think a "svn merge -c -249170" from /usr/src should do >> it if you are managing your system from svn >> > > Sep 2 20:54:05 str kernel: ciss1: *** Hot-plug drive > removed, Port=1E Box=1 Bay=3 SN= W4Z1G4BD Sep 2 > 20:54:05 str kernel: ciss1: *** Physical drive > failure, Port=1E Box=1 Bay=3 Sep 2 20:54:50 str > kernel: ciss1: *** Hot-plug drive inserted, Port=1E Box=1 Bay=3 > SN= WD-WMC1P0F66XVC Sep 2 20:54:50 str kernel: ciss1: > *** HP Array Controller Firmware Ver = 6.64, Build Num = 0 > > > Right, this time it survived, the volume did not detach after > reverting. > > If this change does not cause any other problems do you think it > can go into -STABLE? > > Thanks! > > //per _______________________________________________ > freebsd-scsi@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi To > unsubscribe, send any mail to > "freebsd-scsi-unsubscribe@freebsd.org" > Definitely. I'll yank it out today and setup a 3 day MFC sean -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQF8BAEBCgBmBQJV50iKXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5k/cUIAKwf3SitFqiXrW8ophSd8D2F PHMAIUbRnH6vzAK6yGFmly/4oCaTfCj966hrFRFcCdzKbUAUge89O1ewdbuiSgY+ oF0Wkb6175ucZSYaiEzayp0N1dgewxVZGAFjhO+OXGMXftgR6yYmQDCuE3eFdaRE zA4A+VwE0gKnQxOVBbrhzf8ezEfml+iDvYd/NxCciDhlNMrWhXUCgq9B4RBM6aU2 oYt1qNxrqkVvL9hV8u2/WAJd8Q6sDcaJnv2IcKoU8i/XzhQtsMtCk9juFAvGHQQb HRI4iJpqtBwlhBLSzesIYKzMtfd1RRRLLOG8PHZZFl3RrinOSS02SbbxCa8lFrM= =/2d0 -----END PGP SIGNATURE----- From owner-freebsd-scsi@freebsd.org Wed Sep 2 19:05:50 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D6A449C9798 for ; Wed, 2 Sep 2015 19:05:50 +0000 (UTC) (envelope-from sbruno@freebsd.org) Received: from mail.ignoranthack.me (ignoranthack.me [199.102.79.106]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BBD5ECE3 for ; Wed, 2 Sep 2015 19:05:50 +0000 (UTC) (envelope-from sbruno@freebsd.org) Received: from [192.168.10.69] (guest-wifi.isc.org [149.20.53.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: sbruno@ignoranthack.me) by mail.ignoranthack.me (Postfix) with ESMTPSA id 50C7A1939FD for ; Wed, 2 Sep 2015 19:05:49 +0000 (UTC) Subject: Re: da2:ciss1:0:0:0): Periph destroyed To: freebsd-scsi@freebsd.org References: <55E72440.8070507@intersonic.se> <55E7309C.8010406@freebsd.org> <55E73900.5080302@intersonic.se> <55E742B9.1060002@freebsd.org> <55E747AC.6020302@intersonic.se> From: Sean Bruno Message-ID: <55E7488C.3070602@freebsd.org> Date: Wed, 2 Sep 2015 12:05:48 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <55E747AC.6020302@intersonic.se> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Sep 2015 19:05:50 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 09/02/15 12:02, Per olof Ljungmark wrote: Definitely. I'll yank it out today and setup a 3 day MFC sean -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQF8BAEBCgBmBQJV50iJXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kQxcIAIFfEesSkzRcnpDj0ZCYx15J lvxJovLaeT3Yd/tF7ElKD3B2d/9/LC55FJWIEG+RsbwwC5gqQQRB8dwW3hYjFyEu MpWXK80me3meW8FMyWQG1QYRgtDaPExEV7mLeh4HGlv+iuHODsgxW/EY6qt7ySG+ q+dgGpIkfe4YCMNZtH3FQRKe3RWvUG615BEz7mW1cojUNRN4s9emfPxBMrdjBzcU dDpKsv3et/HQwSTl/GB8Taz313455yD5s/w5lj1ABOo4cRV1bbggOFVwveBSlxY3 wX/sLEC+zrmIByktMWpwCzSIiROTRwgII+N5c7y+7+WqMu6gDDxFN6x/+FW2TMg= =s9pZ -----END PGP SIGNATURE-----