From owner-freebsd-scsi@freebsd.org Sat Dec 12 03:55:40 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CAC26A040F4 for ; Sat, 12 Dec 2015 03:55:40 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-ob0-x22d.google.com (mail-ob0-x22d.google.com [IPv6:2607:f8b0:4003:c01::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9D6BE12E8 for ; Sat, 12 Dec 2015 03:55:40 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by obbsd4 with SMTP id sd4so47792505obb.0 for ; Fri, 11 Dec 2015 19:55:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=GjK01q0Lu21u4HeL0Ng9V0tLjAqr0PUWORCTRQyq24k=; b=iOPDIwGLoDZECW+Z+06gyHEjQ7LLK3zwSzwE4/uAN5rUgtvJgBqa2vNk2fL8ct4fVV HyxwQdpgHAX3nWR0yA3sRUXEmfHh3V92yy0FgccY566eoQQ0XRGjdFKEpgC8nZki930l nfvCdvZnAYVHJLyz1sQQHRGBDfs/st8zG7xhwPL27cSAU7iVZLu1vXcXgRkJLEYp8b7A fYjjDEh5F++ipLhtaWKa77cc7V6BA0bXP5llfJzag+2barln+WwsJ71LUMhtTKEfE7+C 3pn+OSsHEji6BX7D3yjbEJ6X2IIx90Bloq/m2TiKpJE+6eZho+LEQvGmOutYvDxD7rBy mW2w== MIME-Version: 1.0 X-Received: by 10.60.131.40 with SMTP id oj8mr16633456oeb.31.1449892539550; Fri, 11 Dec 2015 19:55:39 -0800 (PST) Sender: asomers@gmail.com Received: by 10.202.0.7 with HTTP; Fri, 11 Dec 2015 19:55:39 -0800 (PST) In-Reply-To: <566B8E2A.8070404@mWare.ca> References: <566B4F68.2040807@mWare.ca> <566B8E2A.8070404@mWare.ca> Date: Fri, 11 Dec 2015 20:55:39 -0700 X-Google-Sender-Auth: 6ebS-Eon9ZtaCBX3cAhgWwumu48 Message-ID: Subject: Re: Informal(?) sesX messages From: Alan Somers To: Mykel@mware.ca Cc: FreeBSD-scsi Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Dec 2015 03:55:41 -0000 On Fri, Dec 11, 2015 at 8:02 PM, wrote: > On 15-12-11 17:44, Alan Somers wrote: >> >> On Fri, Dec 11, 2015 at 3:34 PM, wrote: >>> >>> Hi all, please CC me on reply as I'm not subscribed to this list. >>> >>> I've got one of those Supermicro 72-drive monster machines, all ZFS'd up. >>> https://www.supermicro.com/products/system/4u/6048/SSG-6048R-E1CR72L.cfm >>> >>> And before & after replacing a faulty SAS Expander and a pair of cables >>> (gobs of WRITE/ABORT errors), I'm still occasionally seeing these kernel >>> messages (in groups), and I'm not sure if they're benign, or pointing to >>> a >>> SAS expander event... or what. I admit, this is my first time dealing >>> with a >>> machine with SAS expanders, so I'm a bit out of my depth in diagnosis >>> thereof. >>> >>> Dec 11 16:06:54 ZFS-AF kernel: ses5: da7,pass7: Element descriptor: >>> 'Slot00' >>> Dec 11 16:06:54 ZFS-AF kernel: ses5: da7,pass7: SAS Device Slot Element: >>> 1 >>> Phys at Slot 0 >>> Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: SAS device type 1 id 0 >>> Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: protocols: Initiator( None ) >>> Target( SSP ) >>> Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: parent 500304801ea2df3f addr >>> 5000c500844bd449 >>> >> These look like device arrival notifications. If you scroll up, do >> you see any departure notifications? They should look like this: >> >> mps0: mpssas_prepare_remove: Sending reset for target ID 10 >> da0 at mps0 bus 0 scbus0 target 10 lun 0 >> da0: s/n JPW930HQ15H26H detached >> mps0: Unfreezing devq for target ID 10 >> xpt_release_devq(): requested 1 > present 0 >> (da0:mps0:0:10:0): Periph destroyed >> >> Also, could you post your HBA and expander firmware versions? For the >> HBA, use "sysctl dev.mps.0.firmware_version". For the expander, >> install sg3_utils and do "sg_inq --hex --len=64 ses0". The firmware >> version is the dotted quad at the end. >> >> # sg_inq --hex --len=64 ses0 >> 00 0d 00 05 02 34 00 40 02 41 49 43 20 43 4f 52 50 ....4.@.AIC >> CORP >> 10 53 41 53 20 36 47 20 45 78 70 61 6e 64 65 72 20 SAS 6G >> Expander >> 20 30 62 30 31 78 33 36 2d 31 2e 31 31 2e 31 2e 31 >> 0b01x36-1.11.1.1 >> 30 00 20 20 20 20 20 20 20 >> >> -Alan > > > I can say, without doubt, that I do NOT have any preceding detachments... > which is why I'm so baffled by the messages. If the devices aren't > de/reattaching, what's the point of these informal/benign ones? I am > familiar with them from other hot-swap and disk failure scenarios in other > machines. > > Could this be a driver bug not logging the disconnection? But when I > hot-unplugged them, I do see that in dmesg. > Or does SAS do something where it might renegotiate or reconfigure the > lanes, and I'm just seeing it do that? > > Thanks, > > Myke > > > dev.mpr.0.driver_version: 09.255.01.00-fbsd > dev.mpr.0.firmware_version: 06.00.00.00 > dev.mpr.1.driver_version: 09.255.01.00-fbsd > dev.mpr.1.firmware_version: 08.00.00.00 > dev.mpr.2.driver_version: 09.255.01.00-fbsd > dev.mpr.2.firmware_version: 08.00.00.00 > > [root@ZFS-AF ~]# sg_inq --hex --len=64 ses0 > 00 0d 00 05 02 33 00 40 02 4c 53 49 20 20 20 20 20 ....3.@.LSI > 10 53 41 53 33 78 34 38 20 20 20 20 20 20 20 20 20 SAS3x48 > 20 30 37 30 31 78 34 38 2d 36 36 2e 37 2e 31 2e 31 0701x48-66.7.1.1 > 30 37 00 20 20 20 20 20 20 7. > [root@ZFS-AF ~]# sg_inq --hex --len=64 ses1 > 00 0d 00 05 02 33 00 40 02 4c 53 49 20 20 20 20 20 ....3.@.LSI > 10 53 41 53 33 78 33 36 20 20 20 20 20 20 20 20 20 SAS3x36 > 20 30 37 30 31 78 33 36 2d 36 36 2e 37 2e 31 2e 31 0701x36-66.7.1.1 > 30 37 00 20 20 20 20 20 20 7. > [root@ZFS-AF ~]# sg_inq --hex --len=64 ses2 > SCSI INQUIRY failed on ses2, res=-1 > [root@ZFS-AF ~]# sg_inq --hex --len=64 ses3 > SCSI INQUIRY failed on ses3, res=-1 > [root@ZFS-AF ~]# sg_inq --hex --len=64 ses4 > 00 0d 00 05 02 33 00 40 02 4c 53 49 20 20 20 20 20 ....3.@.LSI > 10 53 41 53 33 78 32 38 20 20 20 20 20 20 20 20 20 SAS3x28 > 20 30 37 30 31 78 32 38 2d 36 36 2e 37 2e 31 2e 31 0701x28-66.7.1.1 > 30 37 00 20 20 20 20 20 20 7. > [root@ZFS-AF ~]# sg_inq --hex --len=64 ses5 > 00 0d 00 05 02 33 00 40 02 4c 53 49 20 20 20 20 20 ....3.@.LSI > 10 53 41 53 33 78 34 38 20 20 20 20 20 20 20 20 20 SAS3x48 > 20 30 37 30 31 78 34 38 2d 36 36 2e 37 2e 31 2e 31 0701x48-66.7.1.1 > 30 37 00 20 20 20 20 20 20 7. > [root@ZFS-AF ~]# > > > And here's dmesg after fresh reboot: Well, that's weird. Your firmware versions look OK, though you might want to upgrade mpr0 just to be consistent. The next thing I would check, if I were you, would be devctl messages. Edit /etc/syslog.conf and change devd's loglevel to INFO, then HUP syslogd. Now every devctl message should get logged in /var/log/devd.log. That will tell you more precisely than dmesg whether there are any arrival or departure events. -Alan