Date: Mon, 28 Aug 2017 09:08:09 -0600 From: Stephen Mcconnell <stephen.mcconnell@broadcom.com> To: Kevin Bowling <kevin.bowling@kev009.com>, FreeBSD-scsi <freebsd-scsi@freebsd.org> Cc: John Baldwin <jhb@freebsd.org> Subject: RE: Disk reordering on LSI SAS2008/mps(4) Message-ID: <63b26ecc55d5f7a3152af6c26aba18a2@mail.gmail.com> In-Reply-To: <CAK7dMtCqPs4-DW%2Bd0a-mQOs=jnMXHs=qoziR%2BFtsXM=5tyG41Q@mail.gmail.com> References: <a7ebb9bef2dbb417d72a233706e50da6@nitrology.com> <1112cc2edb666ce7fa9c72c11cdd284c@nitrology.com> <CAK7dMtCqPs4-DW%2Bd0a-mQOs=jnMXHs=qoziR%2BFtsXM=5tyG41Q@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
I'm assuming that the dubug_level is in hex, right? If it is than the run where debug_level is 0x583 should be showing some Mapping debug output, but I don't see any. Do you have mapping enabled in the controller? You can see the mapping flags in IOC Page 8 in the Flags field. Do you have a way to look at the controller pages? You'd need either lsiutil or maybe mpsutil will work (Scott Long wrote mpsutil and I don't know anything about it). If you don't have mapping enabled, you won't be guaranteed that the devices will be discovered in the same order over a controller reset or reboot. Steve > -----Original Message----- > From: owner-freebsd-scsi@freebsd.org [mailto:owner-freebsd- > scsi@freebsd.org] On Behalf Of Kevin Bowling > Sent: Sunday, August 27, 2017 8:51 PM > To: FreeBSD-scsi > Cc: John Baldwin > Subject: Re: Disk reordering on LSI SAS2008/mps(4) > > Note that we only see this bug with EARLY_AP_STARTUP enabled > > Regards, > > On Fri, Aug 25, 2017 at 2:11 PM, Jason Wolfe <j@nitrology.com> wrote: > > > Attachments are useful. > > > > > > On 2017-08-25 13:58, Jason Wolfe wrote: > > > >> Hi! > >> > >> We've been having an issue where we see some disk reordering on boot > >> on HEAD from mid July on LSI controllers, maybe 5% of the time. We > >> brought mps current as of r322364 with no change behavior. > >> > >> I have a few logs attached with various debug output. In all cases > >> I've seen the pass ordering to be proper, and cam does try to resolve > >> the da ordering, but the device it tries to reassign to is already > >> taken. Attached is the full output, and listing some relevant bits > >> below for the casual reader. Being that the functionality in > >> scsi_da.c has been fairly static, and it's attempting to reassign, it > >> seems more likely we are running into something in mps here. The > >> targets always look to be proper. > >> > >> The various settings of hw.mps.use_phy_num (-1/0/1) don't change the > >> behavior, and neither does hw.mps.enable_ssu=0. We have machines over > >> various FW versions (15/16) that see the issue. I'm wondering if the > >> fact that we see this issue over soft reboots means that the firmware > >> isn't coming into play. To confirm, we are booting from the > >> controller, so the LSI BIOS is enabled. > >> > >> mps0@pci0:3:0:0: class=0x010700 card=0x040015d9 chip=0x00721000 > >> rev=0x03 hdr=0x00 > >> vendor = 'LSI Logic / Symbios Logic' > >> device = 'SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon]' > >> class = mass storage > >> subclass = SAS > >> > >> reorder-verbose.txt: > >> boot_verbose="YES" > >> hw.mps.debug_level="71" > >> > >> da0 at mps0 bus 0 scbus0 target 17 lun 0 > >> cam_periph_alloc: attempt to re-allocate valid device da0 rejected > >> flags 0x102 refcount 4 > >> da1 at mps0 bus 0 scbus0 target 8 lun 0 > >> daasync: Unable to attach to new device due to status 0x6 > >> da2 at mps0 bus 0 scbus0 target 9 lun 0 ... > >> da8 at mps0 bus 0 scbus0 target 15 lun 0 > >> da9 at mps0 bus 0 scbus0 target 16 lun 0 > >> da10 at mps0 bus 0 scbus0 target 18 lun 0 > >> da11 at mps0 bus 0 scbus0 target 19 lun 0 > >> > >> pass0 at mps0 bus 0 scbus0 target 8 lun 0 > >> pass1 at mps0 bus 0 scbus0 target 9 lun 0 ... > >> pass9 at mps0 bus 0 scbus0 target 17 lun 0 > >> pass10 at mps0 bus 0 scbus0 target 18 lun 0 > >> pass11 at mps0 bus 0 scbus0 target 19 lun 0 > >> > >> > >> > >> > >> reorder-mps-mapping.txt: > >> hw.mps.debug_level="583" > >> > >> da0 at mps0 bus 0 scbus0 target 19 lun 0 > >> da1 at mps0 bus 0 scbus0 target 8 lun 0 > >> da2 at mps0 bus 0 scbus0 target 9 lun 0 > >> ... > >> da9 at mps0 bus 0 scbus0 target 16 lun 0 > >> da10 at mps0 bus 0 scbus0 target 17 lun 0 > >> da11 at mps0 bus 0 scbus0 target 18 lun 0 > >> cam_periph_alloc: attempt to re-allocate valid device da0 rejected > >> flags 0x106 refcount 6 > >> daasync: Unable to attach to new device due to status 0x6 > >> > >> ses0: da1,pass0: Element descriptor: 'Slot 01' > >> ses0: da1,pass0: SAS Device Slot Element: 1 Phys at Slot 0 > >> ses0: da0,pass11: Element descriptor: 'Slot 12' > >> ses0: da0,pass11: SAS Device Slot Element: 1 Phys at Slot 11 > >> > >> > >> Luckily we have found a way to fairly easily repro it over a few > >> hours, so we are open to any suggestions. > >> > >> Thanks! > >> Jason > > > > > > _______________________________________________ > > freebsd-scsi@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi > > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" > > > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?63b26ecc55d5f7a3152af6c26aba18a2>