Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 12 Jul 2018 12:00:41 +0200
From:      Oliver Sech <crimsonthunder@gmx.net>
To:        Ken Merry <ken@freebsd.org>, Stephen Mcconnell <stephen.mcconnell@broadcom.com>
Cc:        FreeBSD-scsi <freebsd-scsi@freebsd.org>
Subject:   Re: problems with SAS JBODs 2
Message-ID:  <9e0bf18f-0689-b2a0-1da4-b70c497b2f14@gmx.net>
In-Reply-To: <54B10B7C-CDCE-4428-B584-59CE8F38B120@freebsd.org>
References:  <trinity-14d18077-ea73-40f6-9e87-d2d4000b1f7e-1530620937871@3c-app-gmx-bs01> <CAOtMX2h8r31AeNCKyckK2P0VLn1CKFogo9bWom2So1x2ngpa4A@mail.gmail.com> <237f77ab-89e2-188b-b2b1-84c6d88609b0@gmx.net> <b785fe02-9242-c95f-56cb-2130f90e17f5@gmx.net> <3caf8ccd6fde8cfc4db25bae5327c46b@mail.gmail.com> <0af047d477d15ec364140653bd967c89@mail.gmail.com> <54B10B7C-CDCE-4428-B584-59CE8F38B120@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 07/11/2018 10:35 PM, Ken Merry wrote:
> Oliver, what happens when you try to do I/O to the devices that don’t go away after you pull the cable?  Does that cause the devices to go away?

I tried to 'dd if=/dev/daX of=/dev/null bs=1k count=1' and at least the "da" device disappears.

> Looking at the mprutil output, it also shows the devices sticking around from the adapter’s standpoint.
> 
> You can also try a ‘camcontrol rescan all’ or a ‘camcontrol rescan N’ (where N is the scbus number shown by ‘camcontrol devlist -v’).  That will do some basic probes for each of the devices and should in theory cause them to go away if they aren’t accessible.
> 
> It seems like the adapter may not be recognizing that the devices in question have gone.


I'm pretty sure that I tried this 'camcontrol rescan all' a few times. While I not sure anymore if that cleans up the non-working devices, I'm sure that no new devices were added.

Unfortunately I haven't gotten yet to Steves 'clear controller mapping' script but I did a few other things:
* The last time I tried to upgrade the firmware I had all sorts of problems. "sas3flash" reported bad checksums while flashing some of the files.
So I reflashed both controllers with the DOS version of sas3flash. This was basically a challenge in itself because the DOS version of this utility does not seem to run on computers of this decade. (ERROR:  Failed to initialize PAL.  Exiting program.)
The equivalent sas3flash.EFI version seems to be out of date and caused the checksum problems described before.
(This time I wiped them before flashing with "sas3flash -o -e 6".)

* I tried to change mpr tuneable "use_phy_num" after that but this has not improved the situation. I will retry and collect logs with Steves script.
* I retried with the latest "mpr.ko" from the broadcom download page. (Same problems, no "use_phy_num" tuneable.)

* I retested this hardware with Linux (4.15 and 4.17)
** Some shelves could be replugged reliably (ie: 45 disks show up, 45 disks disappear, 45 disks reappear)
** The newest shelf 2 disks were missing after the replugging (ie: 44 disks show up, 44 disks disappear, 42 disks reappear) (kernel log mpt3sas_cm0: "device is not present handle)

* I tired a different controller
** So far I used a Broadcom LSI SAS 9305-16e (Controller: SAS3216) (Firmware 16.00.01.00 or 15.00.00.00)
** Yesterday I switched to a new fresh out-of-the-box Broadcom LSI 9305-24i (Controller: SAS3224) (Firmware 09.00.00.00 (or something similar with 09*))
With the new controller everything seems work on Linux. It might be the old Firmware?...
It is better with the new controller on FreeBSD in that sense that I at least get one out of two /dev/sesX devices back. But disks are still missing and are not getting completely cleaned up...


This whole thing is a bit frustrating, especially since up until now I thought that HBAs are kind of "connect and forget" devices. Next step is to set up a separate test environment and try to get it to work there. I will keep you updated and try provide log for all FreeBSD related problems.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9e0bf18f-0689-b2a0-1da4-b70c497b2f14>