From owner-freebsd-scsi@freebsd.org Thu Feb 25 19:45:55 2016 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 35F63AB40FB for ; Thu, 25 Feb 2016 19:45:55 +0000 (UTC) (envelope-from ambrisko@ambrisko.com) Received: from mail.ambrisko.com (mail.ambrisko.com [70.91.206.90]) by mx1.freebsd.org (Postfix) with ESMTP id 216AB1DAE for ; Thu, 25 Feb 2016 19:45:54 +0000 (UTC) (envelope-from ambrisko@ambrisko.com) X-Ambrisko-Me: Yes Received: from server2.ambrisko.com (HELO internal.ambrisko.com) ([192.168.1.2]) by ironport.ambrisko.com with ESMTP; 25 Feb 2016 12:00:35 -0800 Received: from ambrisko.com (localhost [127.0.0.1]) by internal.ambrisko.com (8.14.9/8.14.4) with ESMTP id u1PJjs6l012775; Thu, 25 Feb 2016 11:45:54 -0800 (PST) (envelope-from ambrisko@ambrisko.com) Received: (from ambrisko@localhost) by ambrisko.com (8.14.9/8.14.4/Submit) id u1PJjrbe012773; Thu, 25 Feb 2016 11:45:53 -0800 (PST) (envelope-from ambrisko) Date: Thu, 25 Feb 2016 11:45:53 -0800 From: Doug Ambrisko To: Tinker Cc: freebsd-scsi@freebsd.org Subject: Re: MRSAS driver/LSI MegaRaid 92XX-93XX admin question: When one of the Raid's physical drives break, how is it reported in the logs? Message-ID: <20160225194553.GA10162@ambrisko.com> References: <6a648d421b6d611b4f6f411b66303017@openmailbox.org> <55de137d1ed81930cfdbee579d881d62@openmailbox.org> <20160217000002.GA81916@ambrisko.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160217000002.GA81916@ambrisko.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Feb 2016 19:45:55 -0000 On Tue, Feb 16, 2016 at 04:00:02PM -0800, Doug Ambrisko wrote: | On Sun, Feb 14, 2016 at 10:13:31PM +0700, Tinker wrote: | | (Will send any followup from now only to freebsd-scsi@ .) | | | | Did some additional research and found that the disk failure indeed is | | reported in MRSAS' "event log". | | | | So my final question then is, how do you extract it into userland (in | | the absence of an "mfiutil" as the MFI driver has)? | | I have local changes to print the event log in dmesg which gets sysloged. | We then watch syslog for issues to report things to our customers | automatically. This is similar to mfi(4). I put up a couple of patches: https://people.freebsd.org/~ambrisko/mrsas.patch https://people.freebsd.org/~ambrisko/mrsasutil.patch I made a bunch of changes to the driver to deal with issues we've seen at work. I've done light testing and it is working better now. Most of my testing is under FreeBSD 9.2 but the code base is from -current. It is going through more product testing that exposed issues with the ioctl path. One of the major changes in the ioctl path is let the OS create the SG list since user-land doesn't really know what the kernel memory is like and lets the OS figure it out. It also uses 64 bit address range. Limiting the driver address range was creating problems when the system memory was being used and potentially fragmented resulting in lack of memory that could be allocated. This occurred after our appliance was up for a while and during tests. It also adds support for displaying event logs to dmesg such as: mrsas0: 19366 (509744360s/0x0002/info) - State change on PD 16(e0x00/s1) from ONLINE(18) to FAILED(11) mrsas0: 19367 (509744360s/0x0001/info) - State change on VD 00/0 from OPTIMAL(3) to DEGRADED(2) mrsas0: 19368 (509744360s/0x0001/CRIT) - VD 00/0 is now DEGRADED mrsas0: 19369 (509744371s/0x0002/info) - Rebuild started on PD 16(e0x00/s1) mrsas0: 19370 (509744371s/0x0002/info) - State change on PD 16(e0x00/s1) from FAILED(11) to REBUILD(14) It only happens at run time not at boot like mfi. I also added support for mfiutil and created a patch against mfiutil to create hard link to mrsasutil so it will know to automatically use mrsas0. I created the above logs via mrsasutil fail mrsasutil rebuild Again this is lightly tested. I need to test 32 bit emulation and 32 bit build. I need to test it with current. It's a work in progress. Thanks, Doug A.