From owner-freebsd-scsi@FreeBSD.ORG Mon Apr 22 03:00:53 2013 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: by hub.freebsd.org (Postfix, from userid 821) id ADC12A4A; Mon, 22 Apr 2013 03:00:53 +0000 (UTC) Date: Mon, 22 Apr 2013 03:00:53 +0000 From: John To: FreeBSD SCSI Subject: Repeated msgs & kernel panic w/ r246437 (Revamp the CAM enclosure services driver) Message-ID: <20130422030053.GA23186@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Apr 2013 03:00:53 -0000 Hi Folks, After updating one of our servers to the latest stable image, it appears that commit r246437 appears to be causing it to panic. The commit: http://svnweb.freebsd.org/base?view=revision&revision=246437 What one of our servers looks like: http://people.freebsd.org/~jwd/zfsnfsserver.jpg The last known working commit: http://people.freebsd.org/~jwd/r246437/dmesg.r246431.clean.txt With commit r246437: http://people.freebsd.org/~jwd/r246437/dmesg.r246437.log.txt Note, most of the dmesg output is related to the ses devices. It repeats itself multiple times before the panic. ses39: ses0,pass20: Element descriptor: ' ' ses39: ses0,pass20: SAS Expander: 24 Physses39: phy 0: connector 255 other 255 ses39: phy 1: connector 255 other 255 ses39: phy 2: connector 255 other 255 ses39: phy 3: connector 255 other 255 ses39: phy 4: connector 255 other 255 ses39: phy 5: connector 255 other 255 ses39: phy 6: connector 255 other 255 etc, etc... After just a few minutes, the system panics. A pair of images of the screen (sorry, no serial console at this time): Panic: http://people.freebsd.org/~jwd/r246437/20130419_160143.jpg bt: http://people.freebsd.org/~jwd/r246437/20130419_110158.jpg We are currently running a test to see if the fact that all our shelves are dual-attached, allowing us to use geom multipath is related. ie: we have disabled the 2nd HBA thus cutting the total number of da & ses devices in half and thus not executing the code in the commit that tracks duplicate ses devices. Note, if we disable both HBA devices and boot the system up it does not panic or print out the repeated messages, but of course we have no disks :-) I am unclear on the "connector 255 other 255" messages and have not taken the time to look into them yet. I would appreciate any insights folks can provide. Many Thanks, John ps: We've had to seriously increase the console buffer size to capture the complete dmesg output... options MSGBUF_SIZE=(32768*32) Can we delay starting the kernel daemon until after the system is up and /var/log/messages is available? Just a thought... From owner-freebsd-scsi@FreeBSD.ORG Mon Apr 22 06:10:01 2013 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 7E4B1BF7 for ; Mon, 22 Apr 2013 06:10:01 +0000 (UTC) (envelope-from Kashyap.Desai@lsi.com) Received: from ch1outboundpool.messaging.microsoft.com (ch1ehsobe003.messaging.microsoft.com [216.32.181.183]) by mx1.freebsd.org (Postfix) with ESMTP id 328831218 for ; Mon, 22 Apr 2013 06:10:00 +0000 (UTC) Received: from mail88-ch1-R.bigfish.com (10.43.68.236) by CH1EHSOBE005.bigfish.com (10.43.70.55) with Microsoft SMTP Server id 14.1.225.23; Mon, 22 Apr 2013 06:09:54 +0000 Received: from mail88-ch1 (localhost [127.0.0.1]) by mail88-ch1-R.bigfish.com (Postfix) with ESMTP id 6B1833A0113; Mon, 22 Apr 2013 06:09:54 +0000 (UTC) X-Forefront-Antispam-Report: CIP:192.19.193.42; KIP:(null); UIP:(null); IPV:NLI; H:paledge01.lsi.com; RD:paledge01.lsi.com; EFVD:NLI X-SpamScore: -4 X-BigFish: VPS-4(zz98dI9371I542I1432Izz1f42h1fc6h1ee6h1de0h1fdah1202h1e76h1d1ah1d2ahz8dhzz2fh2a8h668h839h944hd25hf0ah1220h1288h12a5h12a9h12bdh137ah13b6h1441h1504h1537h153bh15d0h162dh1631h1758h18e1h1946h19b5h1b0ah1155h) Received-SPF: pass (mail88-ch1: domain of lsi.com designates 192.19.193.42 as permitted sender) client-ip=192.19.193.42; envelope-from=Kashyap.Desai@lsi.com; helo=paledge01.lsi.com ; ge01.lsi.com ; Received: from mail88-ch1 (localhost.localdomain [127.0.0.1]) by mail88-ch1 (MessageSwitch) id 1366610990114227_29846; Mon, 22 Apr 2013 06:09:50 +0000 (UTC) Received: from CH1EHSMHS033.bigfish.com (snatpool1.int.messaging.microsoft.com [10.43.68.250]) by mail88-ch1.bigfish.com (Postfix) with ESMTP id 1857B30004A; Mon, 22 Apr 2013 06:09:50 +0000 (UTC) Received: from paledge01.lsi.com (192.19.193.42) by CH1EHSMHS033.bigfish.com (10.43.70.33) with Microsoft SMTP Server (TLS) id 14.1.225.23; Mon, 22 Apr 2013 06:09:50 +0000 Received: from PALHUB01.lsi.com (128.94.213.114) by PALEDGE01.lsi.com (192.19.193.42) with Microsoft SMTP Server (TLS) id 8.3.298.1; Mon, 22 Apr 2013 02:10:11 -0400 Received: from PALEXCH11.lsi.com (128.94.223.42) by PALHUB01.lsi.com (128.94.213.114) with Microsoft SMTP Server (TLS) id 8.3.298.1; Mon, 22 Apr 2013 02:09:49 -0400 Received: from inbexch02.lsi.com (135.36.98.40) by PALEXCH11.lsi.com (128.94.223.42) with Microsoft SMTP Server (TLS) id 14.2.309.2; Mon, 22 Apr 2013 02:09:49 -0400 Received: from inbmail01.lsi.com ([135.36.98.64]) by inbexch02.lsi.com ([135.36.98.40]) with mapi; Mon, 22 Apr 2013 11:39:46 +0530 From: "Desai, Kashyap" To: Konstantin Belousov Date: Mon, 22 Apr 2013 11:39:45 +0530 Subject: RE: New Driver for MegaRaid 6Gb/s and 12Gb/s Card Thread-Topic: New Driver for MegaRaid 6Gb/s and 12Gb/s Card Thread-Index: Ac49ItS58RhLr1ljSwCbQDyNJSpHogB/Oyww Message-ID: References: <1366236517.1499.31.camel@localhost> <517004AE.7070409@gmail.com> <20130419172450.GB67273@kib.kiev.ua> In-Reply-To: <20130419172450.GB67273@kib.kiev.ua> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: lsi.com Cc: "freebsd-scsi@freebsd.org" , "McConnell, Stephen" X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Apr 2013 06:10:01 -0000 > -----Original Message----- > From: Konstantin Belousov [mailto:kostikbel@gmail.com] > Sent: Friday, April 19, 2013 10:55 PM > To: Desai, Kashyap > Cc: freebsd-scsi@freebsd.org; McConnell, Stephen > Subject: Re: New Driver for MegaRaid 6Gb/s and 12Gb/s Card >=20 > On Fri, Apr 19, 2013 at 12:30:09PM +0530, Desai, Kashyap wrote: > > > What about other 6Gb/s controllers that are supported under mfi, > > > such as Skinny Drake? > > > > Skinny Drake will be supported by mfi. > Will be, or is it ? >=20 > I have two Skinny Drake cards, one single-port, and one dual-port, both > of which are dead with mfi(4) from HEAD. I can provide the debugging > data if this is something which interests you. You can send me the details for your issue. I will work on those in backgro= und. We may have to check first that you are using latest FW for your Skinny Dra= ke. ` Kashyap From owner-freebsd-scsi@FreeBSD.ORG Mon Apr 22 07:14:07 2013 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 418D130A for ; Mon, 22 Apr 2013 07:14:07 +0000 (UTC) (envelope-from Kashyap.Desai@lsi.com) Received: from co1outboundpool.messaging.microsoft.com (co1ehsobe005.messaging.microsoft.com [216.32.180.188]) by mx1.freebsd.org (Postfix) with ESMTP id 097F214BE for ; Mon, 22 Apr 2013 07:14:06 +0000 (UTC) Received: from mail126-co1-R.bigfish.com (10.243.78.229) by CO1EHSOBE029.bigfish.com (10.243.66.94) with Microsoft SMTP Server id 14.1.225.23; Mon, 22 Apr 2013 07:14:00 +0000 Received: from mail126-co1 (localhost [127.0.0.1]) by mail126-co1-R.bigfish.com (Postfix) with ESMTP id 9A99CB4008F; Mon, 22 Apr 2013 07:14:00 +0000 (UTC) X-Forefront-Antispam-Report: CIP:192.19.193.42; KIP:(null); UIP:(null); IPV:NLI; H:paledge01.lsi.com; RD:paledge01.lsi.com; EFVD:NLI X-SpamScore: -6 X-BigFish: VPS-6(zz98dI9371I148cI542I1432I4015Izz1f42h1fc6h1ee6h1de0h1fdah1202h1e76h1d1ah1d2ahzz8275bh8275dhz2fh2a8h668h839h944hd25hf0ah1220h1288h12a5h12a9h12bdh137ah13b6h1441h1504h1537h153bh15d0h162dh1631h1758h18e1h1946h19b5h1b0ah1155h) Received-SPF: pass (mail126-co1: domain of lsi.com designates 192.19.193.42 as permitted sender) client-ip=192.19.193.42; envelope-from=Kashyap.Desai@lsi.com; helo=paledge01.lsi.com ; ge01.lsi.com ; Received: from mail126-co1 (localhost.localdomain [127.0.0.1]) by mail126-co1 (MessageSwitch) id 1366614838325855_14867; Mon, 22 Apr 2013 07:13:58 +0000 (UTC) Received: from CO1EHSMHS003.bigfish.com (unknown [10.243.78.230]) by mail126-co1.bigfish.com (Postfix) with ESMTP id 4D676980047; Mon, 22 Apr 2013 07:13:58 +0000 (UTC) Received: from paledge01.lsi.com (192.19.193.42) by CO1EHSMHS003.bigfish.com (10.243.66.13) with Microsoft SMTP Server (TLS) id 14.1.225.23; Mon, 22 Apr 2013 07:13:58 +0000 Received: from PALCAS01.lsi.com (128.94.213.117) by PALEDGE01.lsi.com (192.19.193.42) with Microsoft SMTP Server (TLS) id 8.3.298.1; Mon, 22 Apr 2013 03:14:19 -0400 Received: from PALEXCH11.lsi.com (128.94.223.42) by PALCAS01.lsi.com (128.94.213.117) with Microsoft SMTP Server (TLS) id 8.3.298.1; Mon, 22 Apr 2013 03:13:57 -0400 Received: from inbexch02.lsi.com (135.36.98.40) by PALEXCH11.lsi.com (128.94.223.42) with Microsoft SMTP Server (TLS) id 14.2.309.2; Mon, 22 Apr 2013 03:13:56 -0400 Received: from inbmail01.lsi.com ([135.36.98.64]) by inbexch02.lsi.com ([135.36.98.40]) with mapi; Mon, 22 Apr 2013 12:43:53 +0530 From: "Desai, Kashyap" To: Kevin Day , Scott Long Date: Mon, 22 Apr 2013 12:43:52 +0530 Subject: RE: New Driver for MegaRaid 6Gb/s and 12Gb/s Card Thread-Topic: New Driver for MegaRaid 6Gb/s and 12Gb/s Card Thread-Index: Ac49J3VH72FxgNncTG2733epxozH7QCAGSww Message-ID: References: <1366236517.1499.31.camel@localhost> <516FF87E.8030702@freebsd.org> <37C12059-AF0B-47C1-AD8E-1D3B4663CD3D@your.org> In-Reply-To: <37C12059-AF0B-47C1-AD8E-1D3B4663CD3D@your.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: lsi.com Cc: "freebsd-scsi@freebsd.org" X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Apr 2013 07:14:07 -0000 > -----Original Message----- > From: Kevin Day [mailto:kevin@your.org] > Sent: Friday, April 19, 2013 11:28 PM > To: Scott Long > Cc: Desai, Kashyap; freebsd-scsi@freebsd.org > Subject: Re: New Driver for MegaRaid 6Gb/s and 12Gb/s Card >=20 >=20 > On Apr 19, 2013, at 10:59 AM, Scott Long wrote: > > > > What will the exposed device names of the arrays be under the new > driver? Will it still be /dev/mfid* ? If it's not, then this is where > the problem lies. Many users still use device names in their /etc/fstab > for mounting all of their filesystems at boot. If you have two drivers > that will compete for the same hardware and give that hardware different > names, they will break the fstab files for those users and they upgrade > over time. A similar situation occurred several years ago with the > Intel e1000 driver; it was split into two drivers, with certain hardware > that was supported by the old hardware going to the new driver. That > broke the network configuration for many users and caused years of pain > and unhappiness as users upgraded and were hit by the switch. We don't > want that to happen here. > > > > The real solution is that we need to have a single common naming > convention for all disks (and for network interfaces), and leave the > details of individual driver names out of the configuration part of the > system. That's not likely to happen any time soon. The other solution > is to mandate that users use volume labels for mounting their > filesystems, but that's not likely to happen either, and even if it did, > it present challenges for migrating existing users. The only remaining > solution that I can think of is to have the mfi and mrsas drivers share > the same devclass for their disk interfaces (mfid*), but that's a hack > that has not been fully explored in FreeBSD. Still, I'd encourage you > to try it and see if you can make it work. If you have any problems, > email me directly. >=20 > Some Linux distributions had a flag day where upgrading beyond a certain > point caused a one-time popup asking if you wanted to convert /etc/fstab > to volume labels instead of device names. You could say no and proceed > normally, but if you said Yes it rewrote fstab to use labels. I'm not > sure where we could hook this so it happened both with freebsd-update > and source upgrades, but it would be nice to make it painless to switch. >=20 >=20 Thanks for sharing your early feedback. That was a main goal to discuss at = freebsd mailing list before I submit "mrsas" driver. Let me give more details about our design w.r.t mfi and mrsas (especially f= or Thunderbolt Controller). We had three different choices as described below. #1. We could have removed Thunderbolt support from mfi and add the same sup= port in mrsas, forcing all customers to move to mrsas. This was really good from technical aspect, because having two Driver for s= ingle PCI ID is not perfect solution. But it really does not work in real world, since few customers may want to = continue with "mfi". (at least for short to medium terms, until and unless = they found "mrsas" is reliable for their production environment) #2 AS A DEFAULT CHOICE "MRSAS" DRIVER WILL DETECT THUNDERBOLT CONTROLLER= . LSI decided to give customer an option to choose existing mfi driver for Th= underbolt (Those who are already using mfi driver). For those customer who wants to use first time Thunderbolt card in next Fre= eBSD-RELEASE, can opt mrsas (because mrsas is redesigned driver and will ha= ve better support for all upcoming PCI ID addition). Considering above fact, We may have three typical use case. #2 (a) . Existing customers: (Do not wants to switch to mrsas). For those customers, they will have choice to continue "mfi". (As of= now this settings are tunable through device.hints and it is a onetime eff= ort) We can find the best suitable default behavior.=20 Is there any way to communicate this behavior through Release note to the c= ustomer with FreeBSD-Next Release ? Anyways this behavior will be well documented in "man page of mrsas". #2 (b) Existing customers: (Wants to switch to mrsas). This is a typical worst case. For this case, user may required few manual c= hanges in /etc/fstab. #2 (c) New customers: ( Using Thunderbolt first time on FreeBSD) For those customers, they will never see "mfi" conflict with "mrsas" dr= iver for Thunderbolt card and for them it will be a painless installation p= rocess. #3. AS A DEFAULT CHOICE DRIVER WILL DETECT THUNDERBOLT CONTROLLER. Same as above #2, except default choice will be "mfi" With this choice we would not have any technical issue, but It would not ex= pose our new driver "mrsas" to the end user and there is a high possibility= that "mrsas" would have never been opted for Thunderbolt card, even if it= is available to the upstream kernel. (LSI recommends "mrsas" driver, since this driver will have longer maintena= nce cycle and support compare to mfi.). Because of that reason, we have dec= ided to go with #2. driver will attach device to the CAM layer, unlike mfi driver which= directly talks to block layer. That was a one of the main reason to re-des= ign mrsas to use CAM layer. driver will expose drives as "/dev/daX"= . In summary, I agree that device name /dev/mfidX and /dev/daX will conflict,= but that was treated as slight disadvantage (for very specific use case) o= ver other benefits we will offer to the end user with this new driver submi= ssion. As mentioned by Scott if we can keep same devclass for "mrsas" and "mfi", I= am ok to explore those area, but it does not looks to be possible because = our important goal in was to give all control to the CAM layer inst= ead of keeping logic, which belongs to CAM layer into Low level driver. Please let me know your thoughts.! Thanks, Kashyap From owner-freebsd-scsi@FreeBSD.ORG Mon Apr 22 11:06:51 2013 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 4466A5FD for ; Mon, 22 Apr 2013 11:06:51 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 1CBF3108B for ; Mon, 22 Apr 2013 11:06:51 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r3MB6orj089251 for ; Mon, 22 Apr 2013 11:06:50 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r3MB6oeX089249 for freebsd-scsi@FreeBSD.org; Mon, 22 Apr 2013 11:06:50 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 22 Apr 2013 11:06:50 GMT Message-Id: <201304221106.r3MB6oeX089249@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-scsi@FreeBSD.org Subject: Current problem reports assigned to freebsd-scsi@FreeBSD.org X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Apr 2013 11:06:51 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/165982 scsi [mpt] mpt instability, drive resets, and losses on Fre o kern/165740 scsi [cam] SCSI code must drain callbacks before free o kern/162256 scsi [mpt] QUEUE FULL EVENT and 'mpt_cam_event: 0x0' o kern/153514 scsi [cam] [panic] CAM related panic o docs/151336 scsi Missing documentation of scsi_ and ata_ functions in c s kern/149927 scsi [cam] hard drive not stopped before removing power dur o kern/148083 scsi [aac] Strange device reporting o kern/147704 scsi [mpt] sys/dev/mpt: new chip revision, partially unsupp o kern/144648 scsi [aac] Strange values of speed and bus width in dmesg o kern/142351 scsi [mpt] LSILogic driver performance problems o kern/134488 scsi [mpt] MPT SCSI driver probes max. 8 LUNs per device o kern/132206 scsi [mpt] system panics on boot when mirroring and 2nd dri o kern/130621 scsi [mpt] tranfer rate is inscrutable slow when use lsi213 o kern/129602 scsi [ahd] ahd(4) gets confused and wedges SCSI bus o kern/128452 scsi [sa] [panic] Accessing SCSI tape drive randomly crashe o kern/128245 scsi [scsi] "inquiry data fails comparison at DV1 step" [re o kern/127927 scsi [isp] isp(4) target driver crashes kernel when set up o kern/123674 scsi [ahc] ahc driver dumping o sparc/121676 scsi [iscsi] iscontrol do not connect iscsi-target on sparc o kern/120487 scsi [sg] scsi_sg incompatible with scanners o kern/120247 scsi [mpt] FreeBSD 6.3 and LSI Logic 1030 = only 3.300MB/s o kern/114597 scsi [sym] System hangs at SCSI bus reset with dual HBAs 22 problems total. From owner-freebsd-scsi@FreeBSD.ORG Tue Apr 23 06:38:46 2013 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C74C720F; Tue, 23 Apr 2013 06:38:46 +0000 (UTC) (envelope-from Kashyap.Desai@lsi.com) Received: from co9outboundpool.messaging.microsoft.com (co9ehsobe005.messaging.microsoft.com [207.46.163.28]) by mx1.freebsd.org (Postfix) with ESMTP id 8FA981B34; Tue, 23 Apr 2013 06:38:46 +0000 (UTC) Received: from mail10-co9-R.bigfish.com (10.236.132.226) by CO9EHSOBE003.bigfish.com (10.236.130.66) with Microsoft SMTP Server id 14.1.225.23; Tue, 23 Apr 2013 06:38:39 +0000 Received: from mail10-co9 (localhost [127.0.0.1]) by mail10-co9-R.bigfish.com (Postfix) with ESMTP id 135D0E04DF; Tue, 23 Apr 2013 06:38:39 +0000 (UTC) X-Forefront-Antispam-Report: CIP:192.19.193.42; KIP:(null); UIP:(null); IPV:NLI; H:paledge01.lsi.com; RD:paledge01.lsi.com; EFVD:NLI X-SpamScore: -4 X-BigFish: VPS-4(zz98dI9371I542I1432Izz1f42h1fc6h1ee6h1de0h1fdah1202h1e76h1d1ah1d2ahzz8275bh8275dhz2fh2a8h668h839h944hd25hf0ah1220h1288h12a5h12a9h12bdh137ah13b6h1441h1504h1537h153bh15d0h162dh1631h1758h18e1h1946h19b5h1b0ah1155h) Received-SPF: pass (mail10-co9: domain of lsi.com designates 192.19.193.42 as permitted sender) client-ip=192.19.193.42; envelope-from=Kashyap.Desai@lsi.com; helo=paledge01.lsi.com ; ge01.lsi.com ; Received: from mail10-co9 (localhost.localdomain [127.0.0.1]) by mail10-co9 (MessageSwitch) id 1366699116903585_17045; Tue, 23 Apr 2013 06:38:36 +0000 (UTC) Received: from CO9EHSMHS011.bigfish.com (unknown [10.236.132.231]) by mail10-co9.bigfish.com (Postfix) with ESMTP id D09784027E; Tue, 23 Apr 2013 06:38:36 +0000 (UTC) Received: from paledge01.lsi.com (192.19.193.42) by CO9EHSMHS011.bigfish.com (10.236.130.21) with Microsoft SMTP Server (TLS) id 14.1.225.23; Tue, 23 Apr 2013 06:38:36 +0000 Received: from PALHUB01.lsi.com (128.94.213.114) by PALEDGE01.lsi.com (192.19.193.42) with Microsoft SMTP Server (TLS) id 8.3.298.1; Tue, 23 Apr 2013 02:38:55 -0400 Received: from PALEXCH11.lsi.com (128.94.223.42) by PALHUB01.lsi.com (128.94.213.114) with Microsoft SMTP Server (TLS) id 8.3.298.1; Tue, 23 Apr 2013 02:38:35 -0400 Received: from inbexch02.lsi.com (135.36.98.40) by PALEXCH11.lsi.com (128.94.223.42) with Microsoft SMTP Server (TLS) id 14.2.309.2; Tue, 23 Apr 2013 02:38:35 -0400 Received: from inbmail01.lsi.com ([135.36.98.64]) by inbexch02.lsi.com ([135.36.98.40]) with mapi; Tue, 23 Apr 2013 12:08:31 +0530 From: "Desai, Kashyap" To: Gary Palmer , "kpneal@pobox.com" Date: Tue, 23 Apr 2013 12:08:30 +0530 Subject: RE: New Driver for MegaRaid 6Gb/s and 12Gb/s Card Thread-Topic: New Driver for MegaRaid 6Gb/s and 12Gb/s Card Thread-Index: Ac49KhDyvCCrpayUTUumlVnRql0X9QCwrqJg Message-ID: References: <1366236517.1499.31.camel@localhost> <517004AE.7070409@gmail.com> <20130419161809.GA76332@neutralgood.org> <20130419181631.GE96431@in-addr.com> In-Reply-To: <20130419181631.GE96431@in-addr.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: lsi.com Cc: "freebsd-scsi@freebsd.org" X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Apr 2013 06:38:46 -0000 > -----Original Message----- > From: Gary Palmer [mailto:gpalmer@freebsd.org] > Sent: Friday, April 19, 2013 11:47 PM > To: kpneal@pobox.com > Cc: Desai, Kashyap; freebsd-scsi@freebsd.org > Subject: Re: New Driver for MegaRaid 6Gb/s and 12Gb/s Card >=20 > On Fri, Apr 19, 2013 at 12:18:09PM -0400, kpneal@pobox.com wrote: > > On Fri, Apr 19, 2013 at 12:30:09PM +0530, Desai, Kashyap wrote: > > > > > > > > > > -----Original Message----- > > > > From: matt [mailto:sendtomatt@gmail.com] > > > > Sent: Thursday, April 18, 2013 8:05 PM > > > > To: Desai, Kashyap > > > > Cc: sbruno@freebsd.org; freebsd-scsi@freebsd.org; Kenneth D. > > > > Merry; McConnell, Stephen; jhb@freebsd.org > > > > Subject: Re: New Driver for MegaRaid 6Gb/s and 12Gb/s Card > > > > > > > > > > Would those only be supported by mfi, or by mrsas and mfi? > > > > Is there already a mrsasutil being planned to replace mfiutil? > > > > > > > > > New management interface will be expose by mrsas as "/dev/mrsasX". > If mfiutil has flexibility to work on "/dev/mrsasX" instead of > "/dev/mfiX" we can extend the same utility for mrsas. > > > > It would be confusing to have to use "mfi"util to control "mrsas" > devices. > > > > At the least there needs to be a different name for the utility that > > either reflects the /dev/ device name or otherwise applies to some > > kind of name for the utility. A new "mrsasutil" utility would fit the > > bill. A combined mfi/mrsas utility would as well if it had a name > > that is more generic but not too broad. > > > > mrutil? (MegaRaid ...) > > lsiutil? > > > > I'm fresh out of other ideas for names, sorry. >=20 > Make mrsasutil a hard link to mfiutil and then alter the code to DTRT > depending on which link was used to call the program. There is > precedence for this in the tree already (maybe not for this kind of > device management utility, but other bits) To change "mfiutil" to work with "mrsas" driver is possible with few devcla= ss changes in mfiutil along with other IOCTL code change, but it need signi= ficant time to analyze mfiutil code. >=20 > Gary From owner-freebsd-scsi@FreeBSD.ORG Tue Apr 23 08:09:47 2013 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id E65FB6FC; Tue, 23 Apr 2013 08:09:47 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-ea0-x22c.google.com (mail-ea0-x22c.google.com [IPv6:2a00:1450:4013:c01::22c]) by mx1.freebsd.org (Postfix) with ESMTP id 55FD01EC6; Tue, 23 Apr 2013 08:09:47 +0000 (UTC) Received: by mail-ea0-f172.google.com with SMTP id g14so118857eak.3 for ; Tue, 23 Apr 2013 01:09:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:sender:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-type :content-transfer-encoding; bh=v2yxHfIvLDSvtogL8lpTVmmFjb+ZQBJ+ApiDmYGoCjE=; b=D7TtcvfyWFYXW59vVt4Trx/8QAiYKZcRusSozwWfS30igluc2jKZjhQzJ9yQX4+Qfc +GqiSgIjni6oGfRfuzwHu+K0cMA2JRPz5zV0dhnA3Eoc7pyqhZON+f6Sch3+zPyI9cCk Ry2mdGbTumg/4vAqeG4YMqBTsO8wSDLaMDYJkKTDPu00L/1H1AQSwGZ8hwPfOInZTKq6 /byjx+p5kQAOiS7jHEnhTm7j+l2L94IykQ34CHkIPpnsnNoo5bvultvVkbkdXiIVrGe+ vpbIkn7+F7seEbvZgveYXlIWPwXyV/jw9/XXQQlN26iEv4t4Gsg3CYRZqvRhN2rUKVMB lQHA== X-Received: by 10.14.214.65 with SMTP id b41mr6618922eep.37.1366704586366; Tue, 23 Apr 2013 01:09:46 -0700 (PDT) Received: from mavbook.mavhome.dp.ua (mavhome.mavhome.dp.ua. [213.227.240.37]) by mx.google.com with ESMTPS id s47sm45202086eeg.8.2013.04.23.01.09.44 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 23 Apr 2013 01:09:45 -0700 (PDT) Sender: Alexander Motin Message-ID: <517641C6.7010905@FreeBSD.org> Date: Tue, 23 Apr 2013 11:09:42 +0300 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130413 Thunderbird/17.0.5 MIME-Version: 1.0 To: John Subject: Re: Repeated msgs & kernel panic w/ r246437 (Revamp the CAM enclosure services driver) References: <20130422030053.GA23186@FreeBSD.org> In-Reply-To: <20130422030053.GA23186@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD SCSI X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Apr 2013 08:09:48 -0000 On 22.04.2013 06:00, John wrote: > Hi Folks, > > After updating one of our servers to the latest stable image, > it appears that commit r246437 appears to be causing it to panic. > > The commit: > > http://svnweb.freebsd.org/base?view=revision&revision=246437 > > What one of our servers looks like: > > http://people.freebsd.org/~jwd/zfsnfsserver.jpg > > The last known working commit: > > http://people.freebsd.org/~jwd/r246437/dmesg.r246431.clean.txt > > With commit r246437: > > http://people.freebsd.org/~jwd/r246437/dmesg.r246437.log.txt > > Note, most of the dmesg output is related to the ses devices. It > repeats itself multiple times before the panic. > > ses39: ses0,pass20: Element descriptor: ' ' > ses39: ses0,pass20: SAS Expander: 24 Physses39: phy 0: connector 255 other 255 > ses39: phy 1: connector 255 other 255 > ses39: phy 2: connector 255 other 255 > ses39: phy 3: connector 255 other 255 > ses39: phy 4: connector 255 other 255 > ses39: phy 5: connector 255 other 255 > ses39: phy 6: connector 255 other 255 > > etc, etc... That is not my part of code, but I think it is just too verbose debug messages, that should be hidden. > After just a few minutes, the system panics. A pair of images > of the screen (sorry, no serial console at this time): > > Panic: http://people.freebsd.org/~jwd/r246437/20130419_160143.jpg > > bt: http://people.freebsd.org/~jwd/r246437/20130419_110158.jpg Despite that you are talking about "latest stable image", I believe your kernel is not latest 9-STABLE. Your backtrace reminds me about locking problems that should be already fixed from several sides. For example, on present 9-STABLE ses_path_iter_devid_callback() doesn't call xpt_create_path(), but calls xpt_create_path_unlocked() instead. If you can reproduce the issue with latest 9-STABLE, please provide respective information. > We are currently running a test to see if the fact that all our > shelves are dual-attached, allowing us to use geom multipath is > related. ie: we have disabled the 2nd HBA thus cutting the total > number of da & ses devices in half and thus not executing the > code in the commit that tracks duplicate ses devices. > > Note, if we disable both HBA devices and boot the system up it > does not panic or print out the repeated messages, but of course > we have no disks :-) > > I am unclear on the "connector 255 other 255" messages and have not > taken the time to look into them yet. > > I would appreciate any insights folks can provide. > > Many Thanks, > John > > ps: We've had to seriously increase the console buffer size to > capture the complete dmesg output... > > options MSGBUF_SIZE=(32768*32) > > Can we delay starting the kernel daemon until after the system > is up and /var/log/messages is available? Just a thought... The goal of this code was to create persistent location-dependent names for devices. It may be better to have them earlier. -- Alexander Motin From owner-freebsd-scsi@FreeBSD.ORG Tue Apr 23 14:18:48 2013 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 3701B34E; Tue, 23 Apr 2013 14:18:48 +0000 (UTC) (envelope-from ken@kdm.org) Received: from nargothrond.kdm.org (nargothrond.kdm.org [70.56.43.81]) by mx1.freebsd.org (Postfix) with ESMTP id 0902A1636; Tue, 23 Apr 2013 14:18:47 +0000 (UTC) Received: from nargothrond.kdm.org (localhost [127.0.0.1]) by nargothrond.kdm.org (8.14.2/8.14.2) with ESMTP id r3NE2bsh051266; Tue, 23 Apr 2013 08:02:37 -0600 (MDT) (envelope-from ken@nargothrond.kdm.org) Received: (from ken@localhost) by nargothrond.kdm.org (8.14.2/8.14.2/Submit) id r3NE2b7o051265; Tue, 23 Apr 2013 08:02:37 -0600 (MDT) (envelope-from ken) Date: Tue, 23 Apr 2013 08:02:37 -0600 From: "Kenneth D. Merry" To: Alexander Motin Subject: Re: Repeated msgs & kernel panic w/ r246437 (Revamp the CAM enclosure services driver) Message-ID: <20130423140237.GA50775@nargothrond.kdm.org> References: <20130422030053.GA23186@FreeBSD.org> <517641C6.7010905@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <517641C6.7010905@FreeBSD.org> User-Agent: Mutt/1.4.2i Cc: John , FreeBSD SCSI X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Apr 2013 14:18:48 -0000 On Tue, Apr 23, 2013 at 11:09:42 +0300, Alexander Motin wrote: > On 22.04.2013 06:00, John wrote: > >Hi Folks, > > > > After updating one of our servers to the latest stable image, > >it appears that commit r246437 appears to be causing it to panic. > > > >The commit: > > > >http://svnweb.freebsd.org/base?view=revision&revision=246437 > > > >What one of our servers looks like: > > > >http://people.freebsd.org/~jwd/zfsnfsserver.jpg > > > >The last known working commit: > > > >http://people.freebsd.org/~jwd/r246437/dmesg.r246431.clean.txt > > > >With commit r246437: > > > >http://people.freebsd.org/~jwd/r246437/dmesg.r246437.log.txt > > > >Note, most of the dmesg output is related to the ses devices. It > >repeats itself multiple times before the panic. > > > >ses39: ses0,pass20: Element descriptor: ' ' > >ses39: ses0,pass20: SAS Expander: 24 Physses39: phy 0: connector 255 > >other 255 > >ses39: phy 1: connector 255 other 255 > >ses39: phy 2: connector 255 other 255 > >ses39: phy 3: connector 255 other 255 > >ses39: phy 4: connector 255 other 255 > >ses39: phy 5: connector 255 other 255 > >ses39: phy 6: connector 255 other 255 > > > >etc, etc... > > That is not my part of code, but I think it is just too verbose debug > messages, that should be hidden. Yes, it is probably too verbose, especially on such a large system. > >After just a few minutes, the system panics. A pair of images > >of the screen (sorry, no serial console at this time): > > > >Panic: http://people.freebsd.org/~jwd/r246437/20130419_160143.jpg > > > >bt: http://people.freebsd.org/~jwd/r246437/20130419_110158.jpg > > Despite that you are talking about "latest stable image", I believe your > kernel is not latest 9-STABLE. Your backtrace reminds me about locking > problems that should be already fixed from several sides. For example, > on present 9-STABLE ses_path_iter_devid_callback() doesn't call > xpt_create_path(), but calls xpt_create_path_unlocked() instead. If you > can reproduce the issue with latest 9-STABLE, please provide respective > information. I agree. I added the xpt_create_path_unlocked() call to fix a panic with a stack trace just like the one above. It looks like a problem due to running r246437 exactly. > >We are currently running a test to see if the fact that all our > >shelves are dual-attached, allowing us to use geom multipath is > >related. ie: we have disabled the 2nd HBA thus cutting the total > >number of da & ses devices in half and thus not executing the > >code in the commit that tracks duplicate ses devices. > > > >Note, if we disable both HBA devices and boot the system up it > >does not panic or print out the repeated messages, but of course > >we have no disks :-) > > > >I am unclear on the "connector 255 other 255" messages and have not > >taken the time to look into them yet. > > > >I would appreciate any insights folks can provide. > > > >Many Thanks, > >John > > > >ps: We've had to seriously increase the console buffer size to > >capture the complete dmesg output... > > > >options MSGBUF_SIZE=(32768*32) > > > >Can we delay starting the kernel daemon until after the system > >is up and /var/log/messages is available? Just a thought... > > The goal of this code was to create persistent location-dependent names > for devices. It may be better to have them earlier. Yes, I agree. Ken -- Kenneth Merry ken@FreeBSD.ORG