From owner-freebsd-stable@FreeBSD.ORG Wed Nov 9 15:35:37 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A53A21065673; Wed, 9 Nov 2011 15:35:37 +0000 (UTC) (envelope-from vince@unsane.co.uk) Received: from unsane.co.uk (unsane-pt.tunnel.tserv5.lon1.ipv6.he.net [IPv6:2001:470:1f08:110::2]) by mx1.freebsd.org (Postfix) with ESMTP id F1F8C8FC0C; Wed, 9 Nov 2011 15:35:36 +0000 (UTC) Received: from vhoffman.lon.namesco.net (lon.namesco.net [195.7.254.102]) (authenticated bits=0) by unsane.co.uk (8.14.4/8.14.4) with ESMTP id pA9FZZZt094824 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 9 Nov 2011 15:35:35 GMT (envelope-from vince@unsane.co.uk) Message-ID: <4EBA9DC7.8090708@unsane.co.uk> Date: Wed, 09 Nov 2011 15:35:35 +0000 From: Vincent Hoffman User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1 MIME-Version: 1.0 To: John Baldwin References: <4EA9E0C3.5080306@unsane.co.uk> <4EB9AC0F.2040209@unsane.co.uk> <4EB9BD9B.8080604@unsane.co.uk> <201111090939.14177.jhb@freebsd.org> In-Reply-To: <201111090939.14177.jhb@freebsd.org> X-Enigmail-Version: 1.3.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Jan Mikkelsen , freebsd-stable@freebsd.org, Jeremy Chadwick Subject: Re: mfi timeouts X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Nov 2011 15:35:37 -0000 On 09/11/2011 14:39, John Baldwin wrote: > On Tuesday, November 08, 2011 6:39:07 pm Vincent Hoffman wrote: >> On 08/11/2011 22:24, Vincent Hoffman wrote: >>> On 08/11/2011 19:50, John Baldwin wrote: >>>> On Wednesday, November 02, 2011 5:47:38 pm Vincent Hoffman wrote: >>>>> On 28/10/2011 04:14, Jan Mikkelsen wrote: >>>>>> Hi, >>>>>> >>>>>> There is a patch linked to from this PR, which seems very similar: >>>>>> >>>>>> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 >>>>>> >>>>>> http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html >>>>>> >>>>>> The problem is also consistent with running mfiutil clearing the problem. >>>>>> >>>>>> I'm about to deploy mfi controllers in a similar configuration, so I'd be >>>> very curious about whether the patch fixes the problem for you. >>>>> The patch you linked to seems to have removed the stalls, although I >>>>> have only had it running for a day. I'll post if it stalls again though. >>>>> >>>>> I did manage to scrounge the use of a Dell r410 with a >>>>> LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) >>>>> Badged as Dell PERC H700 Adapter >>>>> >>>>> to test out the patch I originally found but had the same issue as this post >>>>> >>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html >>>>> >>>>> >>>>> I couldnt get the dell to stall in the first place either though so it >>>>> could be a specific firmware version that the issue. >>>>> >>>>> Anyway thanks for the pointers. >>>> Hmm, did you try the patch I had posted from that earlier thread? It had >>>> two changes in it, one was similar to the patch in the PR, the second added >>>> MSI-X support. I've since tweaked it to make the MSI-X support off by >>>> default but possible to enable via loader.conf. Would you be willing to >>>> try the updated patch at www.freebsd.org/~jhb/patches/mfi.patch? >>> Hi, >>> yes I tried the patch you posted originally, unfortunately the dell >>> never finished booting either. The Supermicro is now in production but >>> I'll take the dell up to 9-STABLE and try your updated patch. >>> >> The patch didnt apply quite cleanly for 9-STABLE, 1 reject as it had >> already been applied. > Odd, it's against stock head, so I don't know why it would have failed to > apply. > >> I have rebooted the dell and it seems happy with the new patch (msi >> disabled.) > Okay, good. I'll commit the non-MSI bits at least and get them merged into > 9.0 if possible. > >> Booting with >> hw.mfi.msix=1 in /boot/loader.conf causes the timeouts again and stops >> the boot from completing. > Ok. Can you try changing it to use MSI instead of MSI-X? Just edit the > mfi_pci.c call and replace 'pci_alloc_msix' with 'pci_alloc_msi'. > Much better, It boots and says Nov 9 15:25:45 zfstest kernel: mfi0: port 0xfc00-0xfcff mem 0xdf1bc000-0xdf1bffff,0xdf1c0000-0xdf1fffff irq 38 at device 0.0 on pci3 Nov 9 15:25:45 zfstest kernel: mfi0: Using MSI-X Nov 9 15:25:45 zfstest kernel: mfi0: Megaraid SAS driver Ver 3.00 Nov 9 15:25:45 zfstest kernel: mfi0: 2004 (374167405s/0x0020/info) - Shutdown command received from host Nov 9 15:25:45 zfstest kernel: mfi0: 2005 (boot + 34s/0x0020/info) - Firmware initialization started (PCI ID 0079/1000/1f16/1028) Nov 9 15:25:45 zfstest kernel: mfi0: 2006 (boot + 34s/0x0020/info) - Firmware version 2.100.03-1046 Nov 9 15:25:45 zfstest kernel: mfi0: 2007 (boot + 36s/0x0008/info) - Battery Present Nov 9 15:25:45 zfstest kernel: mfi0: 2008 (boot + 36s/0x0020/info) - Package version 12.10.0-0025 Nov 9 15:25:45 zfstest kernel: mfi0: 2009 (boot + 36s/0x0020/info) - Board Revision A00 Nov 9 15:25:45 zfstest kernel: mfi0: 2010 (boot + 61s/0x0002/info) - Inserted: PD 00(e0xff/s0) Nov 9 15:25:45 zfstest kernel: mfi0: 2011 (boot + 61s/0x0002/info) - Inserted: PD 00(e0xff/s0) Info: enclPd=ffff, scsiType=0, portMap=01, sasAddr=4433221107000000,0000000000000000 Nov 9 15:25:45 zfstest kernel: mfi0: 2012 (boot + 61s/0x0002/info) - Inserted: PD 01(e0xff/s1) Nov 9 15:25:45 zfstest kernel: mfi0: 2013 (boot + 61s/0x0002/info) - Inserted: PD 01(e0xff/s1) Info: enclPd=ffff, scsiType=0, portMap=00, sasAddr=4433221106000000,0000000000000000 Nov 9 15:25:45 zfstest kernel: mfi0: 2014 (374167491s/0x0020/info) - Time established as 11/09/11 15:24:51; (63 seconds since power on) Nov 9 15:25:45 zfstest kernel: mfi0: 2015 (374167529s/0x0008/info) - Battery temperature is normal Nov 9 15:25:45 zfstest kernel: mfi0: 2016 (374167529s/0x0008/info) - Battery started charging More info as required. Vince