From owner-freebsd-stable@FreeBSD.ORG Wed Oct 24 18:54:12 2007 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3FBD916A418 for ; Wed, 24 Oct 2007 18:54:12 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219]) by mx1.freebsd.org (Postfix) with ESMTP id F008613C4B8 for ; Wed, 24 Oct 2007 18:54:11 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.8p) with ESMTP id 215897568-1834499 for multiple; Wed, 24 Oct 2007 13:17:39 -0400 Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.8/8.13.8) with ESMTP id l9OHEvBw095168; Wed, 24 Oct 2007 13:15:04 -0400 (EDT) (envelope-from jhb@freebsd.org) From: John Baldwin To: Guy Helmer Date: Wed, 24 Oct 2007 13:14:43 -0400 User-Agent: KMail/1.9.6 References: <45B64469.9020002@palisadesys.com> <200710190958.24912.jhb@freebsd.org> <471F4D52.3000706@palisadesys.com> In-Reply-To: <471F4D52.3000706@palisadesys.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200710241314.43652.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Wed, 24 Oct 2007 13:15:05 -0400 (EDT) X-Virus-Scanned: ClamAV 0.88.3/4590/Wed Oct 24 11:20:13 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: freebsd-stable@freebsd.org Subject: Re: Supermicro X7DBR-8+ hang at boot X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Oct 2007 18:54:12 -0000 On Wednesday 24 October 2007 09:49:06 am Guy Helmer wrote: > John Baldwin wrote: > > On Tuesday 23 January 2007 01:17:57 pm Guy Helmer wrote: > > > >> Jack Vogel wrote: > >> > >>> On 1/23/07, Guy Helmer wrote: > >>> > >>>> Using FreeBSD 6.2, I'm having trouble with the Supermicro X7DBR-8+ > >>>> motherboard (dual Xeon 5130 CPUs on the Blackford chipset - > >>>> http://www.supermicro.com/products/motherboard/Xeon1333/5000P/X7DBR-8+.cfm) > >>>> > >>>> hanging after printing the "Waiting 5 seconds for SCSI devices to > >>>> settle" message. The hang doesn't always happen - sometimes we have to > >>>> go through several reboot cycles for it to happen - but sometimes it > >>>> happens with every reboot. For those who would suggest that this > >>>> happens because I'm using Seagate drives, it happens even if we totally > >>>> remove the SCSI drive (but leave the aic7902 SCSI interfaces enabled) > >>>> and boot from a SATA disk. Using FreeBSD 6.1, the Intel gigabit > >>>> ethernet NICs aren't found but the hang doesn't occur. > >>>> > >>> ... > >>> If that isnt it, I would suggest installing using ACPI disabled or > >>> SAFE if > >>> needed, and then tweak the kernel after. > >>> > >> hint.apic.0.disabled=1 helped - it hasn't hung yet in several boot > >> cycles. New dmesg is attached below in case it helps anyone see a > >> better fix than disabling the APICs. > >> > > > > So you got an interrupt storm on IRQ 18 when ahd0 tried to probe and ahd0 got > > interrupt timeouts. This indicates that ahd0 really lives on IRQ 18, not IRQ > > 30. Your BIOS is likely busted since ACPI hardcodes these sort of IRQs. > > > > You can override the BIOS by doing: > > > > set hw.pci5.2.INTA.irq=18 > > > > in the loader (or adding a line to loader.conf) and seeing if that fixes the > > boot with APIC enabled. > > > > > I'm trying to resolve what looks like a similar problem with an IBM > Blade Server unit. I'm reviewing my previous emails on this subject > with the verbose boot messages to try to learn what lead you to > determine the correct interrupt would be 18, but I can't seem to figure > out what data leads to this conclusion. Any hints? He got an interrupt storm on IRQ 18 while the ahd0 device on IRQ 30 was timing out. -- John Baldwin