Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 13 Nov 2003 11:22:33 -0500 (EST)
From:      John Baldwin <jhb@FreeBSD.org>
To:        Benjamin Lewis <bhlewis@wossname.net>
Cc:        bhlewis@purdue.edu
Subject:   RE: Recent -current hangs on Tyan S2460 before finishing boot
Message-ID:  <XFMail.20031113112233.jhb@FreeBSD.org>
In-Reply-To: <3FB31128.1030709@wossname.net>

next in thread | previous in thread | raw e-mail | index | archive | help

On 13-Nov-2003 Benjamin Lewis wrote:
> Hello,
> 
> I'm having trouble getting recent (post- "device apic", pre- turnstile) 
> kernels to boot on my Tyan S2460 (Tiger MP) system with dual AMD 
> Athlons.  What happens is that the machine seems to get "stuck" soon
> after the "Waiting for SCSI devices to settle" message is printed -- it
> appears to be willing to wait forever rather than the SCSI_DELAY time.
> 
> Disabling ACPI in the BIOS has no apparent effect on the hang.  Using
> SCHED_4BSD or SCHED_ULE likewise makes no difference.  I've been 
> following the current@ list hoping to see someone else report a problem
> similar to mine but haven't seen anything yet.
> 
> I do have a serial console attached to the machine and DDB enabled so
> I'm able to provide some information and get more if needed.  I'm
> including a copy of the boot messages from my last attempt to boot
> "FreeBSD 5.1-CURRENT #2: Tue Nov 11 17:35:40 EST 2003" which was 
> cvsup'ed shortly prior to the build date.  Included in the messages are
> the output of "ps" and "trace" once I broke into ddb.
> 
> I'm also including output from "acpidump -t" and "mptable -verbose" 
> since I've seen that information requested in the past.
> 
> Some details about the system that may be pertinent:
>       1. It has two 1Ghz Athlon "Thunderbird" (Not MP) processors.
>          That hasn't been a problem so far.
>       2. The BIOS is version 1.04 (latest is 1.05).  The last time I
>          tried updating to 1.05 (some time ago) I saw lots of error
>          messagess complaining about undefined ACPI stuff so I                
>          reverted.
>       3. There is a Tekram 390F (I think that's the model -- it uses
>          the sym driver) and an Adaptec 3944 SCSI controller.  A
>          single internal SCSI drive is connected to the Tekram and 10
>          external drives are connected to the two ports on the 3944.
>          The external drives are configured as a Vinum Raid10 array.
>          There's also a single IDE drive connected to one of the
>          built-in IDE controllers.
> 
> Please let me know if there is anything more you want to know.
> 
> Thanks,

Can you do a 'show intrcnt' from the ddb prompt?  It sounds like you
may be getting an interrupt storm due to a mis-routed PCI interrupt.

Actually, I think the problem is in the ata driver.  Well, there are
possibly bugs in the interrupt code in that interrupts that don't exist
in the mptable (IRQ's 11 and 15) still get created, but, the fact that
the mptable has no IRQ 15 to me means that there is no IRQ 15 and thus
there should not be an ata1.  Note that in your dmesg, ata1 does say
that it doesn't do DMA because it has been disabled.  Perhaps the ata
driver needs to disable ata1 altogether on that chipset if it sees that
condition.  My guess is that the ata driver is waiting forever for an
interrupt from ata1 which is never going to arrive, hence the hang.
Do you have a boot -v dmesg from a working kernel?

-- 

John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.20031113112233.jhb>