From owner-freebsd-current@FreeBSD.ORG  Tue Apr 12 00:21:04 2005
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id B16DF16A4CE
	for <freebsd-current@FreeBSD.org>;
	Tue, 12 Apr 2005 00:21:04 +0000 (GMT)
Received: from mail22.sea5.speakeasy.net (mail22.sea5.speakeasy.net
	[69.17.117.24])	by mx1.FreeBSD.org (Postfix) with ESMTP id 1B7B143D39
	for <freebsd-current@FreeBSD.org>;
	Tue, 12 Apr 2005 00:21:04 +0000 (GMT)	(envelope-from jhb@FreeBSD.org)
Received: (qmail 4196 invoked from network); 12 Apr 2005 00:21:03 -0000
Received: from server.baldwin.cx ([216.27.160.63])
	(envelope-sender <jhb@FreeBSD.org>)AES256-SHA encrypted SMTP
	for <freebsd-current@FreeBSD.org>; 12 Apr 2005 00:21:03 -0000
Received: from [131.106.57.68] (p178.n-lapop01.stsn.com [12.129.240.178])
	(authenticated bits=0)
	by server.baldwin.cx (8.13.1/8.13.1) with ESMTP id j3C0KXGw016395;
	Mon, 11 Apr 2005 20:20:55 -0400 (EDT)
	(envelope-from jhb@FreeBSD.org)
From: John Baldwin <jhb@FreeBSD.org>
To: Matthew Dillon <dillon@apollo.backplane.com>
Date: Mon, 11 Apr 2005 20:17:17 -0400
User-Agent: KMail/1.8
References: <20050406233405.O47071@carver.gumbysoft.com>
	<20050410172818.D82708@carver.gumbysoft.com>
	<200504110231.j3B2VOYr047361@apollo.backplane.com>
In-Reply-To: <200504110231.j3B2VOYr047361@apollo.backplane.com>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200504112017.18815.jhb@FreeBSD.org>
X-Spam-Status: No, score=-2.8 required=4.2 tests=ALL_TRUSTED autolearn=failed 
	version=3.0.2
X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on server.baldwin.cx
cc: freebsd-current@FreeBSD.org
Subject: Re: Potential source of interrupt aliasing
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Apr 2005 00:21:04 -0000

On Sunday 10 April 2005 10:31 pm, Matthew Dillon wrote:
> :>     *BUT* it *IS* possible that the wrong APIC vector is being masked
> :> (and not because of an interrupt alias, but because the actual hard
> :> interrupt is misrouted).
> :
> :I don't think this is the case. Somehow the vector would have to get
> :corrupted during this function call, which is line 609 in
> :src/sys/i386/i386/local_apic.c:
> :
> :isrc = intr_lookup_source(apic_idt_to_irq(frame.if_vec));
>
>     The vector is not being corrupted at all.  Just put that out of your
>     mind... the APIC is working just fine.  The problem is most likely
>     that the device is asserting the interrupt on the WRONG PIN.  Since
>     the wrong IRQ is asserted, the wrong APIC vector is dispatched, the
>     wrong interrupt handler and ithread is run, and the source from the
>     device that actually generated the interrupt is NOT cleared (because
>     it isn't the device that the system thinks generated the interrupt).

That's not the case here.  This only happens on specific systems and Linux on 
the same systems does not see the aliasing presumably because Linux doesn't 
use interrupt threads and thus doesn't have to mask interrupt lines in the 
APIC.

>     You do route interrupts in APIC mode.  I wish it were a flat space!  It
>     isn't.

Err, no, Doug is right.  Except for some nForce chipsets, all APIC interrupts 
are hardwired.  The ACPI _PRT entry has a null source meaning that the 
associated index is a global interrupt number where global interrupts are 
allocated consecutively across APICs.  The MADT table contains the base 
interrupt number for pin 0 on each I/O APIC.  This really does work fine for 
almost all systems out there now.  For the !ACPI case we actually emulate the 
ACPI model by assigning similar global interrupt numbers to the APIC pins for 
the APICs listed in the MP Table.

>     I think you are forgetting a couple of things here:
>
>     * PCI busses only have 4 interrupt lines (A, B, C, and D).
>
>     * Motherboards often have anywhere from 3 to 6 PCI or PCI-like busses,
>       connected to the APICs via bridge chips.
>
>     * The bridge chips have a limited number of IRQ pins.
>
>     * Sometimes you have several bridges connected to another bridge
>       before it gets to the APIC.
>
>     So the answer is... regardless of the capabilities of the APIC(s)
>     devices still often have limited choices that require IRQ sharing
>     simply due to the PCI BUS and BRIDGE configuration of the motherboard.

Not always.  For example, Intel's host to PCI-X bridges include their own 
I/OxAPIC in the bridge itself, so each slot gets its own pins on that APIC.  
The I/OxAPIC then sends interrupt requests as messages to the CPUs (sort of 
like MSI).

>     But even more to the point, BIOSes (ACPI, etc.) often get really
>     confused about routing IRQs through bridges.  They will for example
>     believe that two devices that share a *PHYSICAL* IRQ line through a
>     bridge are capable of being assigned different IRQs when, in fact,
>     they aren't.  They will get confused about how some of the PCI IRQ
>     lines are routed to the bridges (so line 'B' on PCI bus #1 might be
>     misconfigured, for example).  All sorts of bad things can happen.

I haven't seen this.  Note that you have to handle bridges carefully.  For 
example, if the bridge's bus is included in one of the tables ($PIR, MP 
Table, or ACPI _PRT) you just use the associated info directly for the device 
on that bus.  However, if the bus isn't listed (such as in add-on cards, and 
also some other bridges in various chipsets), then instead you have to do the 
defined "barber-pole" swizzle and pass the request up to your parent.  The 
generic PCI-PCI bridge's route_interrupt method does this.

>     The only way for an operating system to figure this stuff out on its
>     own is to understand the umpteen different bridge chips out there,
>     test physical interrupt sources (which is not always possible) to see
>     how they are actually routed, and ignore the BIOS completely.

Nope.  ACPI provides an abstraction for the link devices used for !APIC mode 
(and rarely for APIC mode) that work fine.  Conceptually, the code needs to 
route link devices though and when you want to get the IRQ for a PCI device 
that's hooked up to a link, you ask the link for its IRQ.  For the !ACPI case 
you can use the PCI BIOS to do this which is what the $PIR-driven driver 
does.

>     Wasn't it something like NetBSD or OpenBSD that was thinking about
>     doing that?  Not trying to figure out the routing but instead just
>     figure out which vector was being asserted for a device?  I'm beginning
>     to think that that may be the ONLY solution.

NetBSD does have drivers for some of the chipsets that basically just 
implement what the PCI BIOS call does.

>     Intel really screwed up big time.  Motorola had a much, much, MUCH
>     better mechanism where the actual devices generated the actual vector
>     number on the interrupt bus and the only thing you might have hardwired
>     would have been the IPL.  But Intel doesn't work that way.  Their stuff
>     is just totally screwed when it comes to handling interrupts.  It's
>     completely 100% guarenteed pungent crapola to anyone who has ever
>     built hardware with a *REAL* interrupt subsystem.

Hence MSI which PCI-E will use and PCI already has support for (and FreeBSD 
will grow support for in the not too distant future).

-- 
John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org