From owner-freebsd-current@FreeBSD.ORG Tue Apr 12 00:21:04 2005 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B16DF16A4CE for ; Tue, 12 Apr 2005 00:21:04 +0000 (GMT) Received: from mail22.sea5.speakeasy.net (mail22.sea5.speakeasy.net [69.17.117.24]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1B7B143D39 for ; Tue, 12 Apr 2005 00:21:04 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 4196 invoked from network); 12 Apr 2005 00:21:03 -0000 Received: from server.baldwin.cx ([216.27.160.63]) (envelope-sender )AES256-SHA encrypted SMTP for ; 12 Apr 2005 00:21:03 -0000 Received: from [131.106.57.68] (p178.n-lapop01.stsn.com [12.129.240.178]) (authenticated bits=0) by server.baldwin.cx (8.13.1/8.13.1) with ESMTP id j3C0KXGw016395; Mon, 11 Apr 2005 20:20:55 -0400 (EDT) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: Matthew Dillon Date: Mon, 11 Apr 2005 20:17:17 -0400 User-Agent: KMail/1.8 References: <20050406233405.O47071@carver.gumbysoft.com> <20050410172818.D82708@carver.gumbysoft.com> <200504110231.j3B2VOYr047361@apollo.backplane.com> In-Reply-To: <200504110231.j3B2VOYr047361@apollo.backplane.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200504112017.18815.jhb@FreeBSD.org> X-Spam-Status: No, score=-2.8 required=4.2 tests=ALL_TRUSTED autolearn=failed version=3.0.2 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on server.baldwin.cx cc: freebsd-current@FreeBSD.org Subject: Re: Potential source of interrupt aliasing X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Apr 2005 00:21:04 -0000 On Sunday 10 April 2005 10:31 pm, Matthew Dillon wrote: > :> *BUT* it *IS* possible that the wrong APIC vector is being masked > :> (and not because of an interrupt alias, but because the actual hard > :> interrupt is misrouted). > : > :I don't think this is the case. Somehow the vector would have to get > :corrupted during this function call, which is line 609 in > :src/sys/i386/i386/local_apic.c: > : > :isrc = intr_lookup_source(apic_idt_to_irq(frame.if_vec)); > > The vector is not being corrupted at all. Just put that out of your > mind... the APIC is working just fine. The problem is most likely > that the device is asserting the interrupt on the WRONG PIN. Since > the wrong IRQ is asserted, the wrong APIC vector is dispatched, the > wrong interrupt handler and ithread is run, and the source from the > device that actually generated the interrupt is NOT cleared (because > it isn't the device that the system thinks generated the interrupt). That's not the case here. This only happens on specific systems and Linux on the same systems does not see the aliasing presumably because Linux doesn't use interrupt threads and thus doesn't have to mask interrupt lines in the APIC. > You do route interrupts in APIC mode. I wish it were a flat space! It > isn't. Err, no, Doug is right. Except for some nForce chipsets, all APIC interrupts are hardwired. The ACPI _PRT entry has a null source meaning that the associated index is a global interrupt number where global interrupts are allocated consecutively across APICs. The MADT table contains the base interrupt number for pin 0 on each I/O APIC. This really does work fine for almost all systems out there now. For the !ACPI case we actually emulate the ACPI model by assigning similar global interrupt numbers to the APIC pins for the APICs listed in the MP Table. > I think you are forgetting a couple of things here: > > * PCI busses only have 4 interrupt lines (A, B, C, and D). > > * Motherboards often have anywhere from 3 to 6 PCI or PCI-like busses, > connected to the APICs via bridge chips. > > * The bridge chips have a limited number of IRQ pins. > > * Sometimes you have several bridges connected to another bridge > before it gets to the APIC. > > So the answer is... regardless of the capabilities of the APIC(s) > devices still often have limited choices that require IRQ sharing > simply due to the PCI BUS and BRIDGE configuration of the motherboard. Not always. For example, Intel's host to PCI-X bridges include their own I/OxAPIC in the bridge itself, so each slot gets its own pins on that APIC. The I/OxAPIC then sends interrupt requests as messages to the CPUs (sort of like MSI). > But even more to the point, BIOSes (ACPI, etc.) often get really > confused about routing IRQs through bridges. They will for example > believe that two devices that share a *PHYSICAL* IRQ line through a > bridge are capable of being assigned different IRQs when, in fact, > they aren't. They will get confused about how some of the PCI IRQ > lines are routed to the bridges (so line 'B' on PCI bus #1 might be > misconfigured, for example). All sorts of bad things can happen. I haven't seen this. Note that you have to handle bridges carefully. For example, if the bridge's bus is included in one of the tables ($PIR, MP Table, or ACPI _PRT) you just use the associated info directly for the device on that bus. However, if the bus isn't listed (such as in add-on cards, and also some other bridges in various chipsets), then instead you have to do the defined "barber-pole" swizzle and pass the request up to your parent. The generic PCI-PCI bridge's route_interrupt method does this. > The only way for an operating system to figure this stuff out on its > own is to understand the umpteen different bridge chips out there, > test physical interrupt sources (which is not always possible) to see > how they are actually routed, and ignore the BIOS completely. Nope. ACPI provides an abstraction for the link devices used for !APIC mode (and rarely for APIC mode) that work fine. Conceptually, the code needs to route link devices though and when you want to get the IRQ for a PCI device that's hooked up to a link, you ask the link for its IRQ. For the !ACPI case you can use the PCI BIOS to do this which is what the $PIR-driven driver does. > Wasn't it something like NetBSD or OpenBSD that was thinking about > doing that? Not trying to figure out the routing but instead just > figure out which vector was being asserted for a device? I'm beginning > to think that that may be the ONLY solution. NetBSD does have drivers for some of the chipsets that basically just implement what the PCI BIOS call does. > Intel really screwed up big time. Motorola had a much, much, MUCH > better mechanism where the actual devices generated the actual vector > number on the interrupt bus and the only thing you might have hardwired > would have been the IPL. But Intel doesn't work that way. Their stuff > is just totally screwed when it comes to handling interrupts. It's > completely 100% guarenteed pungent crapola to anyone who has ever > built hardware with a *REAL* interrupt subsystem. Hence MSI which PCI-E will use and PCI already has support for (and FreeBSD will grow support for in the not too distant future). -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org