From owner-freebsd-current@FreeBSD.ORG Tue Nov 29 21:35:13 2005 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B431716A41F for ; Tue, 29 Nov 2005 21:35:13 +0000 (GMT) (envelope-from julian@elischer.org) Received: from delight.idiom.com (outbound.idiom.com [216.240.47.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id 08F1D43D8D for ; Tue, 29 Nov 2005 21:34:38 +0000 (GMT) (envelope-from julian@elischer.org) Received: from idiom.com (idiom.com [216.240.32.1]) by delight.idiom.com (Postfix) with ESMTP id CCC45223C69; Tue, 29 Nov 2005 13:34:34 -0800 (PST) Received: from [192.168.2.4] (home.elischer.org [216.240.48.38]) by idiom.com (8.12.11/8.12.11) with ESMTP id jATLYWEI021868; Tue, 29 Nov 2005 13:34:34 -0800 (PST) (envelope-from julian@elischer.org) Message-ID: <438CC968.6090203@elischer.org> Date: Tue, 29 Nov 2005 13:34:32 -0800 From: Julian Elischer User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.12) Gecko/20051120 X-Accept-Language: en, hu MIME-Version: 1.0 To: FreeBSD current mailing list Content-Type: multipart/mixed; boundary="------------090303060402000807010506" Cc: Subject: Re: em interrupt storm X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Nov 2005 21:35:13 -0000 This is a multi-part message in MIME format. --------------090303060402000807010506 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit This is not really a -current question as I'm seeing it on 4.x on a Dell 2850 with a PCI-express card, but the previous discussion was here so I thought I'd put it here to continue the thread. The system locks up when the em driver em_intr() is called from irq 2 (em3) but the interrupt was actually generated by em4. as you can see from the vmstat below :root 27] vmstat -i interrupt total rate amr0 irq10 46825 52 em0 irq11 68724 76 em3 irq2 186292 207 em4 irq14 186509 207 atkbd0 irq1 1 0 sio0 irq4 2205 2 clk irq0 89622 99 rtc irq8 114722 127 Total 694900 774 :root 28] em3 and em4 have basically the same interrupt count and rate, however em3 is not active and is not up. The interrupts are coming from em4 which is a standard em type chip on the Dell 2850 motherboard. because em3 didn't make the interrupt, calling it's interrupt routine doesn't clear the interrupt and so it hits again as soon as the interrupt routine returns. thus the system locks up spinning in and out of the interrupt handler for em3 on irq2. However there is something a bit strange about it. If it were as simple as this, and irq2 always copied irq14 then one would expect to freeze up immediatly upon activating em4, but that is not the case. It only sems to freeze up if the system is already in a disk driver (bio mask) when the interrupt happens. (?) em3 and em4 are not connected in any way I know of. em4 is onth emotherboard em3 is on an intel 4-port PCI-express card that is not being used. I can make the system, work reliably by adding code to the em driver so that when any of the em interrupts happen it checks ALL the em interfaces. But this is notthe answer and if there were some OTHER drive on irq2 I'd still be just as hosed. I include the dmesg. Just for fun I might see what happens with dragnfly, though as the machine has no removable media I need to do that over the net so it may take some seting up. Anyone who has any ideas as to why irq14 is being deliverred on irq2, let me know! and let me know why teh prsence of disk IO makes a difference? julian --------------090303060402000807010506 Content-Type: text/plain; name="dmesg" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="dmesg" Copyright (c) 1992-2004 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.10-RELEASE #5: Tue Nov 29 12:41:48 GMT 2005 root@trafmon1.wga:/usr/build/godspeed/freebsd/mods/src/sys/compile/MESSAGING_GATEWAY Timecounter "i8254" frequency 1193182 Hz CPU: Intel(R) Xeon(TM) CPU 3.60GHz (3591.25-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf43 Stepping = 3 Features=0xbfebfbff Hyperthreading: 2 logical CPUs real memory = 3489398784 (3407616K bytes) avail memory = 3400192000 (3320500K bytes) Changing APIC ID for IO APIC #0 from 0 to 8 on chip Changing APIC ID for IO APIC #1 from 0 to 9 on chip Changing APIC ID for IO APIC #2 from 0 to 10 on chip Programming 24 pins in IOAPIC #0 IOAPIC #0 intpin 2 -> irq 0 Programming 24 pins in IOAPIC #1 Programming 24 pins in IOAPIC #2 SCI INT 9 Set apic 0 pin 9, level, active low Set apic 0 pin 9, level, active high FreeBSD/SMP: Multiprocessor motherboard: 4 CPUs cpu0 (BSP): apic id: 0, version: 0x00050014, at 0xfee00000 cpu1 (AP): apic id: 1, version: 0x00050014, at 0xfee00000 cpu2 (AP): apic id: 6, version: 0x00050014, at 0xfee00000 cpu3 (AP): apic id: 7, version: 0x00050014, at 0xfee00000 io0 (APIC): apic id: 8, version: 0x00178020, at 0xfec00000 io1 (APIC): apic id: 9, version: 0x00178020, at 0xfec80000 io2 (APIC): apic id: 10, version: 0x00178020, at 0xfec83000 Preloaded elf kernel "k2" at 0xc0397000. Warning: Pentium 4 CPU: PSE disabled Pentium Pro MTRR support enabled md0: Malloc disk Using $PIR table, 18 entries at 0xc00fb6c0 acpi0: on motherboard acpi0: power button is handled as a fixed feature programming model. Timecounter "ACPI-fast" frequency 3579545 Hz acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 acpi_cpu0: on acpi0 acpi_cpu1: on acpi0 acpi_cpu2: on acpi0 acpi_cpu3: on acpi0 npx0: on motherboard npx0: INT 16 interface dell_bios0: Found Dell signature dell_bios0: at iomem 0xf99f0-0xf9a0e on motherboard dell_bios0: Version: 2.03, Revision: 2.03 dell_bios0: Enable 0 dell_bios0: Disable 1 dell_bios0: Size 1025K dell_bios0: Completion Code 0x0000 Success dell_bios0: Updated on: 8/15/05 at 23:44 dell_bios0: Version A03 dell_bios0: Min. Version A03 dell_bios0: Manufacturer IronPort dell_bios0: System ID 32824 [8038] ipmi0: Found Dell signature ipmi0: at iomem 0xf99f0-0xf9a0e on motherboard ipmi0: Version: 2.03, Revision: 2.03 ipmi0: KCS mode found ipmi0: Address 0xca8 ipmi0: Allignment 0x4 ipmi0: I/O mode ipmi0: Device Rev. 0 ipmi0: Firmware Rev. 1.81 ipmi0: Version 1.5 ipmi0: Number of channels 4 pcib0: on motherboard IOAPIC #0 intpin 16 -> irq 2 Hello 8f6f Hello 8f6f pci0: on pcib0 pcib1: irq 2 at device 2.0 on pci0 pci1: on pcib1 pcib2: at device 0.0 on pci1 IOAPIC #1 intpin 14 -> irq 10 pci2: on pcib2 amr0: mem 0xdfec0000-0xdfefffff,0xd80f0000-0xd80fffff irq 10 at device 14.0 on pci2 amr0: delete logical drives supported by controller created DEVICE***************** amr0: Firmware 516A, BIOS H418, 256MB RAM pcib3: at device 0.2 on pci1 pci3: on pcib3 pcib4: irq 2 at device 4.0 on pci0 pci4: on pcib4 pcib5: mem 0xdf6e0000-0xdf6fffff irq 2 at device 0.0 on pci4 pci5: on pcib5 pcib6: irq 0 at device 1.0 on pci5 pci6: on pcib6 pcib7: irq 0 at device 2.0 on pci5 IOAPIC #0 intpin 18 -> irq 11 IOAPIC #0 intpin 19 -> irq 13 pci7: on pcib7 em0: port 0xece0-0xecff mem 0xdfbc0000-0xdfbdffff,0xdfbe0000-0xdfbfffff irq 11 at device 0.0 on pci7 em0 00:0e:0c:a1:6a:28, em0: Speed:N/A Duplex:N/A em1: port 0xecc0-0xecdf mem 0xdfb80000-0xdfb9ffff,0xdfba0000-0xdfbbffff irq 13 at device 0.1 on pci7 em0 00:0e:0c:a1:6a:28,em1 00:0e:0c:a1:6a:29, em1: Speed:N/A Duplex:N/A pcib8: irq 0 at device 3.0 on pci5 pci8: on pcib8 em2: port 0xdce0-0xdcff mem 0xdf9c0000-0xdf9dffff,0xdf9e0000-0xdf9fffff irq 13 at device 0.0 on pci8 em0 00:0e:0c:a1:6a:28,em1 00:0e:0c:a1:6a:29,em2 00:0e:0c:a1:6a:2a, em2: Speed:N/A Duplex:N/A em3: port 0xdcc0-0xdcdf mem 0xdf980000-0xdf99ffff,0xdf9a0000-0xdf9bffff irq 2 at device 0.1 on pci8 em0 00:0e:0c:a1:6a:28,em1 00:0e:0c:a1:6a:29,em2 00:0e:0c:a1:6a:2a,em3 00:0e:0c:a1:6a:2b, em3: Speed:N/A Duplex:N/A pcib9: irq 2 at device 5.0 on pci0 pci9: on pcib9 pcib10: at device 0.0 on pci9 IOAPIC #2 intpin 0 -> irq 14 pci10: on pcib10 em4: port 0xccc0-0xccff mem 0xdf4e0000-0xdf4fffff irq 14 at device 7.0 on pci10 em0 00:0e:0c:a1:6a:28,em1 00:0e:0c:a1:6a:29,em2 00:0e:0c:a1:6a:2a,em3 00:0e:0c:a1:6a:2b,em4 00:14:22:0f:45:2f, em4: Speed:N/A Duplex:N/A pcib11: at device 0.2 on pci9 IOAPIC #2 intpin 1 -> irq 15 pci11: on pcib11 em5: port 0xbcc0-0xbcff mem 0xdf2e0000-0xdf2fffff irq 15 at device 8.0 on pci11 em0 00:0e:0c:a1:6a:28,em1 00:0e:0c:a1:6a:29,em2 00:0e:0c:a1:6a:2a,em3 00:0e:0c:a1:6a:2b,em4 00:14:22:0f:45:2f,em5 00:14:22:0f:45:30, em5: Speed:N/A Duplex:N/A pcib12: irq 2 at device 6.0 on pci0 pci12: on pcib12 pcib13: at device 30.0 on pci0 pci13: on pcib13 pci13: at 13.0 irq 11 isab0: at device 31.0 on pci0 isa0: on isab0 orm0: