From owner-freebsd-current@FreeBSD.ORG Thu Jun 3 19:56:04 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C9A0516A4CE for ; Thu, 3 Jun 2004 19:56:04 -0700 (PDT) Received: from smtp3b.sentex.ca (smtp3b.sentex.ca [205.211.164.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4274F43D5A for ; Thu, 3 Jun 2004 19:56:04 -0700 (PDT) (envelope-from mike@sentex.net) Received: from avscan1.sentex.ca (avscan1.sentex.ca [199.212.134.11]) by smtp3b.sentex.ca (8.12.11/8.12.11) with ESMTP id i542u3HO043303; Thu, 3 Jun 2004 22:56:03 -0400 (EDT) (envelope-from mike@sentex.net) Received: from lava.sentex.ca (pyroxene.sentex.ca [199.212.134.18]) by avscan1.sentex.ca (8.12.10/8.12.10) with ESMTP id i542u2GJ003137; Thu, 3 Jun 2004 22:56:02 -0400 (EDT) (envelope-from mike@sentex.net) Received: from simian.sentex.net (simeon.sentex.ca [192.168.43.27]) by lava.sentex.ca (8.12.11/8.12.11) with ESMTP id i542tqnT097826; Thu, 3 Jun 2004 22:55:58 -0400 (EDT) (envelope-from mike@sentex.net) Message-Id: <6.0.3.0.0.20040603220621.045655e0@64.7.153.2> X-Sender: mdtpop@64.7.153.2 (Unverified) X-Mailer: QUALCOMM Windows Eudora Version 6.0.3.0 Date: Thu, 03 Jun 2004 22:58:32 -0400 To: freebsd-current@freebsd.org From: Mike Tancsa In-Reply-To: <20040518132157.B8772@gamplex.bde.org> References: <6.0.3.0.0.20040517154946.06d23d60@64.7.153.2> <20040518132157.B8772@gamplex.bde.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-Virus-Scanned: by amavisd-new Subject: Re: sio / puc wedging on both -current and -stable X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Jun 2004 02:56:04 -0000 Just a followup my previous post. Still havent quite figured out exactly what is going on, but perhaps something with the USB code? This is on a number of ICH4 boards I have on at least 3 different chipset variants. The MB BIOSes are also all up to date. Just a quick recap. I can fairly easily trigger an interrupt storm on these machines with USB enabled in the BIOS. If I disable it, I dont have a problem and all works well.... However, what I accidently came across today, was that if I load the USB drivers as a kld, I can *not* wedge the machine. Note the bottom of the following diff diff dmesg.kld dmesg.static < pci0: at 29.0 irq 10 < pci0: at 29.1 irq 5 < pci0: at 29.2 irq 12 --- > uhci0: port 0xb800-0xb81f irq 10 at device 29.0 on pci0 > usb0: on uhci0 > usb0: USB revision 1.0 > uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > uhub0: 2 ports with 2 removable, self powered > uhci1: port 0xb000-0xb01f irq 5 at device 29.1 on pci0 > usb1: on uhci1 > usb1: USB revision 1.0 > uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > uhub1: 2 ports with 2 removable, self powered > uhci2: port 0xb400-0xb41f irq 12 at device 29.2 on pci0 > usb2: on uhci2 > usb2: USB revision 1.0 > uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > uhub2: 2 ports with 2 removable, self powered 67,83d78 < uhci0: port 0xb800-0xb81f irq 10 at device 29.0 on pci0 < usb0: on uhci0 < usb0: USB revision 1.0 < uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 < uhub0: 2 ports with 2 removable, self powered < uhci1: port 0xb000-0xb01f irq 5 at device 29.1 on pci0 < usb1: on uhci1 < usb1: USB revision 1.0 < uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 < uhub1: 2 ports with 2 removable, self powered < uhci2: port 0xb400-0xb41f irq 12 at device 29.2 on pci0 < uhci2: Could not allocate irq < device_probe_and_attach: uhci2 attach returned 6 < uhci2: port 0xb400-0xb41f irq 12 at device 29.2 on pci0 < uhci2: Could not allocate irq < device_probe_and_attach: uhci2 attach returned 6 < stray irq 7 arnold2% Why when I load the USB drivers as a kld, does it get an error about allocating an IRQ ? But when I have the drivers statically compiled, there is no error. Furthermore, when loaded as a kld, I am *not* able to wedge the box with an interrupt storm. arnold2% diff pci.kld pci.static 21c21 < none0@pci0:29:2: class=0x0c0300 card=0x0074a0a0 chip=0x24c78086 rev=0x02 hdr=0x00 --- > uhci2@pci0:29:2: class=0x0c0300 card=0x0074a0a0 chip=0x24c78086 rev=0x02 hdr=0x00 41c41 < none1@pci0:31:3: class=0x0c0500 card=0x0074a0a0 chip=0x24c38086 rev=0x02 hdr=0x00 --- > none0@pci0:31:3: class=0x0c0500 card=0x0074a0a0 chip=0x24c38086 rev=0x02 hdr=0x00 arnold2% arnold2% cat pci.static chip0@pci0:0:0: class=0x060000 card=0x0074a0a0 chip=0x25608086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = '82845G/GL/GV/GE/PE DRAM Controller / Host-Hub I/F Bridge' class = bridge subclass = HOST-PCI agp0@pci0:2:0: class=0x030000 card=0x3402a0a0 chip=0x25628086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = '82845G/GL/GV/GE/PE Integrated Graphics Device' class = display subclass = VGA uhci0@pci0:29:0: class=0x0c0300 card=0x0074a0a0 chip=0x24c28086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801DB/DBM (ICH4/M) USB UHCI Controller #1' class = serial bus subclass = USB uhci1@pci0:29:1: class=0x0c0300 card=0x0074a0a0 chip=0x24c48086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801DB/DBM (ICH4/M) USB UHCI Controller #2' class = serial bus subclass = USB uhci2@pci0:29:2: class=0x0c0300 card=0x0074a0a0 chip=0x24c78086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801DB/DBM (ICH4/M) USB UHCI Controller #3' class = serial bus subclass = USB pcib1@pci0:30:0: class=0x060400 card=0x00000000 chip=0x244e8086 rev=0x82 hdr=0x01 vendor = 'Intel Corporation' device = '82801BA/CA/DB/EB/ER (ICH2/3/4/5/5R) Hub Interface to PCI Bridge' class = bridge subclass = PCI-PCI isab0@pci0:31:0: class=0x060100 card=0x00000000 chip=0x24c08086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801DB (ICH4) LPC Interface Bridge' class = bridge subclass = PCI-ISA atapci0@pci0:31:1: class=0x01018a card=0x0074a0a0 chip=0x24cb8086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801DB (ICH4) UltraATA/100 EIDE Controller' class = mass storage subclass = ATA none0@pci0:31:3: class=0x0c0500 card=0x0074a0a0 chip=0x24c38086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801DB/DBM (ICH4/M) SMBus Controller' class = serial bus subclass = SMBus puc0@pci1:0:0: class=0x070002 card=0x00000000 chip=0x01201407 rev=0x00 hdr=0x00 vendor = 'Lava Computer Manufacturing Inc' class = simple comms subclass = UART puc1@pci1:0:1: class=0x070002 card=0x00000000 chip=0x01211407 rev=0x00 hdr=0x00 vendor = 'Lava Computer Manufacturing Inc' class = simple comms subclass = UART rl0@pci1:1:0: class=0x020000 card=0x813910ec chip=0x813910ec rev=0x10 hdr=0x00 vendor = 'Realtek Semiconductor' device = 'RT8139 (A/B/C/813x/C+) Fast Ethernet Adapter' class = network subclass = ethernet sio8@pci1:5:0: class=0x078000 card=0x0000151f chip=0x0000151f rev=0x00 hdr=0x00 vendor = 'Topic Semiconductor Corp' class = simple comms fxp0@pci1:8:0: class=0x020000 card=0x0317a0a0 chip=0x103a8086 rev=0x82 hdr=0x00 vendor = 'Intel Corporation' device = '82801DB LAN Controller with 82562ET/EZ (CNR) PHY' class = network subclass = ethernet ---Mike At 11:50 PM 17/05/2004, Bruce Evans wrote: >On Mon, 17 May 2004, Mike Tancsa wrote: > > > We are building a box that needs many serial ports to talk to some legacy > > low speed (9600) serial devices. Our application (a small daemon written > > in c) happily talks to the devices and all works well. However, if one of > > the external devices die or is unplugged, the FreeBSD box will at seemingly > > irregular intervals lockup hard. The only way to unlock the machine is to > > either hit the reset button (the keyboard is locked solid-- not even num > > lock works) *or* if I jiggle the DB9 connector enough so that enough noise > > shorts across the serial port *or* plug the serial port into a working > > device that I imagine sends some data on the serial port. The machine then > > returns to a normal state and all is well. This does NOT happen with the > > onboard serial ports. Only with a PUC device (we have tried several and > > its the same result) > > > > Does this jog anyone's memory as to what the problem might be ? > >It's an interrupt storm of some sort. PCI interrupts are more likely to >cause one than ISA interrupts because they are more likely to be level >triggered. > > > I have a remote debugger setup and I can send a break and drop the unit > > into debugger, but kernel debugging is a little beyond our skillset. > >Does this break into the locked machine? If so... > > > db> trace > > siointr1(c11d0000,d56dacb0,c02b49e6,c11d0000,10) at siointr1+0xc5 > > siointr(c11d0000,10,a005,c,10060) at siointr+0xc > > Xfastintr4(c11d0c00,d56dacd8,c02a741a,c11d0c00,c0a3f240) at Xfastintr4+0x16 > > siointr(c11d0c00) at siointr+0xc > >... Type "s", then hold down the Enter key to repeat the "s" command until >control returns here, then keep holding down the Enter key until something >loops (may take many hundreds of commands). Record all the output using >a serial console (don't type it in) and send it to me. > > > puc_intr(c11af000,63103a,c11d0c00,0,d56dad68) at puc_intr+0x4e > >If control returns here, then siointr hasn't looped internally; keep >going. > > > intr_mux(c0a3f240,0,630010,c1360010,c0170010) at intr_mux+0x1f > >If control returns here, then the loop is external so it is harder to >debug (but this is the most likely case). > >Going through intr_mux() means that the interrupt is not fast >(options PUC_FASTINTR). Try that. > > > Xresume12() at Xresume12+0x2b > >Stop if it gets back here. > > > --- interrupt, eip = 0xc02b5b2a, esp = 0xd56dad38, ebp = 0xd56dad68 --- > > vec12(c11ce980,3,2000,cbf03a00,d56634c0) at vec12+0x2 > > cnopen(c11ce980,3,2000,cbf03a00,0) at cnopen+0x6a > >It may be significant that the hang seems to occur while openig the console >device. Do you have a serial console on the puc device? I thought that >this doesn't work. > > > Any pointers on how to track this down ? It happens both in RELENG_4 from > > May 12th and 5.2-CURRENT FreeBSD 5.2-CURRENT #1: Thu May 13 > >Did it work before then? The driver hasn't changed since long before then. > >Bruce