From owner-freebsd-bugs@FreeBSD.ORG Mon Dec 29 08:59:11 2003 Return-Path: Delivered-To: freebsd-bugs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A489316A4CE; Mon, 29 Dec 2003 08:59:11 -0800 (PST) Received: from polaris.canweb.ca (polaris.canweb.ca [204.225.44.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5BF2F43D54; Mon, 29 Dec 2003 08:58:58 -0800 (PST) (envelope-from alan@canweb.ca) Received: (from root@localhost) by polaris.canweb.ca (8.12.6/8.12.6) id hBTGwQqU010466; Mon, 29 Dec 2003 11:58:26 -0500 (EST) (envelope-from alan@canweb.ca) Received: from gemini (gemini.canweb.ca. [192.168.45.22]) by polaris.canweb.ca (8.12.6/8.12.6av) with ESMTP id hBTGwPqZ010458; Mon, 29 Dec 2003 11:58:25 -0500 (EST) (envelope-from alan@canweb.ca) From: "Alan Lew" To: Date: Mon, 29 Dec 2003 11:58:25 -0500 Message-ID: <000001c3ce2c$f99091b0$162da8c0@gemini> X-Priority: 1 (Highest) X-MSMail-Priority: High X-Mailer: Microsoft Outlook, Build 10.0.4024 Importance: High X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 X-Virus-Scanned: by AMaViS perl-11 cc: freebsd-hackers@freebsd.org cc: alan@canweb.ca Subject: Re: kern/37043: Latest stable causes SCSI bus freeze on sym0 when running SMP X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Dec 2003 16:59:11 -0000 Gentlemen, Since upgrading our kernels to 4.x, we’ve noticed this problem. After installing Gérard’s Sym workaround (http://docs.freebsd.org/cgi/mid.cgi?200208222210.g7MMABwT084798), the issue seems to have vanished on 3 of our 5 affected servers. The 5 servers are all Asus AP1400 boxes (CUR-DLSR mobo, ServerWorks 3 LE chipset) running various stable 4.x kernels, all containing the LSI Logic 53C1010-33 chipset. Of the two remaining affected boxes, one runs merrily for a few weeks with minimal load (serving two static web pages) and then dies, spitting out “sym0:0:control msgout 80 22 25d” like messages. The other box runs with no load (completely idle) and hard locks after 2-3 days with no error or panic messages written anywhere. The latter affected box was recently sent to Asus in Taiwan for repairs to the SCSI backplane board, apparently a known ASUS AP1400 issue (don’t know if this info helps, but...) Below is the dmesg output of the two affected boxes (as described above, in order) as well as the “pciconf -l –v” output. I hope this helps. We’re also offering a chance if anyone wishes to mess around with one of these affected boxes; we will make it available remotely for testing. Any thoughts? Regards, ...alan lew alan@canweb.ca -- SNIP –- [ Box #1 DMESG ] Copyright (c) 1992-2002 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.7-RELEASE-p3 #3: Tue Jan 7 11:33:58 EST 2003 alan@rigel.canweb.ca:/usr/obj/usr/src/sys/RIGEL Timecounter "i8254" frequency 1193182 Hz CPU: Pentium III/Pentium III Xeon/Celeron (1000.04-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x686 Stepping = 6 Features=0x383fbff real memory = 1073721344 (1048556K bytes) avail memory = 1041727488 (1017312K bytes) Changing APIC ID for IO APIC #1 from 3 to 1 in MP table Changing APIC ID for IO APIC #1 from 3 to 1 on chip Programming 16 pins in IOAPIC #0 IOAPIC #0 intpin 2 -> irq 0 Programming 16 pins in IOAPIC #1 FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 3, version: 0x00040011, at 0xfee00000 cpu1 (AP): apic id: 0, version: 0x00040011, at 0xfee00000 io0 (APIC): apic id: 2, version: 0x000f0011, at 0xfec00000 io1 (APIC): apic id: 1, version: 0x000f0011, at 0xfec01000 Preloaded elf kernel "kernel" at 0xc0356000. Pentium Pro MTRR support enabled md0: Malloc disk Using $PIR table, 8 entries at 0xc00f1010 npx0: on motherboard npx0: INT 16 interface pcib0: on motherboard IOAPIC #1 intpin 6 -> irq 2 IOAPIC #1 intpin 7 -> irq 10 IOAPIC #1 intpin 22 -> irq 11 pci0: on pcib0 fxp0: port 0xd800-0xd83f mem 0xfd800000-0xfd8fffff,0xfe000000-0xfe000fff irq 2 at device 2.0 on pci0 fxp0: Ethernet address 00:e0:18:0a:b1:14 inphy0: on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto pci0: at 7.0 fxp1: port 0xd000-0xd03f mem 0xfa800000-0xfa8fffff,0xfb000000-0xfb000fff irq 10 at device 8.0 on pci0 fxp1: Ethernet address 00:e0:18:0a:b1:15 inphy1: on miibus1 inphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto isab0: at device 15.0 on pci0 isa0: on isab0 atapci0: port 0xb800-0xb80f at device 15.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 ohci0: mem 0xfa000000-0xfa000fff irq 11 at device 15.2 on pci0 usb0: OHCI version 1.0, legacy support usb0: on ohci0 usb0: USB revision 1.0 uhub0: (0x1166) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 4 ports with 4 removable, self powered ufm0: GemTek Corp USB FM Radio, rev 1.00/4.10, addr 2 pcib1: on motherboard IOAPIC #1 intpin 8 -> irq 12 IOAPIC #1 intpin 9 -> irq 15 pci1: on pcib1 sym0: <1010-33> port 0xb400-0xb4ff mem 0xf9000000-0xf9001fff,0xf9800000-0xf98003ff irq 12 at device 5.0 on pci1 sym0: Symbios NVRAM, ID 7, Fast-80, LVD, parity checking sym0: open drain IRQ line driver, using on-chip SRAM sym0: using LOAD/STORE-based firmware. sym0: handling phase mismatch from SCRIPTS. sym1: <1010-33> port 0xb000-0xb0ff mem 0xf8000000-0xf8001fff,0xf8800000-0xf88003ff irq 15 at device 5.1 on pci1 sym1: Symbios NVRAM, ID 7, Fast-80, LVD, parity checking sym1: open drain IRQ line driver, using on-chip SRAM sym1: using LOAD/STORE-based firmware. sym1: handling phase mismatch from SCRIPTS. orm0: