From owner-freebsd-current@FreeBSD.ORG Mon Apr 25 18:37:42 2005 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1852116A4CE for ; Mon, 25 Apr 2005 18:37:42 +0000 (GMT) Received: from nala.dohd.org (xaa.demon.nl [83.160.166.71]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7D30843D53 for ; Mon, 25 Apr 2005 18:37:40 +0000 (GMT) (envelope-from freebsd@dohd.org) Received: from localhost (localhost.local.dohd.org [127.0.0.1]) by nala.dohd.org (Postfix) with ESMTP id 2E597117CC for ; Mon, 25 Apr 2005 20:37:39 +0200 (CEST) Received: from nala.dohd.org ([127.0.0.1]) by localhost (eeyore.local.dohd.org [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 25780-03-10 for ; Mon, 25 Apr 2005 20:37:33 +0200 (CEST) Received: by nala.dohd.org (Postfix, from userid 1008) id E32CF117CB; Mon, 25 Apr 2005 20:37:33 +0200 (CEST) Date: Mon, 25 Apr 2005 20:37:33 +0200 From: Mark Huizer To: current@freebsd.org Message-ID: <20050425183733.GB24146@eeyore.local.dohd.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="oLBj+sq0vYjzfsbl" Content-Disposition: inline User-Agent: Mutt/1.5.9i X-Virus-Scanned: amavisd-new at dohd.org Subject: fxp0: device timeout X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Apr 2005 18:37:42 -0000 --oLBj+sq0vYjzfsbl Content-Type: text/plain; charset=us-ascii Content-Disposition: inline My server in the attic was recently reinstalled from scratch with FreeBSD 5.stable (actually: RC2, and now upgraded to 5-stable), see the attached dmesg.boot for the dmesg info. This machine in the exact hardware configuration was running fine in its 6-current days. The reason for 'down'grading was the fact that overtime it seems that a subtle filesystem problem arose, that lead to random crashes (well, random... take a lot of vm activity and at some point vnlru would lead to a kernel panic), but the network was perfectly fine. Now, with the 5.x install, I have regular problems with fxp0. In /var/log/messages and dmesg I get this: (The last line comes from ifconfig fxp0 link0, which I added yesterday to see if it would solve problems, the same symptoms occur without that option) Apr 25 18:15:08 eeyore kernel: fxp0: device timeout Apr 25 18:15:08 eeyore kernel: fxp0: Microcode loaded, int_delay: 1000 usec bundle_max: 0 The symptoms: * network freezes for 30 seconds or something like that, after a while pings start to work again and everything moves on. When: * combinations of network activity and/or disk activity. Starting bacula will guarantee device timeouts, as will browsing from a workstation combined with e.g. using a samba share The interface: fxp0: port 0xe000-0xe01f mem 0xe3000000-0xe30fffff,0xe3101000-0xe3101fff irq 5 at device 17.0 on pci0 miibus1: on fxp0 fxp0: flags=9843 mtu 1500 options=8 inet6 fe80::208:c7ff:fe25:7560%fxp0 prefixlen 64 scopeid 0x2 ether 00:50:da:3d:d7:f6 media: Ethernet autoselect (100baseTX ) status: active There are 2 'special' things with this interface: * the mac address is changed from /etc/start_if.fxp0 * it is used with VLAN's (802.1q tagging) I looked in the mail archives, and of course it is clearly an interrupt problem. I did the usual stuff: put the card in different PCI slots, force it to different IRQ in the BIOS, but still no improvement. Furthermore I don't believe that hardware should change that much just by reinstalling FreeBSD, so I tend to believe that something is different between 5.x and 6.x. Does anyone recognize these symptoms and have a nice solution for me? Or if people decide I should PR this: what kind of info would make debugging easy? vmstat 1 info would take a little work to collect for these kind of outages, but I can do some scripting to fix that of course :-) Greetings, Mark -- Nice testing in little China... --oLBj+sq0vYjzfsbl Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="dmesg.boot" Copyright (c) 1992-2005 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.4-STABLE #1: Thu Apr 21 02:25:25 CEST 2005 xaa@eeyore.local.dohd.org:/usr/obj/usr/src/sys/eeyore Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) XP (1660.52-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x681 Stepping = 1 Features=0x383f9ff AMD Features=0xc0400000 real memory = 536805376 (511 MB) avail memory = 515637248 (491 MB) npx0: on motherboard npx0: INT 16 interface acpi0: on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0 cpu0: on acpi0 acpi_throttle0: on cpu0 acpi_button0: on acpi0 acpi_button1: on acpi0 pcib0: port 0x6000-0x607f,0x5000-0x500f,0x4080-0x40ff,0x4000-0x407f,0xcf8-0xcff on acpi0 pci0: on pcib0 agp0: mem 0xd0000000-0xd7ffffff at device 0.0 on pci0 pcib1: at device 1.0 on pci0 pci1: on pcib1 pci1: at device 0.0 (no driver attached) isab0: at device 7.0 on pci0 isa0: on isab0 atapci0: port 0xc000-0xc00f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 7.1 on pci0 atapci0: Correcting VIA config for southbridge data corruption bug ata0: channel #0 on atapci0 ata1: channel #1 on atapci0 uhci0: port 0xc400-0xc41f irq 10 at device 7.2 on pci0 usb0: on uhci0 usb0: USB revision 1.0 uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: port 0xc800-0xc81f irq 10 at device 7.3 on pci0 usb1: on uhci1 usb1: USB revision 1.0 uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered pci0: at device 7.4 (no driver attached) sym0: <810a> port 0xd800-0xd8ff mem 0xe3100000-0xe31000ff irq 11 at device 15.0 on pci0 sym0: Symbios NVRAM, ID 7, Fast-10, SE, parity checking sym0: open drain IRQ line driver sym0: using LOAD/STORE-based firmware. xl0: <3Com 3c905C-TX Fast Etherlink XL> port 0xdc00-0xdc7f mem 0xe3102000-0xe310207f irq 10 at device 16.0 on pci0 miibus0: on xl0 ukphy0: on miibus0 ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto xl0: Ethernet address: 00:04:75:c4:f4:43 fxp0: port 0xe000-0xe01f mem 0xe3000000-0xe30fffff,0xe3101000-0xe3101fff irq 5 at device 17.0 on pci0 miibus1: on fxp0 inphy0: on miibus1 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fxp0: Ethernet address: 00:08:c7:25:75:60 rl0: port 0xe400-0xe4ff mem 0xe3103000-0xe31030ff irq 12 at device 18.0 on pci0 miibus2: on rl0 rlphy0: on miibus2 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto rl0: Ethernet address: 00:50:fc:0b:28:ec pci0: at device 19.0 (no driver attached) fdc0: port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0 fd0: <1440-KB 3.5" drive> on fdc0 drive 0 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A ppc0: port 0x378-0x37f irq 7 on acpi0 ppc0: Generic chipset (EPP/NIBBLE) in COMPATIBLE mode ppbus0: on ppc0 plip0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 atkbdc0: port 0x64,0x60 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 orm0: at iomem 0xd1000-0xd17ff,0xd0000-0xd07ff on isa0 pmtimer0 on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounter "TSC" frequency 1660518050 Hz quality 800 Timecounters tick every 10.000 msec IP Filter: v3.4.35 initialized. Default = block all, Logging = enabled ad0: 38166MB [77545/16/63] at ata0-master UDMA100 acd0: CDROM at ata1-slave PIO4 Waiting 10 seconds for SCSI devices to settle (noperiph:sym0:0:-1:-1): SCSI BUS reset delivered. sa0 at sym0 bus 0 target 0 lun 0 sa0: Removable Sequential Access SCSI-2 device sa0: 10.000MB/s transfers (10.000MHz, offset 8) Mounting root from ufs:/dev/ad0s1a --oLBj+sq0vYjzfsbl--