Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 14 Jan 2010 21:48:56 +0100
From:      Floris Bos <info@je-eigen-domein.nl>
To:        pyunyh@gmail.com
Cc:        freebsd-net@freebsd.org
Subject:   Re: kern/92090: [bge] bge: watchdog timeout -- resetting
Message-ID:  <201001142148.56444.info@je-eigen-domein.nl>
In-Reply-To: <20100114201144.GA1228@michelle.cdnetworks.com>
References:  <201001140140.o0E1e5hr072464@freefall.freebsd.org> <201001142108.02941.info@je-eigen-domein.nl> <20100114201144.GA1228@michelle.cdnetworks.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 14 January 2010 09:11:44 pm Pyun YongHyeon wrote:
> On Thu, Jan 14, 2010 at 09:08:02PM +0100, Floris Bos wrote:
> > On Thursday 14 January 2010 06:56:03 pm Pyun YongHyeon wrote:
> > > On Thu, Jan 14, 2010 at 04:33:19AM +0100, Floris Bos wrote:
> > > > Hi,
> > > > 
> > > > On Thursday 14 January 2010 03:54:52 am Pyun YongHyeon wrote:
> > > > > >  ==
> > > > > >  bge0: <HP NC107i PCIe Gigabit Server Adapter, ASIC rev. 0x5784100> mem 0xdf900000-0xdf90ffff irq 16 at device 0.0 on pci32
> > > > > >  ==
> > > > > >  
> > > > > >  After boot, the network works for about 5 seconds, barely enough time to get an IP by DHCP, and sent a ping or 2.
> > > > > >  Then network connectivity goes down, and after some time there is a "bge0: watchdog timeout -- resetting" message.
> > > > > >  
> > > > > >  Then network works again for 5 seconds, and goes down again. All the time, repeatedly.
> > > > > >  
> > > > > >  The system works fine under Ubuntu. So I assume the hardware is ok.
> > > > > >  
> > > > > 
> > > > > I'm not sure but it looks like you have a BCM5784 controller. What is
> > > > > the output of "devinfo -rv | grep phy"?
> > > > 
> > > > ==
> > > > ukphy0 pnpinfo oui=0x50ef model=0x3a rev=0x4 at phyno=1
> > > > ukphy1 pnpinfo oui=0x50ef model=0x3a rev=0x4 at phyno=1
> > > > ==
> > > 
> > > Support for the PHY was added in r202269.
> > > Please try again after applying the change. Or you can download
> > > sys/dev/mii/miidevs and sys/dev/mii/brgphy.c from HEAD and rebuild
> > > kernel.
> > 
> > Fetched the latest source using CVS on another computer, and transferred it to the system concerned by USB stick.
> > Rebuild the kernel, but the problem is still there.
> > 
> Would you show me full dmesg output including "watchodg timeout"
> messages?

===
Copyright (c) 1992-2010 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 9.0-CURRENT #0: Thu Jan 14 20:12:47 CET 2010
    root@db3.xxxxxxx.xx:/usr/obj/usr/src/sys/GENERIC amd64
WARNING: WITNESS option enabled, expect reduced performance.
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(R) CPU           X3430  @ 2.40GHz (2394.00-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x106e5  Stepping = 5
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x98e3fd<SSE3,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT>
  AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  TSC: P-state invariant
real memory  = 17179869184 (16384 MB)
avail memory = 16533999616 (15768 MB)
ACPI APIC Table: <HP     ProLiant>
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  2
 cpu2 (AP): APIC ID:  4
 cpu3 (AP): APIC ID:  6
ioapic0 <Version 2.0> irqs 0-23 on motherboard
kbd1 at kbdmux0
acpi0: <HP ProLiant> on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0
acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 900
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> irq 16 at device 3.0 on pci0
pci1: <ACPI PCI bus> on pcib1
pci0: <base peripheral> at device 8.0 (no driver attached)
pci0: <base peripheral> at device 8.1 (no driver attached)
pci0: <base peripheral> at device 8.2 (no driver attached)
pci0: <base peripheral> at device 8.3 (no driver attached)
pci0: <base peripheral> at device 16.0 (no driver attached)
pci0: <base peripheral> at device 16.1 (no driver attached)
ehci0: <Intel PCH USB 2.0 controller USB-B> mem 0xdfd02000-0xdfd023ff irq 16 at device 26.0 on pci0
ehci0: [ITHREAD]
usbus0: EHCI version 1.0
usbus0: <Intel PCH USB 2.0 controller USB-B> on ehci0
pcib2: <ACPI PCI-PCI bridge> irq 17 at device 28.0 on pci0
pci16: <ACPI PCI bus> on pcib2
pcib3: <ACPI PCI-PCI bridge> irq 17 at device 28.4 on pci0
pci32: <ACPI PCI bus> on pcib3
bge0: <HP NC107i PCIe Gigabit Server Adapter, ASIC rev. 0x5784100> mem 0xdf900000-0xdf90ffff irq 16 at device 0.0 on pci32
miibus0: <MII bus> on bge0
brgphy0: <BCM5784 10/100/1000baseTX PHY> PHY 1 on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
bge0: Ethernet address: f4:ce:46:0f:2a:2c
bge0: [FILTER]
pcib4: <ACPI PCI-PCI bridge> irq 16 at device 28.5 on pci0
pci34: <ACPI PCI bus> on pcib4
bge1: <HP NC107i PCIe Gigabit Server Adapter, ASIC rev. 0x5784100> mem 0xdfa00000-0xdfa0ffff irq 17 at device 0.0 on pci34
miibus1: <MII bus> on bge1
brgphy1: <BCM5784 10/100/1000baseTX PHY> PHY 1 on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
bge1: Ethernet address: f4:ce:46:0f:2a:2d
bge1: [FILTER]
pcib5: <ACPI PCI-PCI bridge> irq 18 at device 28.6 on pci0
pci36: <ACPI PCI bus> on pcib5
vgapci0: <VGA-compatible display> mem 0xde000000-0xdeffffff,0xdf800000-0xdf803fff,0xdf000000-0xdf7fffff irq 18 at device 0.0 on pci36
pcib6: <ACPI PCI-PCI bridge> irq 19 at device 28.7 on pci0
pci38: <ACPI PCI bus> on pcib6
ehci1: <Intel PCH USB 2.0 controller USB-A> mem 0xdfd02400-0xdfd027ff irq 23 at device 29.0 on pci0
ehci1: [ITHREAD]
usbus1: EHCI version 1.0
usbus1: <Intel PCH USB 2.0 controller USB-A> on ehci1
pcib7: <PCI-PCI bridge> at device 30.0 on pci0
pci48: <PCI bus> on pcib7
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel AHCI controller> port 0x1830-0x1837,0x1824-0x1827,0x1828-0x182f,0x1820-0x1823,0x1800-0x181f mem 0xdfd01000-0xdfd017ff irq 18 at device 31.2 on pci0
atapci0: [ITHREAD]
atapci0: AHCI v1.30 controller with 6 3Gbps ports, PM supported
ata2: <ATA channel 0> on atapci0
ata2: [ITHREAD]
ata3: <ATA channel 1> on atapci0
ata3: [ITHREAD]
ata4: <ATA channel 2> on atapci0
ata4: [ITHREAD]
ata5: <ATA channel 3> on atapci0
ata5: [ITHREAD]
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
acpi_button0: <Power Button> on acpi0
atrtc0: <AT realtime clock> port 0x70-0x71 on acpi0
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart0: [FILTER]
cpu0: <ACPI CPU> on acpi0
est0: <Enhanced SpeedStep Frequency Control> on cpu0
p4tcc0: <CPU Frequency Thermal Control> on cpu0
cpu1: <ACPI CPU> on acpi0
est1: <Enhanced SpeedStep Frequency Control> on cpu1
p4tcc1: <CPU Frequency Thermal Control> on cpu1
cpu2: <ACPI CPU> on acpi0
est2: <Enhanced SpeedStep Frequency Control> on cpu2
p4tcc2: <CPU Frequency Thermal Control> on cpu2
cpu3: <ACPI CPU> on acpi0
est3: <Enhanced SpeedStep Frequency Control> on cpu3
p4tcc3: <CPU Frequency Thermal Control> on cpu3
orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff,0xdc000-0xdffff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd: unable to set the command byte.
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
psm0: unable to set the command byte.
ppc0: cannot reserve I/O port range
ZFS filesystem version 3
ZFS storage pool version 14
Timecounters tick every 1.000 msec
usbus0: 480Mbps High Speed USB v2.0
usbus1: 480Mbps High Speed USB v2.0
ad4: 152627MB <INTEL SSDSA2M160G2GC 2CV102HD> at ata2-master UDMA100 SATA 3Gb/s
ad6: 152627MB <INTEL SSDSA2M160G2GC 2CV102HD> at ata3-master UDMA100 SATA 3Gb/s
ad8: 152627MB <INTEL SSDSA2M160G2GC 2CV102HD> at ata4-master UDMA100 SATA 3Gb/s
ad10: 152627MB <INTEL SSDSA2M160G2GC 2CV102HD> at ata5-master UDMA100 SATA 3Gb/s
SMP: AP CPU #3 Launched!
SMP: AP CPU #1 Launched!
SMP: AP CPU #2 Launched!
WARNING: WITNESS option enabled, expect reduced performance.
ugen1.1: <Intel> at usbus1ugen0.1: <Intel> at usbus0
uhub0: 
<Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1
uhub1: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus0
Root mount waiting for: usbus1 usbus0
uhub0: 2 ports with 2 removable, self powered
uhub1: 2 ports with 2 removable, self powered
Root mount waiting for: usbus1 usbus0
ugen1.2: <vendor 0x8087> at usbus1
uhub2: <vendor 0x8087 product 0x0020, class 9/0, rev 2.00/0.00, addr 2> on usbus1
ugen0.2: <vendor 0x8087> at usbus0
uhub3: <vendor 0x8087 product 0x0020, class 9/0, rev 2.00/0.00, addr 2> on usbus0
Root mount waiting for: usbus1 usbus0
uhub3: 6 ports with 6 removable, self powered
uhub2: 8 ports with 8 removable, self powered
Root mount waiting for: usbus1 usbus0
ugen0.3: <Logitech> at usbus0
ums0: <Logitech USB-PS/2 Optical Mouse, class 0/0, rev 2.00/27.00, addr 3> on usbus0
ums0: 8 buttons and [XYZ] coordinates ID=0
ugen1.3: <ServerEngines> at usbus1
ukbd0: <ServerEngines SE USB Device, class 0/0, rev 1.10/0.01, addr 3> on usbus1
kbd2 at ukbd0
ums1: <ServerEngines SE USB Device, class 0/0, rev 1.10/0.01, addr 3> on usbus1
ums1: 8 buttons and [XYZ] coordinates ID=0
ugen0.4: <SanDisk> at usbus0
umass0: <SanDisk Cruzer Micro, class 0/0, rev 2.00/2.00, addr 4> on usbus0
umass0:  SCSI over Bulk-Only; quirks = 0x0000
Root mount waiting for: usbus0
umass0:0:0:-1: Attached to scbus0
Trying to mount root from zfs:zroot
da0 at umass-sim0 bus 0 scbus0 target 0 lun 0
da0: <SanDisk Cruzer Micro 8.01> Removable Direct Access SCSI-0 device 
da0: 40.000MB/s transfers
da0: 3839MB (7862911 512 byte sectors: 255H 63S/T 489C)
GEOM: da0: partition 1 does not end on a track boundary.
lock order reversal:
 1st 0xffffff000a372bd8 zfs (zfs) @ /usr/src/sys/kern/vfs_mount.c:1058
 2nd 0xffffff000a5bc9f8 devfs (devfs) @ /usr/src/sys/kern/vfs_subr.c:2091
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x2e
witness_checkorder() at witness_checkorder+0x81e
__lockmgr_args() at __lockmgr_args+0xd10
vop_stdlock() at vop_stdlock+0x39
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x47
vget() at vget+0x7b
devfs_allocv() at devfs_allocv+0x100
devfs_root() at devfs_root+0x48
vfs_donmount() at vfs_donmount+0xfb2
nmount() at nmount+0x63
syscall() at syscall+0x1ae
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x8007afeac, rsp = 0x7fffffffdd28, rbp = 0x800a06048 ---
bge0: link state changed to UP
bge0: link state changed to DOWN
bge0: watchdog timeout -- resetting
bge0: link state changed to UP
bge0: link state changed to DOWN
bge0: watchdog timeout -- resetting
bge0: link state changed to UP
bge0: watchdog timeout -- resetting
bge0: link state changed to DOWN
bge0: link state changed to UP
===

Seconds after the link goes up the connectivity is gone, but it takes minutes before it actually shows up as "link state changed to DOWN" in dmesg.


According to the log file of the switch the server is connected to, the link goes up and down every 3 seconds or so.

==
Log Index 	Message Text 	Severity 	Log Time 	Component 	Description
1700	<14> Jan 01 09:27:45 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1701 %% Interface 9 is Link Up	Info	Jan 01 09:27:45	NIM	Interface 9 is Link Up
1701	<14> Jan 01 09:27:48 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1702 %% Interface 9 is Link Down	Info	Jan 01 09:27:48	NIM	Interface 9 is Link Down
1702	<14> Jan 01 09:27:51 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1703 %% Interface 9 is Link Up	Info	Jan 01 09:27:51	NIM	Interface 9 is Link Up
1703	<14> Jan 01 09:27:54 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1704 %% Interface 9 is Link Down	Info	Jan 01 09:27:54	NIM	Interface 9 is Link Down
1704	<14> Jan 01 09:27:57 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1705 %% Interface 9 is Link Up	Info	Jan 01 09:27:57	NIM	Interface 9 is Link Up
1705	<14> Jan 01 09:28:00 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1706 %% Interface 9 is Link Down	Info	Jan 01 09:28:00	NIM	Interface 9 is Link Down
1706	<14> Jan 01 09:28:03 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1707 %% Interface 9 is Link Up	Info	Jan 01 09:28:03	NIM	Interface 9 is Link Up
1707	<14> Jan 01 09:28:06 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1708 %% Interface 9 is Link Down	Info	Jan 01 09:28:06	NIM	Interface 9 is Link Down
1708	<14> Jan 01 09:28:09 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1709 %% Interface 9 is Link Up	Info	Jan 01 09:28:09	NIM	Interface 9 is Link Up
1709	<14> Jan 01 09:28:12 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1710 %% Interface 9 is Link Down	Info	Jan 01 09:28:12	NIM	Interface 9 is Link Down
1710	<14> Jan 01 09:28:15 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1711 %% Interface 9 is Link Up	Info	Jan 01 09:28:15	NIM	Interface 9 is Link Up
1711	<14> Jan 01 09:28:17 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1712 %% Interface 9 is Link Down	Info	Jan 01 09:28:17	NIM	Interface 9 is Link Down
1712	<14> Jan 01 09:28:20 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1713 %% Interface 9 is Link Up	Info	Jan 01 09:28:20	NIM	Interface 9 is Link Up
1713	<14> Jan 01 09:28:24 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1714 %% Interface 9 is Link Down	Info	Jan 01 09:28:24	NIM	Interface 9 is Link Down
1714	<14> Jan 01 09:28:26 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1715 %% Interface 9 is Link Up	Info	Jan 01 09:28:26	NIM	Interface 9 is Link Up
1715	<14> Jan 01 09:28:30 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1716 %% Interface 9 is Link Down	Info	Jan 01 09:28:30	NIM	Interface 9 is Link Down
1716	<14> Jan 01 09:28:32 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1717 %% Interface 9 is Link Up	Info	Jan 01 09:28:32	NIM	Interface 9 is Link Up
1717	<14> Jan 01 09:28:36 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1718 %% Interface 9 is Link Down	Info	Jan 01 09:28:36	NIM	Interface 9 is Link Down
1718	<14> Jan 01 09:28:39 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1719 %% Interface 9 is Link Up	Info	Jan 01 09:28:39	NIM	Interface 9 is Link Up
1719	<14> Jan 01 09:28:42 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1720 %% Interface 9 is Link Down	Info	Jan 01 09:28:42	NIM	Interface 9 is Link Down
1720	<14> Jan 01 09:28:45 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1721 %% Interface 9 is Link Up	Info	Jan 01 09:28:45	NIM	Interface 9 is Link Up
1721	<14> Jan 01 09:28:48 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1722 %% Interface 9 is Link Down	Info	Jan 01 09:28:48	NIM	Interface 9 is Link Down
1722	<14> Jan 01 09:28:51 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1723 %% Interface 9 is Link Up	Info	Jan 01 09:28:51	NIM	Interface 9 is Link Up
1723	<14> Jan 01 09:28:54 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1724 %% Interface 9 is Link Down	Info	Jan 01 09:28:54	NIM	Interface 9 is Link Down
==


Yours sincerly,

Floris Bos



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201001142148.56444.info>