Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 17 Jul 2020 17:11:49 +1000
From:      Aristedes Maniatis <ari@ish.com.au>
To:        freebsd-stable <freebsd-stable@freebsd.org>
Subject:   Ethernet interface Watchdog timeout
Message-ID:  <2931240e-45c2-93e3-4746-48d4f566bd9f@ish.com.au>

next in thread | raw e-mail | index | archive | help
Last night I needed to reboot switches connected to a FreeBSD server.
There are two igb interfaces, bound via lagg0 as an LACP pair. Each is
connected to a different switch and those switches support mlag (LAG
distributed across more than one switch unit). One of the interfaces
came back fine when its switch rebooted, but when the second switch was
rebooted several hours later the other interface didn't. Both igb0 and
igb1 interfaces are on the motherboard itself.

This has happened once before, and rebooting the FreeBSD server resolved
it. Obviously I'd like to understand the problem better first. Is there
more debugging I could collect while the server is in this state?

Physically removing the ethernet cable and plugging it back in does not
bring the interface up. ifconfig down and up also does not help.

What is this watchdog timeout that we are seeing in the logs?


Ari



# ifconfig igb0
igb0: flags=8c03<UP,BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
   
options=e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
    ether ac:1f:6b:00:ea:b2
    media: Ethernet autoselect
    status: no carrier
    nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>


# uname -a
FreeBSD lash.internal 12.1-RELEASE-p2 FreeBSD 12.1-RELEASE-p2 GENERIC  amd64


# grep igb0 /var/log/messages
Jul  8 23:00:43 lash kernel: igb0: Watchdog timeout (TX: 0 desc avail:
42 pidx: 1003) -- resetting
Jul  8 23:00:43 lash kernel: igb0: link state changed to DOWN
Jul  8 23:00:44 lash kernel: igb0: Watchdog timeout (TX: 7 desc avail:
1024 pidx: 0) -- resetting
Jul  9 00:00:01 lash kernel: igb0: Watchdog timeout (TX: 7 desc avail:
1024 pidx: 0) -- resetting
Jul  9 05:01:12 lash kernel: igb0: Watchdog timeout (TX: 7 desc avail:
1024 pidx: 0) -- resetting
Jul  9 05:06:56 lash kernel: igb0: Watchdog timeout (TX: 7 desc avail:
1024 pidx: 0) -- resetting
Jul  9 14:25:33 lash kernel: igb0: Watchdog timeout (TX: 7 desc avail:
1024 pidx: 0) -- resetting
Jul  9 14:44:30 lash kernel: igb0: Watchdog timeout (TX: 7 desc avail:
1024 pidx: 0) -- resetting


igb0@pci0:1:0:0:    class=0x020000 card=0x152115d9 chip=0x15218086
rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'I350 Gigabit Network Connection'
    class      = network
    subclass   = ethernet
    cap 01[40] = powerspec 3  supports D0 D3  current D0
    cap 05[50] = MSI supports 1 message, 64 bit, vector masks
    cap 11[70] = MSI-X supports 10 messages, enabled
                 Table in map 0x1c[0x0], PBA in map 0x1c[0x2000]
    cap 10[a0] = PCI-Express 2 endpoint max data 256(512) FLR NS
                 link x4(x4) speed 5.0(5.0) ASPM disabled(L0s/L1)
    ecap 0001[100] = AER 2 0 fatal 0 non-fatal 1 corrected
    ecap 0003[140] = Serial 1 ac1f6bffff00eab2
    ecap 000e[150] = ARI 1
    ecap 0010[160] = SR-IOV 1 IOV disabled, Memory Space disabled, ARI
disabled
                     0 VFs configured out of 8 supported
                     First VF RID Offset 0x0180, VF RID Stride 0x0004
                     VF Device ID 0x1520
                     Page Sizes: 4096 (enabled), 8192, 65536, 262144,
1048576, 4194304
    ecap 0017[1a0] = TPH Requester 1
    ecap 0018[1c0] = LTR 1
    ecap 000d[1d0] = ACS 1


# dmidecode -t baseboard
# dmidecode 3.2
Scanning /dev/mem for entry point.
SMBIOS 3.0 present.

Handle 0x0002, DMI type 2, 15 bytes
Base Board Information
    Manufacturer: Supermicro
    Product Name: X10DRW-i
    Version: 1.02
    Serial Number: NM173S002991
    Asset Tag: Default string
    Features:
        Board is a hosting board
        Board is replaceable
    Location In Chassis: Default string
    Chassis Handle: 0x0003
    Type: Motherboard
    Contained Object Handles: 0

Handle 0x0021, DMI type 41, 11 bytes
Onboard Device
    Reference Designation: ASPEED Video AST2400
    Type: Video
    Status: Enabled
    Type Instance: 1
    Bus Address: 0000:05:00.0

Handle 0x0022, DMI type 41, 11 bytes
Onboard Device
    Reference Designation: Intel Ethernet i350 #1
    Type: Ethernet
    Status: Enabled
    Type Instance: 1
    Bus Address: 0000:01:00.0

Handle 0x0023, DMI type 41, 11 bytes
Onboard Device
    Reference Designation: Intel Ethernet i350 #2
    Type: Ethernet
    Status: Enabled
    Type Instance: 2
    Bus Address: 0000:01:00.1




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2931240e-45c2-93e3-4746-48d4f566bd9f>