Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 19 Nov 2010 23:01:20 +0100
From:      Ivan Voras <ivoras@freebsd.org>
To:        freebsd-hardware@freebsd.org
Cc:        freebsd-net@freebsd.org
Subject:   em card wedging
Message-ID:  <ic6s3h$mtn$1@dough.gmane.org>

next in thread | raw e-mail | index | archive | help
This problem is separate, on a separate system, from those I've been 
reporting the last few days, just in case someone read them all :)

An on-board em card in a server (supermicro motherboard) wedges after a 
couple of
minutes of operation and while there are continuous "watchdog timeout"
messages on the console, it doesn't help the card and it stays wedged
forever. When this problem happens, monitoring the network state with
"netstat 1" suddenly starts outputing garbage values (large 64-bit
numbers, always constant) for incoming and outgoing packet counts,
like there is some kind of kernel memory corruption.

This can be quickly provoked on-demand by doing flood-ping (ping -f).

There are two ports to the card, em0 and em1 and if I transfer the
Ethernet cable from em0 to em1 and bring it up, then *both* cards
indicate in ifconfig status that they have signal (active) but after a
few packets exchanged over em1 (DHCP) it also hangs.

This is 8-stable amd64 (the behaviour was much worse on 8.0-release
and 8.1-release - the card stopped working after a few seconds) with
this hardware:


em0: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0xdc00-0xdc1f
mem 0xfb5e0000-0xfb5fffff,0xfb5dc000-0xfb5dffff irq 16 at device 0.0
on pci3
em0: Using MSI interrupt
em0: [FILTER]
em0: Ethernet address: 00:25:90:0b:77:5c
em1: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0xec00-0xec1f
mem 0xfb6e0000-0xfb6fffff,0xfb6dc000-0xfb6dffff irq 17 at device 0.0
on pci4
em1: Using MSI interrupt
em1: [FILTER]
em1: Ethernet address: 00:25:90:0b:77:5d


em0@pci0:3:0:0: class=0x020000 card=0x040d15d9 chip=0x10d38086 rev=0x00 
hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
    class      = network
    subclass   = ethernet
    bar   [10] = type Memory, range 32, base 0xfb5e0000, size 131072, 
enabled
    bar   [18] = type I/O Port, range 32, base 0xdc00, size 32, enabled
    bar   [1c] = type Memory, range 32, base 0xfb5dc000, size 16384, enabled
    cap 01[c8] = powerspec 2  supports D0 D3  current D0
    cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
    cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
    cap 11[a0] = MSI-X supports 5 messages in map 0x1c
em1@pci0:4:0:0: class=0x020000 card=0x040d15d9 chip=0x10d38086 rev=0x00 
hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
    class      = network
    subclass   = ethernet
    bar   [10] = type Memory, range 32, base 0xfb6e0000, size 131072, 
enabled
    bar   [18] = type I/O Port, range 32, base 0xec00, size 32, enabled
    bar   [1c] = type Memory, range 32, base 0xfb6dc000, size 16384, enabled
    cap 01[c8] = powerspec 2  supports D0 D3  current D0
    cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
    cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
    cap 11[a0] = MSI-X supports 5 messages in map 0x1c

Interestingly, IPMI, which also works over the same port (and is in
fact on the same subnet as the "main" port) continues working
while all this is happening.

The BIOS configuration doesn't contain anything directly connected to
advanced NIC settings but it contains several PCI-E settings, if there 
is a chance toggling them will work.

While the card is wedged like this, the server cannot be shutdown or
restarted by software - the whole machine hangs after flushing vnodes
& buffers and has to be cold-cycled.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ic6s3h$mtn$1>