From owner-freebsd-hardware@FreeBSD.ORG Fri Nov 19 22:01:44 2010 Return-Path: Delivered-To: freebsd-hardware@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 08BEB1065670 for ; Fri, 19 Nov 2010 22:01:44 +0000 (UTC) (envelope-from freebsd-hardware@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id 834458FC14 for ; Fri, 19 Nov 2010 22:01:43 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1PJZ1U-0002xi-Oo for freebsd-hardware@freebsd.org; Fri, 19 Nov 2010 23:01:36 +0100 Received: from cpe-188-129-101-155.dynamic.amis.hr ([188.129.101.155]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 19 Nov 2010 23:01:36 +0100 Received: from ivoras by cpe-188-129-101-155.dynamic.amis.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 19 Nov 2010 23:01:36 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-hardware@freebsd.org From: Ivan Voras Date: Fri, 19 Nov 2010 23:01:20 +0100 Lines: 79 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: cpe-188-129-101-155.dynamic.amis.hr User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6 Cc: freebsd-net@freebsd.org Subject: em card wedging X-BeenThere: freebsd-hardware@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: General discussion of FreeBSD hardware List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Nov 2010 22:01:44 -0000 This problem is separate, on a separate system, from those I've been reporting the last few days, just in case someone read them all :) An on-board em card in a server (supermicro motherboard) wedges after a couple of minutes of operation and while there are continuous "watchdog timeout" messages on the console, it doesn't help the card and it stays wedged forever. When this problem happens, monitoring the network state with "netstat 1" suddenly starts outputing garbage values (large 64-bit numbers, always constant) for incoming and outgoing packet counts, like there is some kind of kernel memory corruption. This can be quickly provoked on-demand by doing flood-ping (ping -f). There are two ports to the card, em0 and em1 and if I transfer the Ethernet cable from em0 to em1 and bring it up, then *both* cards indicate in ifconfig status that they have signal (active) but after a few packets exchanged over em1 (DHCP) it also hangs. This is 8-stable amd64 (the behaviour was much worse on 8.0-release and 8.1-release - the card stopped working after a few seconds) with this hardware: em0: port 0xdc00-0xdc1f mem 0xfb5e0000-0xfb5fffff,0xfb5dc000-0xfb5dffff irq 16 at device 0.0 on pci3 em0: Using MSI interrupt em0: [FILTER] em0: Ethernet address: 00:25:90:0b:77:5c em1: port 0xec00-0xec1f mem 0xfb6e0000-0xfb6fffff,0xfb6dc000-0xfb6dffff irq 17 at device 0.0 on pci4 em1: Using MSI interrupt em1: [FILTER] em1: Ethernet address: 00:25:90:0b:77:5d em0@pci0:3:0:0: class=0x020000 card=0x040d15d9 chip=0x10d38086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' device = 'Intel 82574L Gigabit Ethernet Controller (82574L)' class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xfb5e0000, size 131072, enabled bar [18] = type I/O Port, range 32, base 0xdc00, size 32, enabled bar [1c] = type Memory, range 32, base 0xfb5dc000, size 16384, enabled cap 01[c8] = powerspec 2 supports D0 D3 current D0 cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) cap 11[a0] = MSI-X supports 5 messages in map 0x1c em1@pci0:4:0:0: class=0x020000 card=0x040d15d9 chip=0x10d38086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' device = 'Intel 82574L Gigabit Ethernet Controller (82574L)' class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xfb6e0000, size 131072, enabled bar [18] = type I/O Port, range 32, base 0xec00, size 32, enabled bar [1c] = type Memory, range 32, base 0xfb6dc000, size 16384, enabled cap 01[c8] = powerspec 2 supports D0 D3 current D0 cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) cap 11[a0] = MSI-X supports 5 messages in map 0x1c Interestingly, IPMI, which also works over the same port (and is in fact on the same subnet as the "main" port) continues working while all this is happening. The BIOS configuration doesn't contain anything directly connected to advanced NIC settings but it contains several PCI-E settings, if there is a chance toggling them will work. While the card is wedged like this, the server cannot be shutdown or restarted by software - the whole machine hangs after flushing vnodes & buffers and has to be cold-cycled.