From owner-freebsd-net@FreeBSD.ORG Wed Apr 18 18:15:14 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DD5E4106566C for ; Wed, 18 Apr 2012 18:15:13 +0000 (UTC) (envelope-from freebsd-net@m.gmane.org) Received: from plane.gmane.org (plane.gmane.org [80.91.229.3]) by mx1.freebsd.org (Postfix) with ESMTP id 9A1A58FC0C for ; Wed, 18 Apr 2012 18:15:13 +0000 (UTC) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1SKZPD-00044h-Fe for freebsd-net@freebsd.org; Wed, 18 Apr 2012 20:15:03 +0200 Received: from www01.lwilke.de ([78.47.159.91]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 18 Apr 2012 20:15:03 +0200 Received: from lw by www01.lwilke.de with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 18 Apr 2012 20:15:03 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-net@freebsd.org From: Lars Wilke Date: Wed, 18 Apr 2012 14:01:48 +0000 Lines: 90 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: www01.lwilke.de User-Agent: slrn/0.9.9p1 (Linux) Subject: Watchdog timeout em driver 8.2-R X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Apr 2012 18:15:14 -0000 Hi, i first posted the following to the -stable list but got no reply. Maybe someone here has some advice for me. Switch: HP ProCurve 2910al The switch does passive LACP Motherboard: Supermicro X8DTN+-F NIC: Quad Port Card, i.e. em1: em1@pci0:6:0:1: class=0x020000 card=0x125e15d9 chip=0x105e8086 rev=0x06 hdr=0x00 vendor = 'Intel Corporation' device = 'HP NC360T PCIe DP Gigabit Server Adapter (n1e5132)' class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xfb9e0000, size 131072, enabled bar [14] = type Memory, range 32, base 0xfb9c0000, size 131072, enabled bar [18] = type I/O Port, range 32, base 0xcc00, size 32, enabled cap 01[c8] = powerspec 2 supports D0 D3 current D0 cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message cap 10[e0] = PCI-Express 1 endpoint max data 256(256) link x4(x4) ecap 0001[100] = AER 1 0 fatal 1 non-fatal 0 corrected ecap 0003[140] = Serial 1 002590ffff0484d8 I use CAT 6 cables and the switch and server are in the same cabinet. OS: FBSD is 8.2-Release rc.conf: ifconfig_em0="up" ifconfig_em1="up" ifconfig_em2="up" ifconfig_em3="up" cloned_interfaces="lagg0" ifconfig_lagg0="laggproto lacp laggport em0 laggport em1 laggport em2 laggport em3" ipv4_addrs_lagg0="192.168.80.20/24" Hm, what sysctls might be interesting? I use: net.inet.tcp.sendbuf_max=16777216 net.inet.tcp.recvbuf_max=16777216 net.inet.tcp.sendspace=65536 net.inet.tcp.recvspace=131072 kern.ipc.nmbclusters=230400 kern.maxvnodes=250000 kern.maxfiles=65536 kern.maxfilesperproc=32768 vfs.read_max=32 loader.conf: does only contain stuff concerning zfs Except for swap the whole system uses zfs, swap is on a geom mirror. Once in a while i see this messages in /var/log/messages Apr 13 08:53:07 san02 kernel: em1: Watchdog timeout -- resetting Apr 13 08:53:07 san02 kernel: em1: Queue(0) tdh = 232, hw tdt = 190 Apr 13 08:53:07 san02 kernel: em1: TX(0) desc avail = 31,Next TX to Clean = 221 Apr 13 08:53:07 san02 kernel: em1: Link is Down Apr 13 08:53:07 san02 kernel: em1: link state changed to DOWN Sometimes nothing for days, sometimes under high Network load (NFSv3), sometimes multiple times a day. I see this message/behaviour on always the same two of the four interfaces (em1 and em3). Then the NIC does not have the ACTIVE flag anymore, an ifconfig em1 up solves the issue. But why does it loose the ACTIVE state and why does the NIC reset itself in the first place? On the switch i see that the port matching em1 on the server has left the trunk, so the missing ACTIVE flag is not lying 8-/ Googling found many postings with the same problem and one site suggested that this might be an ACPI problem but nothing concrete and the postings i found were mostly FBSD7 and older. Any pointers would be appreciated. Thank you --lars _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"