From owner-freebsd-stable@FreeBSD.ORG Wed Jan 7 05:39:55 2015 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6CB649B0 for ; Wed, 7 Jan 2015 05:39:55 +0000 (UTC) Received: from mx0.gentlemail.de (mx0.gentlemail.de [IPv6:2a00:e10:2800::a130]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id ED9321BDA for ; Wed, 7 Jan 2015 05:39:54 +0000 (UTC) Received: from mh0.gentlemail.de (ezra.dcm1.omnilan.net [IPv6:2a00:e10:2800::a135]) by mx0.gentlemail.de (8.14.5/8.14.5) with ESMTP id t075dpQC030135 for ; Wed, 7 Jan 2015 06:39:51 +0100 (CET) (envelope-from freebsd@omnilan.de) Received: from titan.inop.mo1.omnilan.net (titan.inop.mo1.omnilan.net [IPv6:2001:a60:f0bb:1::3:1]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mh0.gentlemail.de (Postfix) with ESMTPSA id EAC2A85B; Wed, 7 Jan 2015 06:39:50 +0100 (CET) Message-ID: <54ACC6A2.1050400@omnilan.de> Date: Wed, 07 Jan 2015 06:39:46 +0100 From: Harry Schmalzbauer Organization: OmniLAN User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; de-DE; rv:1.9.2.8) Gecko/20100906 Lightning/1.0b2 Thunderbird/3.1.2 MIME-Version: 1.0 To: FreeBSD Stable Subject: igb(4) watchdog timeout, lagg(4) fails Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (mx0.gentlemail.de [IPv6:2a00:e10:2800::a130]); Wed, 07 Jan 2015 06:39:51 +0100 (CET) X-Milter: Spamilter (Reciever: mx0.gentlemail.de; Sender-ip: ; Sender-helo: mh0.gentlemail.de; ) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jan 2015 05:39:55 -0000 Hello, recently I upgraded one server from 9.1 to 10.1. There are two 82576 (one port of two Intel ET Dual-Port GbE [kawela]), driven by igb(4). I've never seen any watchdog timeout with FreeBSD-9.1 but suddenly (with 10-stable) I see: igb0: Watchdog timeout -- resetting igb0: Queue(0) tdh = 2974, hw tdt = 2973 igb0: TX(0) desc avail = 0,Next TX to Clean = 0 My biggest problem is, that lagg(4) doesn't detect the problem with igb0. It's configured with "lagghash l2' and most connections were interupted until I manually do 'ifconfig igb0 down'. Then lagg does it's job and connectivity was restored via the remaining igb1. Is there a way to auto-if-down an interface which suffers from watchdog timeouts? And any way to really reset it without rebooting the machine? Thanks, -Harry P.S.: FreeBSD runs as ESXi guest and has one port of each card as PCIe-passthrough, the second port belongs the hypervisor. This has been working fine for more than one year before. Here's some sysctl info when igb0 hung: hw.igb.rxd: 4096 hw.igb.txd: 4096 hw.igb.enable_aim: 1 hw.igb.enable_msix: 1 hw.igb.max_interrupt_rate: 8000 hw.igb.buf_ring_size: 4096 hw.igb.header_split: 0 hw.igb.num_queues: 0 hw.igb.rx_process_limit: 100 dev.igb.%parent: dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection version - 2.4.0 dev.igb.0.%driver: igb dev.igb.0.%location: slot=0 function=0 handle=\_SB_.PCI0.PE60.S1F0 dev.igb.0.%pnpinfo: vendor=0x8086 device=0x10c9 subvendor=0x8086 subdevice=0xa03c class=0x020000 dev.igb.0.%parent: pci7 dev.igb.0.nvm: -1 dev.igb.0.enable_aim: 1 dev.igb.0.fc: 3 dev.igb.0.rx_processing_limit: 100 dev.igb.0.link_irq: 7 dev.igb.0.dropped: 0 dev.igb.0.tx_dma_fail: 0 dev.igb.0.rx_overruns: 0 dev.igb.0.watchdog_timeouts: 1 dev.igb.0.device_control: 1488978497 dev.igb.0.rx_control: 67272738 dev.igb.0.interrupt_mask: 4 dev.igb.0.extended_int_mask: 2147483679 dev.igb.0.tx_buf_alloc: 0 dev.igb.0.rx_buf_alloc: 0 dev.igb.0.fc_high_water: 47488 dev.igb.0.fc_low_water: 47472 dev.igb.0.queue0.interrupt_rate: 0 dev.igb.0.queue0.txd_head: 0 dev.igb.0.queue0.txd_tail: 0 dev.igb.0.queue0.no_desc_avail: 29 dev.igb.0.queue0.tx_packets: 89463 dev.igb.0.queue0.rxd_head: 0 dev.igb.0.queue0.rxd_tail: 0 dev.igb.0.queue0.rx_packets: 419144 dev.igb.0.queue0.rx_bytes: 0 dev.igb.0.queue0.lro_queued: 0 dev.igb.0.queue0.lro_flushed: 0 dev.igb.0.queue1.interrupt_rate: 0 … dev.igb.0.queue2.interrupt_rate: 0 … dev.igb.0.queue3.interrupt_rate: 0