From owner-freebsd-stable@FreeBSD.ORG Tue Aug 2 06:42:05 2011 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2E36D1065670 for ; Tue, 2 Aug 2011 06:42:05 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from mail.digiware.nl (mail.ip6.digiware.nl [IPv6:2001:4cb8:1:106::2]) by mx1.freebsd.org (Postfix) with ESMTP id AD6A78FC16 for ; Tue, 2 Aug 2011 06:42:04 +0000 (UTC) Received: from rack1.digiware.nl (localhost.digiware.nl [127.0.0.1]) by mail.digiware.nl (Postfix) with ESMTP id 15FDC153434; Tue, 2 Aug 2011 08:42:04 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.nl Received: from mail.digiware.nl ([127.0.0.1]) by rack1.digiware.nl (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id e+UNAnFf3QEH; Tue, 2 Aug 2011 08:42:01 +0200 (CEST) Received: from [IPv6:2001:4cb8:3:1:8964:6f2b:1e23:8042] (unknown [IPv6:2001:4cb8:3:1:8964:6f2b:1e23:8042]) by mail.digiware.nl (Postfix) with ESMTP id 8CEFC153433; Tue, 2 Aug 2011 08:42:01 +0200 (CEST) Message-ID: <4E379C38.4000109@digiware.nl> Date: Tue, 02 Aug 2011 08:42:00 +0200 From: Willem Jan Withagen User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: Jeremy Chadwick , "stable@freebsd.org" References: <4E37286D.5070203@digiware.nl> <20110801230028.GA83293@icarus.home.lan> In-Reply-To: <20110801230028.GA83293@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Re: em0 timeout disconnects server X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Aug 2011 06:42:05 -0000 On 2011-08-02 1:00, Jeremy Chadwick wrote: > On Tue, Aug 02, 2011 at 12:27:57AM +0200, Willem Jan Withagen wrote: >> A server just all of a sudden dropped from the network. >> uptime was 26days. >> >> This got my ZFS server hanging: >> >> Aug 1 23:39:58 zfs kernel: em0: Watchdog timeout -- resetting >> Aug 1 23:39:58 zfs kernel: em0: Queue(0) tdh = 942, hw tdt = 977 >> Aug 1 23:39:58 zfs kernel: em0: TX(0) desc avail = 985,Next TX to Clean >> = 938 >> Aug 1 23:43:24 zfs kernel: em0: Watchdog timeout -- resetting >> Aug 1 23:43:24 zfs kernel: em0: Queue(0) tdh = 147, hw tdt = 163 >> Aug 1 23:43:24 zfs kernel: em0: TX(0) desc avail = 1006,Next TX to >> Clean = 145 >> >> ifconfig down/up did not fix anything, un/plugging the ethernet did not >> do anything either. rebooting did fix it. >> >> Serious maintenance jobs only starts after 0:00. >> >> --WjW >> >> uname -a: >> FreeBSD zfs.digiware.nl 8.2-STABLE FreeBSD 8.2-STABLE #10: Wed Jul 6 >> 21:57:36 CEST 2011 >> root@zfs.digiware.nl:/home/obj/usr/src/src8/src/sys/ZFS amd64 > > Please provide "dmesg" output pertaining to the NIC (dmesg | grep em0 > would be sufficient). em0: port 0x1820-0x183f mem 0xdf900000-0xdf91ffff,0xdf924000-0xdf924fff irq 16 at device 25.0 on pci0 em0: Using an MSI interrupt em0: [FILTER] em0: Ethernet address: 00:30:48:de:97:cd > >> pciconf -lv: > > Please re-run this with -lvcb and include only the Intel NIC (em0). em0@pci0:0:25:0: class=0x020000 card=0x10bd15d9 chip=0x10bd8086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = 'Intel 82566DM Gigabit Ethernet Adapter (82566DM)' class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xdf900000, size 131072, enabled bar [14] = type Memory, range 32, base 0xdf924000, size 4096, enabled bar [18] = type I/O Port, range 32, base 0x1820, size 32, enabled cap 01[c8] = powerspec 2 supports D0 D3 current D0 cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message cap 13[e0] = PCI Advanced Features: FLR TP > Also please provide output from command "sysctl dev.em.0". dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3 dev.em.0.%driver: em dev.em.0.%location: slot=25 function=0 handle=\_SB_.PCI0.LAN_ dev.em.0.%pnpinfo: vendor=0x8086 device=0x10bd subvendor=0x15d9 subdevice=0x10bd class=0x020000 dev.em.0.%parent: pci0 dev.em.0.nvm: -1 dev.em.0.debug: -1 dev.em.0.rx_int_delay: 0 dev.em.0.tx_int_delay: 66 dev.em.0.rx_abs_int_delay: 66 dev.em.0.tx_abs_int_delay: 66 dev.em.0.rx_processing_limit: 100 dev.em.0.flow_control: 3 dev.em.0.eee_control: 0 dev.em.0.link_irq: 0 dev.em.0.mbuf_alloc_fail: 0 dev.em.0.cluster_alloc_fail: 0 dev.em.0.dropped: 0 dev.em.0.tx_dma_fail: 0 dev.em.0.rx_overruns: 0 dev.em.0.watchdog_timeouts: 0 dev.em.0.device_control: 1074790976 dev.em.0.rx_control: 67141634 dev.em.0.fc_high_water: 8192 dev.em.0.fc_low_water: 6692 dev.em.0.queue0.txd_head: 710 dev.em.0.queue0.txd_tail: 712 dev.em.0.queue0.tx_irq: 0 dev.em.0.queue0.no_desc_avail: 0 dev.em.0.queue0.rxd_head: 558 dev.em.0.queue0.rxd_tail: 556 dev.em.0.queue0.rx_irq: 0 dev.em.0.mac_stats.excess_coll: 0 dev.em.0.mac_stats.single_coll: 0 dev.em.0.mac_stats.multiple_coll: 0 dev.em.0.mac_stats.late_coll: 0 dev.em.0.mac_stats.collision_count: 0 dev.em.0.mac_stats.symbol_errors: 0 dev.em.0.mac_stats.sequence_errors: 0 dev.em.0.mac_stats.defer_count: 0 dev.em.0.mac_stats.missed_packets: 0 dev.em.0.mac_stats.recv_no_buff: 0 dev.em.0.mac_stats.recv_undersize: 0 dev.em.0.mac_stats.recv_fragmented: 0 dev.em.0.mac_stats.recv_oversize: 0 dev.em.0.mac_stats.recv_jabber: 0 dev.em.0.mac_stats.recv_errs: 0 dev.em.0.mac_stats.crc_errs: 0 dev.em.0.mac_stats.alignment_errs: 0 dev.em.0.mac_stats.coll_ext_errs: 0 dev.em.0.mac_stats.xon_recvd: 0 dev.em.0.mac_stats.xon_txd: 0 dev.em.0.mac_stats.xoff_recvd: 0 dev.em.0.mac_stats.xoff_txd: 0 dev.em.0.mac_stats.total_pkts_recvd: 10942830 dev.em.0.mac_stats.good_pkts_recvd: 10942830 dev.em.0.mac_stats.bcast_pkts_recvd: 83730 dev.em.0.mac_stats.mcast_pkts_recvd: 861 dev.em.0.mac_stats.rx_frames_64: 0 dev.em.0.mac_stats.rx_frames_65_127: 0 dev.em.0.mac_stats.rx_frames_128_255: 0 dev.em.0.mac_stats.rx_frames_256_511: 0 dev.em.0.mac_stats.rx_frames_512_1023: 0 dev.em.0.mac_stats.rx_frames_1024_1522: 0 dev.em.0.mac_stats.good_octets_recvd: 8091153275 dev.em.0.mac_stats.good_octets_txd: 5995000222 dev.em.0.mac_stats.total_pkts_txd: 10302157 dev.em.0.mac_stats.good_pkts_txd: 10302157 dev.em.0.mac_stats.bcast_pkts_txd: 1240 dev.em.0.mac_stats.mcast_pkts_txd: 41 dev.em.0.mac_stats.tx_frames_64: 0 dev.em.0.mac_stats.tx_frames_65_127: 0 dev.em.0.mac_stats.tx_frames_128_255: 0 dev.em.0.mac_stats.tx_frames_256_511: 0 dev.em.0.mac_stats.tx_frames_512_1023: 0 dev.em.0.mac_stats.tx_frames_1024_1522: 0 dev.em.0.mac_stats.tso_txd: 179532 dev.em.0.mac_stats.tso_ctx_fail: 0 dev.em.0.interrupts.asserts: 10151890 dev.em.0.interrupts.rx_pkt_timer: 0 dev.em.0.interrupts.rx_abs_timer: 0 dev.em.0.interrupts.tx_pkt_timer: 0 dev.em.0.interrupts.tx_abs_timer: 0 dev.em.0.interrupts.tx_queue_empty: 0 dev.em.0.interrupts.tx_queue_min_thresh: 0 dev.em.0.interrupts.rx_desc_min_thresh: 0 dev.em.0.interrupts.rx_overrun: 0 dev.em.0.wake: 0 Note that this is all from the rebooted server. This is my main filestore, so can not leave it offline (too long). --WjW