From owner-freebsd-net@FreeBSD.ORG Thu Jul 17 07:49:44 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3DF1C249 for ; Thu, 17 Jul 2014 07:49:44 +0000 (UTC) Received: from atl4mhfb03.myregisteredsite.com (atl4mhfb03.myregisteredsite.com [209.17.115.61]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 13EB628E7 for ; Thu, 17 Jul 2014 07:49:43 +0000 (UTC) Received: from atl4mhob12.myregisteredsite.com (atl4mhob12.myregisteredsite.com [209.17.115.50]) by atl4mhfb03.myregisteredsite.com (8.14.4/8.14.4) with ESMTP id s6H7nEun019630 for ; Thu, 17 Jul 2014 03:49:18 -0400 Received: from mailpod.hostingplatform.com ([10.30.71.211]) by atl4mhob12.myregisteredsite.com (8.14.4/8.14.4) with ESMTP id s6H7n6qJ017267 for ; Thu, 17 Jul 2014 03:49:06 -0400 Received: (qmail 15358 invoked by uid 0); 17 Jul 2014 07:49:06 -0000 X-TCPREMOTEIP: 118.186.129.16 X-Authenticated-UID: peterxu@cyphy.net Received: from unknown (HELO Peters-MacAir.local) (peterxu@cyphy.net@118.186.129.16) by 0 with ESMTPA; 17 Jul 2014 07:49:05 -0000 Message-ID: <53C77FEE.9000707@cyphy.net> Date: Thu, 17 Jul 2014 15:49:02 +0800 From: Xu Zhe User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Network unstability issue with ixgbe driver (ix0 local_faults non zero) Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: Javen Wu , Jason Zhang X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Jul 2014 07:49:44 -0000 Hi, Freebsd developers, We are encountering network problem on Freebsd (version 8.2), with Intel X540T 10g card and ixgbe 2.5.15 (also tried a older version 2.5.8) driver. First, we found the problem when SSH always fails due to timed out. Then we found that it is possibly a generic network issue rather than SSH problem. We found non-zero local_faults and remote_faults in sysctl: # sysctl -a | grep ix.0 dev.ix.0.%desc: Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 2.5.8 dev.ix.0.%driver: ix dev.ix.0.%location: slot=0 function=0 handle=\_SB_.PCI0.NPE3.TGBE dev.ix.0.%pnpinfo: vendor=0x8086 device=0x1528 subvendor=0x152d subdevice=0x899f class=0x020000 dev.ix.0.%parent: pci3 dev.ix.0.fc: 3 dev.ix.0.enable_aim: 1 dev.ix.0.advertise_speed: 0 dev.ix.0.ts: 0 dev.ix.0.dropped: 0 dev.ix.0.mbuf_defrag_failed: 0 dev.ix.0.watchdog_events: 0 dev.ix.0.link_irq: 4 dev.ix.0.queue0.interrupt_rate: 55555 dev.ix.0.queue0.irqs: 1491075 dev.ix.0.queue0.txd_head: 604 dev.ix.0.queue0.txd_tail: 604 dev.ix.0.queue0.tso_tx: 154 dev.ix.0.queue0.no_tx_dma_setup: 0 dev.ix.0.queue0.no_desc_avail: 0 dev.ix.0.queue0.tx_packets: 948089 dev.ix.0.queue0.rxd_head: 620 dev.ix.0.queue0.rxd_tail: 619 dev.ix.0.queue0.rx_packets: 7799404 dev.ix.0.queue0.rx_bytes: 11075537104 dev.ix.0.queue0.rx_copies: 111468 dev.ix.0.queue0.lro_queued: 7788218 dev.ix.0.queue0.lro_flushed: 968958 dev.ix.0.queue1.interrupt_rate: 100000 dev.ix.0.queue1.irqs: 90817 dev.ix.0.queue1.txd_head: 1800 dev.ix.0.queue1.txd_tail: 1800 dev.ix.0.queue1.tso_tx: 2 dev.ix.0.queue1.no_tx_dma_setup: 0 dev.ix.0.queue1.no_desc_avail: 0 dev.ix.0.queue1.tx_packets: 32468 dev.ix.0.queue1.rxd_head: 1802 dev.ix.0.queue1.rxd_tail: 1801 dev.ix.0.queue1.rx_packets: 40714 dev.ix.0.queue1.rx_bytes: 4527395 dev.ix.0.queue1.rx_copies: 38784 dev.ix.0.queue1.lro_queued: 38668 dev.ix.0.queue1.lro_flushed: 38486 dev.ix.0.queue2.interrupt_rate: 71428 dev.ix.0.queue2.irqs: 28625 dev.ix.0.queue2.txd_head: 349 dev.ix.0.queue2.txd_tail: 349 dev.ix.0.queue2.tso_tx: 1 dev.ix.0.queue2.no_tx_dma_setup: 0 dev.ix.0.queue2.no_desc_avail: 0 dev.ix.0.queue2.tx_packets: 6981 dev.ix.0.queue2.rxd_head: 1952 dev.ix.0.queue2.rxd_tail: 1951 dev.ix.0.queue2.rx_packets: 6048 dev.ix.0.queue2.rx_bytes: 947930 dev.ix.0.queue2.rx_copies: 5241 dev.ix.0.queue2.lro_queued: 4846 dev.ix.0.queue2.lro_flushed: 4760 dev.ix.0.queue3.interrupt_rate: 500000 dev.ix.0.queue3.irqs: 54879 dev.ix.0.queue3.txd_head: 504 dev.ix.0.queue3.txd_tail: 504 dev.ix.0.queue3.tso_tx: 10 dev.ix.0.queue3.no_tx_dma_setup: 0 dev.ix.0.queue3.no_desc_avail: 0 dev.ix.0.queue3.tx_packets: 18406 dev.ix.0.queue3.rxd_head: 449 dev.ix.0.queue3.rxd_tail: 448 dev.ix.0.queue3.rx_packets: 20929 dev.ix.0.queue3.rx_bytes: 2572540 dev.ix.0.queue3.rx_copies: 20297 dev.ix.0.queue3.lro_queued: 19218 dev.ix.0.queue3.lro_flushed: 19102 dev.ix.0.queue4.interrupt_rate: 500000 dev.ix.0.queue4.irqs: 22609 dev.ix.0.queue4.txd_head: 1370 dev.ix.0.queue4.txd_tail: 1370 dev.ix.0.queue4.tso_tx: 1 dev.ix.0.queue4.no_tx_dma_setup: 0 dev.ix.0.queue4.no_desc_avail: 0 dev.ix.0.queue4.tx_packets: 3518 dev.ix.0.queue4.rxd_head: 1622 dev.ix.0.queue4.rxd_tail: 1621 dev.ix.0.queue4.rx_packets: 3670 dev.ix.0.queue4.rx_bytes: 474745 dev.ix.0.queue4.rx_copies: 3014 dev.ix.0.queue4.lro_queued: 2174 dev.ix.0.queue4.lro_flushed: 2171 dev.ix.0.queue5.interrupt_rate: 100000 dev.ix.0.queue5.irqs: 366375 dev.ix.0.queue5.txd_head: 833 dev.ix.0.queue5.txd_tail: 833 dev.ix.0.queue5.tso_tx: 326797 dev.ix.0.queue5.no_tx_dma_setup: 0 dev.ix.0.queue5.no_desc_avail: 0 dev.ix.0.queue5.tx_packets: 531092 dev.ix.0.queue5.rxd_head: 57 dev.ix.0.queue5.rxd_tail: 56 dev.ix.0.queue5.rx_packets: 796729 dev.ix.0.queue5.rx_bytes: 108295068 dev.ix.0.queue5.rx_copies: 582757 dev.ix.0.queue5.lro_queued: 795369 dev.ix.0.queue5.lro_flushed: 258290 dev.ix.0.queue6.interrupt_rate: 100000 dev.ix.0.queue6.irqs: 26775 dev.ix.0.queue6.txd_head: 1146 dev.ix.0.queue6.txd_tail: 1146 dev.ix.0.queue6.tso_tx: 13 dev.ix.0.queue6.no_tx_dma_setup: 0 dev.ix.0.queue6.no_desc_avail: 0 dev.ix.0.queue6.tx_packets: 5469 dev.ix.0.queue6.rxd_head: 1077 dev.ix.0.queue6.rxd_tail: 1076 dev.ix.0.queue6.rx_packets: 9269 dev.ix.0.queue6.rx_bytes: 6631479 dev.ix.0.queue6.rx_copies: 4878 dev.ix.0.queue6.lro_queued: 8054 dev.ix.0.queue6.lro_flushed: 4260 dev.ix.0.queue7.interrupt_rate: 55555 dev.ix.0.queue7.irqs: 243399 dev.ix.0.queue7.txd_head: 66 dev.ix.0.queue7.txd_tail: 66 dev.ix.0.queue7.tso_tx: 5 dev.ix.0.queue7.no_tx_dma_setup: 0 dev.ix.0.queue7.no_desc_avail: 0 dev.ix.0.queue7.tx_packets: 121101 dev.ix.0.queue7.rxd_head: 130 dev.ix.0.queue7.rxd_tail: 129 dev.ix.0.queue7.rx_packets: 127106 dev.ix.0.queue7.rx_bytes: 15197119 dev.ix.0.queue7.rx_copies: 118192 dev.ix.0.queue7.lro_queued: 125622 dev.ix.0.queue7.lro_flushed: 125138 dev.ix.0.mac_stats.crc_errs: 0 dev.ix.0.mac_stats.ill_errs: 0 dev.ix.0.mac_stats.byte_errs: 0 dev.ix.0.mac_stats.short_discards: 0 dev.ix.0.mac_stats.local_faults: 7 <=============== HERE dev.ix.0.mac_stats.remote_faults: 1 dev.ix.0.mac_stats.rec_len_errs: 0 dev.ix.0.mac_stats.xon_txd: 0 dev.ix.0.mac_stats.xon_recvd: 0 dev.ix.0.mac_stats.xoff_txd: 0 dev.ix.0.mac_stats.xoff_recvd: 0 dev.ix.0.mac_stats.total_octets_rcvd: 11249450018 dev.ix.0.mac_stats.good_octets_rcvd: 11249396646 dev.ix.0.mac_stats.total_pkts_rcvd: 8804445 dev.ix.0.mac_stats.good_pkts_rcvd: 8803850 dev.ix.0.mac_stats.mcast_pkts_rcvd: 9311 dev.ix.0.mac_stats.bcast_pkts_rcvd: 1908 dev.ix.0.mac_stats.rx_frames_64: 18132 dev.ix.0.mac_stats.rx_frames_65_127: 759186 dev.ix.0.mac_stats.rx_frames_128_255: 116641 dev.ix.0.mac_stats.rx_frames_256_511: 686728 dev.ix.0.mac_stats.rx_frames_512_1023: 67041 dev.ix.0.mac_stats.rx_frames_1024_1522: 7156122 dev.ix.0.mac_stats.recv_undersized: 0 dev.ix.0.mac_stats.recv_fragmented: 0 dev.ix.0.mac_stats.recv_oversized: 0 dev.ix.0.mac_stats.recv_jabberd: 0 dev.ix.0.mac_stats.management_pkts_rcvd: 11219 dev.ix.0.mac_stats.management_pkts_drpd: 0 dev.ix.0.mac_stats.checksum_errs: 0 dev.ix.0.mac_stats.good_octets_txd: 20162287794 dev.ix.0.mac_stats.total_pkts_txd: 14419225 dev.ix.0.mac_stats.good_pkts_txd: 14419225 dev.ix.0.mac_stats.bcast_pkts_txd: 621 dev.ix.0.mac_stats.mcast_pkts_txd: 0 dev.ix.0.mac_stats.management_pkts_txd: 0 dev.ix.0.mac_stats.tx_frames_64: 12833 dev.ix.0.mac_stats.tx_frames_65_127: 549847 dev.ix.0.mac_stats.tx_frames_128_255: 80184 dev.ix.0.mac_stats.tx_frames_256_511: 631975 dev.ix.0.mac_stats.tx_frames_512_1023: 116264 dev.ix.0.mac_stats.tx_frames_1024_1522: 13028122 Does any one know what does local_faults/remot_faults mean here? Does this means there is a hardware error? (We tried to find the adaptor manual, but there is no detail on IXGBE_MLFC [0x04034] register) Any suggestion on how to diagnose this problem is welcomed too. Thanks in advance! Peter