Date: Sun, 16 Nov 2014 19:29:11 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 195078] New: em tx_dma_fails and dropped packets Message-ID: <bug-195078-8@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D195078 Bug ID: 195078 Summary: em tx_dma_fails and dropped packets Product: Base System Version: 9.2-RELEASE Hardware: Any OS: Any Status: Needs Triage Severity: Affects Many People Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: fusionfoto@gmail.com It looks like FreeBSD may be a victim of this bug. This likely affects all FreeBSD versions that have defaulted to a higher dev.em.rxd, which could be several.=20 I've turned tso on my running machine because I didn't want to reboot which solved one set of problems, and then had to increase the rx_processing threshold to hopefully solve the remaining packet drops.=20 I have another couple of machines scheduled to reboot with dev.em.rxd/txd s= et to 256 which I think is the old value, and hopefully I'll be able to set the rest of the sysctls back to normal. Hope this helps. --- http://www.intel.com.au/content/dam/www/public/us/en/documents/specificatio= n-updates/82574-gbe-controller-spec-update.pdf 17. Tx Data Corruption When Using TCP Segmentation Offload Problem: When using TSO, a situation can occur where a PCIe MRd request is repeated with the same address, resulting in data corruption. At the end of the TCP packet, t= he Tx DMA hangs because the length doesn't match. This can only occur when the follow= ing are true: =E2=80=A2 The first buffer of the packet is larger than [3 * (max_read_requ= est - 4)]. =E2=80=A2 There is a 4 KB boundary within 64 bytes following the end of the= header bytes in the buffer Implication: Possible data corruption since a TCP packet is transmitted containing the wrong data but with the correct checksum. Data transmission halts as the Tx DMA module enters a hang state. Workaround: The failure can be avoided by ensuring at least one of the following: =E2=80=A2 The buffer containing the headers should not be larger than [3 * (max_read_request - 4)]. To meet this requirement even for the minimum valu= e of 128 bytes for max_read_request, the buffer should not be larger than 372 by= tes. =E2=80=A2 The alignment of the buffer containing the headers should be such= that there is no 4 KB boundary within 64 bytes following the end of the header bytes. Assumi= ng standard Ethernet/IP/TCP headers of 54 bytes, this means that the buffer sh= ould not start 54-118 bytes before a 4 KB boundary. For example, 128-byte alignm= ent for this buffer could be used to fulfill this condition. This problem has not been reported when using an Intel Linux* or Windows* drivers. Current analysis shows it is very unlikely for a situation to exist that wo= uld cause the 82574 to be at risk for the errata when using the Intel Linux or Windows drivers. Linux and other distros seem to have fixed it. This could be getting exerci= sed because FreeBSD recently changed the default buffer size above 256 for this driver. **** my comments below **** Since I didn't want to reboot to try the lower buffer size, I turned off TS= O on all the machines that I'd checked that were actively incrementing tx_dma_fa= il for em interfaces then re-enabled their membership into the LACP. In brief testing, (few gigabits for a few minutes) tx_dma_fail has not incremented and throughput has not been negatively impacted (before vs after re-enable). On Thu, Nov 13, 2014 at 1:52 PM, FF <fusionfoto@gmail.com> wrote: What knob do I need to turn to address this? This em0 is in an LACP bundle with an igb0 that isn't showing this prob= lem. dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.3.8 dev.em.0.%driver: em dev.em.0.%location: slot=3D25 function=3D0 handle=3D\_SB_.PCI0.GLAN dev.em.0.%pnpinfo: vendor=3D0x8086 device=3D0x153b subvendor=3D0x15d9 subdevice=3D0x153b class=3D0x020000 dev.em.0.%parent: pci0 dev.em.0.nvm: -1 dev.em.0.debug: -1 dev.em.0.fc: 3 dev.em.0.rx_int_delay: 0 dev.em.0.tx_int_delay: 66 dev.em.0.rx_abs_int_delay: 66 dev.em.0.tx_abs_int_delay: 66 dev.em.0.itr: 488 dev.em.0.rx_processing_limit: 100 dev.em.0.eee_control: 1 dev.em.0.link_irq: 0 dev.em.0.mbuf_alloc_fail: 52 dev.em.0.cluster_alloc_fail: 0 dev.em.0.dropped: 0 ** dev.em.0.tx_dma_fail: 1834648 dev.em.0.rx_overruns: 3109 ** dev.em.0.watchdog_timeouts: 0 dev.em.0.device_control: 1209532992 dev.em.0.rx_control: 67141634 dev.em.0.fc_high_water: 23584 dev.em.0.fc_low_water: 20552 dev.em.0.queue0.txd_head: 577 dev.em.0.queue0.txd_tail: 577 dev.em.0.queue0.tx_irq: 0 dev.em.0.queue0.no_desc_avail: 0 dev.em.0.queue0.rxd_head: 967 dev.em.0.queue0.rxd_tail: 966 dev.em.0.queue0.rx_irq: 0 dev.em.0.mac_stats.excess_coll: 0 dev.em.0.mac_stats.single_coll: 0 dev.em.0.mac_stats.multiple_coll: 0 dev.em.0.mac_stats.late_coll: 0 dev.em.0.mac_stats.collision_count: 0 dev.em.0.mac_stats.symbol_errors: 0 dev.em.0.mac_stats.sequence_errors: 0 dev.em.0.mac_stats.defer_count: 0 dev.em.0.mac_stats.missed_packets: 61094 dev.em.0.mac_stats.recv_no_buff: 60008 dev.em.0.mac_stats.recv_undersize: 0 dev.em.0.mac_stats.recv_fragmented: 0 dev.em.0.mac_stats.recv_oversize: 0 dev.em.0.mac_stats.recv_jabber: 0 dev.em.0.mac_stats.recv_errs: 0 dev.em.0.mac_stats.crc_errs: 0 dev.em.0.mac_stats.alignment_errs: 0 dev.em.0.mac_stats.coll_ext_errs: 0 dev.em.0.mac_stats.xon_recvd: 40226659 dev.em.0.mac_stats.xon_txd: 2132 dev.em.0.mac_stats.xoff_recvd: 40241216 dev.em.0.mac_stats.xoff_txd: 2073563 dev.em.0.mac_stats.total_pkts_recvd: 3219537541 dev.em.0.mac_stats.good_pkts_recvd: 3139008594 dev.em.0.mac_stats.bcast_pkts_recvd: 3953817 dev.em.0.mac_stats.mcast_pkts_recvd: 607157 dev.em.0.mac_stats.rx_frames_64: 0 dev.em.0.mac_stats.rx_frames_65_127: 0 dev.em.0.mac_stats.rx_frames_128_255: 0 dev.em.0.mac_stats.rx_frames_256_511: 0 dev.em.0.mac_stats.rx_frames_512_1023: 0 dev.em.0.mac_stats.rx_frames_1024_1522: 0 dev.em.0.mac_stats.good_octets_recvd: 3527296369841 dev.em.0.mac_stats.good_octets_txd: 14348531993101 dev.em.0.mac_stats.total_pkts_txd: 10735190291 dev.em.0.mac_stats.good_pkts_txd: 10733114595 dev.em.0.mac_stats.bcast_pkts_txd: 14 dev.em.0.mac_stats.mcast_pkts_txd: 54334 dev.em.0.mac_stats.tx_frames_64: 0 dev.em.0.mac_stats.tx_frames_65_127: 0 dev.em.0.mac_stats.tx_frames_128_255: 0 dev.em.0.mac_stats.tx_frames_256_511: 0 dev.em.0.mac_stats.tx_frames_512_1023: 0 dev.em.0.mac_stats.tx_frames_1024_1522: 0 dev.em.0.mac_stats.tso_txd: 902605586 dev.em.0.mac_stats.tso_ctx_fail: 0 dev.em.0.interrupts.asserts: 1392541431 dev.em.0.interrupts.rx_pkt_timer: 0 dev.em.0.interrupts.rx_abs_timer: 0 dev.em.0.interrupts.tx_pkt_timer: 0 dev.em.0.interrupts.tx_abs_timer: 0 dev.em.0.interrupts.tx_queue_empty: 0 dev.em.0.interrupts.tx_queue_min_thresh: 0 dev.em.0.interrupts.rx_desc_min_thresh: 0 dev.em.0.interrupts.rx_overrun: 0 dev.em.0.wake: 0 dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection version - 2.3.10 dev.igb.0.%driver: igb dev.igb.0.%location: slot=3D0 function=3D0 handle=3D\_SB_.PCI0.RP04.PXSX dev.igb.0.%pnpinfo: vendor=3D0x8086 device=3D0x1533 subvendor=3D0x15d9 subdevice=3D0x1533 class=3D0x020000 dev.igb.0.%parent: pci5 dev.igb.0.nvm: -1 dev.igb.0.enable_aim: 1 dev.igb.0.fc: 3 dev.igb.0.rx_processing_limit: 100 dev.igb.0.dmac: 0 dev.igb.0.eee_disabled: 0 dev.igb.0.link_irq: 33 dev.igb.0.dropped: 0 dev.igb.0.tx_dma_fail: 0 dev.igb.0.rx_overruns: 0 dev.igb.0.watchdog_timeouts: 0 dev.igb.0.device_control: 1209795137 dev.igb.0.rx_control: 71335938 dev.igb.0.interrupt_mask: 4 dev.igb.0.extended_int_mask: 2147483679 dev.igb.0.tx_buf_alloc: 0 dev.igb.0.rx_buf_alloc: 0 dev.igb.0.fc_high_water: 31328 dev.igb.0.fc_low_water: 31312 dev.igb.0.queue0.no_desc_avail: 0 dev.igb.0.queue0.tx_packets: 62464141 dev.igb.0.queue0.rx_packets: 73012939 dev.igb.0.queue0.rx_bytes: 22529663814 dev.igb.0.queue0.lro_queued: 0 dev.igb.0.queue0.lro_flushed: 0 dev.igb.0.queue1.no_desc_avail: 0 dev.igb.0.queue1.tx_packets: 404298046 dev.igb.0.queue1.rx_packets: 307675818 dev.igb.0.queue1.rx_bytes: 185919902229 dev.igb.0.queue1.lro_queued: 0 dev.igb.0.queue1.lro_flushed: 0 dev.igb.0.queue2.no_desc_avail: 0 dev.igb.0.queue2.tx_packets: 3441053015 dev.igb.0.queue2.rx_packets: 5511826751 dev.igb.0.queue2.rx_bytes: 3054219311510 dev.igb.0.queue2.lro_queued: 0 dev.igb.0.queue2.lro_flushed: 0 dev.igb.0.queue3.no_desc_avail: 0 dev.igb.0.queue3.tx_packets: 1047838830 dev.igb.0.queue3.rx_packets: 1987495318 dev.igb.0.queue3.rx_bytes: 2696179247028 dev.igb.0.queue3.lro_queued: 0 dev.igb.0.queue3.lro_flushed: 0 dev.igb.0.mac_stats.excess_coll: 0 dev.igb.0.mac_stats.single_coll: 0 dev.igb.0.mac_stats.multiple_coll: 0 dev.igb.0.mac_stats.late_coll: 0 dev.igb.0.mac_stats.collision_count: 0 dev.igb.0.mac_stats.symbol_errors: 0 dev.igb.0.mac_stats.sequence_errors: 0 dev.igb.0.mac_stats.defer_count: 283811 dev.igb.0.mac_stats.missed_packets: 9449 dev.igb.0.mac_stats.recv_no_buff: 340 dev.igb.0.mac_stats.recv_undersize: 0 dev.igb.0.mac_stats.recv_fragmented: 0 dev.igb.0.mac_stats.recv_oversize: 0 dev.igb.0.mac_stats.recv_jabber: 0 dev.igb.0.mac_stats.recv_errs: 0 dev.igb.0.mac_stats.crc_errs: 0 dev.igb.0.mac_stats.alignment_errs: 0 dev.igb.0.mac_stats.coll_ext_errs: 0 dev.igb.0.mac_stats.xon_recvd: 46255557 dev.igb.0.mac_stats.xon_txd: 261 dev.igb.0.mac_stats.xoff_recvd: 46255994 dev.igb.0.mac_stats.xoff_txd: 7027 dev.igb.0.mac_stats.total_pkts_recvd: 7975033582 dev.igb.0.mac_stats.good_pkts_recvd: 7880001465 dev.igb.0.mac_stats.bcast_pkts_recvd: 5783868 dev.igb.0.mac_stats.mcast_pkts_recvd: 563315 dev.igb.0.mac_stats.rx_frames_64: 28412906 dev.igb.0.mac_stats.rx_frames_65_127: 3310187919 dev.igb.0.mac_stats.rx_frames_128_255: 784920450 dev.igb.0.mac_stats.rx_frames_256_511: 17225962 dev.igb.0.mac_stats.rx_frames_512_1023: 73415350 dev.igb.0.mac_stats.rx_frames_1024_1522: 3665838878 dev.igb.0.mac_stats.good_octets_recvd: 5990356613544 dev.igb.0.mac_stats.good_octets_txd: 46326753008181 dev.igb.0.mac_stats.total_pkts_txd: 33016014138 dev.igb.0.mac_stats.good_pkts_txd: 33016006850 dev.igb.0.mac_stats.bcast_pkts_txd: 834 dev.igb.0.mac_stats.mcast_pkts_txd: 54331 dev.igb.0.mac_stats.tx_frames_64: 30741691 dev.igb.0.mac_stats.tx_frames_65_127: 2174824217 dev.igb.0.mac_stats.tx_frames_128_255: 139804927 dev.igb.0.mac_stats.tx_frames_256_511: 59190261 dev.igb.0.mac_stats.tx_frames_512_1023: 386886648 dev.igb.0.mac_stats.tx_frames_1024_1522: 30224559106 dev.igb.0.mac_stats.tso_txd: 2384636909 dev.igb.0.mac_stats.tso_ctx_fail: 0 dev.igb.0.interrupts.asserts: 4556119857 dev.igb.0.interrupts.rx_pkt_timer: 7879778770 dev.igb.0.interrupts.rx_abs_timer: 0 dev.igb.0.interrupts.tx_pkt_timer: 0 dev.igb.0.interrupts.tx_abs_timer: 0 dev.igb.0.interrupts.tx_queue_empty: 33015268817 dev.igb.0.interrupts.tx_queue_min_thresh: 7880001470 dev.igb.0.interrupts.rx_desc_min_thresh: 0 dev.igb.0.interrupts.rx_overrun: 0 dev.igb.0.host.breaker_tx_pkt: 0 dev.igb.0.host.host_tx_pkt_discard: 0 dev.igb.0.host.rx_pkt: 222702 dev.igb.0.host.breaker_rx_pkts: 0 dev.igb.0.host.breaker_rx_pkt_drop: 0 dev.igb.0.host.tx_good_pkt: 738033 dev.igb.0.host.breaker_tx_pkt_drop: 0 dev.igb.0.host.rx_good_bytes: 5990357073320 dev.igb.0.host.tx_good_bytes: 46326753008181 dev.igb.0.host.length_errors: 0 dev.igb.0.host.serdes_violation_pkt: 0 dev.igb.0.host.header_redir_missed: 0 dev.igb.0.wake: 0 hw.em.eee_setting: 1 hw.em.rx_process_limit: 100 hw.em.enable_msix: 1 hw.em.sbp: 0 hw.em.smart_pwr_down: 0 hw.em.txd: 1024 hw.em.rxd: 1024 hw.em.rx_abs_int_delay: 66 hw.em.tx_abs_int_delay: 66 hw.em.rx_int_delay: 0 hw.em.tx_int_delay: 66 hw.igb.rx_process_limit: 100 hw.igb.num_queues: 0 hw.igb.header_split: 0 hw.igb.buf_ring_size: 4096 hw.igb.max_interrupt_rate: 8000 hw.igb.enable_msix: 1 hw.igb.enable_aim: 1 hw.igb.txd: 1024 hw.igb.rxd: 1024 FreeBSD systemname.com 9.2-RELEASE-p10 FreeBSD 9.2-RELEASE-p10 #0 r2701= 48M: Mon Aug 18 23:14:36 EDT 2014 root@peta108:/usr/obj/usr/src/sys/CUSTOM10= =20 amd64 em0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu = 1500 =20=20=20=20=20=20=20=20=20=20=20 options=3D4019b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,VLAN= _HWTSO> ether 00:25:90:f2:2d:24 inet6 fe80::225:90ff:fef2:2d24%em0 prefixlen 64 scopeid 0x2 nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet autoselect (1000baseT <full-duplex>) status: active igb0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu= 1500 =20=20=20=20=20=20=20=20=20=20=20 options=3D401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM= ,TSO4,VLAN_HWTSO> ether 00:25:90:f2:2d:24 inet6 fe80::225:90ff:fef2:2d25%igb0 prefixlen 64 scopeid 0x4 nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet autoselect (1000baseT <full-duplex>) status: active lo0: flags=3D8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=3D600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6> inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x7 inet 127.0.0.1 netmask 0xff000000 nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL> lagg0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mt= u 1500 =20=20=20=20=20=20=20=20=20=20=20 options=3D4019b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,VLAN= _HWTSO> ether 00:25:90:f2:2d:24 inet 192.168.0.108 netmask 0xffffff00 broadcast 192.168.0.255 inet6 fe80::225:90ff:fef2:2d24%lagg0 prefixlen 64 scopeid 0x8 nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet autoselect status: active laggproto lacp lagghash l2,l3,l4 laggport: igb0 flags=3D1c<ACTIVE,COLLECTING,DISTRIBUTING> laggport: em0 flags=3D1c<ACTIVE,COLLECTING,DISTRIBUTING> Thanks in advance! --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-195078-8>