Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 7 Oct 2011 11:59:14 -0700
From:      Jason Wolfe <nitroboost@gmail.com>
To:        freebsd-net@freebsd.org
Subject:   Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled
Message-ID:  <CAAAm0r2JH43Rct7UxQK2duH1p43Nepnj5mpb6bXo==DPayhJLg@mail.gmail.com>
In-Reply-To: <4E8F157A.40702@sentex.net>
References:  <CAAAm0r0RXEJo4UiKS=Ui0e5OQTg6sg-xcYf3mYB5%2Bvk8i8557w@mail.gmail.com> <4E8F157A.40702@sentex.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Mike,

I had a large pool of servers running 7.2.3 with MSI-X enabled during my
testing, but it didn't resolve the issue. I just pulled back the
sys/dev/e1000 directory from 8-STABLE and ran it on 8-RELEASE-p2 though, so
if there were changes made outside of the actual driver code that helped I
may have not seen the benefit. It's possible the lagg is adding some
complication, but when one of the interfaces wedge the lagg continues to
operate over the other link (though half of the traffic simply fails). It
appears the interface just runs out of one of its buffers, and is helpless
to resolve it without a bounce.

I do recall coming across the ASPM threads, but my Supermicro boards didn't
have the option and many people claimed it didn't resolve it, so I didn't
follow through. I'll do a bit more digging there, thanks.

Disabling MSI-X has without a doubt completely resolved my problem though. I
would receive about 30 reports/failures a day from my servers when I was
running with it, since disabling it I haven't received a single one in ~40
days.  The servers are currently running with the 7.2.3 driver also, so if
nothing jumps out from my original email I'm happy to re enable it on a
handful of servers and collect some fresh reports.

Jason

On Fri, Oct 7, 2011 at 8:06 AM, Mike Tancsa <mike@sentex.net> wrote:

> On 10/6/2011 7:15 PM, Jason Wolfe wrote:
> > I'm seeing the interface wedge on a good number of systems with Intel
> 82574L
> > chips under FBSD8.2 _only when MSI-X is enabled_, running either 7.1.9
> from
> > 8.2-RELEASE or 7.2.3 from 8.2-STABLE.  I have em0 and em1 in a lagg, but
> > only one side would fail, and a few systems that didn't have a lagg also
> saw
> > the issue.  Higher traffic did seem to increase the likely hood of it
> > dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.1.9
>
>
> Hi,
>        This sure sounds like the issue I was seeing with the 7.1.9
> driver...
> However, it has been fixed for me by going to 7.2.3, which is in
> RELENG_8.  Is it possible you have a couple of issues going on since you
> are using lagg as well ?  Another problem some folks have reported is
> that in the BIOS, if you have an option for ASPM, make sure its disabled.
>
> Google around for ASPM and 82574L for a discussion about it.
>
> If I recall correctly, disabling MSI-X just reduces the chance of the
> problem happening, but its been a while since I ran into this issue.
>
> But for sure you want to be running 7.2.3 from stable
>
> This server used to see this issue
>
> dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
> dev.em.1.%driver: em
> dev.em.1.%location: slot=0 function=0 handle=\_SB_.PCI0.PEX4.HART
> dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086
> subdevice=0x34ec class=0x020000
> dev.em.1.%parent: pci11
> dev.em.1.nvm: -1
> dev.em.1.debug: -1
> dev.em.1.rx_int_delay: 0
> dev.em.1.tx_int_delay: 66
> dev.em.1.rx_abs_int_delay: 66
> dev.em.1.tx_abs_int_delay: 66
> dev.em.1.rx_processing_limit: 100
> dev.em.1.flow_control: 3
> dev.em.1.eee_control: 0
> dev.em.1.link_irq: 0
> dev.em.1.mbuf_alloc_fail: 0
> dev.em.1.cluster_alloc_fail: 0
> dev.em.1.dropped: 0
> dev.em.1.tx_dma_fail: 0
> dev.em.1.rx_overruns: 0
> dev.em.1.watchdog_timeouts: 0
> dev.em.1.device_control: 1209008712
> dev.em.1.rx_control: 67141634
> dev.em.1.fc_high_water: 18432
> dev.em.1.fc_low_water: 16932
> dev.em.1.queue0.txd_head: 754
> dev.em.1.queue0.txd_tail: 754
> dev.em.1.queue0.tx_irq: 251430977
> dev.em.1.queue0.no_desc_avail: 0
> dev.em.1.queue0.rxd_head: 304
> dev.em.1.queue0.rxd_tail: 303
> dev.em.1.queue0.rx_irq: 295670362
> dev.em.1.mac_stats.excess_coll: 0
> dev.em.1.mac_stats.single_coll: 0
> dev.em.1.mac_stats.multiple_coll: 0
> dev.em.1.mac_stats.late_coll: 0
> dev.em.1.mac_stats.collision_count: 0
> dev.em.1.mac_stats.symbol_errors: 0
> dev.em.1.mac_stats.sequence_errors: 0
> dev.em.1.mac_stats.defer_count: 0
> dev.em.1.mac_stats.missed_packets: 0
> dev.em.1.mac_stats.recv_no_buff: 0
> dev.em.1.mac_stats.recv_undersize: 0
> dev.em.1.mac_stats.recv_fragmented: 0
> dev.em.1.mac_stats.recv_oversize: 0
> dev.em.1.mac_stats.recv_jabber: 0
> dev.em.1.mac_stats.recv_errs: 0
> dev.em.1.mac_stats.crc_errs: 0
> dev.em.1.mac_stats.alignment_errs: 0
> dev.em.1.mac_stats.coll_ext_errs: 0
> dev.em.1.mac_stats.xon_recvd: 0
> dev.em.1.mac_stats.xon_txd: 0
> dev.em.1.mac_stats.xoff_recvd: 0
> dev.em.1.mac_stats.xoff_txd: 0
> dev.em.1.mac_stats.total_pkts_recvd: 712410384
> dev.em.1.mac_stats.good_pkts_recvd: 712410384
> dev.em.1.mac_stats.bcast_pkts_recvd: 52263
> dev.em.1.mac_stats.mcast_pkts_recvd: 24921
> dev.em.1.mac_stats.rx_frames_64: 170050
> dev.em.1.mac_stats.rx_frames_65_127: 32571360
> dev.em.1.mac_stats.rx_frames_128_255: 19796510
> dev.em.1.mac_stats.rx_frames_256_511: 6283830
> dev.em.1.mac_stats.rx_frames_512_1023: 7922330
> dev.em.1.mac_stats.rx_frames_1024_1522: 645666304
> dev.em.1.mac_stats.good_octets_recvd: 988128549661
> dev.em.1.mac_stats.good_octets_txd: 48849605092
> dev.em.1.mac_stats.total_pkts_txd: 501680484
> dev.em.1.mac_stats.good_pkts_txd: 501680484
> dev.em.1.mac_stats.bcast_pkts_txd: 4266
> dev.em.1.mac_stats.mcast_pkts_txd: 8
> dev.em.1.mac_stats.tx_frames_64: 134256137
> dev.em.1.mac_stats.tx_frames_65_127: 291152180
> dev.em.1.mac_stats.tx_frames_128_255: 67219002
> dev.em.1.mac_stats.tx_frames_256_511: 5935140
> dev.em.1.mac_stats.tx_frames_512_1023: 812920
> dev.em.1.mac_stats.tx_frames_1024_1522: 2305105
> dev.em.1.mac_stats.tso_txd: 366978
> dev.em.1.mac_stats.tso_ctx_fail: 0
> dev.em.1.interrupts.asserts: 2
> dev.em.1.interrupts.rx_pkt_timer: 0
> dev.em.1.interrupts.rx_abs_timer: 0
> dev.em.1.interrupts.tx_pkt_timer: 0
> dev.em.1.interrupts.tx_abs_timer: 0
> dev.em.1.interrupts.tx_queue_empty: 0
> dev.em.1.interrupts.tx_queue_min_thresh: 0
> dev.em.1.interrupts.rx_desc_min_thresh: 0
> dev.em.1.interrupts.rx_overrun: 0
>
> interrupt                          total       rate
> irq4: uart0                        44896          0
> irq16: bge0                     19753077         32
> irq18: arcmsr0                  37518694         62
> irq19: twa0                       556664          0
> irq21: ehci0                     2149928          3
> irq23: ehci1                     1209435          2
> cpu0: timer                   1209274084       2000
> irq256: siis0                   65793731        108
> irq257: em0                    504313285        834
> irq258: em1:rx 0               295681170        489
> irq259: em1:tx 0               251430780        415
> irq261: ahci0                   71285304        117
> cpu1: timer                   1209264969       2000
> cpu3: timer                   1209266038       2000
> cpu2: timer                   1209265460       2000
> Total                         6086807515      10067
>
>    vendor     = 'Intel Corporation'
>    device     = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
>    class      = network
>    subclass   = ethernet
>     bar   [10] = type Memory, range 32, base 0xb4100000, size 131072,
> enabled
>    bar   [18] = type I/O Port, range 32, base 0x2000, size 32, enabled
>    bar   [1c] = type Memory, range 32, base 0xb4120000, size 16384, enabled
>    cap 01[c8] = powerspec 2  supports D0 D3  current D0
>    cap 05[d0] = MSI supports 1 message, 64 bit
>    cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>    cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
> ecap 0003[140] = Serial 1 001517ffffed68a4
>
>
>        ---Mike
>
>
> --
> -------------------
> Mike Tancsa, tel +1 519 651 3400
> Sentex Communications, mike@sentex.net
> Providing Internet services since 1994 www.sentex.net
> Cambridge, Ontario Canada   http://www.tancsa.com/
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAAAm0r2JH43Rct7UxQK2duH1p43Nepnj5mpb6bXo==DPayhJLg>