Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 7 Oct 2011 15:55:49 -0400
From:      Arnaud Lacombe <lacombar@gmail.com>
To:        Mike Tancsa <mike@sentex.net>
Cc:        freebsd-net@freebsd.org, Jason Wolfe <nitroboost@gmail.com>
Subject:   Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled
Message-ID:  <CACqU3MVwLaepFymZJkaVk6p=SpykGhqs=VYFjLh9fP9S=AxDhg@mail.gmail.com>
In-Reply-To: <4E8F51D4.1060509@sentex.net>
References:  <CAAAm0r0RXEJo4UiKS=Ui0e5OQTg6sg-xcYf3mYB5%2Bvk8i8557w@mail.gmail.com> <4E8F157A.40702@sentex.net> <CAAAm0r2JH43Rct7UxQK2duH1p43Nepnj5mpb6bXo==DPayhJLg@mail.gmail.com> <4E8F51D4.1060509@sentex.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,

On Fri, Oct 7, 2011 at 3:24 PM, Mike Tancsa <mike@sentex.net> wrote:
> On 10/7/2011 2:59 PM, Jason Wolfe wrote:
>> Mike,
>>
>> I had a large pool of servers running 7.2.3 with MSI-X enabled during my
>> testing, but it didn't resolve the issue. I just pulled back the
>> sys/dev/e1000 directory from 8-STABLE and ran it on 8-RELEASE-p2 though,=
 so
>> if there were changes made outside of the actual driver code that helped=
 I
>> may have not seen the benefit. It's possible the lagg is adding some
>> complication, but when one of the interfaces wedge the lagg continues to
>> operate over the other link (though half of the traffic simply fails). I=
t
>> appears the interface just runs out of one of its buffers, and is helple=
ss
>> to resolve it without a bounce.
>>
>> I do recall coming across the ASPM threads, but my Supermicro boards did=
n't
>> have the option and many people claimed it didn't resolve it, so I didn'=
t
>> follow through. I'll do a bit more digging there, thanks.
>>
>> Disabling MSI-X has without a doubt completely resolved my problem thoug=
h. I
>> would receive about 30 reports/failures a day from my servers when I was
>> running with it, since disabling it I haven't received a single one in ~=
40
>> days. =A0The servers are currently running with the 7.2.3 driver also, s=
o if
>> nothing jumps out from my original email I'm happy to re enable it on a
>> handful of servers and collect some fresh reports.
>
> Hi Jason,
> =A0 =A0 =A0 =A0This sounds like a real drag :( =A0You certainly have WAY =
more servers to
> sample from than I do/did (a couple). The problem on my boxes were not
> very frequent to start with, so it would take a while. But the symptoms
> were very similar in that I would see queue overruns in the stats when
> things were wedged. =A0I have other em nics (non 82574) that get the odd
> overrun when they are busy, but they seem to recover from the situation
> just fine. The 82574 did not.
>
> When you disable MSI-X, you mean via hw.pci.enable_msix=3D0 across the
> board, or you disable multi-queue for the NIC, so it uses just one
> interrupt, rather than separate ones for xmit and recv ?
>
em(4)'s multiqueue is misleading. By default, with MSI-X enabled,
before AFAIK, April 2010 it used 2 (RX+TX) queue + 1, ie. 5 MSI-X
vectors[0]. After April 2010, it uses 1 * (RX+TX) queue +  1, ie. 3
MSI-X vectors. There is no logic for the driver to use 1 vector with
MSI-X enabled.

As a side note, the only gain of EM_MULTIQUEUE, now, is to allow the
driver to use the buf_ring(9) lockless queue API, compared to the
locked ifq. Today, em(4) should waste about 16k of memory for when
!EM_MULTIQUEUE. This is the memory, 4096 * sizeof(void *), allocated
for the buf_ring(9) structure which is not used in the !EM_MULTIQUEUE
case.

> Also, what is the purpose of
> hw.pci.do_power_nodriver=3D3 vs 0 (3 means put absolutely everything
> in D3 state.)
>
> net.link.ifqmaxlen 1024 vs 50 (does anything else need to be adjusted of
> this value is increased?)
>
He might as well try to enable EM_MULTIQUEUE.

> hw.em.rxd=3D"2048"
> hw.em.txd=3D"2048"
>
As it starts to be well known here, I am not a fan of bumping a limit
to hide a bug. So I'd rather lower this to 512 or 256, and hope it
triggers the issue more often, so that it could be diagnosticed and
fixed for good.

 - Arnaud

[0]: actually it depends on a field in the chip NVM, which can be up
to 4 (0 based accounting, this would translate in 5 vectors), but
happened to be 2 (3 vector) in 82574 I've got access to. Last time I
checked, this setting could not be seen with the standard NVM dump
sysctl, which limit the output's size. On those chip, the
pre-April-2010 code would falls back on MSI even if 3 were available.

> Have you tried leaving these two at the default on 7.2.3 ?
> if_em.h implies 1024 for each.
>
> =A0 =A0 =A0 =A0---Mike
>
>
>
>
> --
> -------------------
> Mike Tancsa, tel +1 519 651 3400
> Sentex Communications, mike@sentex.net
> Providing Internet services since 1994 www.sentex.net
> Cambridge, Ontario Canada =A0 http://www.tancsa.com/
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACqU3MVwLaepFymZJkaVk6p=SpykGhqs=VYFjLh9fP9S=AxDhg>