Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 26 Oct 2011 14:18:14 -0700
From:      Jason Wolfe <nitroboost@gmail.com>
To:        Hooman Fazaeli <hoomanfazaeli@gmail.com>
Cc:        freebsd-net@freebsd.org, Hooman Fazaeli <fazaeli@sepehrs.com>
Subject:   Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled
Message-ID:  <CAAAm0r0BmUzO84QyjjgK=csG2c-OtU6XavGuB0txBMghJ9grew@mail.gmail.com>
In-Reply-To: <4EA82610.6090207@gmail.com>
References:  <CAAAm0r0RXEJo4UiKS=Ui0e5OQTg6sg-xcYf3mYB5%2Bvk8i8557w@mail.gmail.com> <4E8F157A.40702@sentex.net> <CAAAm0r2JH43Rct7UxQK2duH1p43Nepnj5mpb6bXo==DPayhJLg@mail.gmail.com> <4E8F51D4.1060509@sentex.net> <CACqU3MVwLaepFymZJkaVk6p=SpykGhqs=VYFjLh9fP9S=AxDhg@mail.gmail.com> <CAAAm0r1DKvoL9=Ket9up=4%2B5xiCzTTZJK99FhF9jcCA28B0M%2BA@mail.gmail.com> <CAAAm0r3XdsMHZh%2BP_NF-txZasdExzwZ8ymmGQgGhJQds0fOiBQ@mail.gmail.com> <CAAAm0r1iS3z-7CBJ=xYDf%2BJOA1Q2nU0O54Twbyb7FjvgWHjKVw@mail.gmail.com> <4EA7E203.3020306@sepehrs.com> <CAAAm0r3Nr2t8cCetPkFnLQ-3KwqHw_0SpqbtvYPRUkSP=9n8CA@mail.gmail.com> <4EA82610.6090207@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hooman,

I've added ifconfig to my collection script (as well as sysctl hw.em) that
will run at the time of the next wedge, the ifconfig lines in my email were
just to show the before and after as I've also modified the running options
on emX in this test group as you proposed.

Thank you for the details.  I know the team has fixed many bugs related to
this chip in the past year or so and most folks are running great, so I'm
hoping we can squash this seemingly last and rare one related to interface
hangs.  If not a bug and rather something about my implementation, I can at
least serve as a cautionary tale :)

I've added the dev_printf you proposed also, as of today the test pool is
running with that as well as what's mentioned above.

Thanks,
Jason

On Wed, Oct 26, 2011 at 8:24 AM, Hooman Fazaeli <hoomanfazaeli@gmail.com>wrote:

>
> Dear Jason
>
> I was actually interested in "status:" and "flags=" line from ifconfig
> output
> _at_ the time problem happens.
>
> From your previous post, it is clear that when the driver stops
> transmitting,
> output errors are zero and "Drops" increasing. This happens when the stack
> uses
> IFQ_ENQUEUE to put packet in interface output queue and the queue is full.
> The
> are a few possibilities for interface queue being full:
>
> - if_start has not been called regularly by stack. This is unlikely in
>  your case since it affect all drivers not just em.
>
> - if_em has not been fast enough to IFQ_DRV_DEQUEUE the packets. This is
>  also unlikely because you would otherwise see occasional drops not total
>  TX hang.
>
> The other reasons may be found by looking at em_start() function:
>
> - Driver may be OACTIVE, that is, it is busy sending packets. This may not
> be
>  your case because 'sysctl dev.em.1.debug=1' shows that driver is INACTIVE.
>
> - Number of available TX descriptors falls below EM_MAX_SCATTER (64) and
> driver
>  never recovers. This possibility is also rules out because
>  em debug output shows that 1014 descriptors are available.
>
> - Link may not be active (or at least, the may PHY believe so).
>  In this case, IFQ_DRV_DEQUEUE is never called and interface queue becomes
> full.
>  This was the reason I asked you ifconfig output. You may add the
>  following line to the end of em_print_debug_info function
>  to report link status when the problem happens:
>
>        device_printf(dev, "Link state: %s\n",
>                adapter->link_active? "active": "inactive");
>
>
>
>
> On 10/26/2011 3:03 PM, Jason Wolfe wrote:
>
>> Hooman,
>>
>> I have run with dev.em.X.flow_control=0, which should have the same result
>> as hw.em.fc_setting=0, and net.inet.tcp.tso is also 0.  I'm not sure the
>> remaining options would be able to produce the scenario I'm seeing, but
>> I'm
>> open to giving it a try with no options on the interfaces.  I've also
>> added
>> ifconfig output to the collection.
>>
>>
>> options=219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC>
>> ifconfig emX -rxcsum -txcsum -vlanhwtag -tso -wol
>> options=88<VLAN_MTU,VLAN_HWCSUM>
>>
>> It's always TX, but these servers push ~12x what they receive, so I'm
>> guessing it could happen to either buffer given the right traffic
>> patterns.
>>  While looking through commits I also found a patch to add a couple
>> sysctls
>> for em, which I'm adding - http://freshbsd.org/commit/freebsd/r223676
>>
>> Thanks,
>> Jason
>>
>> On Wed, Oct 26, 2011 at 3:33 AM, Hooman Fazaeli<fazaeli@sepehrs.com>
>>  wrote:
>>
>>  Hi Jason
>>>
>>> Have you tried:
>>>
>>> hw.em.fc_setting="0" (in loader.conf)
>>> ifconfig emX -tso -lro -rxcsum -txcsum -vlanhwtag -wol
>>>
>>> with MSIX and no multiqueue.
>>>
>>> Advanced features has always been a source of problem.
>>> It is worth a try and help to narrow down possibilities.
>>>
>>> It would also be helpful if you provide 'ifconfig' output
>>> when the problem happens.
>>>
>>>  And a question: Does interface RX also hangs or it is just TX?
>>>
>> _______________________________________________
>>
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>>
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAAAm0r0BmUzO84QyjjgK=csG2c-OtU6XavGuB0txBMghJ9grew>