Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Jan 2016 19:06:50 +0100
From:      Harry Schmalzbauer <freebsd@omnilan.de>
To:        Marius Strobl <marius@freebsd.org>
Cc:        FreeBSD Stable <freebsd-stable@freebsd.org>
Subject:   em(4) watchdog timeout redemption [Was: Re: svn commit: r294958 - in stable/10: share/man/man4 sys/dev/e1000 sys/dev/ixgb sys/dev/netmap]
Message-ID:  <56AA58BA.8070404@omnilan.de>
In-Reply-To: <201601272231.u0RMV8LW019394@repo.freebsd.org>
References:  <201601272231.u0RMV8LW019394@repo.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
 Bez=FCglich Marius Strobl's Nachricht vom 27.01.2016 23:31 (localtime):
> Author: marius
> Date: Wed Jan 27 22:31:08 2016
> New Revision: 294958
> URL: https://svnweb.freebsd.org/changeset/base/294958
>
> Log:
>   Sync the e1000 drivers with what's in head as of r294327, modulo part=
s
>   that don't apply to stable/10 (driver API, if_inc_counter(), RSS chan=
ges
>   etc.) and modulo r287465 (which reportedly breaks igb(4)), i. e. asso=
rted
>   fixes and improvements only:
>  =20
>   o MFC r267385 (partial):
>     - Don't compare bus_dma map pointers for static DMA allocations aga=
inst
>       NULL to determine if bus_dmamap_unload() or bus_dmamem_free() sho=
uld be
>       called. Instead, check the associated bus and virtual addresses.
>     - Don't clear static DMA maps to NULL.
>   o MFC r284933:
>     Delete the refernce to VLAN handling being disabled by default. Thi=
s is
>     no longer the case. [1]
>   o MFC r285639:
>     Add an adapter CORE lock in the DDB hook em_dump_queue to avoid WIT=
NESS
>     panic in em_init_locked() while debugging.
>   o MFC r285879:
>     - Remove unused txd_saved.
>     - Intialize txd_upper, txd_lower and txd_used at declaration.
>   o MFC r286162:
>     Free mbufs when busdma loading fails.
>   o MFC r286829:
>     Add capability to disable CRC stripping as it breaks IPMI/BMC capab=
ilities
>     on certain adatpers. [2]
>   o MFC r286831: [3]
>     - Increase EM_MAX_SCATTER to 64 such that the size of em_xmit()::
>       segs[EM_MAX_SCATTER] doesn't get overrun by things like NFS that =
can
>       and do shove more than 32 segs when being used with em(4) and TSO=
4.
>     - Update tso handling code in em_xmit() with update from jhb@
>     - Set if_hw_tsomax, if_hw_tsomaxsegcount and if_hw_tsomaxsegsize to=

>       appropriate values.
>     - Define a TSO workaround "magic" number of 4 that is used to avoid=
 an
>       alignment issue in hardware.
>     - Change a couple of integer values that were used as booleans to a=
ctual
>       bool types.
>     - Ensure that em_enable_intr() enables the appropriate mask of inte=
rrupts
>       and not just a hardcoded define of values.
>   o MFC r286832:
>     e1000/if_lem.c bump to 1.1.0
>   o MFC r286833:
>     Bump all copywrite dates to 2015.
>   o MFC r287112:
>     Style/whitespace cleanup in shared/common code.
>   o MFC r293331:
>     - Switch em(4) to the extended RX descriptor format.
>     - Split rxbuffer and txbuffer apart to support the new RX descripto=
r
>       format structures. Move rxbuffer manipulation to em_setup_rxdesc(=
) to
>       unify the new behavior changes.
>     - Add a RSSKEYLEN macro for help in generating the RSSKEY data stru=
ctures
>       in the card.
>     - Change em_receive_checksum() to process the new rxdescriptor form=
at
>       status bit.
>   o MFC r293332:
>     Disable the reuse of checksum offload context descriptors in the ca=
se
>     of multiple queues in em(4). Document errata in the code.
>   o MFC r293854:
>     Given that em(4), lem(4) and igb(4) hardware doesn't require the
>     alignment guarantees provided by m_defrag(9), use m_collapse(9)
>     instead for performance reasons.
>     While at it, sanitize the statistics softc members, i. e. retire
>     unused ones and add SYSCTL nodes missing for actually used ones.
>  =20
>   PR:	118693 [1], 161277 [2], 195078 [3], 199174 [3], 200221 [3]

Thanks, especially to sbruno@
I'd like to confirm r294958 fixes multiple em(4) problems I observed up
to r294156, especially EM_MULTIQUEUE support on hartwell (82574)
(haven't filed a bug report since I haven't had time to analyze, seems
199174 and 200221 match well).

Glad to see 10.3 will ship with em(4) able to sustain GbE with one NFS
transfer (111,3MiB/s), while keeping low latency for additional
(low-trhoughput) connections without having unrecoverably watchdog
timeouts anymore (adding 2nd queue to em(4) reduces latency from 10ms to
~3ms on new sockets).

For the records, this kind of watchdog timeouts with unsuccessful
interface resets are fixed for me:
em0: Watchdog timeout Queue[0]-- resetting
Interface is RUNNING and ACTIVE
em0: TX Queue 0 ------
em0: hw tdh =3D 210, hw tdt =3D 674
em0: Tx Queue Status =3D -2147483648
em0: TX descriptors avail =3D 3632
em0: Tx Descriptors avail failure =3D 0
em0: RX Queue 0 ------
em0: hw rdh =3D 896, hw rdt =3D 895
em0: RX discarded packets =3D 0
em0: RX Next to Check =3D 896
em0: RX Next to Refresh =3D 895
em0: TX Queue 1 ------
em0: hw tdh =3D 575, hw tdt =3D 716
em0: Tx Queue Status =3D -2147483648
em0: TX descriptors avail =3D 3937
em0: Tx Descriptors avail failure =3D 0
em0: RX Queue 1 ------
em0: hw rdh =3D 192, hw rdt =3D 191
em0: RX discarded packets =3D 0
em0: RX Next to Check =3D 192
em0: RX Next to Refresh =3D 191
em0: link state changed to DOWN
em0: link state changed to UP

-Harry




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?56AA58BA.8070404>