Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Feb 2018 02:12:23 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-net@FreeBSD.org
Subject:   [Bug 221919] ixl: TX queue hang when using TSO and having a high and mixed network load
Message-ID:  <bug-221919-2472-T8qk53nF1e@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-221919-2472@https.bugs.freebsd.org/bugzilla/>
References:  <bug-221919-2472@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D221919

Jason Tubnor <jason@tubnor.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jason@tubnor.net

--- Comment #13 from Jason Tubnor <jason@tubnor.net> ---
I am also seeing this on our Lenovo SR650 7x06 servers.  We too are using 1=
0GbE
XL710 cards:

Intel(R) Ethernet Controller X710 for 10GbE SFP+

# pciconf -l | grep ixl
ixl0@pci0:10:0:0:       class=3D0x020000 card=3D0x402117aa chip=3D0x37d1808=
6 rev=3D0x09
hdr=3D0x00
ixl1@pci0:10:0:1:       class=3D0x020000 card=3D0x402117aa chip=3D0x37d1808=
6 rev=3D0x09
hdr=3D0x00
ixl2@pci0:10:0:2:       class=3D0x020000 card=3D0x402117aa chip=3D0x37d1808=
6 rev=3D0x09
hdr=3D0x00
ixl3@pci0:10:0:3:       class=3D0x020000 card=3D0x402117aa chip=3D0x37d1808=
6 rev=3D0x09
hdr=3D0x00
ixl4@pci0:174:0:0:      class=3D0x020000 card=3D0x000a8086 chip=3D0x1572808=
6 rev=3D0x01
hdr=3D0x00
ixl5@pci0:174:0:1:      class=3D0x020000 card=3D0x00008086 chip=3D0x1572808=
6 rev=3D0x01
hdr=3D0x00

snip from /var/log/messages:

Feb 15 09:50:53 server01 kernel: ixl5: Malicious Driver Detection event 2 o=
n TX
queue 769, pf number 1
Feb 15 09:50:53 server01 kernel: ixl5: MDD TX event is for this function!
Feb 15 09:50:54 server01 kernel: ixl5: WARNING: queue 0 appears to be hung!
Feb 15 09:50:54 server01 kernel: ixl5: WARNING: Resetting!
Feb 15 09:50:57 server01 kernel: WARNING: 192.168.1.14
(iqn.1998-01.com.vmware:HOST-00000000): no ping reply (NOP-Out) after 5
seconds; dropping connection
Feb 15 09:51:25 server01 kernel: ixl5: Malicious Driver Detection event 2 o=
n TX
queue 775, pf number 1
Feb 15 09:51:25 server01 kernel: ixl5: MDD TX event is for this function!
Feb 15 09:51:29 server01 kernel: WARNING: 192.168.1.14
(iqn.1998-01.com.vmware:HOST-00000000): no ping reply (NOP-Out) after 5
seconds; dropping connection
Feb 15 09:51:53 server01 kernel: ixl5: WARNING: queue 7 appears to be hung!
Feb 15 09:51:53 server01 kernel: ixl5: WARNING: Resetting!
Feb 15 09:51:55 server01 kernel: ixl5: Malicious Driver Detection event 2 o=
n TX
queue 768, pf number 1
Feb 15 09:51:55 server01 kernel: ixl5: MDD TX event is for this function!

This is easily able to be reproduced when hooking 10GbE VMWare ESXi hosts u=
p to
these storage servers via iSCSI.  We could trigger it by performing a vMoti=
on
move from one datastore to another.

I do not have a test server that I can test any patches on as 3 of these ex=
ist
in production running 11.1-RELEASE and cannot afford to have them off-line =
or
deviate away from the standard supported freebsd-update mechanism.

I hope something can be worked out pretty soon and rolled into update as th=
is
issue for us can't wait for 11.2 or 12.

I will be trying out -tso, but was trying to avoid that for performance
reasons.

Thanks!

--=20
You are receiving this mail because:
You are on the CC list for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-221919-2472-T8qk53nF1e>