Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 26 Sep 2016 11:33:12 +0200
From:      Julien Charbon <julien.charbon@gmail.com>
To:        Slawa Olhovchenkov <slw@zxy.spb.ru>
Cc:        Konstantin Belousov <kostikbel@gmail.com>, freebsd-stable@FreeBSD.org, hiren panchasara <hiren@strugglingcoder.info>
Subject:   Re: 11.0 stuck on high network load
Message-ID:  <dc2798ff-2ace-81f7-a563-18ffa1ace990@gmail.com>
In-Reply-To: <20160925124626.GI2840@zxy.spb.ru>
References:  <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> <20160919204328.GN2840@zxy.spb.ru> <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> <20160920202633.GQ2840@zxy.spb.ru> <f644cd52-4377-aa90-123a-3a2887972bbc@freebsd.org> <20160921195155.GW2840@zxy.spb.ru> <e4e0188c-b22b-29af-ed15-b650c3ec4553@gmail.com> <20160923200143.GG2840@zxy.spb.ru> <20160925124626.GI2840@zxy.spb.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--12H9K1a6tfXsjU4KKUelKedFmxPMXjVBt
Content-Type: multipart/mixed; boundary="oWtTp0oI5DUmUuqawNkH1QRl3r9fRm6UV";
 protected-headers="v1"
From: Julien Charbon <julien.charbon@gmail.com>
To: Slawa Olhovchenkov <slw@zxy.spb.ru>
Cc: Konstantin Belousov <kostikbel@gmail.com>, freebsd-stable@FreeBSD.org,
 hiren panchasara <hiren@strugglingcoder.info>
Message-ID: <dc2798ff-2ace-81f7-a563-18ffa1ace990@gmail.com>
Subject: Re: 11.0 stuck on high network load
References: <20160916183053.GL9397@strugglingcoder.info>
 <20160916190330.GG2840@zxy.spb.ru>
 <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org>
 <20160919204328.GN2840@zxy.spb.ru>
 <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org>
 <20160920202633.GQ2840@zxy.spb.ru>
 <f644cd52-4377-aa90-123a-3a2887972bbc@freebsd.org>
 <20160921195155.GW2840@zxy.spb.ru>
 <e4e0188c-b22b-29af-ed15-b650c3ec4553@gmail.com>
 <20160923200143.GG2840@zxy.spb.ru> <20160925124626.GI2840@zxy.spb.ru>
In-Reply-To: <20160925124626.GI2840@zxy.spb.ru>

--oWtTp0oI5DUmUuqawNkH1QRl3r9fRm6UV
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable


 Hi Slawa,

On 9/25/16 2:46 PM, Slawa Olhovchenkov wrote:
> On Fri, Sep 23, 2016 at 11:01:43PM +0300, Slawa Olhovchenkov wrote:
>>> On 9/21/16 9:51 PM, Slawa Olhovchenkov wrote:
>>>> On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote:
>>>>>  You can also use Dtrace and lockstat (especially with the lockstat=
 -s
>>>>> option):
>>>>>
>>>>> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks
>>>>> https://www.freebsd.org/cgi/man.cgi?query=3Dlockstat&manpath=3DFree=
BSD+11.0-RELEASE
>>>>>
>>>>>  But I am less familiar with Dtrace/lockstat tools.
>>>>
>>>> I am still use old kernel and got lockdown again.
>>>> Try using lockstat (I am save more output), interesting may be next:=

>>>>
>>>> R/W writer spin on writer: 190019 events in 1.070 seconds (177571 ev=
ents/sec)
>>>>
>>>> --------------------------------------------------------------------=
-----------
>>>> Count indv cuml rcnt     nsec Lock                   Caller         =
        =20
>>>> 140839  74%  74% 0.00    24659 tcpinp                 tcp_tw_2msl_sc=
an+0xc6  =20
>>>>
>>>>       nsec ------ Time Distribution ------ count     Stack          =
        =20
>>>>       4096 |                               913       tcp_twstart+0xa=
3       =20
>>>>       8192 |@@@@@@@@@@@@                   58191     tcp_do_segment+=
0x201f  =20
>>>>      16384 |@@@@@@                         29594     tcp_input+0xe1c=
        =20
>>>>      32768 |@@@@                           23447     ip_input+0x15f =
        =20
>>>>      65536 |@@@                            16197    =20
>>>>     131072 |@                              8674     =20
>>>>     262144 |                               3358     =20
>>>>     524288 |                               456      =20
>>>>    1048576 |                               9        =20
>>>> --------------------------------------------------------------------=
-----------
>>>> Count indv cuml rcnt     nsec Lock                   Caller         =
        =20
>>>> 49180  26% 100% 0.00    15929 tcpinp                 tcp_tw_2msl_sca=
n+0xc6  =20
>>>>
>>>>       nsec ------ Time Distribution ------ count     Stack          =
        =20
>>>>       4096 |                               157       pfslowtimo+0x54=
        =20
>>>>       8192 |@@@@@@@@@@@@@@@                24796     softclock_call_=
cc+0x179=20
>>>>      16384 |@@@@@@                         11223     softclock+0x44 =
        =20
>>>>      32768 |@@@@                           7426      intr_event_exec=
ute_handlers+0x95
>>>>      65536 |@@                             3918     =20
>>>>     131072 |                               1363     =20
>>>>     262144 |                               278      =20
>>>>     524288 |                               19       =20
>>>> --------------------------------------------------------------------=
-----------
>>>
>>>  This is interesting, it seems that you have two call paths competing=

>>> for INP locks here:
>>>
>>>  - pfslowtimo()/tcp_tw_2msl_scan(reuse=3D0) and
>>>
>>>  - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=3D1)
>>
>> My current hypothesis:
>>
>> nginx do write() (or may be close()?) to socket, kernel lock
>> first inp in V_twq_2msl, happen callout for pfslowtimo() on the same
>> CPU core and tcp_tw_2msl_scan infinity locked on same inp.
>>
>> In this case you modification can't help, before next try we need some=

>> like yeld().
>=20
> Or may be locks leaks.
> Or both.

 You are totally right, pfslowtimo()/tcp_tw_2msl_scan(reuse=3D0) is
infinitely blocked on INP_WLOCK() by "something" (that could be related
to write()).

 As I reached my limit of debugging without WITNESS, could you share
your /etc/sysctl.conf, /boot/loader.conf files?  And any specific
configuration you have (like having a Nginx workers affinity, Nginx
special options, etc.).  Like that I can try to reproduce it on releng/11=
=2E0.

 Thanks.

--
Julien


--oWtTp0oI5DUmUuqawNkH1QRl3r9fRm6UV--

--12H9K1a6tfXsjU4KKUelKedFmxPMXjVBt
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQEcBAEBCgAGBQJX6OtnAAoJEKVlQ5Je6dhxlpsIAJ/qGR/MYHOe1EiiGqpVtL+s
Wduh6AxK3/14NBvdjmxEU5RwfKc9tsP+wobaNJ6VZBSr7qqWXZgYM+E76iFiFgKE
GuylXa4Rxb55F8W40pQko4YfYRkfrKc5GwthFtamDS3ASduASIUzDMY0Hh7EGoQD
QyofDSTn8UEQmw8e4K3ryZ/qEExRh29lCSZBVtQMynoNSPQZf4lRhkM+paZoLiZW
cfhw0xSuz9vzP4JbRW2UK0ZfLFzkjCBgUbcslRlc7Rf/969a39e3JxC5NTOnKejx
PN7hPQYvLoaVZu6p5JmAuUD9k3y9RvOxxl0xLE/WPmAuhpi8ROKbW7Q+TGVoLrs=
=xhuC
-----END PGP SIGNATURE-----

--12H9K1a6tfXsjU4KKUelKedFmxPMXjVBt--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?dc2798ff-2ace-81f7-a563-18ffa1ace990>