Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Sep 2016 23:25:18 +0200
From:      Julien Charbon <julien.charbon@gmail.com>
To:        Slawa Olhovchenkov <slw@zxy.spb.ru>
Cc:        Konstantin Belousov <kostikbel@gmail.com>, freebsd-stable@FreeBSD.org, hiren panchasara <hiren@strugglingcoder.info>
Subject:   Re: 11.0 stuck on high network load
Message-ID:  <e4e0188c-b22b-29af-ed15-b650c3ec4553@gmail.com>
In-Reply-To: <20160921195155.GW2840@zxy.spb.ru>
References:  <20160915085938.GN38409@kib.kiev.ua> <20160915090633.GS2840@zxy.spb.ru> <20160916181839.GC2960@zxy.spb.ru> <20160916183053.GL9397@strugglingcoder.info> <20160916190330.GG2840@zxy.spb.ru> <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org> <20160919204328.GN2840@zxy.spb.ru> <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org> <20160920202633.GQ2840@zxy.spb.ru> <f644cd52-4377-aa90-123a-3a2887972bbc@freebsd.org> <20160921195155.GW2840@zxy.spb.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--PMIcB2RuhcdqOnqGRO71Tkne4QgWf9AS6
Content-Type: multipart/mixed; boundary="31rbt2q54Ws8RPnw0ruU6la6KWx342Bne";
 protected-headers="v1"
From: Julien Charbon <julien.charbon@gmail.com>
To: Slawa Olhovchenkov <slw@zxy.spb.ru>
Cc: Konstantin Belousov <kostikbel@gmail.com>, freebsd-stable@FreeBSD.org,
 hiren panchasara <hiren@strugglingcoder.info>
Message-ID: <e4e0188c-b22b-29af-ed15-b650c3ec4553@gmail.com>
Subject: Re: 11.0 stuck on high network load
References: <20160915085938.GN38409@kib.kiev.ua>
 <20160915090633.GS2840@zxy.spb.ru> <20160916181839.GC2960@zxy.spb.ru>
 <20160916183053.GL9397@strugglingcoder.info>
 <20160916190330.GG2840@zxy.spb.ru>
 <78cbcdc9-f565-1046-c157-2ddd8fcccc62@freebsd.org>
 <20160919204328.GN2840@zxy.spb.ru>
 <8ba75d6e-4f01-895e-0aed-53c6c6692cb9@freebsd.org>
 <20160920202633.GQ2840@zxy.spb.ru>
 <f644cd52-4377-aa90-123a-3a2887972bbc@freebsd.org>
 <20160921195155.GW2840@zxy.spb.ru>
In-Reply-To: <20160921195155.GW2840@zxy.spb.ru>

--31rbt2q54Ws8RPnw0ruU6la6KWx342Bne
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable


 Hi Slawa,

On 9/21/16 9:51 PM, Slawa Olhovchenkov wrote:
> On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote:
>>  You can also use Dtrace and lockstat (especially with the lockstat -s=

>> option):
>>
>> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks
>> https://www.freebsd.org/cgi/man.cgi?query=3Dlockstat&manpath=3DFreeBSD=
+11.0-RELEASE
>>
>>  But I am less familiar with Dtrace/lockstat tools.
>=20
> I am still use old kernel and got lockdown again.
> Try using lockstat (I am save more output), interesting may be next:
>=20
> R/W writer spin on writer: 190019 events in 1.070 seconds (177571 event=
s/sec)
>=20
> -----------------------------------------------------------------------=
--------
> Count indv cuml rcnt     nsec Lock                   Caller            =
     =20
> 140839  74%  74% 0.00    24659 tcpinp                 tcp_tw_2msl_scan+=
0xc6  =20
>=20
>       nsec ------ Time Distribution ------ count     Stack             =
     =20
>       4096 |                               913       tcp_twstart+0xa3  =
     =20
>       8192 |@@@@@@@@@@@@                   58191     tcp_do_segment+0x2=
01f  =20
>      16384 |@@@@@@                         29594     tcp_input+0xe1c   =
     =20
>      32768 |@@@@                           23447     ip_input+0x15f    =
     =20
>      65536 |@@@                            16197    =20
>     131072 |@                              8674     =20
>     262144 |                               3358     =20
>     524288 |                               456      =20
>    1048576 |                               9        =20
> -----------------------------------------------------------------------=
--------
> Count indv cuml rcnt     nsec Lock                   Caller            =
     =20
> 49180  26% 100% 0.00    15929 tcpinp                 tcp_tw_2msl_scan+0=
xc6  =20
>=20
>       nsec ------ Time Distribution ------ count     Stack             =
     =20
>       4096 |                               157       pfslowtimo+0x54   =
     =20
>       8192 |@@@@@@@@@@@@@@@                24796     softclock_call_cc+=
0x179=20
>      16384 |@@@@@@                         11223     softclock+0x44    =
     =20
>      32768 |@@@@                           7426      intr_event_execute=
_handlers+0x95
>      65536 |@@                             3918     =20
>     131072 |                               1363     =20
>     262144 |                               278      =20
>     524288 |                               19       =20
> -----------------------------------------------------------------------=
--------

 This is interesting, it seems that you have two call paths competing
for INP locks here:

 - pfslowtimo()/tcp_tw_2msl_scan(reuse=3D0) and

 - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=3D1)

 These paths can indeed compete for the same INP lock, as both
tcp_tw_2msl_scan() calls always start with the first inp found in
twq_2msl list.  But in both cases, this first inp should be quickly used
and its lock released anyway, thus that could explain your situation it
that the TCP stack is doing that all the time, for example:

 - Let say that you are running out completely and constantly of tcptw,
and then all connections transitioning to TIME_WAIT state are competing
with the TIME_WAIT timeout scan that tries to free all the expired
tcptw.  If the stack is doing that all the time, it can appear like
"live" locked.

 This is just an hypothesis and as usual might be a red herring.
Anyway, could you run:

$ vmstat -z | head -2; vmstat -z | grep -E 'tcp|sock'

 Ideally, once when everything is ok, and once when you have the issue
to see the differences (if any).

 If it appears your are quite low in tcptw, and if you have enough
memory, could you try increase the tcptw limit using sysctl
net.inet.tcp.maxtcptw?  And actually see if it improve (or not) your
performance.

 My 2 cents.

--
Julien


--31rbt2q54Ws8RPnw0ruU6la6KWx342Bne--

--PMIcB2RuhcdqOnqGRO71Tkne4QgWf9AS6
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQEcBAEBCgAGBQJX4vrEAAoJEKVlQ5Je6dhxDDAH/1ZVvUOHAOrR83zbYHmPgaGZ
NgNAj8jsXSf8Q37bnl/m4QwF25Dg9srpEv8VXmgKFO7MlpoIvNcf9I/exILbAJTO
TnHB48CKL82SvXtVdb7DBmRC51fOZZJCZ5FQwjAuJSz9GOxbrxrARjWXH+jrNEx6
GjA/4CfcnaM7Uq5Y7jsAnxjYyBDdBdaTPqc03asUySKtdEEPHa9bonK/YTsHElik
W8V4OIsarjeLrF6RLaxW9oYTTIWLsDEG8Hvw/M/PEFvxggHYNXfOx78L24PW/twF
cXLoh0+U1UOUR4dob+fprdRARbDcTm53VS16DSSn/0IZ4jLcbPeKt0Ng/UgJh0g=
=aRb8
-----END PGP SIGNATURE-----

--PMIcB2RuhcdqOnqGRO71Tkne4QgWf9AS6--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?e4e0188c-b22b-29af-ed15-b650c3ec4553>