Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 10 Oct 2016 16:03:39 +0200
From:      Julien Charbon <jch@freebsd.org>
To:        Slawa Olhovchenkov <slw@zxy.spb.ru>
Cc:        Konstantin Belousov <kostikbel@gmail.com>, freebsd-stable@FreeBSD.org, hiren panchasara <hiren@strugglingcoder.info>
Subject:   Re: 11.0 stuck on high network load
Message-ID:  <23f1200e-383e-befb-b76d-c88b3e1287b0@freebsd.org>
In-Reply-To: <20161010133220.GU54003@zxy.spb.ru>
References:  <e4e0188c-b22b-29af-ed15-b650c3ec4553@gmail.com> <20160923200143.GG2840@zxy.spb.ru> <20160925124626.GI2840@zxy.spb.ru> <dc2798ff-2ace-81f7-a563-18ffa1ace990@gmail.com> <20160926172159.GA54003@zxy.spb.ru> <62453d9c-b1e4-1129-70ff-654dacea37f9@gmail.com> <20160928115909.GC54003@zxy.spb.ru> <a0425aad-a421-05bc-c1a8-c6fe06b83833@freebsd.org> <20161006111043.GH54003@zxy.spb.ru> <1431484c-c00e-24c5-bd76-714be8ae5ed5@freebsd.org> <20161010133220.GU54003@zxy.spb.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--Uxu93BPJNKACrFgUljh7rp1RMd0auWIuX
Content-Type: multipart/mixed; boundary="rAjpVOoPH2mHjtbjA6bhunC7dS7rgcUnF";
 protected-headers="v1"
From: Julien Charbon <jch@freebsd.org>
To: Slawa Olhovchenkov <slw@zxy.spb.ru>
Cc: Konstantin Belousov <kostikbel@gmail.com>, freebsd-stable@FreeBSD.org,
 hiren panchasara <hiren@strugglingcoder.info>
Message-ID: <23f1200e-383e-befb-b76d-c88b3e1287b0@freebsd.org>
Subject: Re: 11.0 stuck on high network load
References: <e4e0188c-b22b-29af-ed15-b650c3ec4553@gmail.com>
 <20160923200143.GG2840@zxy.spb.ru> <20160925124626.GI2840@zxy.spb.ru>
 <dc2798ff-2ace-81f7-a563-18ffa1ace990@gmail.com>
 <20160926172159.GA54003@zxy.spb.ru>
 <62453d9c-b1e4-1129-70ff-654dacea37f9@gmail.com>
 <20160928115909.GC54003@zxy.spb.ru>
 <a0425aad-a421-05bc-c1a8-c6fe06b83833@freebsd.org>
 <20161006111043.GH54003@zxy.spb.ru>
 <1431484c-c00e-24c5-bd76-714be8ae5ed5@freebsd.org>
 <20161010133220.GU54003@zxy.spb.ru>
In-Reply-To: <20161010133220.GU54003@zxy.spb.ru>

--rAjpVOoPH2mHjtbjA6bhunC7dS7rgcUnF
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable


 Hi Slawa,

On 10/10/16 3:32 PM, Slawa Olhovchenkov wrote:
> On Mon, Oct 10, 2016 at 01:26:12PM +0200, Julien Charbon wrote:
>> On 10/6/16 1:10 PM, Slawa Olhovchenkov wrote:
>>> On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote:
>>>
>>>> 2. thread1:  In tcp_close() the inp is marked with INP_DROPPED flag,=
 the
>>>> process continues and calls INP_WUNLOCK() here:
>>>>
>>>> https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tcp_=
subr.c#L1568
>>>
>>> Look also to sys/netinet/tcp_timewait.c:488
>>>
>>> And check other locks from r160549
>>
>>  You are right, and here the a fix proposal for this issue:
>>
>> Fix a double-free when an inp transitions to INP_TIMEWAIT state after
>> having been dropped
>> https://reviews.freebsd.org/D8211
>>
>>  It basically enforces in_pcbdrop() logic in tcp_input():  A INP_DROPP=
ED
>> inpcb should never be proceed further.
>>
>>  Slawa, as you are the only one to reproduce this issue currently, cou=
ld
>> test this patch?  (And remove the temporary patch I did provided to yo=
u
>> before).
>>
>>  I will wait for your tests results before pushing further.
>>
>>  Thanks!
>>
>> diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c
>> index c72f01f..37f27e0 100644
>> --- a/sys/netinet/tcp_input.c
>> +++ b/sys/netinet/tcp_input.c
>> @@ -921,6 +921,16 @@ findpcb:
>>                 goto dropwithreset;
>>         }
>>         INP_WLOCK_ASSERT(inp);
>> +       /*
>> +        * While waiting for inp lock during the lookup, another threa=
d
>> +        * can have droppedt  the inpcb, in which case we need to loop=
 back
>> +        * and try to find a new inpcb to deliver to.
>> +        */
>> +       if (inp->inp_flags & INP_DROPPED) {
>> +               INP_WUNLOCK(inp);
>> +               inp =3D NULL;
>> +               goto findpcb;
>=20
> Are you sure about this goto?
> Can this cause infinite loop by found same inpcb?
> May be drop packet is more correct?

 Good question:  Infinite loop is not possible here, as the next TCP
hash lookup will return NULL or a fresh new and not dropped inp.  You
can check the current other usages of goto findpcb in tcp_input().  The
rational here being:

 - Behavior before the patch:  If the inp we found was deleted then goto
findpcb.
 - Behavior after the patch:  If the inp we found was deleted or dropped
then goto findpcb.

 I just prefer having the same behavior applied everywhere:  If
tcp_input() loses the inp lock race and the inp was deleted or dropped
then retry to find a new inpcb to deliver to.

 But you are right dropping the packet here will also fix the issue.

 Then the review process becomes quite helpful because people can argue:
 Dropping here is better because "blah", or goto findpcb is better
because "bluh", etc.  And at the review end you have a nice final patch.

https://reviews.freebsd.org/D8211

--
Julien


--rAjpVOoPH2mHjtbjA6bhunC7dS7rgcUnF--

--Uxu93BPJNKACrFgUljh7rp1RMd0auWIuX
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQEcBAEBCgAGBQJX+5+/AAoJEKVlQ5Je6dhx6lQH/Awtgic2tUHJdoFJkzB+DWng
pMiInCMiaSkF978ngUgRXjltqLVfb1YBR0Odn7UvbY3W6scOyEEUqO0aIyVXS1mY
FSoiQsBlJaHRmKth4RaUPXrBrktHgY2IzVSTNITlfZKSDg0pKjRJalNiQWjyAUr0
LmkmV58/x0rNAXKi/4ZLmmAjgjnMk5n4qVwIoXuA2H12KbE+ZbFu1WIB3FsOnr+i
xlN07KtRxuN84obr0UhuanEsnFw2kITr8QiRe5j9yRN+qRMr80awv6Px1cpDsokP
h4VsbW4ESmf5w1C3OqqETeiXpPlnF5JPnanw0iX1x/2jInD+fOmYRfFsHeoCmuU=
=qFSj
-----END PGP SIGNATURE-----

--Uxu93BPJNKACrFgUljh7rp1RMd0auWIuX--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?23f1200e-383e-befb-b76d-c88b3e1287b0>