Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 10 Oct 2016 17:44:21 +0200
From:      Julien Charbon <jch@freebsd.org>
To:        Slawa Olhovchenkov <slw@zxy.spb.ru>
Cc:        Konstantin Belousov <kostikbel@gmail.com>, freebsd-stable@FreeBSD.org, hiren panchasara <hiren@strugglingcoder.info>
Subject:   Re: 11.0 stuck on high network load
Message-ID:  <52d634aa-639c-bef7-1f10-c46dbadc4d85@freebsd.org>
In-Reply-To: <20161010142941.GV54003@zxy.spb.ru>
References:  <20160925124626.GI2840@zxy.spb.ru> <dc2798ff-2ace-81f7-a563-18ffa1ace990@gmail.com> <20160926172159.GA54003@zxy.spb.ru> <62453d9c-b1e4-1129-70ff-654dacea37f9@gmail.com> <20160928115909.GC54003@zxy.spb.ru> <a0425aad-a421-05bc-c1a8-c6fe06b83833@freebsd.org> <20161006111043.GH54003@zxy.spb.ru> <1431484c-c00e-24c5-bd76-714be8ae5ed5@freebsd.org> <20161010133220.GU54003@zxy.spb.ru> <23f1200e-383e-befb-b76d-c88b3e1287b0@freebsd.org> <20161010142941.GV54003@zxy.spb.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--jgP2QijCNPFwwniMmTVgoP1jMrJMrPvsB
Content-Type: multipart/mixed; boundary="Sp9pwUnhtrmlVgl2LKFvD2WtVtJmggQHQ";
 protected-headers="v1"
From: Julien Charbon <jch@freebsd.org>
To: Slawa Olhovchenkov <slw@zxy.spb.ru>
Cc: Konstantin Belousov <kostikbel@gmail.com>, freebsd-stable@FreeBSD.org,
 hiren panchasara <hiren@strugglingcoder.info>
Message-ID: <52d634aa-639c-bef7-1f10-c46dbadc4d85@freebsd.org>
Subject: Re: 11.0 stuck on high network load
References: <20160925124626.GI2840@zxy.spb.ru>
 <dc2798ff-2ace-81f7-a563-18ffa1ace990@gmail.com>
 <20160926172159.GA54003@zxy.spb.ru>
 <62453d9c-b1e4-1129-70ff-654dacea37f9@gmail.com>
 <20160928115909.GC54003@zxy.spb.ru>
 <a0425aad-a421-05bc-c1a8-c6fe06b83833@freebsd.org>
 <20161006111043.GH54003@zxy.spb.ru>
 <1431484c-c00e-24c5-bd76-714be8ae5ed5@freebsd.org>
 <20161010133220.GU54003@zxy.spb.ru>
 <23f1200e-383e-befb-b76d-c88b3e1287b0@freebsd.org>
 <20161010142941.GV54003@zxy.spb.ru>
In-Reply-To: <20161010142941.GV54003@zxy.spb.ru>

--Sp9pwUnhtrmlVgl2LKFvD2WtVtJmggQHQ
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable


 Hi,

On 10/10/16 4:29 PM, Slawa Olhovchenkov wrote:
> On Mon, Oct 10, 2016 at 04:03:39PM +0200, Julien Charbon wrote:
>> On 10/10/16 3:32 PM, Slawa Olhovchenkov wrote:
>>> On Mon, Oct 10, 2016 at 01:26:12PM +0200, Julien Charbon wrote:
>>>> On 10/6/16 1:10 PM, Slawa Olhovchenkov wrote:
>>>>> On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote:
>>>>>
>>>>>> 2. thread1:  In tcp_close() the inp is marked with INP_DROPPED fla=
g, the
>>>>>> process continues and calls INP_WUNLOCK() here:
>>>>>>
>>>>>> https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tc=
p_subr.c#L1568
>>>>>
>>>>> Look also to sys/netinet/tcp_timewait.c:488
>>>>>
>>>>> And check other locks from r160549
>>>>
>>>>  You are right, and here the a fix proposal for this issue:
>>>>
>>>> Fix a double-free when an inp transitions to INP_TIMEWAIT state afte=
r
>>>> having been dropped
>>>> https://reviews.freebsd.org/D8211
>>>>
>>>>  It basically enforces in_pcbdrop() logic in tcp_input():  A INP_DRO=
PPED
>>>> inpcb should never be proceed further.
>>>>
>>>>  Slawa, as you are the only one to reproduce this issue currently, c=
ould
>>>> test this patch?  (And remove the temporary patch I did provided to =
you
>>>> before).
>>>>
>>>>  I will wait for your tests results before pushing further.
>>>>
>>>>  Thanks!
>>>>
>>>> diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c
>>>> index c72f01f..37f27e0 100644
>>>> --- a/sys/netinet/tcp_input.c
>>>> +++ b/sys/netinet/tcp_input.c
>>>> @@ -921,6 +921,16 @@ findpcb:
>>>>                 goto dropwithreset;
>>>>         }
>>>>         INP_WLOCK_ASSERT(inp);
>>>> +       /*
>>>> +        * While waiting for inp lock during the lookup, another thr=
ead
>>>> +        * can have droppedt  the inpcb, in which case we need to lo=
op back
>>>> +        * and try to find a new inpcb to deliver to.
>>>> +        */
>>>> +       if (inp->inp_flags & INP_DROPPED) {
>>>> +               INP_WUNLOCK(inp);
>>>> +               inp =3D NULL;
>>>> +               goto findpcb;
>>>
>>> Are you sure about this goto?
>>> Can this cause infinite loop by found same inpcb?
>>> May be drop packet is more correct?
>>
>>  Good question:  Infinite loop is not possible here, as the next TCP
>> hash lookup will return NULL or a fresh new and not dropped inp.  You
>=20
> I am not expert in this api and don't see cause of this: I am assume
> hash lookup don't remove from hash returned args and I am don't see
> any removing of this inp. Why hash lookup don't return same inp?
>=20
> (assume this input patch interrupt callout code on the same CPU core).
>=20
>> can check the current other usages of goto findpcb in tcp_input().  Th=
e
>> rational here being:
>>
>>  - Behavior before the patch:  If the inp we found was deleted then go=
to
>> findpcb.
>>  - Behavior after the patch:  If the inp we found was deleted or dropp=
ed
>> then goto findpcb.
>>
>>  I just prefer having the same behavior applied everywhere:  If
>> tcp_input() loses the inp lock race and the inp was deleted or dropped=

>> then retry to find a new inpcb to deliver to.
>>
>>  But you are right dropping the packet here will also fix the issue.
>>
>>  Then the review process becomes quite helpful because people can argu=
e:
>>  Dropping here is better because "blah", or goto findpcb is better
>> because "bluh", etc.  And at the review end you have a nice final patc=
h.
>>
>> https://reviews.freebsd.org/D8211
>=20
> I am not sure, I am see to
>=20
> sys/netinet/in_pcb.h:#define    INP_DROPPED             0x04000000 /* p=
rotocol drop flag */
>=20
> and think this is a flag 'all packets must be droped'

On 10/10/16 4:29 PM, Slawa Olhovchenkov wrote:
> On Mon, Oct 10, 2016 at 04:03:39PM +0200, Julien Charbon wrote:
>> On 10/10/16 3:32 PM, Slawa Olhovchenkov wrote:
>>> On Mon, Oct 10, 2016 at 01:26:12PM +0200, Julien Charbon wrote:
>>>> On 10/6/16 1:10 PM, Slawa Olhovchenkov wrote:
>>>>> On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote:
>>>>>
>>>>>> 2. thread1:  In tcp_close() the inp is marked with INP_DROPPED
flag, the
>>>>>> process continues and calls INP_WUNLOCK() here:
>>>>>>
>>>>>>
https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tcp_subr.=
c#L1568
>>>>>
>>>>> Look also to sys/netinet/tcp_timewait.c:488
>>>>>
>>>>> And check other locks from r160549
>>>>
>>>>  You are right, and here the a fix proposal for this issue:
>>>>
>>>> Fix a double-free when an inp transitions to INP_TIMEWAIT state afte=
r
>>>> having been dropped
>>>> https://reviews.freebsd.org/D8211
>>>>
>>>>  It basically enforces in_pcbdrop() logic in tcp_input():  A
INP_DROPPED
>>>> inpcb should never be proceed further.
>>>>
>>>>  Slawa, as you are the only one to reproduce this issue currently,
could
>>>> test this patch?  (And remove the temporary patch I did provided to =
you
>>>> before).
>>>>
>>>>  I will wait for your tests results before pushing further.
>>>>
>>>>  Thanks!
>>>>
>>>> diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c
>>>> index c72f01f..37f27e0 100644
>>>> --- a/sys/netinet/tcp_input.c
>>>> +++ b/sys/netinet/tcp_input.c
>>>> @@ -921,6 +921,16 @@ findpcb:
>>>>                 goto dropwithreset;
>>>>         }
>>>>         INP_WLOCK_ASSERT(inp);
>>>> +       /*
>>>> +        * While waiting for inp lock during the lookup, another thr=
ead
>>>> +        * can have droppedt  the inpcb, in which case we need to
loop back
>>>> +        * and try to find a new inpcb to deliver to.
>>>> +        */
>>>> +       if (inp->inp_flags & INP_DROPPED) {
>>>> +               INP_WUNLOCK(inp);
>>>> +               inp =3D NULL;
>>>> +               goto findpcb;
>>>
>>> Are you sure about this goto?
>>> Can this cause infinite loop by found same inpcb?
>>> May be drop packet is more correct?
>>
>>  Good question:  Infinite loop is not possible here, as the next TCP
>> hash lookup will return NULL or a fresh new and not dropped inp.  You
>
> I am not expert in this api and don't see cause of this: I am assume
> hash lookup don't remove from hash returned args and I am don't see
> any removing of this inp. Why hash lookup don't return same inp?
>
> (assume this input patch interrupt callout code on the same CPU core).
>
>> can check the current other usages of goto findpcb in tcp_input().  Th=
e
>> rational here being:
>>
>>  - Behavior before the patch:  If the inp we found was deleted then go=
to
>> findpcb.
>>  - Behavior after the patch:  If the inp we found was deleted or dropp=
ed
>> then goto findpcb.
>>
>>  I just prefer having the same behavior applied everywhere:  If
>> tcp_input() loses the inp lock race and the inp was deleted or dropped=

>> then retry to find a new inpcb to deliver to.
>>
>>  But you are right dropping the packet here will also fix the issue.
>>
>>  Then the review process becomes quite helpful because people can argu=
e:
>>  Dropping here is better because "blah", or goto findpcb is better
>> because "bluh", etc.  And at the review end you have a nice final patc=
h.
>>
>> https://reviews.freebsd.org/D8211
>
> I am not sure, I am see to
>
> sys/netinet/in_pcb.h:#define    INP_DROPPED             0x04000000 /*
protocol drop flag */
>
> and think this is a flag 'all packets must be droped'

 Hm, I believe this flag means "this inp has been dropped by the TCP
stack, so don't use it anymore".  Actually this flag is better described
in the function that sets it:

"(INP_DROPPED) is used by TCP to mark an inpcb as unused and avoid
future packet delivery or event notification when a socket remains open
but TCP has closed."

https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/in_pcb=
=2Ec#L1320

/*
 * in_pcbdrop() removes an inpcb from hashed lists, releasing its
address and
 * port reservation, and preventing it from being returned by inpcb looku=
ps.
 *
 * It is used by TCP to mark an inpcb as unused and avoid future packet
 * delivery or event notification when a socket remains open but TCP has
 * closed.  This might occur as a result of a shutdown()-initiated TCP cl=
ose
 * or a RST on the wire, and allows the port binding to be reused while
still
 * maintaining the invariant that so_pcb always points to a valid inpcb
until
 * in_pcbdetach().
 *
 */
void
in_pcbdrop(struct inpcb *inp)
{
  inp->inp_flags |=3D INP_DROPPED;
  ...

 The classical example where "goto findpcb" is useful:  You receive a
new connection request with a TCP SYN packet and this packet is unlucky
and reached a inp being dropped:

 - with "goto findpcb" approach, the next lookup will most likely find
the LISTEN inp and start the TCP hand-shake as usual
 - with "drop the packet" approach, the TCP client will need to
re-transmit a TCP SYN packet

 It is not because a packet was unlucky once that it deserves to be
dropped. :)

--
Julien



--Sp9pwUnhtrmlVgl2LKFvD2WtVtJmggQHQ--

--jgP2QijCNPFwwniMmTVgoP1jMrJMrPvsB
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQEcBAEBCgAGBQJX+7dZAAoJEKVlQ5Je6dhxLFgH/RZKNAZlyImT1Pcw5YGevSTZ
LHAtq7x84dKDQrUWZcE5K8GYvXrpOm3uEjnWMfbc6BfPz7T7emBHC3Y4GgIJ4X29
d6khxTPsgvBFTetRwDkiet5Gk8OrI7t5W3NcXvLpFcAJkBVBQ9lXP5RKqhfWxhJE
3KejwpOAyDLVLMTaN08omHmS4J72pckewe+Ud8/rRm+G/H1xuIDuRbiQGrBMVf8R
HW8e7mwotOx3sJ9JIBBDFYsQ5CDUVPUgfLcN3/U4vWtcIaxuUY8AbY1s/aIh7ltW
RZfJCbcWrXmZcbrp+Yw2uq7010IpPkpHJi/LaudwPJzg2izXUU9tThDhqJU+F5g=
=paCp
-----END PGP SIGNATURE-----

--jgP2QijCNPFwwniMmTVgoP1jMrJMrPvsB--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?52d634aa-639c-bef7-1f10-c46dbadc4d85>