Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 11 Dec 2015 13:12:31 +0100
From:      Hans Petter Selasky <hps@selasky.org>
To:        "Alexander V. Chernikov" <melifaro@ipfw.ru>, Adrian Chadd <adrian@freebsd.org>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, Randall Stewart <rrs@netflix.com>, Gleb Smirnoff <glebius@FreeBSD.org>
Subject:   Re: Race between arptimer() and lle removal [WAS: panic in arptimer in r289937]
Message-ID:  <566ABDAF.7060208@selasky.org>
In-Reply-To: <566AB081.8050100@selasky.org>
References:  null <CAJ-VmonvVyTNuYv_as41yPCFdfR5T3FE45DP9MKAc-eyzXzPUg@mail.gmail.com> <2739461446298483@web2h.yandex.ru> <566A94A1.60400@selasky.org> <2850091449828775@web21o.yandex.ru> <566AB081.8050100@selasky.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 12/11/15 12:16, Hans Petter Selasky wrote:
> On 12/11/15 11:12, Alexander V. Chernikov wrote:
>> 11.12.2015, 12:15, "Hans Petter Selasky" <hps@selasky.org>:
>>> Hi,
>>>
>>> Pulling the nail out of the haystack hopefully.
>>>
>>>>>   Any ideas on where next to look?
>>>
>>> Adrian: In your dump aswell I see:
>>>
>>> la_flags = 1
>>>
>>> That means there was a race calling arptimer() and removing the "lle".
>> Yes. The interesting part here is why lle is removed. There are quite
>> a few reasons: either interface address deleted or interface going
>> down, or explicit delete request.
>> That's why I asked Adrian about interface stuff (and haven't got a
>> reply).
>>>
>>> Alexander: Can you comment on the following patch:
>>>
>>>   > Index: netinet/if_ether.c
>>>   > ===================================================================
>>>   > --- netinet/if_ether.c (revision 291256)
>>>   > +++ netinet/if_ether.c (working copy)
>>>   > @@ -185,7 +185,13 @@
>>>   > LLE_WUNLOCK(lle);
>>>   > return;
>>>   > }
>>>   > - ifp = lle->lle_tbl->llt_ifp;
>>>   > + if (lle->la_flags & LLE_LINKED) {
>>>   > + ifp = lle->lle_tbl->llt_ifp;
>>>   > + } else {
>>>   > + /* XXX RACE entry has been freed */
>>>   > + llentry_free(lle);
>>>   > + return;
>>>   > + }
>>>   > CURVNET_SET(ifp->if_vnet);
>>>   >
>>>   > if ((lle->la_flags & LLE_DELETED) == 0) {
>>>
>>> We need a check in arptimer() that the lle is still linked before
>> Yes, I had exactly that approach in mind. (And nd6_llinfo_timer()
>> needs the same fix).
>> So, would you commit it or should I?
>>> proceeding, in there from what I can see. Because the callback is not
>>> protected by a mutex, it is not atomically stopped by callout_stop().
>
> Hi,
>
> Talking to Randall offlist, I see some more trouble. Let's get
> everything straight before making a fix. There is one more race I see:
>
> The start of the arptimer() callback looks like this:
>
>  > static void
>  > arptimer(void *arg)
>  > {
> POINT0
>  >         LLE_WLOCK(lle);
>  >         if (callout_pending(&lle->lle_timer)) {
> POINT1
>  >                 LLE_WUNLOCK(lle);
>  >                 return;
>  >         }
>
> The code starting the callback looks like this:
>
>  >                 LLE_ADDREF(la);
>  >                 la->la_expire = time_uptime;
>  >                 canceled = callout_reset(&la->lle_timer, hz *
> V_arpt_down,
>  >                     arptimer, la);
>  >                 if (canceled)
>  >                         LLE_REMREF(la);
>
> Which can be written like this:
>
>  > la->la_expire = time_uptime;
>  > canceled = callout_reset(&la->lle_timer, hz * V_arpt_down,
>  >     arptimer, la);
>  > if (canceled == 0)
>  >     LLE_ADDREF(la);
>
> In case we are at POINT0 in arptimer, callout_reset() will not be able
> to cancel the callout and will return 0. We should also drop one ref at
> POINT1, or rewrite the code a bit, which might need Randall's help in
> the callout subsystem area.
>

Hi,

Looking at the version history, I see Gleb Smirnoff and Randall, heavily 
involved with these code paths. Gleb and Randall, do you have any 
comments on the above findings?

Gleb+Randall: Do you agree or disagree there is a race as pointed out above?

Ways forward:

1) Revert r278472 (done by Randall) and fix r238990 (done by Gleb) to 
use the new callout_async_drain() instead of callout_stop() to avoid 
using the rm lock after free.

2) Use callout_stop() before callout_reset() and add a reference when 
the callout was not pending nor completing. We need to use 
callout_stop() in this case because callout_reset() only has two return 
values and cannot be used to distinguish the three callout states in use.

3) Modify callout_reset() to have three return values and fix the 
arptimer code to not leak references.

--HPS



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?566ABDAF.7060208>