Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 11 Dec 2015 13:12:55 +0300
From:      Alexander V. Chernikov <melifaro@ipfw.ru>
To:        Hans Petter Selasky <hps@selasky.org>, Adrian Chadd <adrian@freebsd.org>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Subject:   Re: Race between arptimer() and lle removal [WAS: panic in arptimer in r289937]
Message-ID:  <2850091449828775@web21o.yandex.ru>
In-Reply-To: <566A94A1.60400@selasky.org>
References:  null <CAJ-VmonvVyTNuYv_as41yPCFdfR5T3FE45DP9MKAc-eyzXzPUg@mail.gmail.com> <2739461446298483@web2h.yandex.ru> <566A94A1.60400@selasky.org>

next in thread | previous in thread | raw e-mail | index | archive | help
11.12.2015, 12:15, "Hans Petter Selasky" <hps@selasky.org>:
> Hi,
>
> Pulling the nail out of the haystack hopefully.
>
>>> šAny ideas on where next to look?
>
> Adrian: In your dump aswell I see:
>
> la_flags = 1
>
> That means there was a race calling arptimer() and removing the "lle".
Yes. The interesting part here is why lle is removed. There are quite a few reasons: either interface address deleted or interface going down, or explicit delete request.
That's why I asked Adrian about interface stuff (and haven't got a reply).
>
> Alexander: Can you comment on the following patch:
>
> š> Index: netinet/if_ether.c
> š> ===================================================================
> š> --- netinet/if_ether.c (revision 291256)
> š> +++ netinet/if_ether.c (working copy)
> š> @@ -185,7 +185,13 @@
> š> LLE_WUNLOCK(lle);
> š> return;
> š> }
> š> - ifp = lle->lle_tbl->llt_ifp;
> š> + if (lle->la_flags & LLE_LINKED) {
> š> + ifp = lle->lle_tbl->llt_ifp;
> š> + } else {
> š> + /* XXX RACE entry has been freed */
> š> + llentry_free(lle);
> š> + return;
> š> + }
> š> CURVNET_SET(ifp->if_vnet);
> š>
> š> if ((lle->la_flags & LLE_DELETED) == 0) {
>
> We need a check in arptimer() that the lle is still linked before
Yes, I had exactly that approach in mind. (And nd6_llinfo_timer() needs the same fix).
So, would you commit it or should I?
> proceeding, in there from what I can see. Because the callback is not
> protected by a mutex, it is not atomically stopped by callout_stop().
>
> --HPS



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2850091449828775>