From owner-freebsd-net@freebsd.org Thu Jun 30 08:41:44 2016 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2FDC2B879E2 for ; Thu, 30 Jun 2016 08:41:44 +0000 (UTC) (envelope-from mmacy@nextbsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 17E752262 for ; Thu, 30 Jun 2016 08:41:44 +0000 (UTC) (envelope-from mmacy@nextbsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id 13D13B879E0; Thu, 30 Jun 2016 08:41:44 +0000 (UTC) Delivered-To: net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 13272B879DE; Thu, 30 Jun 2016 08:41:44 +0000 (UTC) (envelope-from mmacy@nextbsd.org) Received: from sender163-mail.zoho.com (sender163-mail.zoho.com [74.201.84.163]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7CFE6225F; Thu, 30 Jun 2016 08:41:43 +0000 (UTC) (envelope-from mmacy@nextbsd.org) Received: from mail.zoho.com by mx.zohomail.com with SMTP id 1467276095001550.584576301839; Thu, 30 Jun 2016 01:41:35 -0700 (PDT) Date: Thu, 30 Jun 2016 01:41:34 -0700 From: Matthew Macy To: "Matthew Macy" Cc: "Julien Charbon" , "Randall Stewart" , "Hans Petter Selasky" , "current@freebsd.org" , "freebsd-net@freebsd.org" Message-ID: <155a0786ddb.11b26639c42755.8788446779628644237@nextbsd.org> In-Reply-To: <1559ad03918.b0d7215a52810.5433014980746638496@nextbsd.org> References: <20160617045319.GE1076@FreeBSD.org> <1f28844b-b4ea-b544-3892-811f2be327b9@freebsd.org> <20160620073917.GI1076@FreeBSD.org> <1d18d0e2-3e42-cb26-928c-2989d0751884@freebsd.org> <20160620095822.GJ1076@FreeBSD.org> <74bb31b7-a9f5-3d0c-eea0-681872e6f09b@freebsd.org> <18D94615-810E-4E79-A889-4B0CC70F9E45@netflix.com> <6E52CA6A-2153-4EF9-A3E1-97CB0D07EB28@freebsd.org> <15598235139.12175f84421756.2471769249719458878@nextbsd.org> <1559ad03918.b0d7215a52810.5433014980746638496@nextbsd.org> Subject: Re: EBR fix for life cycle races was Re: panic with tcp timers MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Priority: Medium User-Agent: Zoho Mail X-Mailer: Zoho Mail X-ZohoMail: Z_57973067 SPT_1 Z_57973066 SPT_1 SLF_D X-Zoho-Virus-Status: 2 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jun 2016 08:41:44 -0000 ---- On Tue, 28 Jun 2016 23:19:45 -0700 Matthew Macy wrote ---- > > > > ---- On Tue, 28 Jun 2016 15:51:57 -0700 K. Macy wrote ---- > > On Tue, Jun 28, 2016 at 10:51 AM, Matthew Macy wrote: > > > You guys should really look at Samy Bahra's epoch based reclamation. I solved a similar problem in drm/linuxkpi using it. > > > > The point being that this is a bug in the TCP life cycle handling > > _not_ in callouts. Churning the callout interface is not the best / > > only solution. > > -M > > Please see see D7017/D7018 as an example for an ultimately more robust solution than continued fiddling with the callout interface. > > https://reviews.freebsd.org/D7018 I realized that this shortens the race but still leaves one open from the time the callout lock is dropped to the time the epoch begins. I have a proposed fix to make read locks never block and to essentially close the race window. The next issue that comes up is that synchronize is called too often. I'll talk to Samy about it in a few hours and come up with a better design. -M > > > > ---- On Tue, 28 Jun 2016 02:58:56 -0700 Julien Charbon wrote ---- > > > > > > > > Hi Randall, > > > > > > > > On 6/25/16 4:41 PM, Randall Stewart via freebsd-net wrote: > > > > > Ok > > > > > > > > > > Lets try this again with my source changed to my @freebsd.net :-) > > > > > > > > > > Now I am also attaching a patch for you Gleb, this will take some poking to > > > > > get in to your NF-head since it incorporates some changes we made earlier. > > > > > > > > > > I think this will fix the problem.. i.e. dealing with two locks in the callout system (which it was > > > > > never meant to have done).. > > > > > > > > > > Note we probably can move the code to use the callout lock init now.. but lets see if this works > > > > > on your setup on c096 and if so we can think about doing that. > > > > > > > > Thanks for proposing a patch. I believe your patch will work with > > > > callout lock init, but not without: You still have a use-after-free > > > > issue on the tcpcb without callout lock init. > > > > > > > > The case being subtle as usual, let me try to describe that could happen: > > > > > > > > With your patch we have: > > > > > > > > void > > > > tcp_timer_keep(void *xtp) > > > > { > > > > struct tcpcb *tp = xtp; > > > > struct tcptemp *t_template; > > > > struct inpcb *inp; > > > > CURVNET_SET(tp->t_vnet); > > > > #ifdef TCPDEBUG > > > > int ostate; > > > > > > > > ostate = tp->t_state; > > > > #endif > > > > inp = tp->t_inpcb; > > > > KASSERT(inp != NULL, ("%s: tp %p tp->t_inpcb == NULL", __func__, > > > > tp)); > > > > INP_WLOCK(inp); > > > > if (callout_pending(&tp->t_timers->tt_keep) ### Use after free > > > > of tp here > > > > !callout_active(&tp->t_timers->tt_keep)) { > > > > INP_WUNLOCK(inp); > > > > CURVNET_RESTORE(); > > > > return; > > > > } > > > > ... > > > > > > > > The use-after-free scenario: > > > > > > > > [CPU 1] the callout fires, tcp_timer_keep entered > > > > [CPU 1] blocks on INP_WLOCK(inp); > > > > [CPU 2] schedules tcp_timer_keep with callout_reset() > > > > [CPU 2] tcp_discardcb called > > > > [CPU 2] tcp_timer_keep callout successfully canceled > > > > [CPU 2] tcpcb freed > > > > [CPU 1] unblocks, the tcpcb is used > > > > > > > > Then the tcpcb will used just after being freed... Might also crash or > > > > not depending in the case. > > > > > > > > Extra notes: > > > > > > > > o The invariant I see here is: The "callout successfully canceled" > > > > step should never happen when "the callout is currently being executed". > > > > > > > > o Solutions I see to enforce this invariant: > > > > > > > > - First solution: Use callout lock init with inp lock, your patch > > > > seems to permit that now. > > > > > > > > - Second solution: Change callout_async_drain() behavior: It can > > > > return 0 (fail) when the callout is currently being executed (no matter > > > > what). > > > > > > > > - Third solution: Don't trust callout_async_drain(callout) return > > > > value of 1 (success) if the previous call of callout_reset(callout) > > > > returned 0 (fail). That was the exact purpose of r284261 change, but > > > > this solution is also step backward in modernization of TCP > > > > timers/callout... > > > > > > > > https://svnweb.freebsd.org/base/stable/10/sys/netinet/tcp_timer.c?r1=284261&r2=284260&pathrev=284261 > > > > > > > > Hopefully my description is clear enough... > > > > > > > > -- > > > > Julien > > > > > > > > > > > > > > _______________________________________________ > > > freebsd-current@freebsd.org mailing list > > > https://lists.freebsd.org/mailman/listinfo/freebsd-current > > > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > > _______________________________________________ > > freebsd-current@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-current > > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > > > > _______________________________________________ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" >