From owner-freebsd-net@freebsd.org Fri Jul 15 04:21:50 2016 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1F472B98CE7; Fri, 15 Jul 2016 04:21:50 +0000 (UTC) (envelope-from hps@selasky.org) Received: from mail.turbocat.net (mail.turbocat.net [IPv6:2a01:4f8:d16:4514::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AD76812EB; Fri, 15 Jul 2016 04:21:49 +0000 (UTC) (envelope-from hps@selasky.org) Received: from laptop015.home.selasky.org (unknown [62.141.129.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id 45F861FE024; Fri, 15 Jul 2016 06:21:46 +0200 (CEST) Subject: Re: callout_drain either broken or man page needs updating To: Matthew Macy , "freebsd-current@freebsd.org" , "freebsd-net@freebsd.org" , Gleb Smirnoff References: <155eca8bae0.d811ff9b567670.7363072028299444677@nextbsd.org> From: Hans Petter Selasky Message-ID: <087a20a1-2b8b-2050-c75f-78aac964b457@selasky.org> Date: Fri, 15 Jul 2016 06:25:41 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.0 MIME-Version: 1.0 In-Reply-To: <155eca8bae0.d811ff9b567670.7363072028299444677@nextbsd.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jul 2016 04:21:50 -0000 On 07/15/16 05:45, Matthew Macy wrote: > glebius last commit needs some further re-work. Hi, Glebius commit needs to be backed out, at least the API change that changes the return value when calling callout_stop() when the callout is scheduled and being serviced. Simply because there is code out there, like Mattew and others have discovered that is "refcounting" on the callout_reset() and expecting that a subsequent callout_stop() will return 1 to "unref". If you consider this impossible, maybe a fourth return value is needed for CANCELLED and DRAINING . Further, getting the callouts straight in the TCP stack is a matter of doing the locking correctly, which some has called "my magic bullet" and not the return values. I've proposed in the following revision https://svnweb.freebsd.org/changeset/base/302768 to add a new callout API that accepts a locking function so that the callout code can run its cancelled checks at the right place for situations where more than one lock is needed. Consider this case: > void > tcp_timer_2msl(void *xtp) > { > struct tcpcb *tp = xtp; > struct inpcb *inp; > CURVNET_SET(tp->t_vnet); > #ifdef TCPDEBUG > int ostate; > > ostate = tp->t_state; > #endif > INP_INFO_RLOCK(&V_tcbinfo); > inp = tp->t_inpcb; > KASSERT(inp != NULL, ("%s: tp %p tp->t_inpcb == NULL", __func__, tp)); > INP_WLOCK(inp); > tcp_free_sackholes(tp); > if (callout_pending(&tp->t_timers->tt_2msl) || > !callout_active(&tp->t_timers->tt_2msl)) { Here we have custom in-house race check that doesn't affect the return value of callout_reset() nor callout_stop(). > INP_WUNLOCK(tp->t_inpcb); > INP_INFO_RUNLOCK(&V_tcbinfo); > CURVNET_RESTORE(); > return; I propose the following solution: > > static void > tcp_timer_2msl_lock(void *xtp, int do_lock) > { > struct tcpcb *tp = xtp; > struct inpcb *inp; > > inp = tp->t_inpcb; > > if (do_lock) { > CURVNET_SET(tp->t_vnet); > INP_INFO_RLOCK(&V_tcbinfo); > INP_WLOCK(inp); > } else { > INP_WUNLOCK(inp); > INP_INFO_RUNLOCK(&V_tcbinfo); > CURVNET_RESTORE(); > } > } > callout_init_lock_function(&callout, &tcp_timer_2msl_lock, CALLOUT_RETURNUNLOCKED); Then in softclock_call_cc() it will look like this: > > CC_UNLOCK(cc); > if (c_lock != NULL) { > if (have locking function) > tcp_timer_2msl_lock(c_arg, 1); > else > class->lc_lock(c_lock, lock_status); > /* > * The callout may have been cancelled > * while we switched locks. > */ Actually "CC_LOCK(cc)" should be in-front of cc_exec_cancel() to avoid races testing, setting and clearing this variable, like done in hps_head. > if (cc_exec_cancel(cc, direct)) { > if (have locking function) > tcp_timer_2msl_lock(c_arg, 0); > else > class->lc_unlock(c_lock); > goto skip; > } > cc_exec_cancel(cc, direct) = true; > > .... > > skip: > if ((c_iflags & CALLOUT_RETURNUNLOCKED) == 0) { > if (have locking function) > ... > else > class->lc_unlock(c_lock); > } The whole point about this is to make the the cancelled check atomic. 1) Lock TCP 2) Lock CC_LOCK() 3) change callout state --HPS