From owner-svn-src-all@FreeBSD.ORG Tue Jan 20 15:01:36 2015 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 482D9B60; Tue, 20 Jan 2015 15:01:36 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C44CABA2; Tue, 20 Jan 2015 15:01:35 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t0KF1Tax077303 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 20 Jan 2015 17:01:29 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t0KF1Tax077303 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t0KF1TMR077302; Tue, 20 Jan 2015 17:01:29 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 20 Jan 2015 17:01:29 +0200 From: Konstantin Belousov To: Hans Petter Selasky Subject: Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys Message-ID: <20150120150129.GF42409@kib.kiev.ua> References: <201501151532.t0FFWV2Y037455@svn.freebsd.org> <54BDD9E1.6090505@selasky.org> <20150120075126.GA42409@kib.kiev.ua> <54BE0AAA.4050104@selasky.org> <20150120090057.GD42409@kib.kiev.ua> <54BE21F0.6010602@selasky.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <54BE21F0.6010602@selasky.org> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on tom.home Cc: "svn-src-head@freebsd.org" , Adrian Chadd , "src-committers@freebsd.org" , "svn-src-all@freebsd.org" X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Jan 2015 15:01:36 -0000 On Tue, Jan 20, 2015 at 10:37:52AM +0100, Hans Petter Selasky wrote: > On 01/20/15 10:00, Konstantin Belousov wrote: > > On Tue, Jan 20, 2015 at 08:58:34AM +0100, Hans Petter Selasky wrote: > >> On 01/20/15 08:51, Konstantin Belousov wrote: > >>> On Tue, Jan 20, 2015 at 05:30:25AM +0100, Hans Petter Selasky wrote: > >>>> On 01/19/15 22:59, Adrian Chadd wrote: > >>>>> Hi, > >>>>> > >>>>> Would you please check what the results of this are with CPU specific > >>>>> callwheels? > >>>>> > >>>>> I'm doing some 10+ gig traffic testing on -HEAD with RSS enabled (on > >>>>> ixgbe) and with this setup, the per-CPU TCP callwheel stuff is > >>>>> enabled. But all the callwheels are now back on clock(0) and so is the > >>>>> lock contention. :( > >>>>> > >>>>> Thanks, > >>>>> > >>>> > >>>> Hi, > >>>> > >>>> Like stated in the manual page, callout_reset_curcpu/on() does not work > >>>> with MPSAFE callouts any more! > >>> I.e. you 'fixed' some undeterminate bugs in callout migration by not > >>> doing migration at all anymore. > >>> > >>>> > >>>> You need to use callout_init_{mtx,rm,rw} and remove the custom locking > >>>> inside the callback in the TCP stack to get it working like before! > >>> > >>> No, you need to do this, if you think that whole callout KPI must be > >>> rototiled. It is up to the person who modifies the KPI, to ensure that > >>> existing code is not broken. > > Hi, > > It is not very hard to update existing callout clients and you can do it > too, if you need the extra bits of performance. I want to avoid regressions, and avoid breaking other' people work. > > Are there more API's than the TCP stack which you think needs an update > and are performance critical? I did not performed any analysis. More, I naturally expect that such analysis and demonstration that there is no regression, is the duty of the person who proposes the change. > > >>> > >>> As I understand, currently we are back to the one-cpu callouts. > >>> Do other people consider this situation acceptable ? > > For the TCP stack - yes, but not for other clients like cv_timedwait() > and such. > > If you think you have a better way to solve the callout problems, please > tell me! In order for a callout to change its CPU you need a lock to > protect which CPU the callout is on. Instead of introducing a third lock > in the callout path, which will be a congestion point, to protect > against changing the CPU number, I decided that we will use the client's > mutex and the MPSAFE implies the client doesn't have any mutex. So it > won't work with callout clients which use the CALLOUT_MPSAFE flag. > Honestly CALLOUT_MPSAFE should not be used, because it leads to extra > complexity in the clients catching the race when tearing down the > callouts and any pending callbacks. This is your opinion. I did fixed some bugs in the callout migration code, and I am not sure that requiring rototiling of almost all KPI consumers (and leaving unconverted consumers to pre-cpu state) is the only possible solution. But again, since it is you who brought the change into the tree, it is your duty to present a valid proof why this is the only possible way to solve bugs (which bugs ?). > > >> > >> Please read the callout 9 manual page first. > > > > Assume I read it. How this changes any of my points above ? > > """ > > A change in the CPU selection cannot happen if this function is > > re-scheduled inside a callout function. Else the callback function given > > by the func argument will be executed on the same CPU like previously > > done. > > """ > > You cannot do this without fixing consumers. > > > > The code simply needs an update. It is not broken in any ways - right? > If it is not broken, fixing it is not that urgent. Isn't it obvious ? If callouts no longer migrate to non-BSP, this is the regression. I am sorry for you attitude.