Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Jan 2015 17:02:15 -0700
From:      Warner Losh <imp@bsdimp.com>
To:        "K. Macy" <kmacy@freebsd.org>
Cc:        Hans Petter Selasky <hps@selasky.org>, Adrian Chadd <adrian@freebsd.org>, "src-committers@freebsd.org" <src-committers@freebsd.org>, Jason Wolfe <nitroboost@gmail.com>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, "svn-src-head@freebsd.org" <svn-src-head@freebsd.org>, Sean Bruno <sbruno@freebsd.org>, Gleb Smirnoff <glebius@freebsd.org>, Konstantin Belousov <kostikbel@gmail.com>
Subject:   Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Message-ID:  <B4F512FF-5AD8-4885-9ABF-CEAB54561F92@bsdimp.com>
In-Reply-To: <CAHM0Q_NJDq41MtDAne0a-fnp%2BOg38=ypNKRidq5cAeL=nj-YDQ@mail.gmail.com>
References:  <20150120075126.GA42409@kib.kiev.ua> <20150120211137.GY15484@FreeBSD.org> <54BED6FB.8060401@selasky.org> <54BEE62D.2060703@ignoranthack.me> <CAHM0Q_MDJN_8sTvTDXfqA7UtJVO3Y8S8%2BNRCs_=6Nj4dkTzjOA@mail.gmail.com> <54BEE8E6.3080009@ignoranthack.me> <CAHM0Q_N_53BM-6RvXu8UpjfDzQHEn5oXZo1Nn8RO0cuOUhe8tg@mail.gmail.com> <54BEEA7F.1070301@ignoranthack.me> <CAHM0Q_PtJ7JHFTiu9_dmi_Ce=rmu1j72z2OYQ2CD3%2BEbcoEGsA@mail.gmail.com> <54BEF154.3030606@ignoranthack.me> <20150121181512.GE15484@FreeBSD.org> <CAJ-VmokyKX0uCA1%2B3KjziJkokDOVBCpUdwHQcVaY=buwa%2BqfhQ@mail.gmail.com> <CAHM0Q_NJDq41MtDAne0a-fnp%2BOg38=ypNKRidq5cAeL=nj-YDQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

> On Jan 21, 2015, at 4:38 PM, K. Macy <kmacy@freebsd.org> wrote:
>=20
>> They originally found that things were spinning for way too long.
>>=20
>> Hans found something similar and determined/concluded that the
>> migration code in callouts was racy-by-design and dramatically
>> simplified it and also put very hard constraints on what is a valid
>> situation to support migrating from one callwheel to another. Now we
>> have fallout which we can either address or back out until the =
callout
>> stuff is again reviewed/fixed.
>>=20
>> I don't think it's as alchemic as is being promoted.
>=20
>=20
> Let's not get drawn into semantic debates. To me alchemy implies
> voodoo debugging, whereas this is, in part, disabling functionality
> that exposes races - which is more like disabling preemption or fine
> grained locking.
>=20
>=20
> Normally I would advocate backing this out on the basis of precedence.
> That is to say, back it out so that developers will get it right the
> first time. However, it appears that hps@ and some others don't
> understand what was done wrong. So I'm going to state some basic
> premises and then the implied basic guidelines required of a commit
> that were not met. If these guidelines are violated by a change in the
> future I will agitate for a rapid back out.
>=20
> Premises:
> A) A performance regression is a bug every bit as much as a locking
> race. Stability OR performance (where we are talking about _current_
> performance) is a false dichotomy.
>   - This means that a change that fixes a bug by introducing a
> substantial performance regression does not constitute a fix.
> B) Existing behavior and performance characteristics should not be
> adversely changed unless there is widespread consensus.
>  - Sometimes it is necessary to "break" a KPI but that should not be
> discovered by svn or scanning back through the mailing lists.
>=20
> Guidelines:
> A) Performance should not measurably regress.
> B) If a KPI changes behavior e.g. callout_reset_on being crippled, all
> clients need to be updated so that they maintain their current
> behavior by the committer changing the KPI. The client code
> maintainers aren't responsible for reacting to these types of changes.
> That is OK for marginal architectures like sun4v or pc98.
> C) If there is a non-zero possibility that these changes break the
> client code, committers who have touched that code in the last few
> years should be notified.
>=20
>=20
>=20
> HPS: Your change failed to meet these guidelines. Some of us are upset
> because these guidelines are fairly fundamental for the on-going
> viability of FreeBSD. Due to linguistic / time zone / cultural
> differences these expectations have not been adequately communicated
> to you. You are not in the USB sandbox where others need for your
> support outweighs the inconvenience of random breakage.
>=20
> It sounds like you are making progress towards updating the concerns
> that have been voiced. If kib's observations are in fact comprehensive
> then adding a callout_init_cpu function and updating all clients so
> that their callouts continue to be scheduled on a CPU other than the
> BSP will suffice and we can all move on.

Is there some reason that we can=E2=80=99t back things out, break things =
down into
smaller pieces and have everything pass through phabric with a wide
ranging review? Given the fundamental nature of these changes, they
really need better review and doing it after the fact seems to be to be
too risky. I=E2=80=99m not debating that this =E2=80=9Cfixes=E2=80=9D =
some issues, but given the
performance regression, it sure seems like we may need a different
solution to be implemented and hashing that out in a branch might be
the best approach.

Warner=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B4F512FF-5AD8-4885-9ABF-CEAB54561F92>