Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Jan 2015 10:16:06 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        "K. Macy" <kmacy@freebsd.org>
Cc:        Hans Petter Selasky <hps@selasky.org>, Adrian Chadd <adrian@freebsd.org>, "src-committers@freebsd.org" <src-committers@freebsd.org>, Jason Wolfe <nitroboost@gmail.com>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, Sean Bruno <sbruno@freebsd.org>, Gleb Smirnoff <glebius@freebsd.org>, "svn-src-head@freebsd.org" <svn-src-head@freebsd.org>
Subject:   Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Message-ID:  <20150121081606.GN42409@kib.kiev.ua>
In-Reply-To: <CAHM0Q_MQPk6Gz%2Bon2fDKSMbzJzi4g-QF23pNP_-v8Y_jV7hhTA@mail.gmail.com>
References:  <20150120211137.GY15484@FreeBSD.org> <54BED6FB.8060401@selasky.org> <54BEE62D.2060703@ignoranthack.me> <CAHM0Q_MDJN_8sTvTDXfqA7UtJVO3Y8S8%2BNRCs_=6Nj4dkTzjOA@mail.gmail.com> <54BEE8E6.3080009@ignoranthack.me> <CAHM0Q_N_53BM-6RvXu8UpjfDzQHEn5oXZo1Nn8RO0cuOUhe8tg@mail.gmail.com> <54BEEA7F.1070301@ignoranthack.me> <CAHM0Q_PtJ7JHFTiu9_dmi_Ce=rmu1j72z2OYQ2CD3%2BEbcoEGsA@mail.gmail.com> <54BEF154.3030606@ignoranthack.me> <CAHM0Q_MQPk6Gz%2Bon2fDKSMbzJzi4g-QF23pNP_-v8Y_jV7hhTA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jan 20, 2015 at 04:37:44PM -0800, K. Macy wrote:
> I would pick stability over performance any day. However, it _seems_
> to me, and maybe I simply don't understand some key details, that the
> fix consisted of largely single-threading the callout system. And as I
> say I may simply not understand the specifics, but this sort of large
> scale disabling does not constitute a fix but is a workaround.  It's
> more like disabling preemption because it fixes a panic. Yes, it might
> "fix" a whole array of bugs that crop up but it could not be seen as a
> fix to an otherwise working system.

Well, this is not a complete truth. Let me try to explain my
understanding, I spent some time actually looking at the new code. Would
be nice if corrections to my understanding is posted, if needed.

Now, when callout_reset() is directed to change cpu, the change only
happens when the callout is associated with a lock, and that lock is
owned by the caller of callout_reset(). There are two consequences. One,
which is good, is significant simplification of the migration code,
together with the drain. This is exactly the place where there bugs
which make my head hurt, see e.g. r234952 and r243901.  Note that some
callouts follow this pattern already, e.g. process timers after r243869.

Another consequence, which is very visible and which causes the roar, is
that all lockless callers of callout_reset_on(9) now silently changed
the behaviour to callout_reset(9). There is no audit performed to
identify such callers, and there is no even a plan to fix them.  The
answer to the complains seems to be 'if you think that the fix is needed,
go and fix'.

My impression is that some set of vocal active developers consider
the second consequence unacceptable, myself included. IMO, committing
the change, however good the consequence one is, without fixing the
consequence two, is inappropriate.  And the onus of doing this is on
the person doing the change.

Yes, it is very interesting is the change actually good WRT fixing
migration, since indeed there is serious reduction in the migration
amount due to locked callout_reset() being not universally used.
It is possible that the bugs are only covered.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150121081606.GN42409>