Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Jan 2015 10:51:00 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Hans Petter Selasky <hps@selasky.org>
Cc:        Adrian Chadd <adrian@freebsd.org>, "src-committers@freebsd.org" <src-committers@freebsd.org>, "K. Macy" <kmacy@freebsd.org>, Jason Wolfe <nitroboost@gmail.com>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, sbruno@freebsd.org, Gleb Smirnoff <glebius@freebsd.org>, "svn-src-head@freebsd.org" <svn-src-head@freebsd.org>
Subject:   Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Message-ID:  <20150121085100.GQ42409@kib.kiev.ua>
In-Reply-To: <54BF640B.6000700@selasky.org>
References:  <54BDD9E1.6090505@selasky.org> <20150120075126.GA42409@kib.kiev.ua> <20150120211137.GY15484@FreeBSD.org> <54BED6FB.8060401@selasky.org> <54BEE62D.2060703@ignoranthack.me> <CAHM0Q_MDJN_8sTvTDXfqA7UtJVO3Y8S8%2BNRCs_=6Nj4dkTzjOA@mail.gmail.com> <54BEE8E6.3080009@ignoranthack.me> <CAHM0Q_N_53BM-6RvXu8UpjfDzQHEn5oXZo1Nn8RO0cuOUhe8tg@mail.gmail.com> <54BEEA7F.1070301@ignoranthack.me> <54BF640B.6000700@selasky.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Jan 21, 2015 at 09:32:11AM +0100, Hans Petter Selasky wrote:
> On 01/21/15 00:53, Sean Bruno wrote:
> > Unkown to me.  Nor am I aware of anyone else who ever hit our panics
> > either.  Our environment, and the failure, was only seen in the Intel
> > 10GE space (ixgbe).  This is an artifact of our use cases, and hasn't
> > been expanded nor tested in our environment with other vendor interfaces.
> >
> > sean
> 
> Hi,
> 
> I've seen this with Mellanox hardware when running some special tests, 
> but not during regular use yet. That was the reason for going into the 
> callout subsystem in the first place. 40GE.
> 
> Also I would like to mention during the heat of this discussion, that 
> during X-mas this year, I had a very heavy discussion with Attilio and a 
> few other FreeBSD developers, who's name was on a patch (r220456) that 
> changed how the return value of "callout_active()" works. 
> "callout_active()" is heavily used inside the TCP stack and what was 
> found is there is a potential race related to migrating the callout from 
> one CPU to the other, which in turn might give other symptoms than a 
> spinlock hang.
> 
> FYI:
> 
> https://svnweb.freebsd.org/base?view=revision&revision=225057
> 
> Cite: "If the newly scheduled thread wants to acquire the old queue it 
> will just spin forever."
> 
> This description reminds me very much of what "Jason Wolfe", others and 
> myself have seen.
> 
> Konstantin, you're responsible for r220456 (Approved by: kib). I would 
I definitely do not see anything related to my freefall login in the
log message for r220456, nor I participated in any way in the work
which lead to that revision.

If you mean r225057, note that approval by re != review.
> like to ask what investigation you did to ensure that you solved the 
> problem as described in the commit message and didn't introduce a new one?
> 
> In r220456 the "callout_reset_on()" function was changed in a way that 
> directly conflicts with how the TCP stack works, by not always ensuring 
> that "callout_active()" returns non-zero after a callout is restarted! 
> See return at line 821:
> 
> > https://svnweb.freebsd.org/base/head/sys/kern/kern_timeout.c?revision=225057&view=markup&pathrev=225057#l821
> 
> Kib: Any comments?

With the re hat on, explanation for the proposed commit looked reasonable,
and committer provided enough evidence that change got adequate testing.
Since change fixed a bug, and this is exactly what re wants to see
during release cycle, I see no reason why commit should be denied.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150121085100.GQ42409>