Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 18 Feb 2017 21:58:25 +0100
From:      Mateusz Guzik <mjguzik@gmail.com>
To:        Mark Millard <markmi@dsl-only.net>
Cc:        mjg@freebsd.org, Justin Hibbits <chmeeedalf@gmail.com>, svn-src-head@freebsd.org, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, FreeBSD Current <freebsd-current@freebsd.org>
Subject:   Re: svn commit: r313268 - head/sys/kern [through -r313271 for atomic_fcmpset use and later: fails on PowerMac G5 "Quad Core"; -r313266 works]
Message-ID:  <20170218205825.GA24384@dft-labs.eu>
In-Reply-To: <AE0F9808-49F5-4345-891B-32ED542958E8@dsl-only.net>
References:  <2FD12B8F-2255-470A-98D4-2DCE9C7495F5@dsl-only.net> <AE0F9808-49F5-4345-891B-32ED542958E8@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Feb 18, 2017 at 12:49:29PM -0800, Mark Millard wrote:
> On 2017-Feb-18, at 4:18 AM, Mark Millard <markmi at dsl-only.net> wrote:
> 
> > [Note: I experiment with clang based powerpc64 builds,
> > reporting problems that I find. Justin is familiar
> > with this, as is Nathan.]
> > 
> > I tried to update the PowerMac G5 (a so-called "Quad Core")
> > that I have access to from head -r312761 to -r313864 and
> > ended up with random panics and hang ups in fairly short
> > order after booting.
> > 
> > Some approximate bisecting for the kernel lead to:
> > (sometimes getting part way into a buildkernel attempt
> > for a different version before a failure happens)
> > 
> > -r313266: works (just before use of atomic_fcmpset)
> > vs.
> > -r313271: fails (last of the "use atomic_fcmpset" check-ins)
> > 
> > (I did not try -r313268 through -r313270 as the use was
> > gradually added.)
> > 
> > So I'm currently running a -r313864 world with a -r313266
> > kernel.
> > 
> > No kernel that I tried that was from before -r313266 had the
> > problems.
> > 
> > Any kernel that I tried that was from after -r313271 had the
> > problems.
> > 
> > Of course I did not try them all in other direction. :)
> 
> [Of course: "either direction".]
> 
> I'll note that the -r313864 buildworld was without
> MALLOC_PRODUCTION being defined. (Unusual for me but
> I'm testing if a jemalloc assert problem on arm64
> also happens on powerpc64.)
> 
> By contrast the buildkernels were production style
> (as is normal for me unless I'm trying to track
> something down that I think might be exposed by
> the extra checks).
> 

Well either the primitive itself is buggy or the somewhat (now) unusual
condition of not providing the failed value (but possibly a stale one)
is not handled correctly in locking code.

That said, I would start with putting barriers "on both sides" of
powerpc's fcmpset for debugging purposes and if the problem persists I
can add some debugs to locking priitmives.

-- 
Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170218205825.GA24384>