Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 24 Feb 2017 23:46:05 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        Mateusz Guzik <mjguzik@gmail.com>
Cc:        Justin Hibbits <chmeeedalf@gmail.com>, mjg@freebsd.org, FreeBSD Current <freebsd-current@freebsd.org>, svn-src-head@freebsd.org, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Re: svn commit: r313268 - head/sys/kern [through -r313271 for atomic_fcmpset use and later: fails on PowerMac G5 "Quad Core"; -r313266 works]
Message-ID:  <EB9DBDFA-BAE9-4BFF-8E8B-BF7698362A11@dsl-only.net>
In-Reply-To: <12339EDD-5663-40E0-8553-821EF9B6CDEB@dsl-only.net>
References:  <2FD12B8F-2255-470A-98D4-2DCE9C7495F5@dsl-only.net> <20170220191044.GA8526@dft-labs.eu> <5D5235E1-6F84-4329-8ED5-35FCDB0A6A71@dsl-only.net> <20170225002300.GC19697@dft-labs.eu> <12339EDD-5663-40E0-8553-821EF9B6CDEB@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2017-Feb-24, at 8:25 PM, Mark Millard <markmi at dsl-only.net> wrote:

> On 2017-Feb-24, at 4:23 PM, Mateusz Guzik <mjguzik at gmail.com> wrote:
>> 
>> On Tue, Feb 21, 2017 at 01:37:25AM -0800, Mark Millard wrote:
>>> [Back to the powerpc64 context.]
>>> 
>>> On 2017-Feb-20, at 11:10 AM, Mateusz Guzik <mjguzik at gmail.com> wrote:
>>> 
>>>> On Sat, Feb 18, 2017 at 04:18:05AM -0800, Mark Millard wrote:
>>>>> [Note: I experiment with clang based powerpc64 builds,
>>>>> reporting problems that I find. Justin is familiar
>>>>> with this, as is Nathan.]
>>>>> 
>>>>> I tried to update the PowerMac G5 (a so-called "Quad Core")
>>>>> that I have access to from head -r312761 to -r313864 and
>>>>> ended up with random panics and hang ups in fairly short
>>>>> order after booting.
>>>>> 
>>>>> Some approximate bisecting for the kernel lead to:
>>>>> (sometimes getting part way into a buildkernel attempt
>>>>> for a different version before a failure happens)
>>>>> 
>>>>> -r313266: works (just before use of atomic_fcmpset)
>>>>> vs.
>>>>> -r313271: fails (last of the "use atomic_fcmpset" check-ins)
>>>>> 
>>>>> (I did not try -r313268 through -r313270 as the use was
>>>>> gradually added.)
>>>>> 
>>>>> So I'm currently running a -r313864 world with a -r313266
>>>>> kernel.
>>>>> 
>>>>> No kernel that I tried that was from before -r313266 had the
>>>>> problems.
>>>>> 
>>>>> Any kernel that I tried that was from after -r313271 had the
>>>>> problems.
>>>>> 
>>>>> Of course I did not try them all in other direction. :)
>>>>> 
>>>> 
>>>> I found that spin mutexes were not properly handling this, fixed in
>>>> r313996.
>>>> 
>>>> Locally I added a if (cpu_tick() % 2) return (0); snipped to amd64
>>>> fcmpset to simulate failures. Everything works, while it would easily
>>>> fail without the patch.
>>>> 
>>>> That said, I hope this concludes the 'missing check for not-reread value
>>>> of failed fcmpset' saga.
>>>> 
>>>> -- 
>>>> Mateusz Guzik <mjguzik gmail.com>
>>> 
>>> -r313999 is an improvement for powerpc64: it boots and I can
>>> log in on the old PowerMac G5 so-called "Quad Core".
>>> 
>>> But, e.g., buildworld buildkernel eventually hangs and later
>>> the powerpc64 panics for "spin lock held too long".
>>> 
>> 
>> Allright, play time is over.
>> 
>> Can you please:
>> 1. verify r313254 is stable for you
>> 2. apply https://people.freebsd.org/~mjg/patches/complete-locks.diff and
>> https://people.freebsd.org/~mjg/.junk/ppc.diff on top of it and retry
>> the test?
>> 
>> This is a workaround which effectively disables the powerpc-specific
>> primitive and makes it use a cmpset wrapper instead. I don't have the
>> hardware to test right now and my attempts to boot in qemu also failed.
>> 
>> That said, does not look like there are general fcmpset bugs left and
>> the remaining issue seems powerpc-specific.
>> 
>> If this works, I'll commit the workaround for the time being as in few
>> weeks I'd like to start merging the work back to stable/11.
>> 
>> -- 
>> Mateusz Guzik <mjguzik gmail.com>
> 
> I've started a self-hosted powerpc64 -r313254 build
> based on running the -r313266 kernel. (The context 
> sometimes do cross builds in is tied up with other
> things. -r313266 is what my prior bisection came up
> with as the last appearently-working kernel at the
> time.)
> 
> So it will be a while before I have a -r313254 in
> place to try: the self-hosted build takes longer
> and so will not be installed for a while.
> 
> To judge stability I'll probably have -e313254 build
> the patched update that you want me to test, initially
> doing a cleanworld. So that too will take a while.
> 
> (The above wording presumes all goes well.)
> 
> I'll let you know as I go along if I run into anything
> interesting.
> 
> 
> My builds are rebuilding both world and kernel since
> what turns into /usr/include/sys/* has changes in your
> patch.
> 
> The builds are without MALLOC_PRODUCTION but are
> otherwise not debug builds.
> 
> 
> I've not seen anything indicating that anyone has
> been trying TARGET_ARCH=powerpc. I've been trying
> TARGET_ARCH=powerpc64 .
> 
> While I do not have access to a true
> TARGET_ARCH=powerpc machine currently, such a build
> can be used on a PowerMac G5 so-called "Quad Core".
> So I could eventually build and try such on the one
> powerpc family machine that I currently have access
> to.
> 
> clang 3.9.1 has a significant code generation problem
> for TARGET_ARCH=powerpc and so I'd have to use
> a gcc 4.2.1 based build for that sort of experiment.
> (There is no xtoolchain for 32-bit powerpc.)
> 
> I use clang 3.9.1 or xtoolchain for
> TARGET_ARCH=powerpc64 and have been using clang 3.9.1
> in recent times. My primary powerpc family use has
> been to experiment with building based on the
> modern libc++ and reporting issues discovered in the
> attempts. This explains the clang/xtoolchain context.
> 
> clang 3.9.1 has major problems for C++ exception
> handling for both powerpc64 and powerpc but a
> lot of FreeBSD is independent of throwing C++
> exceptions. By contrast xtoolchain-based works
> for C++ exception handling but lib32 fails
> to operate when built by a xtoolchain build.

-r313254 had no trouble booting or building
the patched version or anything else involved
in getting there or installing.

But the patched version failed quickly just
attempting cleanworld's recursive remove. (So
it did boot and let me log in.) The panic
description was:

panic: vn_finished_secondary_write: neg cnt


The sources that are different from svn's -r313254
are (some tied to arm64 experiments, most everything
else tied to powerpc64 and/or powerpc, those not
from your patches are long standing from my
investigations or from Justin H.):

# svnlite status /usr/src | sort
. . . (ignoring the ? lines) . . .
M       /usr/src/bin/sh/jobs.c
M       /usr/src/bin/sh/miscbltin.c
M       /usr/src/contrib/llvm/lib/Target/PowerPC/PPCInstrInfo.td
M       /usr/src/contrib/llvm/tools/lld/ELF/Target.cpp
M       /usr/src/lib/csu/powerpc64/Makefile
M       /usr/src/libexec/rtld-elf/Makefile
M       /usr/src/sys/arm/arm/gic.c
M       /usr/src/sys/boot/ofw/Makefile.inc
M       /usr/src/sys/boot/powerpc/Makefile.inc
M       /usr/src/sys/boot/powerpc/kboot/Makefile
M       /usr/src/sys/boot/uboot/Makefile.inc
M       /usr/src/sys/conf/kmod.mk
M       /usr/src/sys/ddb/db_main.c
M       /usr/src/sys/ddb/db_script.c
M       /usr/src/sys/kern/init_main.c
M       /usr/src/sys/kern/kern_condvar.c
M       /usr/src/sys/kern/kern_lock.c
M       /usr/src/sys/kern/kern_lockstat.c
M       /usr/src/sys/kern/kern_mutex.c
M       /usr/src/sys/kern/kern_rwlock.c
M       /usr/src/sys/kern/kern_sx.c
M       /usr/src/sys/kern/kern_synch.c
M       /usr/src/sys/kern/kern_thread.c
M       /usr/src/sys/kern/subr_lock.c
M       /usr/src/sys/kern/vfs_default.c
M       /usr/src/sys/kern/vfs_subr.c
M       /usr/src/sys/powerpc/include/atomic.h
M       /usr/src/sys/powerpc/ofw/ofw_machdep.c
M       /usr/src/sys/sys/lock.h
M       /usr/src/sys/sys/lockmgr.h
M       /usr/src/sys/sys/lockstat.h
M       /usr/src/sys/sys/mutex.h
M       /usr/src/sys/sys/rwlock.h
M       /usr/src/sys/sys/sdt.h
M       /usr/src/sys/sys/sx.h
M       /usr/src/sys/sys/systm.h


===
Mark Millard
markmi at dsl-only.net




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?EB9DBDFA-BAE9-4BFF-8E8B-BF7698362A11>