Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 15 Jan 2017 06:09:14 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        freebsd-arm <freebsd-arm@freebsd.org>, FreeBSD Toolchain <freebsd-toolchain@freebsd.org>
Subject:   Re: qemu-arm-static appears to have problems with signal delivery during (at least) poudrirer-devel based cross builds of some ports with ALLOW_MAKE_JOBS=yes
Message-ID:  <7AF92A3C-3563-4B2E-B14A-D6BAF30A16A2@dsl-only.net>
In-Reply-To: <BF74B3CA-9BD6-4F97-B472-FF918FCE737A@dsl-only.net>
References:  <BF74B3CA-9BD6-4F97-B472-FF918FCE737A@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2017-Jan-14, at 10:53 PM, Mark Millard <markmi at dsl-only.net> =
wrote:

> [Context: head (12) -r312009 and ports head -r431413.]
>=20
> I've been experimenting on amd64 with poudriere-devel with -x
> for -a arm.armv6 and I ran into:
>=20
>> TCG temporary leak before 00021826
>> qemu: uncaught target signal 4 (Illegal instruction) - core dumped
>=20
> in 3 of the 31 ports for the build, but 4 skipped so 3 of 27
> attempted. The 00021826 is the same number in all the examples
> so far (whatever its base).
>=20
> These seem to be the only TCG messages and each failure starts with
> one and then reports the qemu message. (Also true for the below.)
> As far as I can tell the TCG notice is the report of an internal
> qemu problem that is then translated into an Illegal instruction.
>=20
> This was with ALLOW_MAKE_JOBS=3Dyes but -J 1 for poudriere.
>=20
> For 2 of the problem ports retries worked, still using
> ALLOW_MAKE_JOBS=3Dyes and -J 1 .
>=20
> But the 3rd port failed each time tried with ALLOW_MAKE_JOBS=3Dyes
> --but in a different step each time.
>=20
> In all failure cases it was gmake that got the "illegal instruction".
>=20
> But disabling ALLOW_MAKE_JOBS=3Dyes appears (so far) to avoid the
> issue. For example, that 3rd failing port built fine. (I've
> been doing more ports since, with ALLOW_MAKE_JOBS=3Dyes repeatedly
> failing and lack of it working.)
>=20
> My guess is SIGCHLD delivery sometimes touches something (or a timing)
> that is not handled well in qemu-arm-static. I've had not problems
> on an rpi2 or bpim3 in the past.
>=20
> (I have seen some analogous "soemtimes" issues on powerpc under
> and version of lang that mishandled the stack part of the ABI
> FreeBSD uses, SIGCHLD sometimes getting on the stack at a bad-time
> for the messed up code generation, leading to stack corruption. Code
> not getting signals had no problems.)
>=20
> Note: The amd64 context is FreeBSD under VirtualBox under macOS
> and it has had no problem for native builds of world, kernel,
> or ports.

Avoiding ALLOW_MAKE_JOBS=3Dyes is not sufficient to guarantee builds
will work. Here is one that got near the end before failing the
same way:

. . .
install -m 0644 =
/wrkdirs/usr/ports/devel/arm-none-eabi-gcc/work/gcc-6.3.0/gcc/cp/type-util=
s.h =
/wrkdirs/usr/ports/devel/arm-none-eabi-gcc/work/stage/usr/local/lib/gcc/ar=
m-none-eabi/6.3.0/plugin/include/cp/type-utils.h
install: DONTSTRIP set - will not strip installed binaries
TCG temporary leak before 00021826
qemu: uncaught target signal 4 (Illegal instruction) - core dumped
gmake[1]: *** [Makefile:4176: install-gcc] Illegal instruction
gmake[1]: Leaving directory =
'/wrkdirs/usr/ports/devel/arm-none-eabi-gcc/work/.build'
*** Error code 2

Stop.
make: stopped in /usr/ports/devel/arm-none-eabi-gcc
=3D=3D=3D=3D>> Cleaning up wrkdir
=3D=3D=3D>  Cleaning for arm-none-eabi-gcc-6.3.0
build of devel/arm-none-eabi-gcc ended at Sun Jan 15 00:04:02 PST 2017
build time: 02:52:28
!!! build failure encountered !!!


Going back to the earlier initial problem (that I happen to have the
material for handy): expanding the .tbz of the failed build and finding
the core showed:

# find . -name "*.core" -exec file {} \;                                 =
                                                                         =
                                                                         =
                                         =
./work/binutils-2.27/ld/qemu_gmake.core: ELF 32-bit LSB core file ARM, =
version 1 (FreeBSD), FreeBSD-style, from 'ke'

[I've not figured out what I can do with that --or how.]


One thing unusual on my part is that I use -mcpu=3Dcortex-a7 . That
matches how I historically buildworld buildkernel for installation
on the rpi2 and bpim3. I've never had problems like this with
builds on the rpi2 or the bpim3 (buildworld, buildkernel, port
builds). It might be that qemu-arm-static has a problem with
-mcpu=3Dcortex-a7 code that is generated --but not always.

Using the make.conf as an example:

# more /usr/local/etc/poudriere.d/head-cortex-a7-make.conf
WANT_QT_VERBOSE_CONFIGURE=3D1
#
DEFAULT_VERSIONS+=3Dperl5=3D5.24
WITH_DEBUG=3D
WITH_DEBUG_FILES=3D
MALLOC_PRODUCTION=3D
#
#system clang 3.8+ (gcc6 rejects -march=3Darmv7a):
#CFLAGS+=3D -march=3Darmv7-a -mcpu=3Dcortex-a7
#CXXFLAGS+=3D -march=3Darmv7-a -mcpu=3Dcortex-a7
#CPPFLAGS+=3D -march=3Darmv7-a -mcpu=3Dcortex-a7
#
#lang/gcc6's xgcc stage considers the above conflicting so use just:
CFLAGS+=3D -mcpu=3Dcortex-a7
CXXFLAGS+=3D -mcpu=3Dcortex-a7
CPPFLAGS+=3D -mcpu=3Dcortex-a7


For my context poudriere with -x for -a arm.armv6 and the use of
qemu-arm-static does not look reliable enough to depend on. It is
not obvious that the -x use contributes to the problem: it may well
not.

=3D=3D=3D
Mark Millard
markmi at dsl-only.net





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7AF92A3C-3563-4B2E-B14A-D6BAF30A16A2>