Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 28 Dec 2015 00:01:02 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        Warner Losh <imp@bsdimp.com>
Cc:        freebsd-arm <freebsd-arm@freebsd.org>, FreeBSD Toolchain <freebsd-toolchain@freebsd.org>, Ian Lepore <ian@FreeBSD.org>, mat@FreeBSD.org, sbruno@FreeBSD.org
Subject:   Re: 11.0-CURRENT (r292413) on a rpi2b: arm-gnueabi-freebsd/bin/ar, _fseeko, and memset vs memory alignment (SCTRL bit[1]=1?): Explains the Bus error?
Message-ID:  <118D2970-4799-46B1-81A1-0101B907C1BE@dsl-only.net>
In-Reply-To: <D38C49E3-B622-49EA-9B30-3B1B2FA2E569@bsdimp.com>
References:  <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> <DB75F0D6-86CB-4383-8653-6017C76729F9@dsl-only.net> <A338272B-982F-4E1F-B87D-1E33815EA212@dsl-only.net> <0D81C2CA-BF1C-4C14-B816-A8C5F68715B5@bsdimp.com> <51EB4AAB-BC81-4282-BA4D-D329C41D660B@dsl-only.net> <8B52074F-FDEF-4119-BB04-630F9BE9E6DB@bsdimp.com> <BBAAE33E-BD65-40A3-A0B3-F3346FC08112@dsl-only.net> <DC9EE7C3-2763-44EF-91DA-AFE63C48E537@dsl-only.net> <D38C49E3-B622-49EA-9B30-3B1B2FA2E569@bsdimp.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On 2015-Dec-26, at 8:45 AM, Warner Losh <imp@bsdimp.com> wrote:

> Thanks, it sounds like I fixed a bug, but there=E2=80=99s more.
>=20
> What were the specific port so I can test it here?
>=20
> And to be clear, this is a buildworld on the RPi 2 using the =
cross-built world with CPUTYPE=3Darmv7a or some such, right?
>=20
> Warner
>=20
>> On Dec 25, 2015, at 9:32 PM, Mark Millard <markmi@dsl-only.net> =
wrote:
>>=20
>> [I am again breaking off another section of older material.]
>>=20
>> Mixed news I'm afraid.
>>=20
>> The specific couple of ports that I attempted did build, the same =
ones that originally got the Bus Error in ar using (indirectly) _fseeko =
and memset that I reported. So I expect that you fixed one error.
>>=20
>> But when I tried to buildworld, clang++ 3.7 processing =
usr/src/lib/clang/libllvmtablegen/ materials quickly got a Bus Error at =
nearly the same type of instruction (it has a "!" below that the earlier =
one did not), but with r4 holding the misaligned address this time:
>>=20
>>> --- _bootstrap-tools-lib/clang/libllvmsupport ---
>>> --- APFloat.o ---
>>> clang++: error: unable to execute command: Bus error (core dumped)
>>> . . .
>>> # gdb clang++ usr/src/lib/clang/libllvmtablegen/clang++.core
>>> . . .
>>> Core was generated by `clang++'.
>>> Program terminated with signal 10, Bus error.
>>> #0  0x00c3bb9c in =
clang::DependentTemplateSpecializationType::DependentTemplateSpecializatio=
nType ()
>>> [New Thread 22a18000 (LWP 100128/<unknown>)]
>>> (gdb) x/40i 0x00c3bb60
>>> . . .
>>> 0xc3bb9c =
<_ZN5clang35DependentTemplateSpecializationTypeC2ENS_21ElaboratedTypeKeywo=
rdEPNS_19NestedNameSpecifierEPKNS_14IdentifierInfoEjPKNS_16TemplateArgumen=
tENS_8QualTypeE+356>:
>>>   vst1.64	{d16-d17}, [r4]!
>>> . . .
>>> (gdb) info all-registers
>>> r0             0xbfbf81a8	-1077968472
>>> r1             0x22f07e14	586186260
>>> r2             0xc416bc	12850876
>>> r3             0x2	2
>>> r4             0x22f07dfc	586186236
>>> . . .
>>=20
>>=20
>> Thus it appears that there is more code around that likely generates =
pointers not aligned so to allow the code generation that is in use for =
what is pointed to.
>>=20
>> At this point I have no clue if the issue is just inside clang itself =
vs. if it is in something that clang is layered on top of. Nor if there =
is just one bad thing or many.
>>=20
>> Note: I had not yet tried buildworld/buildkernel for the context of =
the "-f" option that I was experimenting with earlier. So I do not have =
a direct compare and contrast at this point.

Somehow I did not notice your E-mail at the time. Meanwhile I've more =
evidence. . .

[Initial context for notes: Before updating to 11.0-CURRENT -r292756 and =
its clang/clang++ 3.7.1.]

Example c++ program that clang++ got an internal Bus Error for:

> # more main.cc
> #include <iostream>
> int
> main ()
> {
> std::ostream *o; return 0;
> }

Of course the include makes the source being processed non-trivial.

Going in a different direction. . . dmesg -a | grep "core dumped" on the =
rpi2 showed:

> pid 22238 (msgfmt), uid 0: exited on signal 11 (core dumped)
> pid 22250 (xgettext), uid 0: exited on signal 11 (core dumped)
> pid 22259 (msgmerge), uid 0: exited on signal 11 (core dumped)
> pid 26149 (msgfmt), uid 0: exited on signal 11 (core dumped)
> pid 26161 (xgettext), uid 0: exited on signal 11 (core dumped)
> pid 26170 (msgmerge), uid 0: exited on signal 11 (core dumped)
> pid 28826 (c++), uid 0: exited on signal 10 (core dumped)
> pid 29202 (c++), uid 0: exited on signal 10 (core dumped)
> pid 29282 (c++), uid 0: exited on signal 10 (core dumped)
> pid 29292 (clang++), uid 0: exited on signal 10 (core dumped)

Only the c++/clang++ contexts (same but for name) seemed to be leaving =
.core files behind.

The older log files also showed examples like the following from ports =
building activity:

> /var/log/dmesg.today:pid 18763 (conftest), uid 0: exited on signal 11 =
(core dumped)
> /var/log/dmesg.today:pid 18916 (conftest), uid 0: exited on signal 11 =
(core dumped)

(The original ar that I started with showed as well, the records went =
back that far at the time.)

[New -r292756 context. . .]

After the above I updated to:

> $ freebsd-version -ku; uname -aKU
> 11.0-CURRENT
> 11.0-CURRENT
> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #4 r292756M: Sun Dec 27 =
02:55:57 PST 2015     =
root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG  arm =
1100092 1100092

in order to pick up clang 3.7.1. I used -fmax-type-align=3D4 =
-mno-unaligned-access in the src.conf file for the buildworld =
buildkernel amd64->rpi2 cross build before installing both parts on the =
rpi2 media.

On the rpi2 itself the resulting c++/clang++ still gets Bus Error during =
buildworld despite the use of -fmax-type-align=3D4 -mno-unaligned-acces =
in the amd64 hosted cross build (and in the rpi2 attempted rebuild). An =
example crash report is:

> /usr/bin/clang++ -B/usr/local/arm-gnueabi-freebsd/bin -march=3Darmv7a =
-fmax-type-align=3D4 -mno-unaligned-access  -O -pipe -mfloat-abi=3Dsoftfp =
-I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include =
-I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/tools/clang/incl=
ude =
-I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support =
-I. =
-I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/../../lib/clang/=
include -DLLVM_ON_UNIX -DLLVM_ON_FREEBSD -D__STDC_LIMIT_MACROS =
-D__STDC_CONSTANT_MACROS -fno-strict-aliasing =
-DLLVM_DEFAULT_TARGET_TRIPLE=3D\"armv6-gnueabi-freebsd11.0\" =
-DLLVM_HOST_TRIPLE=3D\"armv6-unknown-freebsd11.0\" =
-DDEFAULT_SYSROOT=3D\"\" -MD -MP -MF.depend.APFloat.o -MTAPFloat.o =
-Qunused-arguments =
-I/usr/obj/clang/arm.armv6/usr/src/tmp/legacy/usr/include  -std=3Dc++11 =
-fno-exceptions -fno-rtti -stdlib=3Dlibc++ -Wno-c++11-extensions  -c =
/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support/APFloa=
t.cpp -o APFloat.o
> clang++: error: unable to execute command: Bus error (core dumped)
> clang++: error: clang frontend command failed due to signal (use -v to =
see invocation)
> FreeBSD clang version 3.7.1 (tags/RELEASE_371/final 255217) 20151225
> Target: armv6--freebsd11.0-gnueabi
> Thread model: posix
> clang++: note: diagnostic msg: PLEASE submit a bug report to =
https://bugs.freebsd.org/submit/ and include the crash backtrace, =
preprocessed source, and associated run script.
> clang++: note: diagnostic msg:=20
> ********************
>=20
> PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
> Preprocessed source(s) and associated run script(s) are located at:
> clang++: note: diagnostic msg: /tmp/APFloat-04544c.cpp
> clang++: note: diagnostic msg: /tmp/APFloat-04544c.sh
> clang++: note: diagnostic msg:=20
>=20
> ********************
> *** Error code 254
>=20
> Stop.
> make[3]: stopped in /usr/src/lib/clang/libllvmsupport
> *** Error code 1

An earlier -j 6 buildworld had failures for ARMBuildAttrs, APSInt, =
APInt, and Error before stopping, in addition to the APFloat indicated =
above (no -j makes for easier reading above):

> # ls -lt /tmp
> total 41516
> -rw-r--r--  1 root  wheel     4057 Dec 28 03:05 APFloat-04544c.sh
> -rw-r--r--  1 root  wheel  2155452 Dec 28 03:05 APFloat-04544c.cpp
> -rw-r--r--  1 root  wheel     4081 Dec 28 02:53 =
ARMBuildAttrs-432569.sh
> -rw-r--r--  1 root  wheel  1276912 Dec 28 02:53 =
ARMBuildAttrs-432569.cpp
> -rw-r--r--  1 root  wheel     3997 Dec 28 02:53 APSInt-a2927e.sh
> -rw-r--r--  1 root  wheel  1943445 Dec 28 02:53 APSInt-a2927e.cpp
> -rw-r--r--  1 root  wheel     3985 Dec 28 02:53 APInt-d0389a.sh
> -rw-r--r--  1 root  wheel  2115595 Dec 28 02:53 APInt-d0389a.cpp
> -rw-r--r--  1 root  wheel     4009 Dec 28 02:53 APFloat-33be1b.sh
> -rw-r--r--  1 root  wheel  2155452 Dec 28 02:53 APFloat-33be1b.cpp
> -rw-r--r--  1 root  wheel     4001 Dec 28 02:53 Error-777068.sh
> -rw-r--r--  1 root  wheel  1925065 Dec 28 02:53 Error-777068.cpp

The earlier "iostream" program example also still gets its Bus Error =
during its clang++ based compilation in this new -r292756 context.

The above -r292756 material avoids involving ports software with its own =
set of additional questions, compilers, tools, etc.: it sticks to =
buildworld/buildkernel material (and never gets to buildkernel).

When I tried -j 5 buildkernel by itself on the rpi2 there were no Bus =
Errors, no Segmentation Faults, and no core dumps. The buildkernel took =
about 51 minutes. (I've not tried installing what it built.)

(I have a SSD on a USB hub in use for world/root on the rpi2. The =
/etc/fstab on the micro-SD lists / as mounting from the SSD instead. I =
installkernel and installworld via the amd64 context to both the =
micro-SD and the SSD so that they track. I can boot from just the =
micro-SD if I want to but normally involve the SSD.)

Another kind of experiment would be to omit -fmax-type-align=3D4 but use =
-mno-unaligned-access (for handling any packed data structures) and see =
if buildkernel can still finish on the rpi2 (if enough of the =
amd64->rpi2 buildworld still operates on the rpi2 to allow the test).

A potential experiment for buildworld would be to use -fmax-type-align=3D1=
 with -mno-unaligned-access as the amd64->rpi2 cross build context. A =
misalignment Bus Error from that context might well be a clang++ code =
generation error of not paying attention to the alignment rules where =
clang++ should.

A potentially interesting (but independent) set of warnings during the =
buildkernel was:

> WARNING: hwpmc_mod.c: enum pmc_event has too many values: 2588 > 1023
> WARNING: hwpmc_logging.c: enum pmc_event has too many values: 2588 > =
1023
> WARNING: hwpmc_soft.c: enum pmc_event has too many values: 2588 > 1023
> WARNING: hwpmc_arm.c: enum pmc_event has too many values: 2588 > 1023

(I've not investigated.)



Before this -r292756 update the following ports seemed to have built =
without generating core dumps or Bus Error reports or other such in the =
process:

devel/gettext-tools
devel/gmake-lite
devel/p5-Locale-gettext
lang/perl5.22
security/sudo

Note that I did not use make.conf to force -f. . . and -m. . . for =
these. But the test was if they could build, not if they operated =
correctly when built.

My guess is that they are primarily C instead of C++ and/or happen to =
avoid the parts of C++ where clang++ is having internal data structure =
alignment problems vs. SCTLR bit[1]=3D=3D1.

Generally the pkg installs on the rpi2 seemed to have been operating =
okay. But they do nto test compiling/linking with the system =
clang/clang++ involved.

In general building ports can have other issues that block completion so =
I had not tried much in that direction and happened to pick on a few =
things that worked (see above). Getting through a self-hosting rpi2 =
buildworld buildkernel first likely is a better path before involving =
ports.

But my way of working has involved using devel/arm-gnueabi-binutils , =
which seemed to build and work fine.


One thing of note from all my rpi2 builds: I've learned to avoid doing a =
"svnlite status /usr/src/" and similar commands. Fairly frequently they =
do not complete and each existing ssh connection to the rpi2 quits =
responding once some new program is attempted from the connection. The =
same for directly at the rpi2 (via USB devices).

Unfortunately /var/log/messages only shows the following boot, no =
messages from the hang-up context. I'd guess that USB (and other such?) =
communication stopped operating.



The src.conf for on the rpi2 has (the amd64->rpi2 cross compile was very =
similar but the amd64-host-targets-self clang/clang++ commands do not =
need the -f. . . and -m. . . ):

> TO_TYPE=3Darmv6
> TOOLS_TO_TYPE=3Darm-gnueabi
> FROM_TYPE=3D${TO_TYPE}
> TOOLS_FROM_TYPE=3D${TOOLS_TO_TYPE}
> VERSION_CONTEXT=3D11.0
> #
> KERNCONF=3DRPI2-NODBG
> TARGET=3Darm
> .if ${.MAKE.LEVEL} =3D=3D 0
> TARGET_ARCH=3D${TO_TYPE}
> .export TARGET_ARCH
> .endif
> #
> WITHOUT_CROSS_COMPILER=3D
> #
> # For WITH_BOOT=3D . . . (amd64 cross compile context)
> # arm-gnueabi-freebsd/bin/ld reports bootinfo.o: relocation =
R_ARM_MOVW_ABS_NC against `a local symbol' can not be used when making a =
shared object; recompile with -fPIC=20
> WITHOUT_BOOT=3D
> #
> WITH_FAST_DEPEND=3D
> WITH_LIBCPLUSPLUS=3D
> WITH_CLANG=3D
> WITH_CLANG_IS_CC=3D
> WITH_CLANG_FULL=3D
> WITH_LLDB=3D
> WITH_CLANG_EXTRAS=3D
> #
> WITHOUT_LIB32=3D
> WITHOUT_GCC=3D
> WITHOUT_GNUCXX=3D
> #
> NO_WERROR=3D
> MALLOC_PRODUCTION=3D
> #CFLAGS+=3D -DELF_VERBOSE
> #
> WITH_DEBUG=3D
> WITH_DEBUG_FILES=3D
> #
> # TOOLS_TO_TYPE based on ${TO_TYPE}-xtoolchain-gcc related =
bintutils...
> #
> #CROSS_TOOLCHAIN=3D${TO_TYPE}-gcc
> X_COMPILER_TYPE=3Dclang
> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/
> .if ${.MAKE.LEVEL} =3D=3D 0
> XCC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4 -mno-unaligned-access
> XCXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4 -mno-unaligned-access
> XCPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4 -mno-unaligned-access
> .export XCC
> .export XCXX
> .export XCPP
> XAS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as
> XAR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar
> XLD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld
> XNM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm
> XOBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy
> XOBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump
> XRANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib
> XSIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size
> #NO-SUCH: XSTRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings
> XSTRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings
> .export XAS
> .export XAR
> .export XLD
> .export XNM
> .export XOBJCOPY
> .export XOBJDUMP
> .export XRANLIB
> .export XSIZE
> .export XSTRINGS
> .endif
> #
> # =46rom clang (via system)...
> #
> .if ${.MAKE.LEVEL} =3D=3D 0
> CC=3D/usr/bin/clang -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin =
-march=3Darmv7a -fmax-type-align=3D4 -mno-unaligned-access
> CXX=3D/usr/bin/clang++ -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin =
-march=3Darmv7a -fmax-type-align=3D4 -mno-unaligned-access
> CPP=3D/usr/bin/clang-cpp -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin =
-march=3Darmv7a -fmax-type-align=3D4 -mno-unaligned-access
> .export CC
> .export CXX
> .export CPP
> .endif
> #
> #
> # TOOLS_FROM_TYPE binutils from xtoolchain like context...
> #
> .if ${.MAKE.LEVEL} =3D=3D 0
> AS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/as
> AR=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ar
> LD=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ld
> NM=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/nm
> OBJCOPY=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objcopy
> OBJDUMP=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objdump
> RANLIB=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ranlib
> SIZE=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/size
> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/strings
> STRINGS=3D/usr/local/bin/${TOOLS_FROM_TYPE}-freebsd-strings
> .export AS
> .export AR
> .export LD
> .export NM
> .export OBJCOPY
> .export OBJDUMP
> .export RANLIB
> .export SIZE
> .export STRINGS
> .endif

This technique does require devel/arm-gnueabi-binutils to have been =
built and operating okay on amd64 and later on the rpi2. So far I've no =
hints of any problems in that area.



The RPI2-NODBG config is shown below:

> # more /usr/src/sys/arm/conf/RPI2-NODBG=20
> ident           RPI2-NODBG
>=20
> include         "RPI2"
>=20
> makeoptions     DEBUG=3D-g                # Build kernel with gdb(1) =
debug symbols
> options         ALT_BREAK_TO_DEBUGGER
> #options        VERBOSE_SYSINIT         # Enable verbose sysinit =
messages
>=20
> options         KDB                     # Enable kernel debugger =
support
>=20
> # For minimum debugger support (stable branch) use:
> #options        KDB_TRACE               # Print a stack trace for a =
panic
> options         DDB                     # Enable the kernel debugger
>=20
> nooptions       INVARIANTS              # Enable calls of extra sanity =
checking
> nooptions       INVARIANT_SUPPORT       # Extra sanity checks of =
internal structures, required by INVARIANTS
> nooptions       WITNESS                 # Enable checks to detect =
deadlocks and cycles
> nooptions       WITNESS_SKIPSPIN        # Don't run witness on =
spinlocks for speed
> nooptions       DIAGNOSTIC


Most of my /usr/src/ tailoring is tied to powerpc and powerpc64 issues:

> # svnlite status /usr/src/
> ?       /usr/src/.snap
> M       /usr/src/contrib/libcxxrt/guard.cc
> M       /usr/src/lib/csu/powerpc64/Makefile
> M       /usr/src/lib/libc/stdio/findfp.c
> ?       /usr/src/lib/libc/stdio/findfp.c.orig
> ?       /usr/src/restoresymtable
> ?       /usr/src/sys/arm/conf/RPI2-NODBG
> M       /usr/src/sys/boot/ofw/Makefile.inc
> M       /usr/src/sys/boot/powerpc/Makefile.inc
> M       /usr/src/sys/boot/uboot/Makefile.inc
> ?       /usr/src/sys/powerpc/conf/GENERIC64vtsc
> ?       /usr/src/sys/powerpc/conf/GENERIC64vtsc-NODEBUG
> ?       /usr/src/sys/powerpc/conf/GENERICvtsc
> ?       /usr/src/sys/powerpc/conf/GENERICvtsc-NODEBUG
> M       /usr/src/sys/powerpc/ofw/ofw_machdep.c

lib/libc/stdio/findfp.c has the patch I was asked to test.

contrib/libcxxrt/guard.cc is to avoid bad C++ source code (use of =
C11-specific notation in C++ that is reported syntax errors in =
powerpc64-xtoolchain-gcc/powerpc64-gcc compilation contexts):

> # svnlite diff /usr/src/contrib/libcxxrt/guard.cc
> Index: /usr/src/contrib/libcxxrt/guard.cc
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- /usr/src/contrib/libcxxrt/guard.cc	(revision 292756)
> +++ /usr/src/contrib/libcxxrt/guard.cc	(working copy)
> @@ -101,7 +101,7 @@
>  	uint32_t init_half;
>  	uint32_t lock_half;
>  } guard_t;
> -_Static_assert(sizeof(guard_t) =3D=3D sizeof(uint64_t), "");
> +//_Static_assert(sizeof(guard_t) =3D=3D sizeof(uint64_t), "");
>  static const uint32_t LOCKED =3D 1;
>  static const uint32_t INITIALISED =3D static_cast<guard_lock_t>(1) << =
24;
>  #	endif

The sys/boot/. . . examples are just use of -Wl, notation in LDFLAGS =
where the original notation was rejected, such as:

> # svnlite diff /usr/src/sys/boot/uboot/Makefile.inc
> Index: /usr/src/sys/boot/uboot/Makefile.inc
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- /usr/src/sys/boot/uboot/Makefile.inc	(revision 292756)
> +++ /usr/src/sys/boot/uboot/Makefile.inc	(working copy)
> @@ -2,7 +2,7 @@
> =20
>  .if ${MACHINE_ARCH} =3D=3D "powerpc64"
>  CFLAGS+=3D	-m32 -mcpu=3Dpowerpc
> -LDFLAGS+=3D	-m elf32ppc_fbsd
> +LDFLAGS+=3D	-Wl,-m -Wl,elf32ppc_fbsd
>  .endif
> =20
>  .include "../Makefile.inc"

All 3 are powerpc64 specific changes.

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

>=20
> Older material:
>=20
> On 2015-Dec-25, at 5:21 PM, Mark Millard <markmi@dsl-only.net> wrote:
>=20
>> On 2015-Dec-25, at 3:42 PM, Warner Losh <imp@bsdimp.com> wrote:
>>=20
>>=20
>>> On Dec 25, 2015, at 3:14 PM, Mark Millard <markmi@dsl-only.net> =
wrote:
>>>=20
>>> [I'm going to break much of the earlier "original material" text to =
tail of the message.]
>>>=20
>>>> On 2015-Dec-25, at 11:53 AM, Warner Losh <imp@bsdimp.com> wrote:
>>>>=20
>>>> So what happens if we actually fix the underlying bug?
>>>>=20
>>>> I see two ways of doing this. In findfp.c, we allocate an array of =
FILE * today like:
>>>>   g =3D (struct glue *)malloc(sizeof(*g) + ALIGNBYTES + n * =
sizeof(FILE));
>>>> but that assumes that FILE just has normal pointer alignment =
requirements. However,
>>>> due to the mbstate having int64_t alignment requirements, this is =
wrong. Maybe we
>>>> need to do something like
>>>> 	g =3D (struct glue *)malloc(sizeof(*g) + =
max(sizeof(int64_t),ALIGNBYTES) + n * sizeof(FILE));
>>>> which wouldn=E2=80=99t change anything on LP64 systems, but would =
result in proper alignment
>>>> for ILP32 systems. We=E2=80=99d have to fix the loop that uses =
ALIGN afterwards to use
>>>> roundup. Instead, we=E2=80=99d need to round up to the neared =
8-byte aligned offset (or technically,
>>>> the max of ALIGNBYTES and 8, but that=E2=80=99s always 8 on =
today=E2=80=99s systems. If we do this,
>>>> we can make sure that each file is 8-byte aligned or better. We may =
need to round up
>>>> sizeof(FILE) to a multiple of 8 as well. I believe that since it =
has the 8-byte alignment
>>>> for a member, its size must be a multiple of 8, but I=E2=80=99ve =
not chased that belief to ground.
>>>> If not, we may need another decorator (__aligned(8), I think, =
spelled with the ugly
>>>> max expression above). That way, the contract we=E2=80=99re making =
with the compiler will
>>>> always be true. ALIGN BYTES is 4 on Arm anyway, so that bit is =
clearly wrong.
>>>>=20
>>>> This wouldn=E2=80=99t be an ABI change, since you can only get a =
valid FILE * from fopen (and
>>>> friends), plus stdin, stdout, and stderr. Those addresses aren=E2=80=99=
t hard coded into binaries,
>>>> so even if we have to tweak the last three and deal with some =
=E2=80=98fake=E2=80=99 FILE abuse in libc
>>>> (which I don=E2=80=99t think suffers from this issue, btw, given =
the alignment requirements that would
>>>> naturally follow from something on the stack), we=E2=80=99d still =
be ahead. At least for all CONFORMING
>>>> implementations[*]...
>>>>=20
>>>> TL;DR: Why not make FILE * always 8-byte aligned? The compiler =
options are a band-aide.
>>>>=20
>>>> Warner
>>>>=20
>>>> [*] There=E2=80=99s at least on popular package that has a copy of =
the FILE structure in one of its
>>>> .h files and uses that to do unnatural optimization things, but =
even that=E2=80=99s cool, I think,
>>>> since it never allocates a new one.
>>>>=20
>>>=20
>>> The ARM documentation mentions cases of 16 byte alignment =
requirements. I've no clue if the clang code generation ever creates =
such code. There might be wider requirements possible in arm code as =
well. (I'm not an arm expert.) As an example of an implication: "The =
malloc() function returns a pointer to a block of at least size bytes =
suitably aligned for any use." In other words: aligned to some figure =
that is a multiple of *every* alignment requirement that the code =
generator can produce, possibly being the least common multiple.
>>>=20
>>> "-fmax-type-align=3D. . ." is a means of controlling/limiting the =
range of potential alignments to no more than a fixed, predefined value. =
Above that and the code generation has to work in small size accesses =
and build-up/split-up bigger values. Using "-fmax-type-align=3D. . ." =
allows defining a figure as part of an ABI that is then not subject to =
code generator updates that could increase the maximum alignment figure =
and break things: It turns off such new capabilities. Other options need =
not work that way to preserve the ABI.
>>=20
>> That=E2=80=99s true, as far as it goes=E2=80=A6 But I=E2=80=99m not =
sure it goes far enough. The premise here is that the problem is =
wide-spread, when in fact I think it is quite narrow.
>>=20
>>> But in the most fundamental terms process wise as far as I can tell. =
. .
>>>=20
>>> While the FILE case that occurred is a specific example, every =
memory-allocation-like operation is at a potential issue for all such =
"allocated" objects where the related code generation requires alignment =
to avoid Bus Error (given the SCTLR bit[1] in use).
>>=20
>> The problem isn=E2=80=99t general. The problem isn=E2=80=99t malloc. =
Malloc will generally return the right thing on arm (and if it =
doesn=E2=80=99t,
>> then we need to make sure it does).
>>=20
>> The problem is we get a boatload of FILEs from the system all at =
once, and those are misaligned because of a bug in the code. One =
that=E2=80=99s fixed, I believe, in https://reviews.freebsd.org/D4708.
>>=20
>>=20
>>> How many other places in FreeBSD might sometimes return mis-aligned =
pointers for the existing code generation and ABI combination?
>>=20
>> It isn=E2=80=99t an ABI thing, just a code bug thing. The only reason =
it was an issue was due to the optimizing nature of clang.
>>=20
>> We=E2=80=99ve had to deal with the arm alignment issues for years. I =
wager there are very few indeed. The only reason this was was brought to =
light was better code-gen from clang.
>>=20
>>> How many other places are subject to breakage when "internal" =
structs/unions/fields involved are changed to be of a different size =
because the code is not fully auto-adjusting to match the code =
generation yet --even if right now "it works"? How fragile will things =
be for future work?
>>=20
>> If there are others, I=E2=80=99ll bet they could be counted on one =
hand since very few things do the =E2=80=98slab=E2=80=99 allocator that =
FILE does.
>>=20
>>> What would it take to find out and deal with them all? (I do not =
have the background knowledge to span much.)
>>>=20
>>> My experiment avoided potentially changing parts of the ABI and also =
avoided dealing with such a "lots of code to investigate" issue. It may =
not be the long term 11.0-RELEASE solution. Even if not, it may be =
appropriate for various temporary purposes that need to avoid Bus Errors =
in the process. For example if Ian has a good reason to use clang 3.7 =
instead of gcc 4.2.1.
>>=20
>> The review above doesn=E2=80=99t change the ABI either.
>>=20
>>> Other notes:
>>>=20
>>>> I believe that since it has the 8-byte alignment
>>>> for a member, its size must be a multiple of 8
>>>=20
>>> There are some C/C++ language rules about the address of a structure =
equalling the address of the first field, uniformity of the offsets, and =
the like. But. . .
>>>=20
>>> The C and C++ languages specify no specific numerical alignment =
figures, not even relative to specific sizeof(...) expressions. To use =
an old example: a 68010 only needs alignment for >=3D 2 byte things and =
even alignment is all that is then required. Some other contexts take a =
lot more to meet the specifications. There are some implications of the =
modern memory model(s) created to cover concurrency explicitly, such as =
avoiding interactions that can happen via, for example, separate objects =
(in part) sharing a cache line. (I've only looked at C++ for this, and =
only to a degree.)
>>>=20
>>> The detailed alignment rules are more "implementation defined" than =
"predefined by the standard". But the definition is trying to meet =
language criteria. It is not a fully independent choice.
>>=20
>> Many of them are actually defined by a combination of the standard =
language definition, as well as the ABI standard. This is why we know =
that mbstate_t must be 8 byte aligned.
>>=20
>>> May be some other standards that FreeBSD is tied to specify more =
specifics, such as a N byte integer always aligns to some multiple of N =
(a waste on the 68010), including the alignment for union or struct that =
it may be a part of tracking. But such rules force padding that may or =
may not be required to meet the language's more abstract criteria and =
such rules may not match the existing/in-use ABI.
>>=20
>> It is all spelled out in the ARM EABI docs.
>>=20
>>> So far as I can tell explicitly declared alignments may well be =
necessary. If that one "popular package", say, formed an array of FILE =
copies then the resultant alignments need not all match the ones =
produced by your example code unless the FILE declaration forces the =
compiler to match, causing sizeof(FILE) to track as well. FILE need not =
be the only such issue.
>>=20
>> Arrays of FILEs isn=E2=80=99t an issue (except that it encodes the =
size of FILE into the app). It=E2=80=99s the specifically quirky way =
that libc does it that=E2=80=99s the problem.
>>=20
>>> My background and reference material are mostly tied the languages =
--and so my notes tend to be limited to that much context.
>>=20
>> Understood. While there may be issues with alignment still, tossing a =
big hammer at the problem because they might exist will likely mean they =
will persist far longer than fixing them one at a time. When we first =
ported to arm, there were maybe half a dozen places that needed fixing. =
I doubt there=E2=80=99s more now.
>>=20
>> Can you try the patch in the above code review w/o the -f switch and =
let me know if it works for you?
>>=20
>> Warner
>=20
> buildworld/buildkernel has been started on amd64 for a rpi2 target. =
That and install kernel/world and starting up a port rebuild on the rpi2 =
and waiting for it means it will be a few hours even if I start the next =
thing just as each prior thing finishes. I may give up and go to sleep =
first.
>=20
> As for presumptions: I'll take your word on expected status of things. =
I've no clue. But absent even the hear-say status information at the =
time I did not presume that what was in front of me was all there is to =
worry about --nor did I try to go figure it all out on my own. I took a =
path to cover both possibilities for local-only vs. more-wide-spread (so =
long as that path did not force a split-up of some larger form of atomic =
action).
>=20
> In my view "-mno-unaligned-access" is an even bigger hammer than I =
used. I find no clang statement about what its ABI consequences would =
be, unlike for what I did: What mix of more padding for alignment vs. =
more but smaller accesses? But as I remember I've seen =
"-mno-unaligned-access" in use in ports and the like so its consequences =
may be familiar material for some folks.
>=20
> Absent any questions about ABI consequences "-mno-unaligned-access" =
does well mark the expected SCTLR bit[1] status, far better than what I =
did. Again: I was covering my ignorance while making any significant =
investigation/debugging as unlikely as I could.
>=20
>=20
>> Original material:
>>=20
>>> On Dec 25, 2015, at 7:24 AM, Mark Millard <markmi@dsl-only.net> =
wrote:
>>>=20
>>> [Good News Summary: Rebuilding buildworld/buildkernel for rpi2 =
11.0-CURRENT 292413 from amd64 based on adding -fmax-type-align=3D4 has =
so far removed the crashes during the toolchain activity: no more =
misaligned accesses in libc's _fseeko or elsewhere.]
>>>=20
>>> On 2015-Dec-25, at 12:31 AM, Mark Millard <markmi@dsl-only.net> =
wrote:
>>>=20
>>>> On 2015-Dec-24, at 10:39 PM, Mark Millard <markmi@dsl-only.net> =
wrote:
>>>>=20
>>>>> [I do not know if this partial crash analysis related to on-arm =
clang-associated activity is good enough and appropriate to submit or =
not.]
>>>>>=20
>>>>> The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved =
below came from pkg install activity instead of port building. Used =
as-is.
>>>>>=20
>>>>> When I just tried my first from-rpi2b builds (ports for a rpi2b), =
/usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the =
following suggests an alignment error for the type of instructions that =
memset for 128 bytes was translated to (sizeof(mbstate_t)) in the code =
used by /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to =
check SCTLR bit[1] to be directly sure that alignment was being =
enforced.)
>>>>>=20
>>>>> The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar =
:
>>>>>=20
>>>>>> libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru =
.libs/libgnuintl.a  bindtextdom.o dcgettext.o dgettext.o gettext.o =
finddomain.o hash-string.o loadmsgcat.o localealias.o textdomain.o =
l10nflist.o explodename.o dcigettext.o dcngettext.o dngettext.o =
ngettext.o pluralx.o plural-exp.o localcharset.o threadlib.o lock.o =
relocatable.o langprefs.o localename.o log.o printf.o setlocale.o =
version.o xsize.o osdep.o intl-compat.o
>>>>>> Bus error (core dumped)
>>>>>> *** [libgnuintl.la] Error code 138
>>>>>=20
>>>>> It failed in _fseeko doing a memset that turned into uses of =
"vst1.64	{d16-d17}, [r0]" instructions, for an address in =
register r0 that ended in 0xa4, so was not aligned to 8 byte boundaries. =
=46rom what I read such "VSTn (multiple n-element structures)" that have =
.64 require 8 byte alignment. The evidence of the code and register =
value follow.
>>>>>=20
>>>>>> # gdb /usr/local/arm-gnueabi-freebsd/bin/ar =
/usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gette=
xt-tools/intl/ar.core
>>>>>> . . .
>>>>>> #0  0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D<value =
optimized out>, whence=3D<value optimized out>, ltest=3D<value optimized =
out>) at /usr/src/lib/libc/stdio/fseek.c:299
>>>>>> 299		memset(&fp->_mbstate, 0, sizeof(mbstate_t));
>>>>>> . . .
>>>>>> (gdb) x/24i 0x2033adb0
>>>>>> 0x2033adb0 <_fseeko+836>:	vmov.i32	q8, #0	; =
0x00000000
>>>>>> 0x2033adb4 <_fseeko+840>:	movw	r1, #65503	; 0xffdf
>>>>>> 0x2033adb8 <_fseeko+844>:	stm	r4, {r0, r7}
>>>>>> 0x2033adbc <_fseeko+848>:	ldrh	r0, [r4, #12]
>>>>>> 0x2033adc0 <_fseeko+852>:	and	r0, r0, r1
>>>>>> 0x2033adc4 <_fseeko+856>:	strh	r0, [r4, #12]
>>>>>> 0x2033adc8 <_fseeko+860>:	add	r0, r4, #216	; 0xd8
>>>>>> 0x2033adcc <_fseeko+864>:	vst1.64	{d16-d17}, [r0]
>>>>>> 0x2033add0 <_fseeko+868>:	add	r0, r4, #200	; 0xc8
>>>>>> 0x2033add4 <_fseeko+872>:	vst1.64	{d16-d17}, [r0]
>>>>>> 0x2033add8 <_fseeko+876>:	add	r0, r4, #184	; 0xb8
>>>>>> 0x2033addc <_fseeko+880>:	vst1.64	{d16-d17}, [r0]
>>>>>> 0x2033ade0 <_fseeko+884>:	add	r0, r4, #168	; 0xa8
>>>>>> 0x2033ade4 <_fseeko+888>:	vst1.64	{d16-d17}, [r0]
>>>>>> 0x2033ade8 <_fseeko+892>:	add	r0, r4, #152	; 0x98
>>>>>> 0x2033adec <_fseeko+896>:	vst1.64	{d16-d17}, [r0]
>>>>>> 0x2033adf0 <_fseeko+900>:	add	r0, r4, #136	; 0x88
>>>>>> 0x2033adf4 <_fseeko+904>:	vst1.64	{d16-d17}, [r0]
>>>>>> 0x2033adf8 <_fseeko+908>:	add	r0, r4, #120	; 0x78
>>>>>> 0x2033adfc <_fseeko+912>:	vst1.64	{d16-d17}, [r0]
>>>>>> 0x2033ae00 <_fseeko+916>:	add	r0, r4, #104	; 0x68
>>>>>> 0x2033ae04 <_fseeko+920>:	vst1.64	{d16-d17}, [r0]
>>>>>> 0x2033ae08 <_fseeko+924>:	b	0x2033b070 =
<_fseeko+1540>
>>>>>> 0x2033ae0c <_fseeko+928>:	cmp	r5, #0	; 0x0
>>>>>> (gdb) info all-registers
>>>>>> r0             0x20651ea4	543497892
>>>>>> r1             0xffdf	65503
>>>>>> r2             0x0	0
>>>>>> r3             0x0	0
>>>>>> r4             0x20651dcc	543497676
>>>>>> r5             0x0	0
>>>>>> r6             0x0	0
>>>>>> r7             0x0	0
>>>>>> r8             0x20359df4	540384756
>>>>>> r9             0x0	0
>>>>>> r10            0x0	0
>>>>>> r11            0xbfbfb948	-1077954232
>>>>>> r12            0x2037b208	540520968
>>>>>> sp             0xbfbfb898	-1077954408
>>>>>> lr             0x2035a004	540385284
>>>>>> pc             0x2033adcc	540257740
>>>>>> f0             0	(raw 0x000000000000000000000000)
>>>>>> f1             0	(raw 0x000000000000000000000000)
>>>>>> f2             0	(raw 0x000000000000000000000000)
>>>>>> f3             0	(raw 0x000000000000000000000000)
>>>>>> f4             0	(raw 0x000000000000000000000000)
>>>>>> f5             0	(raw 0x000000000000000000000000)
>>>>>> f6             0	(raw 0x000000000000000000000000)
>>>>>> f7             0	(raw 0x000000000000000000000000)
>>>>>> fps            0x0	0
>>>>>> cpsr           0x60000010	1610612752
>>>>>=20
>>>>> The syntax in use for vst1.64 instructions does not explicitly =
have the alignment notation. Presuming that the decoding is correct then =
from what I read the following applies:
>>>>>=20
>>>>>> Home > NEON and VFP Programming > NEON load and store element and =
structure instructions > Alignment restrictions in load and store, =
element and structure instructions
>>>>>>=20
>>>>>> . . . When the alignment is not specified in the instruction, the =
alignment restriction is controlled by the A bit (SCTLR bit[1]):
>>>>>> 	=E2=80=A2	if the A bit is 0, there are no alignment =
restrictions (except for strongly ordered or device memory, where =
accesses must be element aligned or the result is unpredictable)
>>>>>> 	=E2=80=A2	if the A bit is 1, accesses must be element =
aligned.
>>>>>> If an address is not correctly aligned, an alignment fault =
occurs.
>>>>>=20
>>>>> So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus =
error would have the context to happen because of the mis-alignment.
>>>>>=20
>>>>> The following shows the make.conf context that explains how =
/usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked:
>>>>>=20
>>>>>> # more /etc/make.conf
>>>>>> WRKDIRPREFIX=3D/usr/obj/portswork
>>>>>> WITH_DEBUG=3D
>>>>>> WITH_DEBUG_FILES=3D
>>>>>> MALLOC_PRODUCTION=3D
>>>>>> #
>>>>>> TO_TYPE=3Darmv6
>>>>>> TOOLS_TO_TYPE=3Darm-gnueabi
>>>>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/
>>>>>> .if ${.MAKE.LEVEL} =3D=3D 0
>>>>>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a
>>>>>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a
>>>>>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a
>>>>>> .export CC
>>>>>> .export CXX
>>>>>> .export CPP
>>>>>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as
>>>>>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar
>>>>>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld
>>>>>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm
>>>>>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy
>>>>>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump
>>>>>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib
>>>>>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size
>>>>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings=

>>>>>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings
>>>>>> .export AS
>>>>>> .export AR
>>>>>> .export LD
>>>>>> .export NM
>>>>>> .export OBJCOPY
>>>>>> .export OBJDUMP
>>>>>> .export RANLIB
>>>>>> .export SIZE
>>>>>> .export STRINGS
>>>>>> .endif
>>>>>=20
>>>>>=20
>>>>> Other context:
>>>>>=20
>>>>>> # freebsd-version -ku; uname -aKU
>>>>>> 11.0-CURRENT
>>>>>> 11.0-CURRENT
>>>>>> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue =
Dec 22 22:02:21 PST 2015     =
root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG  arm =
1100091 1100091
>>>>>=20
>>>>>=20
>>>>>=20
>>>>> I will note that world and kernel are my own build of -r292413 =
(earlier experiment) --a build made from an amd64 host context and put =
in place via DESTDIR=3D. My expectation would be that the amd64 context =
would not be likely to have similar alignment restrictions involved in =
its ar activity (or other activity). That would explain how I got this =
far using such a clang 3.7 related toolchain for targeting an rpi2 =
before finding such a problem.
>>>>=20
>>>>=20
>>>> I realized re-reading the all above that it seems to suggest that =
the _fseeko code involved is from /usr/local/arm-gnueabi-freebsd/bin/ar =
but that was not my intent.
>>>>=20
>>>> libc.so.7 is from my buildworld, including the fseeko =
implementation:
>>>>=20
>>>> Reading symbols from /lib/libc.so.7...Reading symbols from =
/usr/lib/debug//lib/libc.so.7.debug...done.
>>>> done.
>>>> Loaded symbols for /lib/libc.so.7
>>>>=20
>>>>=20
>>>> head/sys/sys/_types.h has:
>>>>=20
>>>> /*
>>>> * mbstate_t is an opaque object to keep conversion state during =
multibyte
>>>> * stream conversions.
>>>> */
>>>> typedef union {
>>>> char            __mbstate8[128];
>>>> __int64_t       _mbstateL;      /* for alignment */
>>>> } __mbstate_t;
>>>>=20
>>>> suggesting an implicit alignment of the union to whatever the =
implementation defines for __int64_t --which need not be 8 byte =
alignment (in the abstract, general case). But 8 byte alignment is a =
possibility as well (in the abstract).
>>>>=20
>>>> But printing *fp in gdb for the fp argument to _fseeko reports the =
same not-8-byte aligned address for __mbstate8 that was in r0:
>>>>=20
>>>>> (gdb) bt
>>>>> #0  0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D<value =
optimized out>, whence=3D<value optimized out>, ltest=3D<value optimized =
out>) at /usr/src/lib/libc/stdio/fseek.c:299
>>>>> #1  0x2033b108 in fseeko (fp=3D0x20651dcc, offset=3D18571438587904, =
whence=3D0) at /usr/src/lib/libc/stdio/fseek.c:82
>>>>> #2  0x00016138 in ?? ()
>>>>> (gdb) print fp
>>>>> $2 =3D (FILE *) 0x20651dcc
>>>>> (gdb) print *fp
>>>>> $3 =3D {_p =3D 0x2069a240 "", _r =3D 0, _w =3D 0, _flags =3D 5264, =
_file =3D 36, _bf =3D {_base =3D 0x2069a240 "", _size =3D 32768}, =
_lbfsize =3D 0, _cookie =3D 0x20651dcc, _close =3D 0x20359dfc =
<__sclose>,
>>>>> _read =3D 0x20359de4 <__sread>, _seek =3D 0x20359df4 <__sseek>, =
_write =3D 0x20359dec <__swrite>, _ub =3D {_base =3D 0x0, _size =3D 0}, =
_up =3D 0x0, _ur =3D 0, _ubuf =3D 0x20651e0c "", _nbuf =3D 0x20651e0f =
"", _lb =3D {
>>>>> _base =3D 0x0, _size =3D 0}, _blksize =3D 32768, _offset =3D 0, =
_fl_mutex =3D 0x0, _fl_owner =3D 0x0, _fl_count =3D 0, _orientation =3D =
0, _mbstate =3D {__mbstate8 =3D 0x20651e34 "", _mbstateL =3D 0}, _flags2 =
=3D 0}
>>>>=20
>>>> The overall FILE struct containing the _mbstate field is also not =
8-byte aligned. But the offset from the start of the FILE struct to =
__mbstate8 is a multiple of 8 bytes.
>>>>=20
>>>> It is my interpretation that there is nothing here to justify the =
memset implementation combination:
>>>>=20
>>>> SCTLR bit[1]=3D=3D1
>>>>=20
>>>> mixed with
>>>>=20
>>>> vst1.64 instructions
>>>>=20
>>>> I.e.: one or both needs to change unless some way for forcing =
8-byte alignment is introduced.
>>>>=20
>>>> I have not managed to track down anything that would indicate =
FreeBSD's intent for SCTLR bit[1]. I do not even know if it is required =
by the design to be constant (once initialized).
>>>=20
>>>=20
>>> I have (so far) removed the build tool crashes based on adding =
-fmax-type-align=3D4 to avoid the misaligned accesses. Details follow.
>>>=20
>>> src.conf on amd64 for the rpi2 targeting buildworld/buildkernel now =
looks like:
>>>=20
>>>> # more ~/src.configs/src.conf.rpi2-clang.amd64-host
>>>> TO_TYPE=3Darmv6
>>>> TOOLS_TO_TYPE=3Darm-gnueabi
>>>> FROM_TYPE=3Damd64
>>>> TOOLS_FROM_TYPE=3Dx86_64
>>>> VERSION_CONTEXT=3D11.0
>>>> #
>>>> KERNCONF=3DRPI2-NODBG
>>>> TARGET=3Darm
>>>> .if ${.MAKE.LEVEL} =3D=3D 0
>>>> TARGET_ARCH=3D${TO_TYPE}
>>>> .export TARGET_ARCH
>>>> .endif
>>>> #
>>>> WITHOUT_CROSS_COMPILER=3D
>>>> #
>>>> # For WITH_BOOT=3D . . .
>>>> # arm-gnueabi-freebsd/bin/ld reports bootinfo.o: relocation =
R_ARM_MOVW_ABS_NC against `a local symbol' can not be used when making a =
shared object; recompile with -fPIC
>>>> WITHOUT_BOOT=3D
>>>> #
>>>> WITH_FAST_DEPEND=3D
>>>> WITH_LIBCPLUSPLUS=3D
>>>> WITH_CLANG=3D
>>>> WITH_CLANG_IS_CC=3D
>>>> WITH_CLANG_FULL=3D
>>>> WITH_LLDB=3D
>>>> WITH_CLANG_EXTRAS=3D
>>>> #
>>>> WITHOUT_LIB32=3D
>>>> WITHOUT_GCC=3D
>>>> WITHOUT_GNUCXX=3D
>>>> #
>>>> NO_WERROR=3D
>>>> MALLOC_PRODUCTION=3D
>>>> #CFLAGS+=3D -DELF_VERBOSE
>>>> #
>>>> WITH_DEBUG=3D
>>>> WITH_DEBUG_FILES=3D
>>>> #
>>>> # TOOLS_TO_TYPE based on ${TO_TYPE}-xtoolchain-gcc related =
bintutils...
>>>> #
>>>> #CROSS_TOOLCHAIN=3D${TO_TYPE}-gcc
>>>> X_COMPILER_TYPE=3Dclang
>>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/
>>>> .if ${.MAKE.LEVEL} =3D=3D 0
>>>> XCC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4
>>>> XCXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4
>>>> XCPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4
>>>> .export XCC
>>>> .export XCXX
>>>> .export XCPP
>>>> XAS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as
>>>> XAR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar
>>>> XLD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld
>>>> XNM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm
>>>> XOBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy
>>>> XOBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump
>>>> XRANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib
>>>> XSIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size
>>>> #NO-SUCH: XSTRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings
>>>> XSTRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings
>>>> .export XAS
>>>> .export XAR
>>>> .export XLD
>>>> .export XNM
>>>> .export XOBJCOPY
>>>> .export XOBJDUMP
>>>> .export XRANLIB
>>>> .export XSIZE
>>>> .export XSTRINGS
>>>> .endif
>>>> #
>>>> # Host compiler stuff:
>>>> .if ${.MAKE.LEVEL} =3D=3D 0
>>>> CC=3D/usr/bin/clang -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin
>>>> CXX=3D/usr/bin/clang++ -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin
>>>> CPP=3D/usr/bin/clang-cpp =
-B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin
>>>> .export CC
>>>> .export CXX
>>>> .export CPP
>>>> AS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/as
>>>> AR=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ar
>>>> LD=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ld
>>>> NM=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/nm
>>>> OBJCOPY=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objcopy
>>>> OBJDUMP=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objdump
>>>> RANLIB=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ranlib
>>>> SIZE=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/size
>>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/strings=

>>>> STRINGS=3D/usr/local/bin/${TOOLS_FROM_TYPE}-freebsd-strings
>>>> .export AS
>>>> .export AR
>>>> .export LD
>>>> .export NM
>>>> .export OBJCOPY
>>>> .export OBJDUMP
>>>> .export RANLIB
>>>> .export SIZE
>>>> .export STRINGS
>>>> .endif
>>>=20
>>> make.conf for during the on-rpi2 port builds now looks like:
>>>=20
>>>> $ more /etc/make.conf
>>>> WRKDIRPREFIX=3D/usr/obj/portswork
>>>> WITH_DEBUG=3D
>>>> WITH_DEBUG_FILES=3D
>>>> MALLOC_PRODUCTION=3D
>>>> #
>>>> TO_TYPE=3Darmv6
>>>> TOOLS_TO_TYPE=3Darm-gnueabi
>>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/
>>>> .if ${.MAKE.LEVEL} =3D=3D 0
>>>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4
>>>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4
>>>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4
>>>> .export CC
>>>> .export CXX
>>>> .export CPP
>>>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as
>>>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar
>>>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld
>>>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm
>>>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy
>>>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump
>>>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib
>>>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size
>>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings
>>>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings
>>>> .export AS
>>>> .export AR
>>>> .export LD
>>>> .export NM
>>>> .export OBJCOPY
>>>> .export OBJDUMP
>>>> .export RANLIB
>>>> .export SIZE
>>>> .export STRINGS
>>>> .endif
>>>=20
>>>=20
>>>=20
>>> =3D=3D=3D
>>> Mark Millard
>>> markmi at dsl-only.net
>>>=20
>>>=20
>>>=20
>>> _______________________________________________
>>> freebsd-toolchain@freebsd.org mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain
>>> To unsubscribe, send any mail to =
"freebsd-toolchain-unsubscribe@freebsd.org"
>=20
>=20
>=20





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?118D2970-4799-46B1-81A1-0101B907C1BE>