From owner-freebsd-toolchain@freebsd.org Fri Dec 25 23:42:43 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 15433A51AB1 for ; Fri, 25 Dec 2015 23:42:43 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from mail-oi0-x22b.google.com (mail-oi0-x22b.google.com [IPv6:2607:f8b0:4003:c06::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CF20019AD for ; Fri, 25 Dec 2015 23:42:42 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: by mail-oi0-x22b.google.com with SMTP id o62so146890852oif.3 for ; Fri, 25 Dec 2015 15:42:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=sender:subject:mime-version:content-type:from:in-reply-to:date:cc :message-id:references:to; bh=4r33SfTRwWXipcaagNiw6yQTSTgqBm90O8/kDIAGj6o=; b=nBVh+F8zWGO24NnBMAKqZiM066Tp0v8wdK9rzewJkLz+9cGmKFtkQS+ED2MsDTYD+/ uj4+s6k2JzmPl7HxC7ek4ZBmhLJA3SMZyl2MIUmljULtVkzGK2jiN44jdOLKxcYoEj3d JAlKG96sRon5KCkw1AQq9SlT1KyEgf6Zri69meVwKfbr0rbFNn2gtwZ8a0LIJ+URNRY4 kP7wxvg3TyTQ7zDel/pBQWYBMA2Fqb30NBW7Foc6g6xiArYRxm4pmMJqblxtUsp+QugT o9EGpROgAXNzlvZNvqlRG6tAcege5hF/YW6Lc9RPLAsfa+V8pTNJNiB2wvW/QCWOsTJ3 a7YQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:mime-version:content-type:from :in-reply-to:date:cc:message-id:references:to; bh=4r33SfTRwWXipcaagNiw6yQTSTgqBm90O8/kDIAGj6o=; b=Om2J360Tq6C+Cja9fvTbJqhWcUwW5q1Ya4V/5knq7y5jr5IfEFHx2Q5NpFsxKKT9n3 wFevvpUXLwb3/LXi71EJabo32vLqxfsJJD+4mbw4WRVuCEiU9CE2EnuYuDVkmtFCBro1 Rwwr6wmmdfIwIMJ6HGlIACq3IB7a0HmY13ylhzBZpXSXaTezES1V/oQ4df2wolPKjM0Q zt4gkYfWnFjG4M0Yj7ILND0IabjPA/o6hwkxN7gkdbjhekx4WBQPWJMbElx0eqTpPwX8 VHsRAqAtvXfSM6WA5JNx13jyMZIkhtsqIpSEZzxjLOWZdUzabtNXYHR1+46Z034AcXxl I85A== X-Gm-Message-State: ALoCoQnTjeJFo0+QMs5C6WCHRHO470HFiCp8gT5rVj8yJUYuu5KrRhlLX7bAGu63Gb72771V0rPFw0m6hE9J9ZzyCeGTs0mDGg== X-Received: by 10.202.73.67 with SMTP id w64mr23002113oia.84.1451086961655; Fri, 25 Dec 2015 15:42:41 -0800 (PST) Received: from ?IPv6:2601:280:4900:3700:7ce5:ac5c:f359:9182? ([2601:280:4900:3700:7ce5:ac5c:f359:9182]) by smtp.gmail.com with ESMTPSA id kp2sm10045777obb.12.2015.12.25.15.42.40 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 25 Dec 2015 15:42:40 -0800 (PST) Sender: Warner Losh Subject: Re: 11.0-CURRENT (r292413) on a rpi2b: arm-gnueabi-freebsd/bin/ar, _fseeko, and memset vs memory alignment (SCTRL bit[1]=1?): Explains the Bus error? Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Content-Type: multipart/signed; boundary="Apple-Mail=_918263EA-3FFB-4ABA-8809-80F83519528B"; protocol="application/pgp-signature"; micalg=pgp-sha512 X-Pgp-Agent: GPGMail 2.5.2 From: Warner Losh In-Reply-To: <51EB4AAB-BC81-4282-BA4D-D329C41D660B@dsl-only.net> Date: Fri, 25 Dec 2015 16:42:38 -0700 Cc: freebsd-arm , FreeBSD Toolchain , Ian Lepore , mat@FreeBSD.org, sbruno@FreeBSD.org Message-Id: <8B52074F-FDEF-4119-BB04-630F9BE9E6DB@bsdimp.com> References: <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> <0D81C2CA-BF1C-4C14-B816-A8C5F68715B5@bsdimp.com> <51EB4AAB-BC81-4282-BA4D-D329C41D660B@dsl-only.net> To: Mark Millard X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Dec 2015 23:42:43 -0000 --Apple-Mail=_918263EA-3FFB-4ABA-8809-80F83519528B Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Dec 25, 2015, at 3:14 PM, Mark Millard wrote: >=20 > [I'm going to break much of the earlier "original material" text to = tail of the message.] >=20 >> On 2015-Dec-25, at 11:53 AM, Warner Losh wrote: >>=20 >> So what happens if we actually fix the underlying bug? >>=20 >> I see two ways of doing this. In findfp.c, we allocate an array of = FILE * today like: >> g =3D (struct glue *)malloc(sizeof(*g) + ALIGNBYTES + n * = sizeof(FILE)); >> but that assumes that FILE just has normal pointer alignment = requirements. However, >> due to the mbstate having int64_t alignment requirements, this is = wrong. Maybe we >> need to do something like >> g =3D (struct glue *)malloc(sizeof(*g) + = max(sizeof(int64_t),ALIGNBYTES) + n * sizeof(FILE)); >> which wouldn=E2=80=99t change anything on LP64 systems, but would = result in proper alignment >> for ILP32 systems. We=E2=80=99d have to fix the loop that uses ALIGN = afterwards to use >> roundup. Instead, we=E2=80=99d need to round up to the neared 8-byte = aligned offset (or technically, >> the max of ALIGNBYTES and 8, but that=E2=80=99s always 8 on today=E2=80= =99s systems. If we do this, >> we can make sure that each file is 8-byte aligned or better. We may = need to round up >> sizeof(FILE) to a multiple of 8 as well. I believe that since it has = the 8-byte alignment >> for a member, its size must be a multiple of 8, but I=E2=80=99ve not = chased that belief to ground. >> If not, we may need another decorator (__aligned(8), I think, spelled = with the ugly >> max expression above). That way, the contract we=E2=80=99re making = with the compiler will >> always be true. ALIGN BYTES is 4 on Arm anyway, so that bit is = clearly wrong. >>=20 >> This wouldn=E2=80=99t be an ABI change, since you can only get a = valid FILE * from fopen (and >> friends), plus stdin, stdout, and stderr. Those addresses aren=E2=80=99= t hard coded into binaries, >> so even if we have to tweak the last three and deal with some = =E2=80=98fake=E2=80=99 FILE abuse in libc >> (which I don=E2=80=99t think suffers from this issue, btw, given the = alignment requirements that would >> naturally follow from something on the stack), we=E2=80=99d still be = ahead. At least for all CONFORMING >> implementations[*]... >>=20 >> TL;DR: Why not make FILE * always 8-byte aligned? The compiler = options are a band-aide. >>=20 >> Warner >>=20 >> [*] There=E2=80=99s at least on popular package that has a copy of = the FILE structure in one of its >> .h files and uses that to do unnatural optimization things, but even = that=E2=80=99s cool, I think, >> since it never allocates a new one. >>=20 >=20 > The ARM documentation mentions cases of 16 byte alignment = requirements. I've no clue if the clang code generation ever creates = such code. There might be wider requirements possible in arm code as = well. (I'm not an arm expert.) As an example of an implication: "The = malloc() function returns a pointer to a block of at least size bytes = suitably aligned for any use." In other words: aligned to some figure = that is a multiple of *every* alignment requirement that the code = generator can produce, possibly being the least common multiple. >=20 > "-fmax-type-align=3D. . ." is a means of controlling/limiting the = range of potential alignments to no more than a fixed, predefined value. = Above that and the code generation has to work in small size accesses = and build-up/split-up bigger values. Using "-fmax-type-align=3D. . ." = allows defining a figure as part of an ABI that is then not subject to = code generator updates that could increase the maximum alignment figure = and break things: It turns off such new capabilities. Other options need = not work that way to preserve the ABI. That=E2=80=99s true, as far as it goes=E2=80=A6 But I=E2=80=99m not sure = it goes far enough. The premise here is that the problem is wide-spread, = when in fact I think it is quite narrow. > But in the most fundamental terms process wise as far as I can tell. . = . >=20 > While the FILE case that occurred is a specific example, every = memory-allocation-like operation is at a potential issue for all such = "allocated" objects where the related code generation requires alignment = to avoid Bus Error (given the SCTLR bit[1] in use). The problem isn=E2=80=99t general. The problem isn=E2=80=99t malloc. = Malloc will generally return the right thing on arm (and if it = doesn=E2=80=99t, then we need to make sure it does). The problem is we get a boatload of FILEs from the system all at once, = and those are misaligned because of a bug in the code. One that=E2=80=99s = fixed, I believe, in https://reviews.freebsd.org/D4708. > How many other places in FreeBSD might sometimes return mis-aligned = pointers for the existing code generation and ABI combination? It isn=E2=80=99t an ABI thing, just a code bug thing. The only reason it = was an issue was due to the optimizing nature of clang. We=E2=80=99ve had to deal with the arm alignment issues for years. I = wager there are very few indeed. The only reason this was was brought to = light was better code-gen from clang. > How many other places are subject to breakage when "internal" = structs/unions/fields involved are changed to be of a different size = because the code is not fully auto-adjusting to match the code = generation yet --even if right now "it works"? How fragile will things = be for future work? If there are others, I=E2=80=99ll bet they could be counted on one hand = since very few things do the =E2=80=98slab=E2=80=99 allocator that FILE = does. > What would it take to find out and deal with them all? (I do not have = the background knowledge to span much.) >=20 > My experiment avoided potentially changing parts of the ABI and also = avoided dealing with such a "lots of code to investigate" issue. It may = not be the long term 11.0-RELEASE solution. Even if not, it may be = appropriate for various temporary purposes that need to avoid Bus Errors = in the process. For example if Ian has a good reason to use clang 3.7 = instead of gcc 4.2.1. The review above doesn=E2=80=99t change the ABI either. > Other notes: >=20 >> I believe that since it has the 8-byte alignment >> for a member, its size must be a multiple of 8 >=20 > There are some C/C++ language rules about the address of a structure = equalling the address of the first field, uniformity of the offsets, and = the like. But. . . >=20 > The C and C++ languages specify no specific numerical alignment = figures, not even relative to specific sizeof(...) expressions. To use = an old example: a 68010 only needs alignment for >=3D 2 byte things and = even alignment is all that is then required. Some other contexts take a = lot more to meet the specifications. There are some implications of the = modern memory model(s) created to cover concurrency explicitly, such as = avoiding interactions that can happen via, for example, separate objects = (in part) sharing a cache line. (I've only looked at C++ for this, and = only to a degree.) >=20 > The detailed alignment rules are more "implementation defined" than = "predefined by the standard". But the definition is trying to meet = language criteria. It is not a fully independent choice. Many of them are actually defined by a combination of the standard = language definition, as well as the ABI standard. This is why we know = that mbstate_t must be 8 byte aligned. > May be some other standards that FreeBSD is tied to specify more = specifics, such as a N byte integer always aligns to some multiple of N = (a waste on the 68010), including the alignment for union or struct that = it may be a part of tracking. But such rules force padding that may or = may not be required to meet the language's more abstract criteria and = such rules may not match the existing/in-use ABI. It is all spelled out in the ARM EABI docs. > So far as I can tell explicitly declared alignments may well be = necessary. If that one "popular package", say, formed an array of FILE = copies then the resultant alignments need not all match the ones = produced by your example code unless the FILE declaration forces the = compiler to match, causing sizeof(FILE) to track as well. FILE need not = be the only such issue. Arrays of FILEs isn=E2=80=99t an issue (except that it encodes the size = of FILE into the app). It=E2=80=99s the specifically quirky way that = libc does it that=E2=80=99s the problem. > My background and reference material are mostly tied the languages = --and so my notes tend to be limited to that much context. Understood. While there may be issues with alignment still, tossing a = big hammer at the problem because they might exist will likely mean they = will persist far longer than fixing them one at a time. When we first = ported to arm, there were maybe half a dozen places that needed fixing. = I doubt there=E2=80=99s more now. Can you try the patch in the above code review w/o the -f switch and let = me know if it works for you? Warner > Original material: >=20 >> On Dec 25, 2015, at 7:24 AM, Mark Millard = wrote: >>=20 >> [Good News Summary: Rebuilding buildworld/buildkernel for rpi2 = 11.0-CURRENT 292413 from amd64 based on adding -fmax-type-align=3D4 has = so far removed the crashes during the toolchain activity: no more = misaligned accesses in libc's _fseeko or elsewhere.] >>=20 >> On 2015-Dec-25, at 12:31 AM, Mark Millard = wrote: >>=20 >>> On 2015-Dec-24, at 10:39 PM, Mark Millard = wrote: >>>=20 >>>> [I do not know if this partial crash analysis related to on-arm = clang-associated activity is good enough and appropriate to submit or = not.] >>>>=20 >>>> The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved = below came from pkg install activity instead of port building. Used = as-is. >>>>=20 >>>> When I just tried my first from-rpi2b builds (ports for a rpi2b), = /usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the = following suggests an alignment error for the type of instructions that = memset for 128 bytes was translated to (sizeof(mbstate_t)) in the code = used by /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to = check SCTLR bit[1] to be directly sure that alignment was being = enforced.) >>>>=20 >>>> The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar = : >>>>=20 >>>>> libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru = .libs/libgnuintl.a bindtextdom.o dcgettext.o dgettext.o gettext.o = finddomain.o hash-string.o loadmsgcat.o localealias.o textdomain.o = l10nflist.o explodename.o dcigettext.o dcngettext.o dngettext.o = ngettext.o pluralx.o plural-exp.o localcharset.o threadlib.o lock.o = relocatable.o langprefs.o localename.o log.o printf.o setlocale.o = version.o xsize.o osdep.o intl-compat.o >>>>> Bus error (core dumped) >>>>> *** [libgnuintl.la] Error code 138 >>>>=20 >>>> It failed in _fseeko doing a memset that turned into uses of = "vst1.64 {d16-d17}, [r0]" instructions, for an address in = register r0 that ended in 0xa4, so was not aligned to 8 byte boundaries. = =46rom what I read such "VSTn (multiple n-element structures)" that have = .64 require 8 byte alignment. The evidence of the code and register = value follow. >>>>=20 >>>>> # gdb /usr/local/arm-gnueabi-freebsd/bin/ar = /usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gette= xt-tools/intl/ar.core >>>>> . . . >>>>> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at /usr/src/lib/libc/stdio/fseek.c:299 >>>>> 299 memset(&fp->_mbstate, 0, sizeof(mbstate_t)); >>>>> . . . >>>>> (gdb) x/24i 0x2033adb0 >>>>> 0x2033adb0 <_fseeko+836>: vmov.i32 q8, #0 ; 0x00000000 >>>>> 0x2033adb4 <_fseeko+840>: movw r1, #65503 ; 0xffdf >>>>> 0x2033adb8 <_fseeko+844>: stm r4, {r0, r7} >>>>> 0x2033adbc <_fseeko+848>: ldrh r0, [r4, #12] >>>>> 0x2033adc0 <_fseeko+852>: and r0, r0, r1 >>>>> 0x2033adc4 <_fseeko+856>: strh r0, [r4, #12] >>>>> 0x2033adc8 <_fseeko+860>: add r0, r4, #216 ; 0xd8 >>>>> 0x2033adcc <_fseeko+864>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033add0 <_fseeko+868>: add r0, r4, #200 ; 0xc8 >>>>> 0x2033add4 <_fseeko+872>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033add8 <_fseeko+876>: add r0, r4, #184 ; 0xb8 >>>>> 0x2033addc <_fseeko+880>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033ade0 <_fseeko+884>: add r0, r4, #168 ; 0xa8 >>>>> 0x2033ade4 <_fseeko+888>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033ade8 <_fseeko+892>: add r0, r4, #152 ; 0x98 >>>>> 0x2033adec <_fseeko+896>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033adf0 <_fseeko+900>: add r0, r4, #136 ; 0x88 >>>>> 0x2033adf4 <_fseeko+904>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033adf8 <_fseeko+908>: add r0, r4, #120 ; 0x78 >>>>> 0x2033adfc <_fseeko+912>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033ae00 <_fseeko+916>: add r0, r4, #104 ; 0x68 >>>>> 0x2033ae04 <_fseeko+920>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033ae08 <_fseeko+924>: b 0x2033b070 <_fseeko+1540> >>>>> 0x2033ae0c <_fseeko+928>: cmp r5, #0 ; 0x0 >>>>> (gdb) info all-registers >>>>> r0 0x20651ea4 543497892 >>>>> r1 0xffdf 65503 >>>>> r2 0x0 0 >>>>> r3 0x0 0 >>>>> r4 0x20651dcc 543497676 >>>>> r5 0x0 0 >>>>> r6 0x0 0 >>>>> r7 0x0 0 >>>>> r8 0x20359df4 540384756 >>>>> r9 0x0 0 >>>>> r10 0x0 0 >>>>> r11 0xbfbfb948 -1077954232 >>>>> r12 0x2037b208 540520968 >>>>> sp 0xbfbfb898 -1077954408 >>>>> lr 0x2035a004 540385284 >>>>> pc 0x2033adcc 540257740 >>>>> f0 0 (raw 0x000000000000000000000000) >>>>> f1 0 (raw 0x000000000000000000000000) >>>>> f2 0 (raw 0x000000000000000000000000) >>>>> f3 0 (raw 0x000000000000000000000000) >>>>> f4 0 (raw 0x000000000000000000000000) >>>>> f5 0 (raw 0x000000000000000000000000) >>>>> f6 0 (raw 0x000000000000000000000000) >>>>> f7 0 (raw 0x000000000000000000000000) >>>>> fps 0x0 0 >>>>> cpsr 0x60000010 1610612752 >>>>=20 >>>> The syntax in use for vst1.64 instructions does not explicitly have = the alignment notation. Presuming that the decoding is correct then from = what I read the following applies: >>>>=20 >>>>> Home > NEON and VFP Programming > NEON load and store element and = structure instructions > Alignment restrictions in load and store, = element and structure instructions >>>>>=20 >>>>> . . . When the alignment is not specified in the instruction, the = alignment restriction is controlled by the A bit (SCTLR bit[1]): >>>>> =E2=80=A2 if the A bit is 0, there are no alignment = restrictions (except for strongly ordered or device memory, where = accesses must be element aligned or the result is unpredictable) >>>>> =E2=80=A2 if the A bit is 1, accesses must be element = aligned. >>>>> If an address is not correctly aligned, an alignment fault occurs. >>>>=20 >>>> So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus = error would have the context to happen because of the mis-alignment. >>>>=20 >>>> The following shows the make.conf context that explains how = /usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked: >>>>=20 >>>>> # more /etc/make.conf >>>>> WRKDIRPREFIX=3D/usr/obj/portswork >>>>> WITH_DEBUG=3D >>>>> WITH_DEBUG_FILES=3D >>>>> MALLOC_PRODUCTION=3D >>>>> # >>>>> TO_TYPE=3Darmv6 >>>>> TOOLS_TO_TYPE=3Darm-gnueabi >>>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>>>> .if ${.MAKE.LEVEL} =3D=3D 0 >>>>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>>> .export CC >>>>> .export CXX >>>>> .export CPP >>>>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>>>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>>>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>>>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>>>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>>>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>>>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>>>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>>>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>>>> .export AS >>>>> .export AR >>>>> .export LD >>>>> .export NM >>>>> .export OBJCOPY >>>>> .export OBJDUMP >>>>> .export RANLIB >>>>> .export SIZE >>>>> .export STRINGS >>>>> .endif >>>>=20 >>>>=20 >>>> Other context: >>>>=20 >>>>> # freebsd-version -ku; uname -aKU >>>>> 11.0-CURRENT >>>>> 11.0-CURRENT >>>>> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue = Dec 22 22:02:21 PST 2015 = root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG arm = 1100091 1100091 >>>>=20 >>>>=20 >>>>=20 >>>> I will note that world and kernel are my own build of -r292413 = (earlier experiment) --a build made from an amd64 host context and put = in place via DESTDIR=3D. My expectation would be that the amd64 context = would not be likely to have similar alignment restrictions involved in = its ar activity (or other activity). That would explain how I got this = far using such a clang 3.7 related toolchain for targeting an rpi2 = before finding such a problem. >>>=20 >>>=20 >>> I realized re-reading the all above that it seems to suggest that = the _fseeko code involved is from /usr/local/arm-gnueabi-freebsd/bin/ar = but that was not my intent. >>>=20 >>> libc.so.7 is from my buildworld, including the fseeko = implementation: >>>=20 >>> Reading symbols from /lib/libc.so.7...Reading symbols from = /usr/lib/debug//lib/libc.so.7.debug...done. >>> done. >>> Loaded symbols for /lib/libc.so.7 >>>=20 >>>=20 >>> head/sys/sys/_types.h has: >>>=20 >>> /* >>> * mbstate_t is an opaque object to keep conversion state during = multibyte >>> * stream conversions. >>> */ >>> typedef union { >>> char __mbstate8[128]; >>> __int64_t _mbstateL; /* for alignment */ >>> } __mbstate_t; >>>=20 >>> suggesting an implicit alignment of the union to whatever the = implementation defines for __int64_t --which need not be 8 byte = alignment (in the abstract, general case). But 8 byte alignment is a = possibility as well (in the abstract). >>>=20 >>> But printing *fp in gdb for the fp argument to _fseeko reports the = same not-8-byte aligned address for __mbstate8 that was in r0: >>>=20 >>>> (gdb) bt >>>> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at /usr/src/lib/libc/stdio/fseek.c:299 >>>> #1 0x2033b108 in fseeko (fp=3D0x20651dcc, offset=3D18571438587904, = whence=3D0) at /usr/src/lib/libc/stdio/fseek.c:82 >>>> #2 0x00016138 in ?? () >>>> (gdb) print fp >>>> $2 =3D (FILE *) 0x20651dcc >>>> (gdb) print *fp >>>> $3 =3D {_p =3D 0x2069a240 "", _r =3D 0, _w =3D 0, _flags =3D 5264, = _file =3D 36, _bf =3D {_base =3D 0x2069a240 "", _size =3D 32768}, = _lbfsize =3D 0, _cookie =3D 0x20651dcc, _close =3D 0x20359dfc = <__sclose>, >>>> _read =3D 0x20359de4 <__sread>, _seek =3D 0x20359df4 <__sseek>, = _write =3D 0x20359dec <__swrite>, _ub =3D {_base =3D 0x0, _size =3D 0}, = _up =3D 0x0, _ur =3D 0, _ubuf =3D 0x20651e0c "", _nbuf =3D 0x20651e0f = "", _lb =3D { >>>> _base =3D 0x0, _size =3D 0}, _blksize =3D 32768, _offset =3D 0, = _fl_mutex =3D 0x0, _fl_owner =3D 0x0, _fl_count =3D 0, _orientation =3D = 0, _mbstate =3D {__mbstate8 =3D 0x20651e34 "", _mbstateL =3D 0}, _flags2 = =3D 0} >>>=20 >>> The overall FILE struct containing the _mbstate field is also not = 8-byte aligned. But the offset from the start of the FILE struct to = __mbstate8 is a multiple of 8 bytes. >>>=20 >>> It is my interpretation that there is nothing here to justify the = memset implementation combination: >>>=20 >>> SCTLR bit[1]=3D=3D1 >>>=20 >>> mixed with >>>=20 >>> vst1.64 instructions >>>=20 >>> I.e.: one or both needs to change unless some way for forcing 8-byte = alignment is introduced. >>>=20 >>> I have not managed to track down anything that would indicate = FreeBSD's intent for SCTLR bit[1]. I do not even know if it is required = by the design to be constant (once initialized). >>=20 >>=20 >> I have (so far) removed the build tool crashes based on adding = -fmax-type-align=3D4 to avoid the misaligned accesses. Details follow. >>=20 >> src.conf on amd64 for the rpi2 targeting buildworld/buildkernel now = looks like: >>=20 >>> # more ~/src.configs/src.conf.rpi2-clang.amd64-host >>> TO_TYPE=3Darmv6 >>> TOOLS_TO_TYPE=3Darm-gnueabi >>> FROM_TYPE=3Damd64 >>> TOOLS_FROM_TYPE=3Dx86_64 >>> VERSION_CONTEXT=3D11.0 >>> # >>> KERNCONF=3DRPI2-NODBG >>> TARGET=3Darm >>> .if ${.MAKE.LEVEL} =3D=3D 0 >>> TARGET_ARCH=3D${TO_TYPE} >>> .export TARGET_ARCH >>> .endif >>> # >>> WITHOUT_CROSS_COMPILER=3D >>> # >>> # For WITH_BOOT=3D . . . >>> # arm-gnueabi-freebsd/bin/ld reports bootinfo.o: relocation = R_ARM_MOVW_ABS_NC against `a local symbol' can not be used when making a = shared object; recompile with -fPIC >>> WITHOUT_BOOT=3D >>> # >>> WITH_FAST_DEPEND=3D >>> WITH_LIBCPLUSPLUS=3D >>> WITH_CLANG=3D >>> WITH_CLANG_IS_CC=3D >>> WITH_CLANG_FULL=3D >>> WITH_LLDB=3D >>> WITH_CLANG_EXTRAS=3D >>> # >>> WITHOUT_LIB32=3D >>> WITHOUT_GCC=3D >>> WITHOUT_GNUCXX=3D >>> # >>> NO_WERROR=3D >>> MALLOC_PRODUCTION=3D >>> #CFLAGS+=3D -DELF_VERBOSE >>> # >>> WITH_DEBUG=3D >>> WITH_DEBUG_FILES=3D >>> # >>> # TOOLS_TO_TYPE based on ${TO_TYPE}-xtoolchain-gcc related = bintutils... >>> # >>> #CROSS_TOOLCHAIN=3D${TO_TYPE}-gcc >>> X_COMPILER_TYPE=3Dclang >>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>> .if ${.MAKE.LEVEL} =3D=3D 0 >>> XCC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> XCXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> XCPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> .export XCC >>> .export XCXX >>> .export XCPP >>> XAS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>> XAR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>> XLD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>> XNM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>> XOBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>> XOBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>> XRANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>> XSIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>> #NO-SUCH: XSTRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>> XSTRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>> .export XAS >>> .export XAR >>> .export XLD >>> .export XNM >>> .export XOBJCOPY >>> .export XOBJDUMP >>> .export XRANLIB >>> .export XSIZE >>> .export XSTRINGS >>> .endif >>> # >>> # Host compiler stuff: >>> .if ${.MAKE.LEVEL} =3D=3D 0 >>> CC=3D/usr/bin/clang -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >>> CXX=3D/usr/bin/clang++ -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >>> CPP=3D/usr/bin/clang-cpp -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >>> .export CC >>> .export CXX >>> .export CPP >>> AS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/as >>> AR=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ar >>> LD=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ld >>> NM=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/nm >>> OBJCOPY=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objcopy >>> OBJDUMP=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objdump >>> RANLIB=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ranlib >>> SIZE=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/size >>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/strings >>> STRINGS=3D/usr/local/bin/${TOOLS_FROM_TYPE}-freebsd-strings >>> .export AS >>> .export AR >>> .export LD >>> .export NM >>> .export OBJCOPY >>> .export OBJDUMP >>> .export RANLIB >>> .export SIZE >>> .export STRINGS >>> .endif >>=20 >> make.conf for during the on-rpi2 port builds now looks like: >>=20 >>> $ more /etc/make.conf >>> WRKDIRPREFIX=3D/usr/obj/portswork >>> WITH_DEBUG=3D >>> WITH_DEBUG_FILES=3D >>> MALLOC_PRODUCTION=3D >>> # >>> TO_TYPE=3Darmv6 >>> TOOLS_TO_TYPE=3Darm-gnueabi >>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>> .if ${.MAKE.LEVEL} =3D=3D 0 >>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> .export CC >>> .export CXX >>> .export CPP >>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>> .export AS >>> .export AR >>> .export LD >>> .export NM >>> .export OBJCOPY >>> .export OBJDUMP >>> .export RANLIB >>> .export SIZE >>> .export STRINGS >>> .endif >>=20 >>=20 >>=20 >> =3D=3D=3D >> Mark Millard >> markmi at dsl-only.net >>=20 >>=20 >>=20 >> _______________________________________________ >> freebsd-toolchain@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain >> To unsubscribe, send any mail to = "freebsd-toolchain-unsubscribe@freebsd.org" --Apple-Mail=_918263EA-3FFB-4ABA-8809-80F83519528B Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQIcBAEBCgAGBQJWfdRuAAoJEGwc0Sh9sBEAg6EP/11K319mkZa0LbiV0g4Zbo5k RV44oXg/ucQsROpqDqp0DVzcMkJgGp9TjR6B0J9spviCviWJN5s6Ut5AKF9niV7g IGodpS2yaFRa7sOrv9o3ZffOOVajzOaXpkoeyeesv8+wS78B1wrpVGoKT35CC/mc SVbktqz5HpFAuPKXzCeV7ywAEpzH/NPNZFWrfT0Hi7P2UTS4KuRUemdw8adF5EDr pgARcaxIRpmUDoyU7TaRRxvrMknoqvo5vUcU5w5rLEiMrbH6pQAqdyJuDcUfa0aC 1cP/v+hjbqxFMNxTEFcQFqUgnUjrECKmOsijHJ4OCanNWK0Odiu8h4ORiyD59EyX ayTvbXMDqjiSVG449j775TkHARm8/lVJ2G4BfC4ig8AjflBDQTJpTLPxmKWS81xe Yz2uQfc6sAUKTdvG63/PwIAp1dK9ZaG2hiuSZDeGPKCVbRjFPfux3OTVj/+fUF7u IdrpturezDOhn2kfRea3IpTGeSxOTgKC1VtAx4HSUZ4GIgAiRj4xZEDl6Ww9qzaW 2GOEThC4k1pK+WL6NCoZb38pRiWONojGXErYtgs+wg+PdlXEY5heDX/6RduIs2UW 8lq+YeRGwkSHtvKt/4FnmcCSFWsEjlZ1+0uQu3VgoRZKHjnbxpASxGs7pgS7nGrU sw2TR+Xv0rOMjhVdj8v5 =qpY9 -----END PGP SIGNATURE----- --Apple-Mail=_918263EA-3FFB-4ABA-8809-80F83519528B--