From owner-freebsd-toolchain@freebsd.org Sun Dec 20 14:09:56 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7B08BA14E99 for ; Sun, 20 Dec 2015 14:09:56 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6AEB01D25 for ; Sun, 20 Dec 2015 14:09:56 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tBKE9uCb011285 for ; Sun, 20 Dec 2015 14:09:56 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-toolchain@FreeBSD.org Subject: [Bug 205453] 11.0-CURRENT libcxxrt/guard.cc uses C11's _Static_assert in conditionally-compiled C++ code and when it is used buildworld fails for syntax errors in g++ compilers Date: Sun, 20 Dec 2015 14:09:56 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: bin X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: linimon@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-toolchain@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Dec 2015 14:09:56 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=205453 Mark Linimon changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-bugs@FreeBSD.org |freebsd-toolchain@FreeBSD.o | |rg CC| |freebsd-ppc@FreeBSD.org -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-toolchain@freebsd.org Sun Dec 20 16:29:47 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7D677A4D60C for ; Sun, 20 Dec 2015 16:29:47 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 534521755 for ; Sun, 20 Dec 2015 16:29:47 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tBKGTlFa085918 for ; Sun, 20 Dec 2015 16:29:47 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-toolchain@FreeBSD.org Subject: [Bug 205453] 11.0-CURRENT libcxxrt/guard.cc uses C11's _Static_assert in conditionally-compiled C++ code and when it is used buildworld fails for syntax errors in g++ compilers Date: Sun, 20 Dec 2015 16:29:47 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: bin X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: dim@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-toolchain@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Dec 2015 16:29:47 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=205453 Dimitry Andric changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |bapt@FreeBSD.org, | |dim@FreeBSD.org, | |theraven@FreeBSD.org --- Comment #2 from Dimitry Andric --- Hm, this _Static_assert has an interesting history. The original review from Baptiste, https://reviews.freebsd.org/D1390, used static_assert(), but this required -std=c++11 to compile, otherwise you would get: contrib/libcxxrt/guard.cc:104:1: error: C++ requires a type specifier for all declarations static_assert(sizeof(guard_t) == sizeof(uint64_t), ""); ^~~~~~~~~~~~~ This is the version upstream eventually also used, since they apparently assume C++11 there. David suggested changing it to _Static_assert(): "This should work if you change it to _Static_assert, which I think we support for all C/C++ versions." Now that I look at the code again, I am not entirely sure why the static assertion is only for the big endian #ifdef block. It would seem more useful to put it a few lines lower, for the !_LP64 case. That said, even when moving the _Static_assert() like that, it compiles fine for me, both with base gcc, and several versions of ports gcc (I tried gcc 4.8, 4.9 and 5.2). On the other hand, your sample program indeed does not compile with the ports versions of gcc. I'm not sure where those versions are getting their version of _Static_assert() from, though... -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-toolchain@freebsd.org Sun Dec 20 16:56:40 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 16650A4E86B for ; Sun, 20 Dec 2015 16:56:40 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 065A81A19 for ; Sun, 20 Dec 2015 16:56:40 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tBKGudE7037102 for ; Sun, 20 Dec 2015 16:56:39 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-toolchain@FreeBSD.org Subject: [Bug 205453] 11.0-CURRENT libcxxrt/guard.cc uses C11's _Static_assert in conditionally-compiled C++ code and when it is used buildworld fails for syntax errors in g++ compilers Date: Sun, 20 Dec 2015 16:56:40 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: bin X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: markmi@dsl-only.net X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-toolchain@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Dec 2015 16:56:40 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=205453 --- Comment #3 from Mark Millard --- As for where _Static_assert and static_assert are gotten from: _Static_assert in C11 and static_assert in C++11 exist even for free-standing implementations. As I understand it is even stronger: no explicit headers should be required unless the notation from one language is being used in the other (ignoring simulating for older versions of languages). The handling should be automatic/built-in even for source with no #includes involved. -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-toolchain@freebsd.org Sun Dec 20 18:03:12 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A68B8A4E573 for ; Sun, 20 Dec 2015 18:03:12 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 95F0F140F for ; Sun, 20 Dec 2015 18:03:12 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tBKI3CtO005753 for ; Sun, 20 Dec 2015 18:03:12 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-toolchain@FreeBSD.org Subject: [Bug 205453] 11.0-CURRENT libcxxrt/guard.cc uses C11's _Static_assert in conditionally-compiled C++ code and when it is used buildworld fails for syntax errors in g++ compilers Date: Sun, 20 Dec 2015 18:03:12 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: bin X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: markmi@dsl-only.net X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-toolchain@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Dec 2015 18:03:12 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=205453 --- Comment #4 from Mark Millard --- FYI: My src.conf in use for the failing powerpc64-gcc "X" toolchain based buildworld and buildkernel (with gcc49 acting as the host toolchain) was: # more ~/src.configs/src.conf.powerpc64-xtoolchain.powerpc64-host KERNCONF=GENERIC64vtsc-NODEBUG TARGET=powerpc .if ${.MAKE.LEVEL} == 0 TARGET_ARCH=powerpc64 .export TARGET_ARCH .endif WITHOUT_CROSS_COMPILER= WITHOUT_CLANG_EXTRAS= WITH_FAST_DEPEND= WITH_LIBCPLUSPLUS= WITH_LIB32= WITH_BOOT= WITH_CLANG= WITH_CLANG_IS_CC= WITH_CLANG_FULL= WITH_LLDB= WITHOUT_GCC= WITHOUT_GNUCXX= NO_WERROR= MALLOC_PRODUCTION= WITH_DEBUG= WITH_DEBUG_FILES= CROSS_TOOLCHAIN=powerpc64-gcc .if ${.MAKE.LEVEL} == 0 CC=/usr/local/bin/gcc49 CXX=/usr/local/bin/g++49 CPP=/usr/local/bin/cpp49 .export CC .export CXX .export CPP .endif So WITH_LIBCPLUSPLUS. make.conf was empty. gcc49/g++49 had been built without 32 bit (lib32) support. gcc49 built powerpc64-gcc's update; the older powerpc64-gcc built gcc49. gcc 4.2.1 is/was not present. The kernel configuration turns on both vt and sc and turns off ps3. -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-toolchain@freebsd.org Wed Dec 23 11:50:17 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 41128A4F7AD; Wed, 23 Dec 2015 11:50:17 +0000 (UTC) (envelope-from gerald@pfeifer.com) Received: from ainaz.pair.com (ainaz.pair.com [209.68.2.66]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2578118D7; Wed, 23 Dec 2015 11:50:16 +0000 (UTC) (envelope-from gerald@pfeifer.com) Received: from [10.10.10.65] (unknown [110.136.210.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ainaz.pair.com (Postfix) with ESMTPSA id 919E63F427; Wed, 23 Dec 2015 06:50:13 -0500 (EST) Date: Wed, 23 Dec 2015 19:50:08 +0800 (WITA) From: Gerald Pfeifer To: "William A. Mahaffey III" cc: freebsd-questions@freebsd.org, freebsd-toolchain@freebsd.org Subject: Re: [toolchain] gcc5-devel question In-Reply-To: <5669A8BB.3030602@hiwaay.net> Message-ID: References: <5669A8BB.3030602@hiwaay.net> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Dec 2015 11:50:17 -0000 Hi William, On Thu, 10 Dec 2015, William A. Mahaffey III wrote: > However, pkg did reinstall gcc5-devel-5.2.1.s20151124 due to changed > options. I cd'ed to /usr/ports/lang/gcc5-devel & poked around a bit. I > saw no files or directories dated today, the latest was dated Dec 02, > the last time I upgraded & did a 'make install'. I did a find in > /usr/ports & /usr/ports/lang/gcc5-devel was all there was. Did the > compiler indeed get reinstalled ? If so, where :-) ? the location should not have changed, nor should have any packaging (additional files, say). > Also, when I did a 'make showconfig', it showed graphite support ready > to go (*yippeeee*, kudos), but no executable that I could locate on > short notice. Do I still need to compile it up, or is there an > executable ready to go somewhere not-so-obvious to me :-) ? Graphite support means additional optimizations GCC can perform (if you specify the respective options). You need to build the lang/gcc* ports that support Graphite with the respective option (GRAPHITE) enabled. It is off by default, and thus not part of packages. > When I last compiled it up last week, I did a 'make install > FORCE_PKG_REGISTER=1' which overwrote /usr/local/bin/gcc5 (not a huge > issue, but still ....), how do I tell it to install the executable under > a different name ? BTW, I am *NOT* particularly familiar w/ the 'GNU > way', so pardon me if this is a bit noobish :-/ .... TIA & have a good > one. I am not aware of the FreeBSD packaging system supporting the renaming of individual files within a package. You could try to install into a different location and then tweak things there, I guess. Gerald From owner-freebsd-toolchain@freebsd.org Wed Dec 23 15:58:52 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8BD52A4FF11; Wed, 23 Dec 2015 15:58:52 +0000 (UTC) (envelope-from wam@hiwaay.net) Received: from fly.hiwaay.net (fly.hiwaay.net [216.180.54.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5CFDD13E4; Wed, 23 Dec 2015 15:58:51 +0000 (UTC) (envelope-from wam@hiwaay.net) Received: from kabini1.local (dynamic-216-186-213-32.knology.net [216.186.213.32] (may be forged)) (authenticated bits=0) by fly.hiwaay.net (8.13.8/8.13.8/fly) with ESMTP id tBNFwiDP019052 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Wed, 23 Dec 2015 09:58:45 -0600 Subject: Re: [toolchain] gcc5-devel question References: <5669A8BB.3030602@hiwaay.net> Cc: freebsd-questions@freebsd.org, freebsd-toolchain@freebsd.org From: "William A. Mahaffey III" Message-ID: <567AC4B3.8090108@hiwaay.net> Date: Wed, 23 Dec 2015 10:04:13 -0553.75 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Dec 2015 15:58:52 -0000 On 12/23/15 05:56, Gerald Pfeifer wrote: > Hi William, > > On Thu, 10 Dec 2015, William A. Mahaffey III wrote: >> However, pkg did reinstall gcc5-devel-5.2.1.s20151124 due to changed >> options. I cd'ed to /usr/ports/lang/gcc5-devel & poked around a bit. I >> saw no files or directories dated today, the latest was dated Dec 02, >> the last time I upgraded & did a 'make install'. I did a find in >> /usr/ports & /usr/ports/lang/gcc5-devel was all there was. Did the >> compiler indeed get reinstalled ? If so, where :-) ? > the location should not have changed, nor should have any packaging > (additional files, say). 1st, thanks for your reply. This question was/is stupidly worded :-). What I was/am asking is whether the pkg-installed version of this compiler is compiled for Graphite support OOTB. The answer appears to be 'no'. I just pkg-upgraded to the newest version & it is now called 5.3.1: [29/33] Upgrading gcc5-devel from 5.2.1.s20151124 to 5.3.1.s20151208... [29/33] Extracting gcc5-devel-5.3.1.s20151208: .......... done However, when queried from the command line, it doesn't mention 'libisl' (req'd for Graphite support) & when I try to compile code using Graphite optimization (-floop-parallelize-all for me) it fails w/ an error message saying Graphite support not available: f951: sorry, unimplemented: Graphite loop optimizations cannot be used (ISL is not available)(-fgraphite, -fgraphite-identity, -floop-block, -floop-interchange, -floo p-strip-mine, -floop-parallelize-all, -floop-unroll-and-jam, and -ftree-loop-linear) libisl is indeed available, pkg installed & ready to go. When I upgraded ports (portsnap fetch update) this A.M., nothing got upgraded, everything still dated 12/09 or earlier (when I last rebuilt it to turn on Graphite support, which apparently worked AOK) except for the log files from my compile. The port showed/shows Graphite enabled by default (I think). So, another question is whether the pkg & port have different build configurations ? > >> Also, when I did a 'make showconfig', it showed graphite support ready >> to go (*yippeeee*, kudos), but no executable that I could locate on >> short notice. Do I still need to compile it up, or is there an >> executable ready to go somewhere not-so-obvious to me :-) ? > Graphite support means additional optimizations GCC can perform (if > you specify the respective options). > > You need to build the lang/gcc* ports that support Graphite with the > respective option (GRAPHITE) enabled. It is off by default, and thus > not part of packages. > >> When I last compiled it up last week, I did a 'make install >> FORCE_PKG_REGISTER=1' which overwrote /usr/local/bin/gcc5 (not a huge >> issue, but still ....), how do I tell it to install the executable under >> a different name ? BTW, I am *NOT* particularly familiar w/ the 'GNU >> way', so pardon me if this is a bit noobish :-/ .... TIA & have a good >> one. > I am not aware of the FreeBSD packaging system supporting the renaming > of individual files within a package. You could try to install into a > different location and then tweak things there, I guess. > > Gerald > -- William A. Mahaffey III ---------------------------------------------------------------------- "The M1 Garand is without doubt the finest implement of war ever devised by man." -- Gen. George S. Patton Jr. From owner-freebsd-toolchain@freebsd.org Fri Dec 25 07:06:12 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1D5DAA514F5 for ; Fri, 25 Dec 2015 07:06:12 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-211-151.reflexion.net [208.70.211.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D704919C8 for ; Fri, 25 Dec 2015 07:06:11 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 6397 invoked from network); 25 Dec 2015 06:39:36 -0000 Received: from unknown (HELO rtc-sm-01.app.dca.reflexion.local) (10.81.150.1) by 0 (rfx-qmail) with SMTP; 25 Dec 2015 06:39:36 -0000 Received: by rtc-sm-01.app.dca.reflexion.local (Reflexion email security v7.80.0) with SMTP; Fri, 25 Dec 2015 01:39:31 -0500 (EST) Received: (qmail 21550 invoked from network); 25 Dec 2015 06:39:31 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with SMTP; 25 Dec 2015 06:39:31 -0000 X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-76-115-7-162.hsd1.or.comcast.net [76.115.7.162]) by iron2.pdx.net (Postfix) with ESMTPSA id 3AA1BB1E001; Thu, 24 Dec 2015 22:39:23 -0800 (PST) From: Mark Millard Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: 11.0-CURRENT (r292413) on a rpi2b: arm-gnueabi-freebsd/bin/ar, _fseeko, and memset vs memory alignment (SCTRL bit[1]=1?): Explains the Bus error? Message-Id: <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> Date: Thu, 24 Dec 2015 22:39:29 -0800 To: freebsd-arm@freebsd.org, FreeBSD Toolchain Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Dec 2015 07:06:12 -0000 [I do not know if this partial crash analysis related to on-arm = clang-associated activity is good enough and appropriate to submit or = not.] The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved below = came from pkg install activity instead of port building. Used as-is. When I just tried my first from-rpi2b builds (ports for a rpi2b), = /usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the = following suggests an alignment error for the type of instructions that = memset for 128 bytes was translated to (sizeof(mbstate_t)) in the code = used by /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to = check SCTLR bit[1] to be directly sure that alignment was being = enforced.) The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar : > libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru = .libs/libgnuintl.a bindtextdom.o dcgettext.o dgettext.o gettext.o = finddomain.o hash-string.o loadmsgcat.o localealias.o textdomain.o = l10nflist.o explodename.o dcigettext.o dcngettext.o dngettext.o = ngettext.o pluralx.o plural-exp.o localcharset.o threadlib.o lock.o = relocatable.o langprefs.o localename.o log.o printf.o setlocale.o = version.o xsize.o osdep.o intl-compat.o > Bus error (core dumped) > *** [libgnuintl.la] Error code 138 It failed in _fseeko doing a memset that turned into uses of "vst1.64 = {d16-d17}, [r0]" instructions, for an address in register r0 that ended = in 0xa4, so was not aligned to 8 byte boundaries. =46rom what I read = such "VSTn (multiple n-element structures)" that have .64 require 8 byte = alignment. The evidence of the code and register value follow. > # gdb /usr/local/arm-gnueabi-freebsd/bin/ar = /usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gette= xt-tools/intl/ar.core > . . . > #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at = /usr/src/lib/libc/stdio/fseek.c:299 > 299 memset(&fp->_mbstate, 0, sizeof(mbstate_t)); > . . . > (gdb) x/24i 0x2033adb0 > 0x2033adb0 <_fseeko+836>: vmov.i32 q8, #0 ; 0x00000000 > 0x2033adb4 <_fseeko+840>: movw r1, #65503 ; 0xffdf > 0x2033adb8 <_fseeko+844>: stm r4, {r0, r7} > 0x2033adbc <_fseeko+848>: ldrh r0, [r4, #12] > 0x2033adc0 <_fseeko+852>: and r0, r0, r1 > 0x2033adc4 <_fseeko+856>: strh r0, [r4, #12] > 0x2033adc8 <_fseeko+860>: add r0, r4, #216 ; 0xd8 > 0x2033adcc <_fseeko+864>: vst1.64 {d16-d17}, [r0] > 0x2033add0 <_fseeko+868>: add r0, r4, #200 ; 0xc8 > 0x2033add4 <_fseeko+872>: vst1.64 {d16-d17}, [r0] > 0x2033add8 <_fseeko+876>: add r0, r4, #184 ; 0xb8 > 0x2033addc <_fseeko+880>: vst1.64 {d16-d17}, [r0] > 0x2033ade0 <_fseeko+884>: add r0, r4, #168 ; 0xa8 > 0x2033ade4 <_fseeko+888>: vst1.64 {d16-d17}, [r0] > 0x2033ade8 <_fseeko+892>: add r0, r4, #152 ; 0x98 > 0x2033adec <_fseeko+896>: vst1.64 {d16-d17}, [r0] > 0x2033adf0 <_fseeko+900>: add r0, r4, #136 ; 0x88 > 0x2033adf4 <_fseeko+904>: vst1.64 {d16-d17}, [r0] > 0x2033adf8 <_fseeko+908>: add r0, r4, #120 ; 0x78 > 0x2033adfc <_fseeko+912>: vst1.64 {d16-d17}, [r0] > 0x2033ae00 <_fseeko+916>: add r0, r4, #104 ; 0x68 > 0x2033ae04 <_fseeko+920>: vst1.64 {d16-d17}, [r0] > 0x2033ae08 <_fseeko+924>: b 0x2033b070 <_fseeko+1540> > 0x2033ae0c <_fseeko+928>: cmp r5, #0 ; 0x0 > (gdb) info all-registers > r0 0x20651ea4 543497892 > r1 0xffdf 65503 > r2 0x0 0 > r3 0x0 0 > r4 0x20651dcc 543497676 > r5 0x0 0 > r6 0x0 0 > r7 0x0 0 > r8 0x20359df4 540384756 > r9 0x0 0 > r10 0x0 0 > r11 0xbfbfb948 -1077954232 > r12 0x2037b208 540520968 > sp 0xbfbfb898 -1077954408 > lr 0x2035a004 540385284 > pc 0x2033adcc 540257740 > f0 0 (raw 0x000000000000000000000000) > f1 0 (raw 0x000000000000000000000000) > f2 0 (raw 0x000000000000000000000000) > f3 0 (raw 0x000000000000000000000000) > f4 0 (raw 0x000000000000000000000000) > f5 0 (raw 0x000000000000000000000000) > f6 0 (raw 0x000000000000000000000000) > f7 0 (raw 0x000000000000000000000000) > fps 0x0 0 > cpsr 0x60000010 1610612752 The syntax in use for vst1.64 instructions does not explicitly have the = alignment notation. Presuming that the decoding is correct then from = what I read the following applies: > Home > NEON and VFP Programming > NEON load and store element and = structure instructions > Alignment restrictions in load and store, = element and structure instructions >=20 > . . . When the alignment is not specified in the instruction, the = alignment restriction is controlled by the A bit (SCTLR bit[1]): > =E2=80=A2 if the A bit is 0, there are no alignment = restrictions (except for strongly ordered or device memory, where = accesses must be element aligned or the result is unpredictable) > =E2=80=A2 if the A bit is 1, accesses must be element = aligned. > If an address is not correctly aligned, an alignment fault occurs. So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus error = would have the context to happen because of the mis-alignment. The following shows the make.conf context that explains how = /usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked: > # more /etc/make.conf=20 > WRKDIRPREFIX=3D/usr/obj/portswork > WITH_DEBUG=3D > WITH_DEBUG_FILES=3D > MALLOC_PRODUCTION=3D > # > TO_TYPE=3Darmv6 > TOOLS_TO_TYPE=3Darm-gnueabi > CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ > .if ${.MAKE.LEVEL} =3D=3D 0 > CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a > CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a > CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a > .export CC > .export CXX > .export CPP > AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as > AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar > LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld > NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm > OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy > OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump > RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib > SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size > #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings > STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings > .export AS > .export AR > .export LD > .export NM > .export OBJCOPY > .export OBJDUMP > .export RANLIB > .export SIZE > .export STRINGS > .endif Other context: > # freebsd-version -ku; uname -aKU > 11.0-CURRENT > 11.0-CURRENT > FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue Dec 22 = 22:02:21 PST 2015 = root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG arm = 1100091 1100091 I will note that world and kernel are my own build of -r292413 (earlier = experiment) --a build made from an amd64 host context and put in place = via DESTDIR=3D. My expectation would be that the amd64 context would not = be likely to have similar alignment restrictions involved in its ar = activity (or other activity). That would explain how I got this far = using such a clang 3.7 related toolchain for targeting an rpi2 before = finding such a problem. =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-toolchain@freebsd.org Fri Dec 25 08:31:44 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E7F6DA5041C for ; Fri, 25 Dec 2015 08:31:44 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-211-151.reflexion.net [208.70.211.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AC4B31E0F for ; Fri, 25 Dec 2015 08:31:44 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 23568 invoked from network); 25 Dec 2015 08:31:48 -0000 Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2) by 0 (rfx-qmail) with SMTP; 25 Dec 2015 08:31:48 -0000 Received: by mail-cs-02.app.dca.reflexion.local (Reflexion email security v7.80.0) with SMTP; Fri, 25 Dec 2015 03:31:41 -0500 (EST) Received: (qmail 25540 invoked from network); 25 Dec 2015 08:31:41 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with SMTP; 25 Dec 2015 08:31:41 -0000 X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-76-115-7-162.hsd1.or.comcast.net [76.115.7.162]) by iron2.pdx.net (Postfix) with ESMTPSA id DF8A2B1E001; Fri, 25 Dec 2015 00:31:41 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: 11.0-CURRENT (r292413) on a rpi2b: arm-gnueabi-freebsd/bin/ar, _fseeko, and memset vs memory alignment (SCTRL bit[1]=1?): Explains the Bus error? From: Mark Millard In-Reply-To: <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> Date: Fri, 25 Dec 2015 00:31:41 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: References: <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> To: freebsd-arm@freebsd.org, FreeBSD Toolchain X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Dec 2015 08:31:45 -0000 On 2015-Dec-24, at 10:39 PM, Mark Millard wrote: > [I do not know if this partial crash analysis related to on-arm = clang-associated activity is good enough and appropriate to submit or = not.] >=20 > The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved below = came from pkg install activity instead of port building. Used as-is. >=20 > When I just tried my first from-rpi2b builds (ports for a rpi2b), = /usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the = following suggests an alignment error for the type of instructions that = memset for 128 bytes was translated to (sizeof(mbstate_t)) in the code = used by /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to = check SCTLR bit[1] to be directly sure that alignment was being = enforced.) >=20 > The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar : >=20 >> libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru = .libs/libgnuintl.a bindtextdom.o dcgettext.o dgettext.o gettext.o = finddomain.o hash-string.o loadmsgcat.o localealias.o textdomain.o = l10nflist.o explodename.o dcigettext.o dcngettext.o dngettext.o = ngettext.o pluralx.o plural-exp.o localcharset.o threadlib.o lock.o = relocatable.o langprefs.o localename.o log.o printf.o setlocale.o = version.o xsize.o osdep.o intl-compat.o >> Bus error (core dumped) >> *** [libgnuintl.la] Error code 138 >=20 > It failed in _fseeko doing a memset that turned into uses of "vst1.64 = {d16-d17}, [r0]" instructions, for an address in register r0 that ended = in 0xa4, so was not aligned to 8 byte boundaries. =46rom what I read = such "VSTn (multiple n-element structures)" that have .64 require 8 byte = alignment. The evidence of the code and register value follow. >=20 >> # gdb /usr/local/arm-gnueabi-freebsd/bin/ar = /usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gette= xt-tools/intl/ar.core >> . . . >> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at = /usr/src/lib/libc/stdio/fseek.c:299 >> 299 memset(&fp->_mbstate, 0, sizeof(mbstate_t)); >> . . . >> (gdb) x/24i 0x2033adb0 >> 0x2033adb0 <_fseeko+836>: vmov.i32 q8, #0 ; 0x00000000 >> 0x2033adb4 <_fseeko+840>: movw r1, #65503 ; 0xffdf >> 0x2033adb8 <_fseeko+844>: stm r4, {r0, r7} >> 0x2033adbc <_fseeko+848>: ldrh r0, [r4, #12] >> 0x2033adc0 <_fseeko+852>: and r0, r0, r1 >> 0x2033adc4 <_fseeko+856>: strh r0, [r4, #12] >> 0x2033adc8 <_fseeko+860>: add r0, r4, #216 ; 0xd8 >> 0x2033adcc <_fseeko+864>: vst1.64 {d16-d17}, [r0] >> 0x2033add0 <_fseeko+868>: add r0, r4, #200 ; 0xc8 >> 0x2033add4 <_fseeko+872>: vst1.64 {d16-d17}, [r0] >> 0x2033add8 <_fseeko+876>: add r0, r4, #184 ; 0xb8 >> 0x2033addc <_fseeko+880>: vst1.64 {d16-d17}, [r0] >> 0x2033ade0 <_fseeko+884>: add r0, r4, #168 ; 0xa8 >> 0x2033ade4 <_fseeko+888>: vst1.64 {d16-d17}, [r0] >> 0x2033ade8 <_fseeko+892>: add r0, r4, #152 ; 0x98 >> 0x2033adec <_fseeko+896>: vst1.64 {d16-d17}, [r0] >> 0x2033adf0 <_fseeko+900>: add r0, r4, #136 ; 0x88 >> 0x2033adf4 <_fseeko+904>: vst1.64 {d16-d17}, [r0] >> 0x2033adf8 <_fseeko+908>: add r0, r4, #120 ; 0x78 >> 0x2033adfc <_fseeko+912>: vst1.64 {d16-d17}, [r0] >> 0x2033ae00 <_fseeko+916>: add r0, r4, #104 ; 0x68 >> 0x2033ae04 <_fseeko+920>: vst1.64 {d16-d17}, [r0] >> 0x2033ae08 <_fseeko+924>: b 0x2033b070 <_fseeko+1540> >> 0x2033ae0c <_fseeko+928>: cmp r5, #0 ; 0x0 >> (gdb) info all-registers >> r0 0x20651ea4 543497892 >> r1 0xffdf 65503 >> r2 0x0 0 >> r3 0x0 0 >> r4 0x20651dcc 543497676 >> r5 0x0 0 >> r6 0x0 0 >> r7 0x0 0 >> r8 0x20359df4 540384756 >> r9 0x0 0 >> r10 0x0 0 >> r11 0xbfbfb948 -1077954232 >> r12 0x2037b208 540520968 >> sp 0xbfbfb898 -1077954408 >> lr 0x2035a004 540385284 >> pc 0x2033adcc 540257740 >> f0 0 (raw 0x000000000000000000000000) >> f1 0 (raw 0x000000000000000000000000) >> f2 0 (raw 0x000000000000000000000000) >> f3 0 (raw 0x000000000000000000000000) >> f4 0 (raw 0x000000000000000000000000) >> f5 0 (raw 0x000000000000000000000000) >> f6 0 (raw 0x000000000000000000000000) >> f7 0 (raw 0x000000000000000000000000) >> fps 0x0 0 >> cpsr 0x60000010 1610612752 >=20 > The syntax in use for vst1.64 instructions does not explicitly have = the alignment notation. Presuming that the decoding is correct then from = what I read the following applies: >=20 >> Home > NEON and VFP Programming > NEON load and store element and = structure instructions > Alignment restrictions in load and store, = element and structure instructions >>=20 >> . . . When the alignment is not specified in the instruction, the = alignment restriction is controlled by the A bit (SCTLR bit[1]): >> =E2=80=A2 if the A bit is 0, there are no alignment = restrictions (except for strongly ordered or device memory, where = accesses must be element aligned or the result is unpredictable) >> =E2=80=A2 if the A bit is 1, accesses must be element = aligned. >> If an address is not correctly aligned, an alignment fault occurs. >=20 > So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus error = would have the context to happen because of the mis-alignment. >=20 > The following shows the make.conf context that explains how = /usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked: >=20 >> # more /etc/make.conf=20 >> WRKDIRPREFIX=3D/usr/obj/portswork >> WITH_DEBUG=3D >> WITH_DEBUG_FILES=3D >> MALLOC_PRODUCTION=3D >> # >> TO_TYPE=3Darmv6 >> TOOLS_TO_TYPE=3Darm-gnueabi >> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >> .if ${.MAKE.LEVEL} =3D=3D 0 >> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >> .export CC >> .export CXX >> .export CPP >> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >> .export AS >> .export AR >> .export LD >> .export NM >> .export OBJCOPY >> .export OBJDUMP >> .export RANLIB >> .export SIZE >> .export STRINGS >> .endif >=20 >=20 > Other context: >=20 >> # freebsd-version -ku; uname -aKU >> 11.0-CURRENT >> 11.0-CURRENT >> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue Dec = 22 22:02:21 PST 2015 = root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG arm = 1100091 1100091 >=20 >=20 >=20 > I will note that world and kernel are my own build of -r292413 = (earlier experiment) --a build made from an amd64 host context and put = in place via DESTDIR=3D. My expectation would be that the amd64 context = would not be likely to have similar alignment restrictions involved in = its ar activity (or other activity). That would explain how I got this = far using such a clang 3.7 related toolchain for targeting an rpi2 = before finding such a problem. I realized re-reading the all above that it seems to suggest that the = _fseeko code involved is from /usr/local/arm-gnueabi-freebsd/bin/ar but = that was not my intent. libc.so.7 is from my buildworld, including the fseeko implementation: Reading symbols from /lib/libc.so.7...Reading symbols from = /usr/lib/debug//lib/libc.so.7.debug...done. done. Loaded symbols for /lib/libc.so.7 head/sys/sys/_types.h has: /* * mbstate_t is an opaque object to keep conversion state during = multibyte * stream conversions. */ typedef union { char __mbstate8[128]; __int64_t _mbstateL; /* for alignment */ } __mbstate_t; suggesting an implicit alignment of the union to whatever the = implementation defines for __int64_t --which need not be 8 byte = alignment (in the abstract, general case). But 8 byte alignment is a = possibility as well (in the abstract). But printing *fp in gdb for the fp argument to _fseeko reports the same = not-8-byte aligned address for __mbstate8 that was in r0: > (gdb) bt > #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at = /usr/src/lib/libc/stdio/fseek.c:299 > #1 0x2033b108 in fseeko (fp=3D0x20651dcc, offset=3D18571438587904, = whence=3D0) at /usr/src/lib/libc/stdio/fseek.c:82 > #2 0x00016138 in ?? () > (gdb) print fp > $2 =3D (FILE *) 0x20651dcc > (gdb) print *fp > $3 =3D {_p =3D 0x2069a240 "", _r =3D 0, _w =3D 0, _flags =3D 5264, = _file =3D 36, _bf =3D {_base =3D 0x2069a240 "", _size =3D 32768}, = _lbfsize =3D 0, _cookie =3D 0x20651dcc, _close =3D 0x20359dfc = <__sclose>,=20 > _read =3D 0x20359de4 <__sread>, _seek =3D 0x20359df4 <__sseek>, = _write =3D 0x20359dec <__swrite>, _ub =3D {_base =3D 0x0, _size =3D 0}, = _up =3D 0x0, _ur =3D 0, _ubuf =3D 0x20651e0c "", _nbuf =3D 0x20651e0f = "", _lb =3D { > _base =3D 0x0, _size =3D 0}, _blksize =3D 32768, _offset =3D 0, = _fl_mutex =3D 0x0, _fl_owner =3D 0x0, _fl_count =3D 0, _orientation =3D = 0, _mbstate =3D {__mbstate8 =3D 0x20651e34 "", _mbstateL =3D 0}, _flags2 = =3D 0} The overall FILE struct containing the _mbstate field is also not 8-byte = aligned. But the offset from the start of the FILE struct to __mbstate8 = is a multiple of 8 bytes. It is my interpretation that there is nothing here to justify the memset = implementation combination: SCTLR bit[1]=3D=3D1 mixed with vst1.64 instructions I.e.: one or both needs to change unless some way for forcing 8-byte = alignment is introduced. I have not managed to track down anything that would indicate FreeBSD's = intent for SCTLR bit[1]. I do not even know if it is required by the = design to be constant (once initialized). =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-toolchain@freebsd.org Fri Dec 25 14:25:00 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F2E89A510C9 for ; Fri, 25 Dec 2015 14:25:00 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-211-152.reflexion.net [208.70.211.152]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B811417E3 for ; Fri, 25 Dec 2015 14:25:00 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 31886 invoked from network); 25 Dec 2015 14:25:05 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 25 Dec 2015 14:25:05 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v7.80.0) with SMTP; Fri, 25 Dec 2015 09:25:04 -0500 (EST) Received: (qmail 9553 invoked from network); 25 Dec 2015 14:25:03 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with SMTP; 25 Dec 2015 14:25:03 -0000 X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-76-115-7-162.hsd1.or.comcast.net [76.115.7.162]) by iron2.pdx.net (Postfix) with ESMTPSA id 0EA50B1E001; Fri, 25 Dec 2015 06:24:56 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: 11.0-CURRENT (r292413) on a rpi2b: arm-gnueabi-freebsd/bin/ar, _fseeko, and memset vs memory alignment (SCTRL bit[1]=1?): Explains the Bus error? From: Mark Millard In-Reply-To: Date: Fri, 25 Dec 2015 06:24:57 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: References: <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> To: freebsd-arm@freebsd.org, FreeBSD Toolchain X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Dec 2015 14:25:01 -0000 [Good News Summary: Rebuilding buildworld/buildkernel for rpi2 = 11.0-CURRENT 292413 from amd64 based on adding -fmax-type-align=3D4 has = so far removed the crashes during the toolchain activity: no more = misaligned accesses in libc's _fseeko or elsewhere.] On 2015-Dec-25, at 12:31 AM, Mark Millard wrote: > On 2015-Dec-24, at 10:39 PM, Mark Millard wrote: >=20 >> [I do not know if this partial crash analysis related to on-arm = clang-associated activity is good enough and appropriate to submit or = not.] >>=20 >> The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved below = came from pkg install activity instead of port building. Used as-is. >>=20 >> When I just tried my first from-rpi2b builds (ports for a rpi2b), = /usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the = following suggests an alignment error for the type of instructions that = memset for 128 bytes was translated to (sizeof(mbstate_t)) in the code = used by /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to = check SCTLR bit[1] to be directly sure that alignment was being = enforced.) >>=20 >> The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar : >>=20 >>> libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru = .libs/libgnuintl.a bindtextdom.o dcgettext.o dgettext.o gettext.o = finddomain.o hash-string.o loadmsgcat.o localealias.o textdomain.o = l10nflist.o explodename.o dcigettext.o dcngettext.o dngettext.o = ngettext.o pluralx.o plural-exp.o localcharset.o threadlib.o lock.o = relocatable.o langprefs.o localename.o log.o printf.o setlocale.o = version.o xsize.o osdep.o intl-compat.o >>> Bus error (core dumped) >>> *** [libgnuintl.la] Error code 138 >>=20 >> It failed in _fseeko doing a memset that turned into uses of "vst1.64 = {d16-d17}, [r0]" instructions, for an address in register r0 that ended = in 0xa4, so was not aligned to 8 byte boundaries. =46rom what I read = such "VSTn (multiple n-element structures)" that have .64 require 8 byte = alignment. The evidence of the code and register value follow. >>=20 >>> # gdb /usr/local/arm-gnueabi-freebsd/bin/ar = /usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gette= xt-tools/intl/ar.core >>> . . . >>> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at /usr/src/lib/libc/stdio/fseek.c:299 >>> 299 memset(&fp->_mbstate, 0, sizeof(mbstate_t)); >>> . . . >>> (gdb) x/24i 0x2033adb0 >>> 0x2033adb0 <_fseeko+836>: vmov.i32 q8, #0 ; 0x00000000 >>> 0x2033adb4 <_fseeko+840>: movw r1, #65503 ; 0xffdf >>> 0x2033adb8 <_fseeko+844>: stm r4, {r0, r7} >>> 0x2033adbc <_fseeko+848>: ldrh r0, [r4, #12] >>> 0x2033adc0 <_fseeko+852>: and r0, r0, r1 >>> 0x2033adc4 <_fseeko+856>: strh r0, [r4, #12] >>> 0x2033adc8 <_fseeko+860>: add r0, r4, #216 ; 0xd8 >>> 0x2033adcc <_fseeko+864>: vst1.64 {d16-d17}, [r0] >>> 0x2033add0 <_fseeko+868>: add r0, r4, #200 ; 0xc8 >>> 0x2033add4 <_fseeko+872>: vst1.64 {d16-d17}, [r0] >>> 0x2033add8 <_fseeko+876>: add r0, r4, #184 ; 0xb8 >>> 0x2033addc <_fseeko+880>: vst1.64 {d16-d17}, [r0] >>> 0x2033ade0 <_fseeko+884>: add r0, r4, #168 ; 0xa8 >>> 0x2033ade4 <_fseeko+888>: vst1.64 {d16-d17}, [r0] >>> 0x2033ade8 <_fseeko+892>: add r0, r4, #152 ; 0x98 >>> 0x2033adec <_fseeko+896>: vst1.64 {d16-d17}, [r0] >>> 0x2033adf0 <_fseeko+900>: add r0, r4, #136 ; 0x88 >>> 0x2033adf4 <_fseeko+904>: vst1.64 {d16-d17}, [r0] >>> 0x2033adf8 <_fseeko+908>: add r0, r4, #120 ; 0x78 >>> 0x2033adfc <_fseeko+912>: vst1.64 {d16-d17}, [r0] >>> 0x2033ae00 <_fseeko+916>: add r0, r4, #104 ; 0x68 >>> 0x2033ae04 <_fseeko+920>: vst1.64 {d16-d17}, [r0] >>> 0x2033ae08 <_fseeko+924>: b 0x2033b070 <_fseeko+1540> >>> 0x2033ae0c <_fseeko+928>: cmp r5, #0 ; 0x0 >>> (gdb) info all-registers >>> r0 0x20651ea4 543497892 >>> r1 0xffdf 65503 >>> r2 0x0 0 >>> r3 0x0 0 >>> r4 0x20651dcc 543497676 >>> r5 0x0 0 >>> r6 0x0 0 >>> r7 0x0 0 >>> r8 0x20359df4 540384756 >>> r9 0x0 0 >>> r10 0x0 0 >>> r11 0xbfbfb948 -1077954232 >>> r12 0x2037b208 540520968 >>> sp 0xbfbfb898 -1077954408 >>> lr 0x2035a004 540385284 >>> pc 0x2033adcc 540257740 >>> f0 0 (raw 0x000000000000000000000000) >>> f1 0 (raw 0x000000000000000000000000) >>> f2 0 (raw 0x000000000000000000000000) >>> f3 0 (raw 0x000000000000000000000000) >>> f4 0 (raw 0x000000000000000000000000) >>> f5 0 (raw 0x000000000000000000000000) >>> f6 0 (raw 0x000000000000000000000000) >>> f7 0 (raw 0x000000000000000000000000) >>> fps 0x0 0 >>> cpsr 0x60000010 1610612752 >>=20 >> The syntax in use for vst1.64 instructions does not explicitly have = the alignment notation. Presuming that the decoding is correct then from = what I read the following applies: >>=20 >>> Home > NEON and VFP Programming > NEON load and store element and = structure instructions > Alignment restrictions in load and store, = element and structure instructions >>>=20 >>> . . . When the alignment is not specified in the instruction, the = alignment restriction is controlled by the A bit (SCTLR bit[1]): >>> =E2=80=A2 if the A bit is 0, there are no alignment = restrictions (except for strongly ordered or device memory, where = accesses must be element aligned or the result is unpredictable) >>> =E2=80=A2 if the A bit is 1, accesses must be element = aligned. >>> If an address is not correctly aligned, an alignment fault occurs. >>=20 >> So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus error = would have the context to happen because of the mis-alignment. >>=20 >> The following shows the make.conf context that explains how = /usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked: >>=20 >>> # more /etc/make.conf=20 >>> WRKDIRPREFIX=3D/usr/obj/portswork >>> WITH_DEBUG=3D >>> WITH_DEBUG_FILES=3D >>> MALLOC_PRODUCTION=3D >>> # >>> TO_TYPE=3Darmv6 >>> TOOLS_TO_TYPE=3Darm-gnueabi >>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>> .if ${.MAKE.LEVEL} =3D=3D 0 >>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>> .export CC >>> .export CXX >>> .export CPP >>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>> .export AS >>> .export AR >>> .export LD >>> .export NM >>> .export OBJCOPY >>> .export OBJDUMP >>> .export RANLIB >>> .export SIZE >>> .export STRINGS >>> .endif >>=20 >>=20 >> Other context: >>=20 >>> # freebsd-version -ku; uname -aKU >>> 11.0-CURRENT >>> 11.0-CURRENT >>> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue Dec = 22 22:02:21 PST 2015 = root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG arm = 1100091 1100091 >>=20 >>=20 >>=20 >> I will note that world and kernel are my own build of -r292413 = (earlier experiment) --a build made from an amd64 host context and put = in place via DESTDIR=3D. My expectation would be that the amd64 context = would not be likely to have similar alignment restrictions involved in = its ar activity (or other activity). That would explain how I got this = far using such a clang 3.7 related toolchain for targeting an rpi2 = before finding such a problem. >=20 >=20 > I realized re-reading the all above that it seems to suggest that the = _fseeko code involved is from /usr/local/arm-gnueabi-freebsd/bin/ar but = that was not my intent. >=20 > libc.so.7 is from my buildworld, including the fseeko implementation: >=20 > Reading symbols from /lib/libc.so.7...Reading symbols from = /usr/lib/debug//lib/libc.so.7.debug...done. > done. > Loaded symbols for /lib/libc.so.7 >=20 >=20 > head/sys/sys/_types.h has: >=20 > /* > * mbstate_t is an opaque object to keep conversion state during = multibyte > * stream conversions. > */ > typedef union { > char __mbstate8[128]; > __int64_t _mbstateL; /* for alignment */ > } __mbstate_t; >=20 > suggesting an implicit alignment of the union to whatever the = implementation defines for __int64_t --which need not be 8 byte = alignment (in the abstract, general case). But 8 byte alignment is a = possibility as well (in the abstract). >=20 > But printing *fp in gdb for the fp argument to _fseeko reports the = same not-8-byte aligned address for __mbstate8 that was in r0: >=20 >> (gdb) bt >> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at = /usr/src/lib/libc/stdio/fseek.c:299 >> #1 0x2033b108 in fseeko (fp=3D0x20651dcc, offset=3D18571438587904, = whence=3D0) at /usr/src/lib/libc/stdio/fseek.c:82 >> #2 0x00016138 in ?? () >> (gdb) print fp >> $2 =3D (FILE *) 0x20651dcc >> (gdb) print *fp >> $3 =3D {_p =3D 0x2069a240 "", _r =3D 0, _w =3D 0, _flags =3D 5264, = _file =3D 36, _bf =3D {_base =3D 0x2069a240 "", _size =3D 32768}, = _lbfsize =3D 0, _cookie =3D 0x20651dcc, _close =3D 0x20359dfc = <__sclose>,=20 >> _read =3D 0x20359de4 <__sread>, _seek =3D 0x20359df4 <__sseek>, = _write =3D 0x20359dec <__swrite>, _ub =3D {_base =3D 0x0, _size =3D 0}, = _up =3D 0x0, _ur =3D 0, _ubuf =3D 0x20651e0c "", _nbuf =3D 0x20651e0f = "", _lb =3D { >> _base =3D 0x0, _size =3D 0}, _blksize =3D 32768, _offset =3D 0, = _fl_mutex =3D 0x0, _fl_owner =3D 0x0, _fl_count =3D 0, _orientation =3D = 0, _mbstate =3D {__mbstate8 =3D 0x20651e34 "", _mbstateL =3D 0}, _flags2 = =3D 0} >=20 > The overall FILE struct containing the _mbstate field is also not = 8-byte aligned. But the offset from the start of the FILE struct to = __mbstate8 is a multiple of 8 bytes. >=20 > It is my interpretation that there is nothing here to justify the = memset implementation combination: >=20 > SCTLR bit[1]=3D=3D1 >=20 > mixed with >=20 > vst1.64 instructions >=20 > I.e.: one or both needs to change unless some way for forcing 8-byte = alignment is introduced. >=20 > I have not managed to track down anything that would indicate = FreeBSD's intent for SCTLR bit[1]. I do not even know if it is required = by the design to be constant (once initialized). I have (so far) removed the build tool crashes based on adding = -fmax-type-align=3D4 to avoid the misaligned accesses. Details follow. src.conf on amd64 for the rpi2 targeting buildworld/buildkernel now = looks like: > # more ~/src.configs/src.conf.rpi2-clang.amd64-host > TO_TYPE=3Darmv6 > TOOLS_TO_TYPE=3Darm-gnueabi > FROM_TYPE=3Damd64 > TOOLS_FROM_TYPE=3Dx86_64 > VERSION_CONTEXT=3D11.0 > # > KERNCONF=3DRPI2-NODBG > TARGET=3Darm > .if ${.MAKE.LEVEL} =3D=3D 0 > TARGET_ARCH=3D${TO_TYPE} > .export TARGET_ARCH > .endif > # > WITHOUT_CROSS_COMPILER=3D > # > # For WITH_BOOT=3D . . . > # arm-gnueabi-freebsd/bin/ld reports bootinfo.o: relocation = R_ARM_MOVW_ABS_NC against `a local symbol' can not be used when making a = shared object; recompile with -fPIC=20 > WITHOUT_BOOT=3D > # > WITH_FAST_DEPEND=3D > WITH_LIBCPLUSPLUS=3D > WITH_CLANG=3D > WITH_CLANG_IS_CC=3D > WITH_CLANG_FULL=3D > WITH_LLDB=3D > WITH_CLANG_EXTRAS=3D > # > WITHOUT_LIB32=3D > WITHOUT_GCC=3D > WITHOUT_GNUCXX=3D > # > NO_WERROR=3D > MALLOC_PRODUCTION=3D > #CFLAGS+=3D -DELF_VERBOSE > # > WITH_DEBUG=3D > WITH_DEBUG_FILES=3D > # > # TOOLS_TO_TYPE based on ${TO_TYPE}-xtoolchain-gcc related = bintutils... > # > #CROSS_TOOLCHAIN=3D${TO_TYPE}-gcc > X_COMPILER_TYPE=3Dclang > CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ > .if ${.MAKE.LEVEL} =3D=3D 0 > XCC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 > XCXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 > XCPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 > .export XCC > .export XCXX > .export XCPP > XAS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as > XAR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar > XLD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld > XNM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm > XOBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy > XOBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump > XRANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib > XSIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size > #NO-SUCH: XSTRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings > XSTRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings > .export XAS > .export XAR > .export XLD > .export XNM > .export XOBJCOPY > .export XOBJDUMP > .export XRANLIB > .export XSIZE > .export XSTRINGS > .endif > # > # Host compiler stuff: > .if ${.MAKE.LEVEL} =3D=3D 0 > CC=3D/usr/bin/clang -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin > CXX=3D/usr/bin/clang++ -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin > CPP=3D/usr/bin/clang-cpp -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin > .export CC > .export CXX > .export CPP > AS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/as > AR=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ar > LD=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ld > NM=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/nm > OBJCOPY=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objcopy > OBJDUMP=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objdump > RANLIB=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ranlib > SIZE=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/size > #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/strings > STRINGS=3D/usr/local/bin/${TOOLS_FROM_TYPE}-freebsd-strings > .export AS > .export AR > .export LD > .export NM > .export OBJCOPY > .export OBJDUMP > .export RANLIB > .export SIZE > .export STRINGS > .endif make.conf for during the on-rpi2 port builds now looks like: > $ more /etc/make.conf=20 > WRKDIRPREFIX=3D/usr/obj/portswork > WITH_DEBUG=3D > WITH_DEBUG_FILES=3D > MALLOC_PRODUCTION=3D > # > TO_TYPE=3Darmv6 > TOOLS_TO_TYPE=3Darm-gnueabi > CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ > .if ${.MAKE.LEVEL} =3D=3D 0 > CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 > CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 > CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 > .export CC > .export CXX > .export CPP > AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as > AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar > LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld > NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm > OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy > OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump > RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib > SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size > #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings > STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings > .export AS > .export AR > .export LD > .export NM > .export OBJCOPY > .export OBJDUMP > .export RANLIB > .export SIZE > .export STRINGS > .endif =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-toolchain@freebsd.org Fri Dec 25 14:25:01 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F404CA510CB for ; Fri, 25 Dec 2015 14:25:00 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-211-152.reflexion.net [208.70.211.152]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B816F17E5 for ; Fri, 25 Dec 2015 14:25:00 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 4734 invoked from network); 25 Dec 2015 14:24:58 -0000 Received: from unknown (HELO rtc-sm-01.app.dca.reflexion.local) (10.81.150.1) by 0 (rfx-qmail) with SMTP; 25 Dec 2015 14:24:58 -0000 Received: by rtc-sm-01.app.dca.reflexion.local (Reflexion email security v7.80.0) with SMTP; Fri, 25 Dec 2015 09:25:00 -0500 (EST) Received: (qmail 14342 invoked from network); 25 Dec 2015 14:25:00 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with SMTP; 25 Dec 2015 14:25:00 -0000 X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-76-115-7-162.hsd1.or.comcast.net [76.115.7.162]) by iron2.pdx.net (Postfix) with ESMTPSA id 44428B1E002; Fri, 25 Dec 2015 06:24:56 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: 11.0-CURRENT (r292413) on a rpi2b: arm-gnueabi-freebsd/bin/ar, _fseeko, and memset vs memory alignment (SCTRL bit[1]=1?): Explains the Bus error? From: Mark Millard In-Reply-To: Date: Fri, 25 Dec 2015 06:24:57 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: References: <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> To: freebsd-arm@freebsd.org, FreeBSD Toolchain X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Dec 2015 14:25:01 -0000 [Good News Summary: Rebuilding buildworld/buildkernel for rpi2 = 11.0-CURRENT 292413 from amd64 based on adding -fmax-type-align=3D4 has = so far removed the crashes during the toolchain activity: no more = misaligned accesses in libc's _fseeko or elsewhere.] On 2015-Dec-25, at 12:31 AM, Mark Millard wrote: > On 2015-Dec-24, at 10:39 PM, Mark Millard wrote: >=20 >> [I do not know if this partial crash analysis related to on-arm = clang-associated activity is good enough and appropriate to submit or = not.] >>=20 >> The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved below = came from pkg install activity instead of port building. Used as-is. >>=20 >> When I just tried my first from-rpi2b builds (ports for a rpi2b), = /usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the = following suggests an alignment error for the type of instructions that = memset for 128 bytes was translated to (sizeof(mbstate_t)) in the code = used by /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to = check SCTLR bit[1] to be directly sure that alignment was being = enforced.) >>=20 >> The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar : >>=20 >>> libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru = .libs/libgnuintl.a bindtextdom.o dcgettext.o dgettext.o gettext.o = finddomain.o hash-string.o loadmsgcat.o localealias.o textdomain.o = l10nflist.o explodename.o dcigettext.o dcngettext.o dngettext.o = ngettext.o pluralx.o plural-exp.o localcharset.o threadlib.o lock.o = relocatable.o langprefs.o localename.o log.o printf.o setlocale.o = version.o xsize.o osdep.o intl-compat.o >>> Bus error (core dumped) >>> *** [libgnuintl.la] Error code 138 >>=20 >> It failed in _fseeko doing a memset that turned into uses of "vst1.64 = {d16-d17}, [r0]" instructions, for an address in register r0 that ended = in 0xa4, so was not aligned to 8 byte boundaries. =46rom what I read = such "VSTn (multiple n-element structures)" that have .64 require 8 byte = alignment. The evidence of the code and register value follow. >>=20 >>> # gdb /usr/local/arm-gnueabi-freebsd/bin/ar = /usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gette= xt-tools/intl/ar.core >>> . . . >>> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at /usr/src/lib/libc/stdio/fseek.c:299 >>> 299 memset(&fp->_mbstate, 0, sizeof(mbstate_t)); >>> . . . >>> (gdb) x/24i 0x2033adb0 >>> 0x2033adb0 <_fseeko+836>: vmov.i32 q8, #0 ; 0x00000000 >>> 0x2033adb4 <_fseeko+840>: movw r1, #65503 ; 0xffdf >>> 0x2033adb8 <_fseeko+844>: stm r4, {r0, r7} >>> 0x2033adbc <_fseeko+848>: ldrh r0, [r4, #12] >>> 0x2033adc0 <_fseeko+852>: and r0, r0, r1 >>> 0x2033adc4 <_fseeko+856>: strh r0, [r4, #12] >>> 0x2033adc8 <_fseeko+860>: add r0, r4, #216 ; 0xd8 >>> 0x2033adcc <_fseeko+864>: vst1.64 {d16-d17}, [r0] >>> 0x2033add0 <_fseeko+868>: add r0, r4, #200 ; 0xc8 >>> 0x2033add4 <_fseeko+872>: vst1.64 {d16-d17}, [r0] >>> 0x2033add8 <_fseeko+876>: add r0, r4, #184 ; 0xb8 >>> 0x2033addc <_fseeko+880>: vst1.64 {d16-d17}, [r0] >>> 0x2033ade0 <_fseeko+884>: add r0, r4, #168 ; 0xa8 >>> 0x2033ade4 <_fseeko+888>: vst1.64 {d16-d17}, [r0] >>> 0x2033ade8 <_fseeko+892>: add r0, r4, #152 ; 0x98 >>> 0x2033adec <_fseeko+896>: vst1.64 {d16-d17}, [r0] >>> 0x2033adf0 <_fseeko+900>: add r0, r4, #136 ; 0x88 >>> 0x2033adf4 <_fseeko+904>: vst1.64 {d16-d17}, [r0] >>> 0x2033adf8 <_fseeko+908>: add r0, r4, #120 ; 0x78 >>> 0x2033adfc <_fseeko+912>: vst1.64 {d16-d17}, [r0] >>> 0x2033ae00 <_fseeko+916>: add r0, r4, #104 ; 0x68 >>> 0x2033ae04 <_fseeko+920>: vst1.64 {d16-d17}, [r0] >>> 0x2033ae08 <_fseeko+924>: b 0x2033b070 <_fseeko+1540> >>> 0x2033ae0c <_fseeko+928>: cmp r5, #0 ; 0x0 >>> (gdb) info all-registers >>> r0 0x20651ea4 543497892 >>> r1 0xffdf 65503 >>> r2 0x0 0 >>> r3 0x0 0 >>> r4 0x20651dcc 543497676 >>> r5 0x0 0 >>> r6 0x0 0 >>> r7 0x0 0 >>> r8 0x20359df4 540384756 >>> r9 0x0 0 >>> r10 0x0 0 >>> r11 0xbfbfb948 -1077954232 >>> r12 0x2037b208 540520968 >>> sp 0xbfbfb898 -1077954408 >>> lr 0x2035a004 540385284 >>> pc 0x2033adcc 540257740 >>> f0 0 (raw 0x000000000000000000000000) >>> f1 0 (raw 0x000000000000000000000000) >>> f2 0 (raw 0x000000000000000000000000) >>> f3 0 (raw 0x000000000000000000000000) >>> f4 0 (raw 0x000000000000000000000000) >>> f5 0 (raw 0x000000000000000000000000) >>> f6 0 (raw 0x000000000000000000000000) >>> f7 0 (raw 0x000000000000000000000000) >>> fps 0x0 0 >>> cpsr 0x60000010 1610612752 >>=20 >> The syntax in use for vst1.64 instructions does not explicitly have = the alignment notation. Presuming that the decoding is correct then from = what I read the following applies: >>=20 >>> Home > NEON and VFP Programming > NEON load and store element and = structure instructions > Alignment restrictions in load and store, = element and structure instructions >>>=20 >>> . . . When the alignment is not specified in the instruction, the = alignment restriction is controlled by the A bit (SCTLR bit[1]): >>> =E2=80=A2 if the A bit is 0, there are no alignment = restrictions (except for strongly ordered or device memory, where = accesses must be element aligned or the result is unpredictable) >>> =E2=80=A2 if the A bit is 1, accesses must be element = aligned. >>> If an address is not correctly aligned, an alignment fault occurs. >>=20 >> So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus error = would have the context to happen because of the mis-alignment. >>=20 >> The following shows the make.conf context that explains how = /usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked: >>=20 >>> # more /etc/make.conf=20 >>> WRKDIRPREFIX=3D/usr/obj/portswork >>> WITH_DEBUG=3D >>> WITH_DEBUG_FILES=3D >>> MALLOC_PRODUCTION=3D >>> # >>> TO_TYPE=3Darmv6 >>> TOOLS_TO_TYPE=3Darm-gnueabi >>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>> .if ${.MAKE.LEVEL} =3D=3D 0 >>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>> .export CC >>> .export CXX >>> .export CPP >>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>> .export AS >>> .export AR >>> .export LD >>> .export NM >>> .export OBJCOPY >>> .export OBJDUMP >>> .export RANLIB >>> .export SIZE >>> .export STRINGS >>> .endif >>=20 >>=20 >> Other context: >>=20 >>> # freebsd-version -ku; uname -aKU >>> 11.0-CURRENT >>> 11.0-CURRENT >>> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue Dec = 22 22:02:21 PST 2015 = root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG arm = 1100091 1100091 >>=20 >>=20 >>=20 >> I will note that world and kernel are my own build of -r292413 = (earlier experiment) --a build made from an amd64 host context and put = in place via DESTDIR=3D. My expectation would be that the amd64 context = would not be likely to have similar alignment restrictions involved in = its ar activity (or other activity). That would explain how I got this = far using such a clang 3.7 related toolchain for targeting an rpi2 = before finding such a problem. >=20 >=20 > I realized re-reading the all above that it seems to suggest that the = _fseeko code involved is from /usr/local/arm-gnueabi-freebsd/bin/ar but = that was not my intent. >=20 > libc.so.7 is from my buildworld, including the fseeko implementation: >=20 > Reading symbols from /lib/libc.so.7...Reading symbols from = /usr/lib/debug//lib/libc.so.7.debug...done. > done. > Loaded symbols for /lib/libc.so.7 >=20 >=20 > head/sys/sys/_types.h has: >=20 > /* > * mbstate_t is an opaque object to keep conversion state during = multibyte > * stream conversions. > */ > typedef union { > char __mbstate8[128]; > __int64_t _mbstateL; /* for alignment */ > } __mbstate_t; >=20 > suggesting an implicit alignment of the union to whatever the = implementation defines for __int64_t --which need not be 8 byte = alignment (in the abstract, general case). But 8 byte alignment is a = possibility as well (in the abstract). >=20 > But printing *fp in gdb for the fp argument to _fseeko reports the = same not-8-byte aligned address for __mbstate8 that was in r0: >=20 >> (gdb) bt >> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at = /usr/src/lib/libc/stdio/fseek.c:299 >> #1 0x2033b108 in fseeko (fp=3D0x20651dcc, offset=3D18571438587904, = whence=3D0) at /usr/src/lib/libc/stdio/fseek.c:82 >> #2 0x00016138 in ?? () >> (gdb) print fp >> $2 =3D (FILE *) 0x20651dcc >> (gdb) print *fp >> $3 =3D {_p =3D 0x2069a240 "", _r =3D 0, _w =3D 0, _flags =3D 5264, = _file =3D 36, _bf =3D {_base =3D 0x2069a240 "", _size =3D 32768}, = _lbfsize =3D 0, _cookie =3D 0x20651dcc, _close =3D 0x20359dfc = <__sclose>,=20 >> _read =3D 0x20359de4 <__sread>, _seek =3D 0x20359df4 <__sseek>, = _write =3D 0x20359dec <__swrite>, _ub =3D {_base =3D 0x0, _size =3D 0}, = _up =3D 0x0, _ur =3D 0, _ubuf =3D 0x20651e0c "", _nbuf =3D 0x20651e0f = "", _lb =3D { >> _base =3D 0x0, _size =3D 0}, _blksize =3D 32768, _offset =3D 0, = _fl_mutex =3D 0x0, _fl_owner =3D 0x0, _fl_count =3D 0, _orientation =3D = 0, _mbstate =3D {__mbstate8 =3D 0x20651e34 "", _mbstateL =3D 0}, _flags2 = =3D 0} >=20 > The overall FILE struct containing the _mbstate field is also not = 8-byte aligned. But the offset from the start of the FILE struct to = __mbstate8 is a multiple of 8 bytes. >=20 > It is my interpretation that there is nothing here to justify the = memset implementation combination: >=20 > SCTLR bit[1]=3D=3D1 >=20 > mixed with >=20 > vst1.64 instructions >=20 > I.e.: one or both needs to change unless some way for forcing 8-byte = alignment is introduced. >=20 > I have not managed to track down anything that would indicate = FreeBSD's intent for SCTLR bit[1]. I do not even know if it is required = by the design to be constant (once initialized). I have (so far) removed the build tool crashes based on adding = -fmax-type-align=3D4 to avoid the misaligned accesses. Details follow. src.conf on amd64 for the rpi2 targeting buildworld/buildkernel now = looks like: > # more ~/src.configs/src.conf.rpi2-clang.amd64-host > TO_TYPE=3Darmv6 > TOOLS_TO_TYPE=3Darm-gnueabi > FROM_TYPE=3Damd64 > TOOLS_FROM_TYPE=3Dx86_64 > VERSION_CONTEXT=3D11.0 > # > KERNCONF=3DRPI2-NODBG > TARGET=3Darm > .if ${.MAKE.LEVEL} =3D=3D 0 > TARGET_ARCH=3D${TO_TYPE} > .export TARGET_ARCH > .endif > # > WITHOUT_CROSS_COMPILER=3D > # > # For WITH_BOOT=3D . . . > # arm-gnueabi-freebsd/bin/ld reports bootinfo.o: relocation = R_ARM_MOVW_ABS_NC against `a local symbol' can not be used when making a = shared object; recompile with -fPIC=20 > WITHOUT_BOOT=3D > # > WITH_FAST_DEPEND=3D > WITH_LIBCPLUSPLUS=3D > WITH_CLANG=3D > WITH_CLANG_IS_CC=3D > WITH_CLANG_FULL=3D > WITH_LLDB=3D > WITH_CLANG_EXTRAS=3D > # > WITHOUT_LIB32=3D > WITHOUT_GCC=3D > WITHOUT_GNUCXX=3D > # > NO_WERROR=3D > MALLOC_PRODUCTION=3D > #CFLAGS+=3D -DELF_VERBOSE > # > WITH_DEBUG=3D > WITH_DEBUG_FILES=3D > # > # TOOLS_TO_TYPE based on ${TO_TYPE}-xtoolchain-gcc related = bintutils... > # > #CROSS_TOOLCHAIN=3D${TO_TYPE}-gcc > X_COMPILER_TYPE=3Dclang > CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ > .if ${.MAKE.LEVEL} =3D=3D 0 > XCC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 > XCXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 > XCPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 > .export XCC > .export XCXX > .export XCPP > XAS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as > XAR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar > XLD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld > XNM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm > XOBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy > XOBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump > XRANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib > XSIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size > #NO-SUCH: XSTRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings > XSTRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings > .export XAS > .export XAR > .export XLD > .export XNM > .export XOBJCOPY > .export XOBJDUMP > .export XRANLIB > .export XSIZE > .export XSTRINGS > .endif > # > # Host compiler stuff: > .if ${.MAKE.LEVEL} =3D=3D 0 > CC=3D/usr/bin/clang -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin > CXX=3D/usr/bin/clang++ -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin > CPP=3D/usr/bin/clang-cpp -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin > .export CC > .export CXX > .export CPP > AS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/as > AR=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ar > LD=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ld > NM=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/nm > OBJCOPY=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objcopy > OBJDUMP=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objdump > RANLIB=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ranlib > SIZE=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/size > #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/strings > STRINGS=3D/usr/local/bin/${TOOLS_FROM_TYPE}-freebsd-strings > .export AS > .export AR > .export LD > .export NM > .export OBJCOPY > .export OBJDUMP > .export RANLIB > .export SIZE > .export STRINGS > .endif make.conf for during the on-rpi2 port builds now looks like: > $ more /etc/make.conf=20 > WRKDIRPREFIX=3D/usr/obj/portswork > WITH_DEBUG=3D > WITH_DEBUG_FILES=3D > MALLOC_PRODUCTION=3D > # > TO_TYPE=3Darmv6 > TOOLS_TO_TYPE=3Darm-gnueabi > CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ > .if ${.MAKE.LEVEL} =3D=3D 0 > CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 > CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 > CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 > .export CC > .export CXX > .export CPP > AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as > AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar > LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld > NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm > OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy > OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump > RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib > SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size > #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings > STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings > .export AS > .export AR > .export LD > .export NM > .export OBJCOPY > .export OBJDUMP > .export RANLIB > .export SIZE > .export STRINGS > .endif =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-toolchain@freebsd.org Fri Dec 25 19:53:54 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 54CF4A51A9F for ; Fri, 25 Dec 2015 19:53:54 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from mail-oi0-x22c.google.com (mail-oi0-x22c.google.com [IPv6:2607:f8b0:4003:c06::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1C7521BB7 for ; Fri, 25 Dec 2015 19:53:53 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: by mail-oi0-x22c.google.com with SMTP id y66so151037058oig.0 for ; Fri, 25 Dec 2015 11:53:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=sender:subject:mime-version:content-type:from:in-reply-to:date:cc :message-id:references:to; bh=2LW2LlAOFfr/9Bk4M2K8Ib+ZIOSAU2vcto1tHmfq70g=; b=COzKV8bBoC2EiAQQZF5GKe+hBixZgfil3c1crw8WQK5klXM3/nYqLGhg2EGZMt8e3w yPfILQhlMQs5oWo1lOaUXX71/td25QK1aSGlIcBgc4zXT6/grYLcxF2SbSoiSkgir4TM 9d7o3XHtZMQcnXVnNESXnMYmEg4lpeZJx6A9IVfkEf9ax7GOapgeKGyi36iwpwlWc70H VRrqF/8fbzJ+Lu26Eupv7veYBPj5XeJwPmgMEexF/0ZjURlzakMO2/rR2dBKeXVukafb ndLjW/DBoUyJ4KJ6jVJSAZm1b5DcxIxUEIe7Ytde4J9GCze4TjHYv1sBN+2p44q3zJ1Z Z4uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:mime-version:content-type:from :in-reply-to:date:cc:message-id:references:to; bh=2LW2LlAOFfr/9Bk4M2K8Ib+ZIOSAU2vcto1tHmfq70g=; b=Dz4TPEp9iad2eRTNIr7aCVKeOC7n+It7nVou9RnR562nsmxdPzrWHZflT8OxSXSl1h JRasQXXJU2Oo8wQnN5T4GJvhM91OwrdrX6HUGStUaifFZWb/2AhtThLwXpBHv59RKCQ1 AeUeaUbDoyGRJ3rqvJAHzm6r+CuhSzgo0Z3K+cp6PtY1m1OV5RO50S8s6A55F9QZa9ch xYHm/K33PBv1AYMQplZvZPPX0Kreogs0ZJ98rRNmPgQweKCn5merydZYTZA2RaQZSyXr THZe/A2Bb5s14rcs4xOsM8y68FyxQiVHNwchOTbDFwmeXw/rzBdNhEcGEcCLH/MNhEuT 96QQ== X-Gm-Message-State: ALoCoQlCbps9TWycGCyJiWFtZRmd0NgY664G7oDcKiK6WAMk16q8j7bLchP5OGuEGMx5JenHh0zHXSHrL50pUgcKgj2KWINEWA== X-Received: by 10.202.85.195 with SMTP id j186mr23975786oib.74.1451073232546; Fri, 25 Dec 2015 11:53:52 -0800 (PST) Received: from ?IPv6:2601:280:4900:3700:e997:dd6f:8139:f929? ([2601:280:4900:3700:e997:dd6f:8139:f929]) by smtp.gmail.com with ESMTPSA id mm4sm13622866obb.1.2015.12.25.11.53.51 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 25 Dec 2015 11:53:51 -0800 (PST) Sender: Warner Losh Subject: Re: 11.0-CURRENT (r292413) on a rpi2b: arm-gnueabi-freebsd/bin/ar, _fseeko, and memset vs memory alignment (SCTRL bit[1]=1?): Explains the Bus error? Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Content-Type: multipart/signed; boundary="Apple-Mail=_B83D07DB-888A-4D4F-96E0-1636BFAA4EB5"; protocol="application/pgp-signature"; micalg=pgp-sha512 X-Pgp-Agent: GPGMail 2.5.2 From: Warner Losh In-Reply-To: Date: Fri, 25 Dec 2015 12:53:49 -0700 Cc: freebsd-arm@freebsd.org, FreeBSD Toolchain Message-Id: <0D81C2CA-BF1C-4C14-B816-A8C5F68715B5@bsdimp.com> References: <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> To: Mark Millard X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Dec 2015 19:53:54 -0000 --Apple-Mail=_B83D07DB-888A-4D4F-96E0-1636BFAA4EB5 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 So what happens if we actually fix the underlying bug? I see two ways of doing this. In findfp.c, we allocate an array of FILE = * today like: g =3D (struct glue *)malloc(sizeof(*g) + ALIGNBYTES + n * = sizeof(FILE)); but that assumes that FILE just has normal pointer alignment = requirements. However, due to the mbstate having int64_t alignment requirements, this is wrong. = Maybe we need to do something like g =3D (struct glue *)malloc(sizeof(*g) + = max(sizeof(int64_t),ALIGNBYTES) + n * sizeof(FILE)); which wouldn=E2=80=99t change anything on LP64 systems, but would result = in proper alignment for ILP32 systems. We=E2=80=99d have to fix the loop that uses ALIGN = afterwards to use roundup. Instead, we=E2=80=99d need to round up to the neared 8-byte = aligned offset (or technically, the max of ALIGNBYTES and 8, but that=E2=80=99s always 8 on today=E2=80=99= s systems. If we do this, we can make sure that each file is 8-byte aligned or better. We may need = to round up sizeof(FILE) to a multiple of 8 as well. I believe that since it has the = 8-byte alignment for a member, its size must be a multiple of 8, but I=E2=80=99ve not = chased that belief to ground. If not, we may need another decorator (__aligned(8), I think, spelled = with the ugly max expression above). That way, the contract we=E2=80=99re making with = the compiler will always be true. ALIGN BYTES is 4 on Arm anyway, so that bit is clearly = wrong. This wouldn=E2=80=99t be an ABI change, since you can only get a valid = FILE * from fopen (and friends), plus stdin, stdout, and stderr. Those addresses aren=E2=80=99t = hard coded into binaries, so even if we have to tweak the last three and deal with some =E2=80=98fak= e=E2=80=99 FILE abuse in libc (which I don=E2=80=99t think suffers from this issue, btw, given the = alignment requirements that would naturally follow from something on the stack), we=E2=80=99d still be = ahead. At least for all CONFORMING implementations[*]... TL;DR: Why not make FILE * always 8-byte aligned? The compiler options = are a band-aide. Warner [*] There=E2=80=99s at least on popular package that has a copy of the = FILE structure in one of its .h files and uses that to do unnatural optimization things, but even = that=E2=80=99s cool, I think, since it never allocates a new one. > On Dec 25, 2015, at 7:24 AM, Mark Millard wrote: >=20 > [Good News Summary: Rebuilding buildworld/buildkernel for rpi2 = 11.0-CURRENT 292413 from amd64 based on adding -fmax-type-align=3D4 has = so far removed the crashes during the toolchain activity: no more = misaligned accesses in libc's _fseeko or elsewhere.] >=20 > On 2015-Dec-25, at 12:31 AM, Mark Millard wrote: >=20 >> On 2015-Dec-24, at 10:39 PM, Mark Millard = wrote: >>=20 >>> [I do not know if this partial crash analysis related to on-arm = clang-associated activity is good enough and appropriate to submit or = not.] >>>=20 >>> The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved = below came from pkg install activity instead of port building. Used = as-is. >>>=20 >>> When I just tried my first from-rpi2b builds (ports for a rpi2b), = /usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the = following suggests an alignment error for the type of instructions that = memset for 128 bytes was translated to (sizeof(mbstate_t)) in the code = used by /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to = check SCTLR bit[1] to be directly sure that alignment was being = enforced.) >>>=20 >>> The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar : >>>=20 >>>> libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru = .libs/libgnuintl.a bindtextdom.o dcgettext.o dgettext.o gettext.o = finddomain.o hash-string.o loadmsgcat.o localealias.o textdomain.o = l10nflist.o explodename.o dcigettext.o dcngettext.o dngettext.o = ngettext.o pluralx.o plural-exp.o localcharset.o threadlib.o lock.o = relocatable.o langprefs.o localename.o log.o printf.o setlocale.o = version.o xsize.o osdep.o intl-compat.o >>>> Bus error (core dumped) >>>> *** [libgnuintl.la] Error code 138 >>>=20 >>> It failed in _fseeko doing a memset that turned into uses of = "vst1.64 {d16-d17}, [r0]" instructions, for an address in = register r0 that ended in 0xa4, so was not aligned to 8 byte boundaries. = =46rom what I read such "VSTn (multiple n-element structures)" that have = .64 require 8 byte alignment. The evidence of the code and register = value follow. >>>=20 >>>> # gdb /usr/local/arm-gnueabi-freebsd/bin/ar = /usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gette= xt-tools/intl/ar.core >>>> . . . >>>> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at /usr/src/lib/libc/stdio/fseek.c:299 >>>> 299 memset(&fp->_mbstate, 0, sizeof(mbstate_t)); >>>> . . . >>>> (gdb) x/24i 0x2033adb0 >>>> 0x2033adb0 <_fseeko+836>: vmov.i32 q8, #0 ; 0x00000000 >>>> 0x2033adb4 <_fseeko+840>: movw r1, #65503 ; 0xffdf >>>> 0x2033adb8 <_fseeko+844>: stm r4, {r0, r7} >>>> 0x2033adbc <_fseeko+848>: ldrh r0, [r4, #12] >>>> 0x2033adc0 <_fseeko+852>: and r0, r0, r1 >>>> 0x2033adc4 <_fseeko+856>: strh r0, [r4, #12] >>>> 0x2033adc8 <_fseeko+860>: add r0, r4, #216 ; 0xd8 >>>> 0x2033adcc <_fseeko+864>: vst1.64 {d16-d17}, [r0] >>>> 0x2033add0 <_fseeko+868>: add r0, r4, #200 ; 0xc8 >>>> 0x2033add4 <_fseeko+872>: vst1.64 {d16-d17}, [r0] >>>> 0x2033add8 <_fseeko+876>: add r0, r4, #184 ; 0xb8 >>>> 0x2033addc <_fseeko+880>: vst1.64 {d16-d17}, [r0] >>>> 0x2033ade0 <_fseeko+884>: add r0, r4, #168 ; 0xa8 >>>> 0x2033ade4 <_fseeko+888>: vst1.64 {d16-d17}, [r0] >>>> 0x2033ade8 <_fseeko+892>: add r0, r4, #152 ; 0x98 >>>> 0x2033adec <_fseeko+896>: vst1.64 {d16-d17}, [r0] >>>> 0x2033adf0 <_fseeko+900>: add r0, r4, #136 ; 0x88 >>>> 0x2033adf4 <_fseeko+904>: vst1.64 {d16-d17}, [r0] >>>> 0x2033adf8 <_fseeko+908>: add r0, r4, #120 ; 0x78 >>>> 0x2033adfc <_fseeko+912>: vst1.64 {d16-d17}, [r0] >>>> 0x2033ae00 <_fseeko+916>: add r0, r4, #104 ; 0x68 >>>> 0x2033ae04 <_fseeko+920>: vst1.64 {d16-d17}, [r0] >>>> 0x2033ae08 <_fseeko+924>: b 0x2033b070 <_fseeko+1540> >>>> 0x2033ae0c <_fseeko+928>: cmp r5, #0 ; 0x0 >>>> (gdb) info all-registers >>>> r0 0x20651ea4 543497892 >>>> r1 0xffdf 65503 >>>> r2 0x0 0 >>>> r3 0x0 0 >>>> r4 0x20651dcc 543497676 >>>> r5 0x0 0 >>>> r6 0x0 0 >>>> r7 0x0 0 >>>> r8 0x20359df4 540384756 >>>> r9 0x0 0 >>>> r10 0x0 0 >>>> r11 0xbfbfb948 -1077954232 >>>> r12 0x2037b208 540520968 >>>> sp 0xbfbfb898 -1077954408 >>>> lr 0x2035a004 540385284 >>>> pc 0x2033adcc 540257740 >>>> f0 0 (raw 0x000000000000000000000000) >>>> f1 0 (raw 0x000000000000000000000000) >>>> f2 0 (raw 0x000000000000000000000000) >>>> f3 0 (raw 0x000000000000000000000000) >>>> f4 0 (raw 0x000000000000000000000000) >>>> f5 0 (raw 0x000000000000000000000000) >>>> f6 0 (raw 0x000000000000000000000000) >>>> f7 0 (raw 0x000000000000000000000000) >>>> fps 0x0 0 >>>> cpsr 0x60000010 1610612752 >>>=20 >>> The syntax in use for vst1.64 instructions does not explicitly have = the alignment notation. Presuming that the decoding is correct then from = what I read the following applies: >>>=20 >>>> Home > NEON and VFP Programming > NEON load and store element and = structure instructions > Alignment restrictions in load and store, = element and structure instructions >>>>=20 >>>> . . . When the alignment is not specified in the instruction, the = alignment restriction is controlled by the A bit (SCTLR bit[1]): >>>> =E2=80=A2 if the A bit is 0, there are no alignment = restrictions (except for strongly ordered or device memory, where = accesses must be element aligned or the result is unpredictable) >>>> =E2=80=A2 if the A bit is 1, accesses must be element = aligned. >>>> If an address is not correctly aligned, an alignment fault occurs. >>>=20 >>> So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus error = would have the context to happen because of the mis-alignment. >>>=20 >>> The following shows the make.conf context that explains how = /usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked: >>>=20 >>>> # more /etc/make.conf >>>> WRKDIRPREFIX=3D/usr/obj/portswork >>>> WITH_DEBUG=3D >>>> WITH_DEBUG_FILES=3D >>>> MALLOC_PRODUCTION=3D >>>> # >>>> TO_TYPE=3Darmv6 >>>> TOOLS_TO_TYPE=3Darm-gnueabi >>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>>> .if ${.MAKE.LEVEL} =3D=3D 0 >>>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>> .export CC >>>> .export CXX >>>> .export CPP >>>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>>> .export AS >>>> .export AR >>>> .export LD >>>> .export NM >>>> .export OBJCOPY >>>> .export OBJDUMP >>>> .export RANLIB >>>> .export SIZE >>>> .export STRINGS >>>> .endif >>>=20 >>>=20 >>> Other context: >>>=20 >>>> # freebsd-version -ku; uname -aKU >>>> 11.0-CURRENT >>>> 11.0-CURRENT >>>> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue Dec = 22 22:02:21 PST 2015 = root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG arm = 1100091 1100091 >>>=20 >>>=20 >>>=20 >>> I will note that world and kernel are my own build of -r292413 = (earlier experiment) --a build made from an amd64 host context and put = in place via DESTDIR=3D. My expectation would be that the amd64 context = would not be likely to have similar alignment restrictions involved in = its ar activity (or other activity). That would explain how I got this = far using such a clang 3.7 related toolchain for targeting an rpi2 = before finding such a problem. >>=20 >>=20 >> I realized re-reading the all above that it seems to suggest that the = _fseeko code involved is from /usr/local/arm-gnueabi-freebsd/bin/ar but = that was not my intent. >>=20 >> libc.so.7 is from my buildworld, including the fseeko implementation: >>=20 >> Reading symbols from /lib/libc.so.7...Reading symbols from = /usr/lib/debug//lib/libc.so.7.debug...done. >> done. >> Loaded symbols for /lib/libc.so.7 >>=20 >>=20 >> head/sys/sys/_types.h has: >>=20 >> /* >> * mbstate_t is an opaque object to keep conversion state during = multibyte >> * stream conversions. >> */ >> typedef union { >> char __mbstate8[128]; >> __int64_t _mbstateL; /* for alignment */ >> } __mbstate_t; >>=20 >> suggesting an implicit alignment of the union to whatever the = implementation defines for __int64_t --which need not be 8 byte = alignment (in the abstract, general case). But 8 byte alignment is a = possibility as well (in the abstract). >>=20 >> But printing *fp in gdb for the fp argument to _fseeko reports the = same not-8-byte aligned address for __mbstate8 that was in r0: >>=20 >>> (gdb) bt >>> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at /usr/src/lib/libc/stdio/fseek.c:299 >>> #1 0x2033b108 in fseeko (fp=3D0x20651dcc, offset=3D18571438587904, = whence=3D0) at /usr/src/lib/libc/stdio/fseek.c:82 >>> #2 0x00016138 in ?? () >>> (gdb) print fp >>> $2 =3D (FILE *) 0x20651dcc >>> (gdb) print *fp >>> $3 =3D {_p =3D 0x2069a240 "", _r =3D 0, _w =3D 0, _flags =3D 5264, = _file =3D 36, _bf =3D {_base =3D 0x2069a240 "", _size =3D 32768}, = _lbfsize =3D 0, _cookie =3D 0x20651dcc, _close =3D 0x20359dfc = <__sclose>, >>> _read =3D 0x20359de4 <__sread>, _seek =3D 0x20359df4 <__sseek>, = _write =3D 0x20359dec <__swrite>, _ub =3D {_base =3D 0x0, _size =3D 0}, = _up =3D 0x0, _ur =3D 0, _ubuf =3D 0x20651e0c "", _nbuf =3D 0x20651e0f = "", _lb =3D { >>> _base =3D 0x0, _size =3D 0}, _blksize =3D 32768, _offset =3D 0, = _fl_mutex =3D 0x0, _fl_owner =3D 0x0, _fl_count =3D 0, _orientation =3D = 0, _mbstate =3D {__mbstate8 =3D 0x20651e34 "", _mbstateL =3D 0}, _flags2 = =3D 0} >>=20 >> The overall FILE struct containing the _mbstate field is also not = 8-byte aligned. But the offset from the start of the FILE struct to = __mbstate8 is a multiple of 8 bytes. >>=20 >> It is my interpretation that there is nothing here to justify the = memset implementation combination: >>=20 >> SCTLR bit[1]=3D=3D1 >>=20 >> mixed with >>=20 >> vst1.64 instructions >>=20 >> I.e.: one or both needs to change unless some way for forcing 8-byte = alignment is introduced. >>=20 >> I have not managed to track down anything that would indicate = FreeBSD's intent for SCTLR bit[1]. I do not even know if it is required = by the design to be constant (once initialized). >=20 >=20 > I have (so far) removed the build tool crashes based on adding = -fmax-type-align=3D4 to avoid the misaligned accesses. Details follow. >=20 > src.conf on amd64 for the rpi2 targeting buildworld/buildkernel now = looks like: >=20 >> # more ~/src.configs/src.conf.rpi2-clang.amd64-host >> TO_TYPE=3Darmv6 >> TOOLS_TO_TYPE=3Darm-gnueabi >> FROM_TYPE=3Damd64 >> TOOLS_FROM_TYPE=3Dx86_64 >> VERSION_CONTEXT=3D11.0 >> # >> KERNCONF=3DRPI2-NODBG >> TARGET=3Darm >> .if ${.MAKE.LEVEL} =3D=3D 0 >> TARGET_ARCH=3D${TO_TYPE} >> .export TARGET_ARCH >> .endif >> # >> WITHOUT_CROSS_COMPILER=3D >> # >> # For WITH_BOOT=3D . . . >> # arm-gnueabi-freebsd/bin/ld reports bootinfo.o: relocation = R_ARM_MOVW_ABS_NC against `a local symbol' can not be used when making a = shared object; recompile with -fPIC >> WITHOUT_BOOT=3D >> # >> WITH_FAST_DEPEND=3D >> WITH_LIBCPLUSPLUS=3D >> WITH_CLANG=3D >> WITH_CLANG_IS_CC=3D >> WITH_CLANG_FULL=3D >> WITH_LLDB=3D >> WITH_CLANG_EXTRAS=3D >> # >> WITHOUT_LIB32=3D >> WITHOUT_GCC=3D >> WITHOUT_GNUCXX=3D >> # >> NO_WERROR=3D >> MALLOC_PRODUCTION=3D >> #CFLAGS+=3D -DELF_VERBOSE >> # >> WITH_DEBUG=3D >> WITH_DEBUG_FILES=3D >> # >> # TOOLS_TO_TYPE based on ${TO_TYPE}-xtoolchain-gcc related = bintutils... >> # >> #CROSS_TOOLCHAIN=3D${TO_TYPE}-gcc >> X_COMPILER_TYPE=3Dclang >> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >> .if ${.MAKE.LEVEL} =3D=3D 0 >> XCC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >> XCXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >> XCPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >> .export XCC >> .export XCXX >> .export XCPP >> XAS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >> XAR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >> XLD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >> XNM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >> XOBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >> XOBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >> XRANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >> XSIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >> #NO-SUCH: XSTRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >> XSTRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >> .export XAS >> .export XAR >> .export XLD >> .export XNM >> .export XOBJCOPY >> .export XOBJDUMP >> .export XRANLIB >> .export XSIZE >> .export XSTRINGS >> .endif >> # >> # Host compiler stuff: >> .if ${.MAKE.LEVEL} =3D=3D 0 >> CC=3D/usr/bin/clang -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >> CXX=3D/usr/bin/clang++ -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >> CPP=3D/usr/bin/clang-cpp -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >> .export CC >> .export CXX >> .export CPP >> AS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/as >> AR=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ar >> LD=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ld >> NM=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/nm >> OBJCOPY=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objcopy >> OBJDUMP=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objdump >> RANLIB=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ranlib >> SIZE=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/size >> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/strings >> STRINGS=3D/usr/local/bin/${TOOLS_FROM_TYPE}-freebsd-strings >> .export AS >> .export AR >> .export LD >> .export NM >> .export OBJCOPY >> .export OBJDUMP >> .export RANLIB >> .export SIZE >> .export STRINGS >> .endif >=20 > make.conf for during the on-rpi2 port builds now looks like: >=20 >> $ more /etc/make.conf >> WRKDIRPREFIX=3D/usr/obj/portswork >> WITH_DEBUG=3D >> WITH_DEBUG_FILES=3D >> MALLOC_PRODUCTION=3D >> # >> TO_TYPE=3Darmv6 >> TOOLS_TO_TYPE=3Darm-gnueabi >> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >> .if ${.MAKE.LEVEL} =3D=3D 0 >> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >> .export CC >> .export CXX >> .export CPP >> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >> .export AS >> .export AR >> .export LD >> .export NM >> .export OBJCOPY >> .export OBJDUMP >> .export RANLIB >> .export SIZE >> .export STRINGS >> .endif >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 >=20 >=20 > _______________________________________________ > freebsd-toolchain@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain > To unsubscribe, send any mail to = "freebsd-toolchain-unsubscribe@freebsd.org" --Apple-Mail=_B83D07DB-888A-4D4F-96E0-1636BFAA4EB5 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQIcBAEBCgAGBQJWfZ7OAAoJEGwc0Sh9sBEABzwP/1I+blTKkKLGQXoESMEjWeio 5aSCqiKQrqx2thpeAsTDWbWBnXXfcjPY5Kgs+t5gjLMaj/oI6Lq360ySSksdGvlp 1wf7x2LTG1NiJ2oAdRBWCYs+sYW7JCjicjw1SI28rqo6o+Wno5nvrnTTLm2gnyhx GLZ2prxOXo6vCVDhU60R6vc5jj98UbtLTS6TdWlV+urLTlvRaP1jXLz2aSmVchLn hjSZVlvMgszPQR90Fh5Fa873flpbzCHcg6usy8gOU/IqMOnPC4pcEstaCTbtwbkP OMOaYZ3w8uLCESqzhM5qMriVbToKOs/ZZ9EQwHMkdlJ2qACq4umtWItyqhyhgdh5 4NsVDVJdmIOtXeGBY5uAi0fVurModb/qjhDJWGoxd4up+BmA+znPZ3GPcdLhcNUm 2W4YYu6Lx0emrA7WhHU19ZIBSe0Nb6F1LC4CCPDNUdSvCLEX9g7yjsOsmeSijbQS O/AACTVX0QdUfVHDZaPhgy9C/GVlV8zCovsFh9/1owZEngXA9BoqtSYbgMIEQxVl owF+elTanSHf+b25B7s9oBs2Vivqhg5XLWKcLU0JN69wnMg1MnEZk9e99g0qSui3 BqendD3Mm3iA2/5eZU1fIMPNpCzqv2GHyUwyGS3Vy1s8PmaEUy/OHfifHtnAx6vN k+rxdgljeSUtGkjrb4ZP =M8/v -----END PGP SIGNATURE----- --Apple-Mail=_B83D07DB-888A-4D4F-96E0-1636BFAA4EB5-- From owner-freebsd-toolchain@freebsd.org Fri Dec 25 22:14:11 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 93D01A5003F for ; Fri, 25 Dec 2015 22:14:11 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-211-151.reflexion.net [208.70.211.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 56F54145E for ; Fri, 25 Dec 2015 22:14:10 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 15612 invoked from network); 25 Dec 2015 22:14:15 -0000 Received: from unknown (HELO rtc-sm-01.app.dca.reflexion.local) (10.81.150.1) by 0 (rfx-qmail) with SMTP; 25 Dec 2015 22:14:15 -0000 Received: by rtc-sm-01.app.dca.reflexion.local (Reflexion email security v7.80.0) with SMTP; Fri, 25 Dec 2015 17:14:10 -0500 (EST) Received: (qmail 513 invoked from network); 25 Dec 2015 22:14:10 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with SMTP; 25 Dec 2015 22:14:10 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-76-115-7-162.hsd1.or.comcast.net [76.115.7.162]) by iron2.pdx.net (Postfix) with ESMTPSA id 375C11C43E4; Fri, 25 Dec 2015 14:14:04 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: 11.0-CURRENT (r292413) on a rpi2b: arm-gnueabi-freebsd/bin/ar, _fseeko, and memset vs memory alignment (SCTRL bit[1]=1?): Explains the Bus error? From: Mark Millard In-Reply-To: <0D81C2CA-BF1C-4C14-B816-A8C5F68715B5@bsdimp.com> Date: Fri, 25 Dec 2015 14:14:07 -0800 Cc: freebsd-arm@freebsd.org, FreeBSD Toolchain , Ian Lepore , mat@FreeBSD.org, sbruno@FreeBSD.org Content-Transfer-Encoding: quoted-printable Message-Id: <51EB4AAB-BC81-4282-BA4D-D329C41D660B@dsl-only.net> References: <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> <0D81C2CA-BF1C-4C14-B816-A8C5F68715B5@bsdimp.com> To: Warner Losh X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Dec 2015 22:14:11 -0000 [I'm going to break much of the earlier "original material" text to tail = of the message.] > On 2015-Dec-25, at 11:53 AM, Warner Losh wrote: >=20 > So what happens if we actually fix the underlying bug? >=20 > I see two ways of doing this. In findfp.c, we allocate an array of = FILE * today like: > g =3D (struct glue *)malloc(sizeof(*g) + ALIGNBYTES + n * = sizeof(FILE)); > but that assumes that FILE just has normal pointer alignment = requirements. However, > due to the mbstate having int64_t alignment requirements, this is = wrong. Maybe we > need to do something like > g =3D (struct glue *)malloc(sizeof(*g) + = max(sizeof(int64_t),ALIGNBYTES) + n * sizeof(FILE)); > which wouldn=E2=80=99t change anything on LP64 systems, but would = result in proper alignment > for ILP32 systems. We=E2=80=99d have to fix the loop that uses ALIGN = afterwards to use > roundup. Instead, we=E2=80=99d need to round up to the neared 8-byte = aligned offset (or technically, > the max of ALIGNBYTES and 8, but that=E2=80=99s always 8 on today=E2=80=99= s systems. If we do this, > we can make sure that each file is 8-byte aligned or better. We may = need to round up > sizeof(FILE) to a multiple of 8 as well. I believe that since it has = the 8-byte alignment > for a member, its size must be a multiple of 8, but I=E2=80=99ve not = chased that belief to ground. > If not, we may need another decorator (__aligned(8), I think, spelled = with the ugly > max expression above). That way, the contract we=E2=80=99re making = with the compiler will > always be true. ALIGN BYTES is 4 on Arm anyway, so that bit is clearly = wrong. >=20 > This wouldn=E2=80=99t be an ABI change, since you can only get a valid = FILE * from fopen (and > friends), plus stdin, stdout, and stderr. Those addresses aren=E2=80=99t= hard coded into binaries, > so even if we have to tweak the last three and deal with some = =E2=80=98fake=E2=80=99 FILE abuse in libc > (which I don=E2=80=99t think suffers from this issue, btw, given the = alignment requirements that would > naturally follow from something on the stack), we=E2=80=99d still be = ahead. At least for all CONFORMING > implementations[*]... >=20 > TL;DR: Why not make FILE * always 8-byte aligned? The compiler options = are a band-aide. >=20 > Warner >=20 > [*] There=E2=80=99s at least on popular package that has a copy of the = FILE structure in one of its > .h files and uses that to do unnatural optimization things, but even = that=E2=80=99s cool, I think, > since it never allocates a new one. >=20 The ARM documentation mentions cases of 16 byte alignment requirements. = I've no clue if the clang code generation ever creates such code. There = might be wider requirements possible in arm code as well. (I'm not an = arm expert.) As an example of an implication: "The malloc() function = returns a pointer to a block of at least size bytes suitably aligned for = any use." In other words: aligned to some figure that is a multiple of = *every* alignment requirement that the code generator can produce, = possibly being the least common multiple. "-fmax-type-align=3D. . ." is a means of controlling/limiting the range = of potential alignments to no more than a fixed, predefined value. Above = that and the code generation has to work in small size accesses and = build-up/split-up bigger values. Using "-fmax-type-align=3D. . ." allows = defining a figure as part of an ABI that is then not subject to code = generator updates that could increase the maximum alignment figure and = break things: It turns off such new capabilities. Other options need not = work that way to preserve the ABI. But in the most fundamental terms process wise as far as I can tell. . . While the FILE case that occurred is a specific example, every = memory-allocation-like operation is at a potential issue for all such = "allocated" objects where the related code generation requires alignment = to avoid Bus Error (given the SCTLR bit[1] in use). How many other places in FreeBSD might sometimes return mis-aligned = pointers for the existing code generation and ABI combination? How many other places are subject to breakage when "internal" = structs/unions/fields involved are changed to be of a different size = because the code is not fully auto-adjusting to match the code = generation yet --even if right now "it works"? How fragile will things = be for future work? What would it take to find out and deal with them all? (I do not have = the background knowledge to span much.) My experiment avoided potentially changing parts of the ABI and also = avoided dealing with such a "lots of code to investigate" issue. It may = not be the long term 11.0-RELEASE solution. Even if not, it may be = appropriate for various temporary purposes that need to avoid Bus Errors = in the process. For example if Ian has a good reason to use clang 3.7 = instead of gcc 4.2.1. Other notes: > I believe that since it has the 8-byte alignment > for a member, its size must be a multiple of 8 There are some C/C++ language rules about the address of a structure = equalling the address of the first field, uniformity of the offsets, and = the like. But. . . The C and C++ languages specify no specific numerical alignment figures, = not even relative to specific sizeof(...) expressions. To use an old = example: a 68010 only needs alignment for >=3D 2 byte things and even = alignment is all that is then required. Some other contexts take a lot = more to meet the specifications. There are some implications of the = modern memory model(s) created to cover concurrency explicitly, such as = avoiding interactions that can happen via, for example, separate objects = (in part) sharing a cache line. (I've only looked at C++ for this, and = only to a degree.) The detailed alignment rules are more "implementation defined" than = "predefined by the standard". But the definition is trying to meet = language criteria. It is not a fully independent choice. May be some other standards that FreeBSD is tied to specify more = specifics, such as a N byte integer always aligns to some multiple of N = (a waste on the 68010), including the alignment for union or struct that = it may be a part of tracking. But such rules force padding that may or = may not be required to meet the language's more abstract criteria and = such rules may not match the existing/in-use ABI. So far as I can tell explicitly declared alignments may well be = necessary. If that one "popular package", say, formed an array of FILE = copies then the resultant alignments need not all match the ones = produced by your example code unless the FILE declaration forces the = compiler to match, causing sizeof(FILE) to track as well. FILE need not = be the only such issue. My background and reference material are mostly tied the languages --and = so my notes tend to be limited to that much context. Original material: > On Dec 25, 2015, at 7:24 AM, Mark Millard wrote: >=20 > [Good News Summary: Rebuilding buildworld/buildkernel for rpi2 = 11.0-CURRENT 292413 from amd64 based on adding -fmax-type-align=3D4 has = so far removed the crashes during the toolchain activity: no more = misaligned accesses in libc's _fseeko or elsewhere.] >=20 > On 2015-Dec-25, at 12:31 AM, Mark Millard wrote: >=20 >> On 2015-Dec-24, at 10:39 PM, Mark Millard = wrote: >>=20 >>> [I do not know if this partial crash analysis related to on-arm = clang-associated activity is good enough and appropriate to submit or = not.] >>>=20 >>> The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved = below came from pkg install activity instead of port building. Used = as-is. >>>=20 >>> When I just tried my first from-rpi2b builds (ports for a rpi2b), = /usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the = following suggests an alignment error for the type of instructions that = memset for 128 bytes was translated to (sizeof(mbstate_t)) in the code = used by /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to = check SCTLR bit[1] to be directly sure that alignment was being = enforced.) >>>=20 >>> The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar : >>>=20 >>>> libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru = .libs/libgnuintl.a bindtextdom.o dcgettext.o dgettext.o gettext.o = finddomain.o hash-string.o loadmsgcat.o localealias.o textdomain.o = l10nflist.o explodename.o dcigettext.o dcngettext.o dngettext.o = ngettext.o pluralx.o plural-exp.o localcharset.o threadlib.o lock.o = relocatable.o langprefs.o localename.o log.o printf.o setlocale.o = version.o xsize.o osdep.o intl-compat.o >>>> Bus error (core dumped) >>>> *** [libgnuintl.la] Error code 138 >>>=20 >>> It failed in _fseeko doing a memset that turned into uses of = "vst1.64 {d16-d17}, [r0]" instructions, for an address in = register r0 that ended in 0xa4, so was not aligned to 8 byte boundaries. = =46rom what I read such "VSTn (multiple n-element structures)" that have = .64 require 8 byte alignment. The evidence of the code and register = value follow. >>>=20 >>>> # gdb /usr/local/arm-gnueabi-freebsd/bin/ar = /usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gette= xt-tools/intl/ar.core >>>> . . . >>>> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at /usr/src/lib/libc/stdio/fseek.c:299 >>>> 299 memset(&fp->_mbstate, 0, sizeof(mbstate_t)); >>>> . . . >>>> (gdb) x/24i 0x2033adb0 >>>> 0x2033adb0 <_fseeko+836>: vmov.i32 q8, #0 ; 0x00000000 >>>> 0x2033adb4 <_fseeko+840>: movw r1, #65503 ; 0xffdf >>>> 0x2033adb8 <_fseeko+844>: stm r4, {r0, r7} >>>> 0x2033adbc <_fseeko+848>: ldrh r0, [r4, #12] >>>> 0x2033adc0 <_fseeko+852>: and r0, r0, r1 >>>> 0x2033adc4 <_fseeko+856>: strh r0, [r4, #12] >>>> 0x2033adc8 <_fseeko+860>: add r0, r4, #216 ; 0xd8 >>>> 0x2033adcc <_fseeko+864>: vst1.64 {d16-d17}, [r0] >>>> 0x2033add0 <_fseeko+868>: add r0, r4, #200 ; 0xc8 >>>> 0x2033add4 <_fseeko+872>: vst1.64 {d16-d17}, [r0] >>>> 0x2033add8 <_fseeko+876>: add r0, r4, #184 ; 0xb8 >>>> 0x2033addc <_fseeko+880>: vst1.64 {d16-d17}, [r0] >>>> 0x2033ade0 <_fseeko+884>: add r0, r4, #168 ; 0xa8 >>>> 0x2033ade4 <_fseeko+888>: vst1.64 {d16-d17}, [r0] >>>> 0x2033ade8 <_fseeko+892>: add r0, r4, #152 ; 0x98 >>>> 0x2033adec <_fseeko+896>: vst1.64 {d16-d17}, [r0] >>>> 0x2033adf0 <_fseeko+900>: add r0, r4, #136 ; 0x88 >>>> 0x2033adf4 <_fseeko+904>: vst1.64 {d16-d17}, [r0] >>>> 0x2033adf8 <_fseeko+908>: add r0, r4, #120 ; 0x78 >>>> 0x2033adfc <_fseeko+912>: vst1.64 {d16-d17}, [r0] >>>> 0x2033ae00 <_fseeko+916>: add r0, r4, #104 ; 0x68 >>>> 0x2033ae04 <_fseeko+920>: vst1.64 {d16-d17}, [r0] >>>> 0x2033ae08 <_fseeko+924>: b 0x2033b070 <_fseeko+1540> >>>> 0x2033ae0c <_fseeko+928>: cmp r5, #0 ; 0x0 >>>> (gdb) info all-registers >>>> r0 0x20651ea4 543497892 >>>> r1 0xffdf 65503 >>>> r2 0x0 0 >>>> r3 0x0 0 >>>> r4 0x20651dcc 543497676 >>>> r5 0x0 0 >>>> r6 0x0 0 >>>> r7 0x0 0 >>>> r8 0x20359df4 540384756 >>>> r9 0x0 0 >>>> r10 0x0 0 >>>> r11 0xbfbfb948 -1077954232 >>>> r12 0x2037b208 540520968 >>>> sp 0xbfbfb898 -1077954408 >>>> lr 0x2035a004 540385284 >>>> pc 0x2033adcc 540257740 >>>> f0 0 (raw 0x000000000000000000000000) >>>> f1 0 (raw 0x000000000000000000000000) >>>> f2 0 (raw 0x000000000000000000000000) >>>> f3 0 (raw 0x000000000000000000000000) >>>> f4 0 (raw 0x000000000000000000000000) >>>> f5 0 (raw 0x000000000000000000000000) >>>> f6 0 (raw 0x000000000000000000000000) >>>> f7 0 (raw 0x000000000000000000000000) >>>> fps 0x0 0 >>>> cpsr 0x60000010 1610612752 >>>=20 >>> The syntax in use for vst1.64 instructions does not explicitly have = the alignment notation. Presuming that the decoding is correct then from = what I read the following applies: >>>=20 >>>> Home > NEON and VFP Programming > NEON load and store element and = structure instructions > Alignment restrictions in load and store, = element and structure instructions >>>>=20 >>>> . . . When the alignment is not specified in the instruction, the = alignment restriction is controlled by the A bit (SCTLR bit[1]): >>>> =E2=80=A2 if the A bit is 0, there are no alignment = restrictions (except for strongly ordered or device memory, where = accesses must be element aligned or the result is unpredictable) >>>> =E2=80=A2 if the A bit is 1, accesses must be element = aligned. >>>> If an address is not correctly aligned, an alignment fault occurs. >>>=20 >>> So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus error = would have the context to happen because of the mis-alignment. >>>=20 >>> The following shows the make.conf context that explains how = /usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked: >>>=20 >>>> # more /etc/make.conf >>>> WRKDIRPREFIX=3D/usr/obj/portswork >>>> WITH_DEBUG=3D >>>> WITH_DEBUG_FILES=3D >>>> MALLOC_PRODUCTION=3D >>>> # >>>> TO_TYPE=3Darmv6 >>>> TOOLS_TO_TYPE=3Darm-gnueabi >>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>>> .if ${.MAKE.LEVEL} =3D=3D 0 >>>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>> .export CC >>>> .export CXX >>>> .export CPP >>>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>>> .export AS >>>> .export AR >>>> .export LD >>>> .export NM >>>> .export OBJCOPY >>>> .export OBJDUMP >>>> .export RANLIB >>>> .export SIZE >>>> .export STRINGS >>>> .endif >>>=20 >>>=20 >>> Other context: >>>=20 >>>> # freebsd-version -ku; uname -aKU >>>> 11.0-CURRENT >>>> 11.0-CURRENT >>>> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue Dec = 22 22:02:21 PST 2015 = root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG arm = 1100091 1100091 >>>=20 >>>=20 >>>=20 >>> I will note that world and kernel are my own build of -r292413 = (earlier experiment) --a build made from an amd64 host context and put = in place via DESTDIR=3D. My expectation would be that the amd64 context = would not be likely to have similar alignment restrictions involved in = its ar activity (or other activity). That would explain how I got this = far using such a clang 3.7 related toolchain for targeting an rpi2 = before finding such a problem. >>=20 >>=20 >> I realized re-reading the all above that it seems to suggest that the = _fseeko code involved is from /usr/local/arm-gnueabi-freebsd/bin/ar but = that was not my intent. >>=20 >> libc.so.7 is from my buildworld, including the fseeko implementation: >>=20 >> Reading symbols from /lib/libc.so.7...Reading symbols from = /usr/lib/debug//lib/libc.so.7.debug...done. >> done. >> Loaded symbols for /lib/libc.so.7 >>=20 >>=20 >> head/sys/sys/_types.h has: >>=20 >> /* >> * mbstate_t is an opaque object to keep conversion state during = multibyte >> * stream conversions. >> */ >> typedef union { >> char __mbstate8[128]; >> __int64_t _mbstateL; /* for alignment */ >> } __mbstate_t; >>=20 >> suggesting an implicit alignment of the union to whatever the = implementation defines for __int64_t --which need not be 8 byte = alignment (in the abstract, general case). But 8 byte alignment is a = possibility as well (in the abstract). >>=20 >> But printing *fp in gdb for the fp argument to _fseeko reports the = same not-8-byte aligned address for __mbstate8 that was in r0: >>=20 >>> (gdb) bt >>> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at /usr/src/lib/libc/stdio/fseek.c:299 >>> #1 0x2033b108 in fseeko (fp=3D0x20651dcc, offset=3D18571438587904, = whence=3D0) at /usr/src/lib/libc/stdio/fseek.c:82 >>> #2 0x00016138 in ?? () >>> (gdb) print fp >>> $2 =3D (FILE *) 0x20651dcc >>> (gdb) print *fp >>> $3 =3D {_p =3D 0x2069a240 "", _r =3D 0, _w =3D 0, _flags =3D 5264, = _file =3D 36, _bf =3D {_base =3D 0x2069a240 "", _size =3D 32768}, = _lbfsize =3D 0, _cookie =3D 0x20651dcc, _close =3D 0x20359dfc = <__sclose>, >>> _read =3D 0x20359de4 <__sread>, _seek =3D 0x20359df4 <__sseek>, = _write =3D 0x20359dec <__swrite>, _ub =3D {_base =3D 0x0, _size =3D 0}, = _up =3D 0x0, _ur =3D 0, _ubuf =3D 0x20651e0c "", _nbuf =3D 0x20651e0f = "", _lb =3D { >>> _base =3D 0x0, _size =3D 0}, _blksize =3D 32768, _offset =3D 0, = _fl_mutex =3D 0x0, _fl_owner =3D 0x0, _fl_count =3D 0, _orientation =3D = 0, _mbstate =3D {__mbstate8 =3D 0x20651e34 "", _mbstateL =3D 0}, _flags2 = =3D 0} >>=20 >> The overall FILE struct containing the _mbstate field is also not = 8-byte aligned. But the offset from the start of the FILE struct to = __mbstate8 is a multiple of 8 bytes. >>=20 >> It is my interpretation that there is nothing here to justify the = memset implementation combination: >>=20 >> SCTLR bit[1]=3D=3D1 >>=20 >> mixed with >>=20 >> vst1.64 instructions >>=20 >> I.e.: one or both needs to change unless some way for forcing 8-byte = alignment is introduced. >>=20 >> I have not managed to track down anything that would indicate = FreeBSD's intent for SCTLR bit[1]. I do not even know if it is required = by the design to be constant (once initialized). >=20 >=20 > I have (so far) removed the build tool crashes based on adding = -fmax-type-align=3D4 to avoid the misaligned accesses. Details follow. >=20 > src.conf on amd64 for the rpi2 targeting buildworld/buildkernel now = looks like: >=20 >> # more ~/src.configs/src.conf.rpi2-clang.amd64-host >> TO_TYPE=3Darmv6 >> TOOLS_TO_TYPE=3Darm-gnueabi >> FROM_TYPE=3Damd64 >> TOOLS_FROM_TYPE=3Dx86_64 >> VERSION_CONTEXT=3D11.0 >> # >> KERNCONF=3DRPI2-NODBG >> TARGET=3Darm >> .if ${.MAKE.LEVEL} =3D=3D 0 >> TARGET_ARCH=3D${TO_TYPE} >> .export TARGET_ARCH >> .endif >> # >> WITHOUT_CROSS_COMPILER=3D >> # >> # For WITH_BOOT=3D . . . >> # arm-gnueabi-freebsd/bin/ld reports bootinfo.o: relocation = R_ARM_MOVW_ABS_NC against `a local symbol' can not be used when making a = shared object; recompile with -fPIC >> WITHOUT_BOOT=3D >> # >> WITH_FAST_DEPEND=3D >> WITH_LIBCPLUSPLUS=3D >> WITH_CLANG=3D >> WITH_CLANG_IS_CC=3D >> WITH_CLANG_FULL=3D >> WITH_LLDB=3D >> WITH_CLANG_EXTRAS=3D >> # >> WITHOUT_LIB32=3D >> WITHOUT_GCC=3D >> WITHOUT_GNUCXX=3D >> # >> NO_WERROR=3D >> MALLOC_PRODUCTION=3D >> #CFLAGS+=3D -DELF_VERBOSE >> # >> WITH_DEBUG=3D >> WITH_DEBUG_FILES=3D >> # >> # TOOLS_TO_TYPE based on ${TO_TYPE}-xtoolchain-gcc related = bintutils... >> # >> #CROSS_TOOLCHAIN=3D${TO_TYPE}-gcc >> X_COMPILER_TYPE=3Dclang >> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >> .if ${.MAKE.LEVEL} =3D=3D 0 >> XCC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >> XCXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >> XCPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >> .export XCC >> .export XCXX >> .export XCPP >> XAS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >> XAR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >> XLD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >> XNM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >> XOBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >> XOBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >> XRANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >> XSIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >> #NO-SUCH: XSTRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >> XSTRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >> .export XAS >> .export XAR >> .export XLD >> .export XNM >> .export XOBJCOPY >> .export XOBJDUMP >> .export XRANLIB >> .export XSIZE >> .export XSTRINGS >> .endif >> # >> # Host compiler stuff: >> .if ${.MAKE.LEVEL} =3D=3D 0 >> CC=3D/usr/bin/clang -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >> CXX=3D/usr/bin/clang++ -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >> CPP=3D/usr/bin/clang-cpp -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >> .export CC >> .export CXX >> .export CPP >> AS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/as >> AR=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ar >> LD=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ld >> NM=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/nm >> OBJCOPY=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objcopy >> OBJDUMP=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objdump >> RANLIB=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ranlib >> SIZE=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/size >> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/strings >> STRINGS=3D/usr/local/bin/${TOOLS_FROM_TYPE}-freebsd-strings >> .export AS >> .export AR >> .export LD >> .export NM >> .export OBJCOPY >> .export OBJDUMP >> .export RANLIB >> .export SIZE >> .export STRINGS >> .endif >=20 > make.conf for during the on-rpi2 port builds now looks like: >=20 >> $ more /etc/make.conf >> WRKDIRPREFIX=3D/usr/obj/portswork >> WITH_DEBUG=3D >> WITH_DEBUG_FILES=3D >> MALLOC_PRODUCTION=3D >> # >> TO_TYPE=3Darmv6 >> TOOLS_TO_TYPE=3Darm-gnueabi >> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >> .if ${.MAKE.LEVEL} =3D=3D 0 >> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >> .export CC >> .export CXX >> .export CPP >> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >> .export AS >> .export AR >> .export LD >> .export NM >> .export OBJCOPY >> .export OBJDUMP >> .export RANLIB >> .export SIZE >> .export STRINGS >> .endif >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 >=20 >=20 > _______________________________________________ > freebsd-toolchain@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain > To unsubscribe, send any mail to = "freebsd-toolchain-unsubscribe@freebsd.org" From owner-freebsd-toolchain@freebsd.org Fri Dec 25 23:42:43 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 15433A51AB1 for ; Fri, 25 Dec 2015 23:42:43 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from mail-oi0-x22b.google.com (mail-oi0-x22b.google.com [IPv6:2607:f8b0:4003:c06::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CF20019AD for ; Fri, 25 Dec 2015 23:42:42 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: by mail-oi0-x22b.google.com with SMTP id o62so146890852oif.3 for ; Fri, 25 Dec 2015 15:42:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=sender:subject:mime-version:content-type:from:in-reply-to:date:cc :message-id:references:to; bh=4r33SfTRwWXipcaagNiw6yQTSTgqBm90O8/kDIAGj6o=; b=nBVh+F8zWGO24NnBMAKqZiM066Tp0v8wdK9rzewJkLz+9cGmKFtkQS+ED2MsDTYD+/ uj4+s6k2JzmPl7HxC7ek4ZBmhLJA3SMZyl2MIUmljULtVkzGK2jiN44jdOLKxcYoEj3d JAlKG96sRon5KCkw1AQq9SlT1KyEgf6Zri69meVwKfbr0rbFNn2gtwZ8a0LIJ+URNRY4 kP7wxvg3TyTQ7zDel/pBQWYBMA2Fqb30NBW7Foc6g6xiArYRxm4pmMJqblxtUsp+QugT o9EGpROgAXNzlvZNvqlRG6tAcege5hF/YW6Lc9RPLAsfa+V8pTNJNiB2wvW/QCWOsTJ3 a7YQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:mime-version:content-type:from :in-reply-to:date:cc:message-id:references:to; bh=4r33SfTRwWXipcaagNiw6yQTSTgqBm90O8/kDIAGj6o=; b=Om2J360Tq6C+Cja9fvTbJqhWcUwW5q1Ya4V/5knq7y5jr5IfEFHx2Q5NpFsxKKT9n3 wFevvpUXLwb3/LXi71EJabo32vLqxfsJJD+4mbw4WRVuCEiU9CE2EnuYuDVkmtFCBro1 Rwwr6wmmdfIwIMJ6HGlIACq3IB7a0HmY13ylhzBZpXSXaTezES1V/oQ4df2wolPKjM0Q zt4gkYfWnFjG4M0Yj7ILND0IabjPA/o6hwkxN7gkdbjhekx4WBQPWJMbElx0eqTpPwX8 VHsRAqAtvXfSM6WA5JNx13jyMZIkhtsqIpSEZzxjLOWZdUzabtNXYHR1+46Z034AcXxl I85A== X-Gm-Message-State: ALoCoQnTjeJFo0+QMs5C6WCHRHO470HFiCp8gT5rVj8yJUYuu5KrRhlLX7bAGu63Gb72771V0rPFw0m6hE9J9ZzyCeGTs0mDGg== X-Received: by 10.202.73.67 with SMTP id w64mr23002113oia.84.1451086961655; Fri, 25 Dec 2015 15:42:41 -0800 (PST) Received: from ?IPv6:2601:280:4900:3700:7ce5:ac5c:f359:9182? ([2601:280:4900:3700:7ce5:ac5c:f359:9182]) by smtp.gmail.com with ESMTPSA id kp2sm10045777obb.12.2015.12.25.15.42.40 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 25 Dec 2015 15:42:40 -0800 (PST) Sender: Warner Losh Subject: Re: 11.0-CURRENT (r292413) on a rpi2b: arm-gnueabi-freebsd/bin/ar, _fseeko, and memset vs memory alignment (SCTRL bit[1]=1?): Explains the Bus error? Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Content-Type: multipart/signed; boundary="Apple-Mail=_918263EA-3FFB-4ABA-8809-80F83519528B"; protocol="application/pgp-signature"; micalg=pgp-sha512 X-Pgp-Agent: GPGMail 2.5.2 From: Warner Losh In-Reply-To: <51EB4AAB-BC81-4282-BA4D-D329C41D660B@dsl-only.net> Date: Fri, 25 Dec 2015 16:42:38 -0700 Cc: freebsd-arm , FreeBSD Toolchain , Ian Lepore , mat@FreeBSD.org, sbruno@FreeBSD.org Message-Id: <8B52074F-FDEF-4119-BB04-630F9BE9E6DB@bsdimp.com> References: <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> <0D81C2CA-BF1C-4C14-B816-A8C5F68715B5@bsdimp.com> <51EB4AAB-BC81-4282-BA4D-D329C41D660B@dsl-only.net> To: Mark Millard X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Dec 2015 23:42:43 -0000 --Apple-Mail=_918263EA-3FFB-4ABA-8809-80F83519528B Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Dec 25, 2015, at 3:14 PM, Mark Millard wrote: >=20 > [I'm going to break much of the earlier "original material" text to = tail of the message.] >=20 >> On 2015-Dec-25, at 11:53 AM, Warner Losh wrote: >>=20 >> So what happens if we actually fix the underlying bug? >>=20 >> I see two ways of doing this. In findfp.c, we allocate an array of = FILE * today like: >> g =3D (struct glue *)malloc(sizeof(*g) + ALIGNBYTES + n * = sizeof(FILE)); >> but that assumes that FILE just has normal pointer alignment = requirements. However, >> due to the mbstate having int64_t alignment requirements, this is = wrong. Maybe we >> need to do something like >> g =3D (struct glue *)malloc(sizeof(*g) + = max(sizeof(int64_t),ALIGNBYTES) + n * sizeof(FILE)); >> which wouldn=E2=80=99t change anything on LP64 systems, but would = result in proper alignment >> for ILP32 systems. We=E2=80=99d have to fix the loop that uses ALIGN = afterwards to use >> roundup. Instead, we=E2=80=99d need to round up to the neared 8-byte = aligned offset (or technically, >> the max of ALIGNBYTES and 8, but that=E2=80=99s always 8 on today=E2=80= =99s systems. If we do this, >> we can make sure that each file is 8-byte aligned or better. We may = need to round up >> sizeof(FILE) to a multiple of 8 as well. I believe that since it has = the 8-byte alignment >> for a member, its size must be a multiple of 8, but I=E2=80=99ve not = chased that belief to ground. >> If not, we may need another decorator (__aligned(8), I think, spelled = with the ugly >> max expression above). That way, the contract we=E2=80=99re making = with the compiler will >> always be true. ALIGN BYTES is 4 on Arm anyway, so that bit is = clearly wrong. >>=20 >> This wouldn=E2=80=99t be an ABI change, since you can only get a = valid FILE * from fopen (and >> friends), plus stdin, stdout, and stderr. Those addresses aren=E2=80=99= t hard coded into binaries, >> so even if we have to tweak the last three and deal with some = =E2=80=98fake=E2=80=99 FILE abuse in libc >> (which I don=E2=80=99t think suffers from this issue, btw, given the = alignment requirements that would >> naturally follow from something on the stack), we=E2=80=99d still be = ahead. At least for all CONFORMING >> implementations[*]... >>=20 >> TL;DR: Why not make FILE * always 8-byte aligned? The compiler = options are a band-aide. >>=20 >> Warner >>=20 >> [*] There=E2=80=99s at least on popular package that has a copy of = the FILE structure in one of its >> .h files and uses that to do unnatural optimization things, but even = that=E2=80=99s cool, I think, >> since it never allocates a new one. >>=20 >=20 > The ARM documentation mentions cases of 16 byte alignment = requirements. I've no clue if the clang code generation ever creates = such code. There might be wider requirements possible in arm code as = well. (I'm not an arm expert.) As an example of an implication: "The = malloc() function returns a pointer to a block of at least size bytes = suitably aligned for any use." In other words: aligned to some figure = that is a multiple of *every* alignment requirement that the code = generator can produce, possibly being the least common multiple. >=20 > "-fmax-type-align=3D. . ." is a means of controlling/limiting the = range of potential alignments to no more than a fixed, predefined value. = Above that and the code generation has to work in small size accesses = and build-up/split-up bigger values. Using "-fmax-type-align=3D. . ." = allows defining a figure as part of an ABI that is then not subject to = code generator updates that could increase the maximum alignment figure = and break things: It turns off such new capabilities. Other options need = not work that way to preserve the ABI. That=E2=80=99s true, as far as it goes=E2=80=A6 But I=E2=80=99m not sure = it goes far enough. The premise here is that the problem is wide-spread, = when in fact I think it is quite narrow. > But in the most fundamental terms process wise as far as I can tell. . = . >=20 > While the FILE case that occurred is a specific example, every = memory-allocation-like operation is at a potential issue for all such = "allocated" objects where the related code generation requires alignment = to avoid Bus Error (given the SCTLR bit[1] in use). The problem isn=E2=80=99t general. The problem isn=E2=80=99t malloc. = Malloc will generally return the right thing on arm (and if it = doesn=E2=80=99t, then we need to make sure it does). The problem is we get a boatload of FILEs from the system all at once, = and those are misaligned because of a bug in the code. One that=E2=80=99s = fixed, I believe, in https://reviews.freebsd.org/D4708. > How many other places in FreeBSD might sometimes return mis-aligned = pointers for the existing code generation and ABI combination? It isn=E2=80=99t an ABI thing, just a code bug thing. The only reason it = was an issue was due to the optimizing nature of clang. We=E2=80=99ve had to deal with the arm alignment issues for years. I = wager there are very few indeed. The only reason this was was brought to = light was better code-gen from clang. > How many other places are subject to breakage when "internal" = structs/unions/fields involved are changed to be of a different size = because the code is not fully auto-adjusting to match the code = generation yet --even if right now "it works"? How fragile will things = be for future work? If there are others, I=E2=80=99ll bet they could be counted on one hand = since very few things do the =E2=80=98slab=E2=80=99 allocator that FILE = does. > What would it take to find out and deal with them all? (I do not have = the background knowledge to span much.) >=20 > My experiment avoided potentially changing parts of the ABI and also = avoided dealing with such a "lots of code to investigate" issue. It may = not be the long term 11.0-RELEASE solution. Even if not, it may be = appropriate for various temporary purposes that need to avoid Bus Errors = in the process. For example if Ian has a good reason to use clang 3.7 = instead of gcc 4.2.1. The review above doesn=E2=80=99t change the ABI either. > Other notes: >=20 >> I believe that since it has the 8-byte alignment >> for a member, its size must be a multiple of 8 >=20 > There are some C/C++ language rules about the address of a structure = equalling the address of the first field, uniformity of the offsets, and = the like. But. . . >=20 > The C and C++ languages specify no specific numerical alignment = figures, not even relative to specific sizeof(...) expressions. To use = an old example: a 68010 only needs alignment for >=3D 2 byte things and = even alignment is all that is then required. Some other contexts take a = lot more to meet the specifications. There are some implications of the = modern memory model(s) created to cover concurrency explicitly, such as = avoiding interactions that can happen via, for example, separate objects = (in part) sharing a cache line. (I've only looked at C++ for this, and = only to a degree.) >=20 > The detailed alignment rules are more "implementation defined" than = "predefined by the standard". But the definition is trying to meet = language criteria. It is not a fully independent choice. Many of them are actually defined by a combination of the standard = language definition, as well as the ABI standard. This is why we know = that mbstate_t must be 8 byte aligned. > May be some other standards that FreeBSD is tied to specify more = specifics, such as a N byte integer always aligns to some multiple of N = (a waste on the 68010), including the alignment for union or struct that = it may be a part of tracking. But such rules force padding that may or = may not be required to meet the language's more abstract criteria and = such rules may not match the existing/in-use ABI. It is all spelled out in the ARM EABI docs. > So far as I can tell explicitly declared alignments may well be = necessary. If that one "popular package", say, formed an array of FILE = copies then the resultant alignments need not all match the ones = produced by your example code unless the FILE declaration forces the = compiler to match, causing sizeof(FILE) to track as well. FILE need not = be the only such issue. Arrays of FILEs isn=E2=80=99t an issue (except that it encodes the size = of FILE into the app). It=E2=80=99s the specifically quirky way that = libc does it that=E2=80=99s the problem. > My background and reference material are mostly tied the languages = --and so my notes tend to be limited to that much context. Understood. While there may be issues with alignment still, tossing a = big hammer at the problem because they might exist will likely mean they = will persist far longer than fixing them one at a time. When we first = ported to arm, there were maybe half a dozen places that needed fixing. = I doubt there=E2=80=99s more now. Can you try the patch in the above code review w/o the -f switch and let = me know if it works for you? Warner > Original material: >=20 >> On Dec 25, 2015, at 7:24 AM, Mark Millard = wrote: >>=20 >> [Good News Summary: Rebuilding buildworld/buildkernel for rpi2 = 11.0-CURRENT 292413 from amd64 based on adding -fmax-type-align=3D4 has = so far removed the crashes during the toolchain activity: no more = misaligned accesses in libc's _fseeko or elsewhere.] >>=20 >> On 2015-Dec-25, at 12:31 AM, Mark Millard = wrote: >>=20 >>> On 2015-Dec-24, at 10:39 PM, Mark Millard = wrote: >>>=20 >>>> [I do not know if this partial crash analysis related to on-arm = clang-associated activity is good enough and appropriate to submit or = not.] >>>>=20 >>>> The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved = below came from pkg install activity instead of port building. Used = as-is. >>>>=20 >>>> When I just tried my first from-rpi2b builds (ports for a rpi2b), = /usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the = following suggests an alignment error for the type of instructions that = memset for 128 bytes was translated to (sizeof(mbstate_t)) in the code = used by /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to = check SCTLR bit[1] to be directly sure that alignment was being = enforced.) >>>>=20 >>>> The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar = : >>>>=20 >>>>> libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru = .libs/libgnuintl.a bindtextdom.o dcgettext.o dgettext.o gettext.o = finddomain.o hash-string.o loadmsgcat.o localealias.o textdomain.o = l10nflist.o explodename.o dcigettext.o dcngettext.o dngettext.o = ngettext.o pluralx.o plural-exp.o localcharset.o threadlib.o lock.o = relocatable.o langprefs.o localename.o log.o printf.o setlocale.o = version.o xsize.o osdep.o intl-compat.o >>>>> Bus error (core dumped) >>>>> *** [libgnuintl.la] Error code 138 >>>>=20 >>>> It failed in _fseeko doing a memset that turned into uses of = "vst1.64 {d16-d17}, [r0]" instructions, for an address in = register r0 that ended in 0xa4, so was not aligned to 8 byte boundaries. = =46rom what I read such "VSTn (multiple n-element structures)" that have = .64 require 8 byte alignment. The evidence of the code and register = value follow. >>>>=20 >>>>> # gdb /usr/local/arm-gnueabi-freebsd/bin/ar = /usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gette= xt-tools/intl/ar.core >>>>> . . . >>>>> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at /usr/src/lib/libc/stdio/fseek.c:299 >>>>> 299 memset(&fp->_mbstate, 0, sizeof(mbstate_t)); >>>>> . . . >>>>> (gdb) x/24i 0x2033adb0 >>>>> 0x2033adb0 <_fseeko+836>: vmov.i32 q8, #0 ; 0x00000000 >>>>> 0x2033adb4 <_fseeko+840>: movw r1, #65503 ; 0xffdf >>>>> 0x2033adb8 <_fseeko+844>: stm r4, {r0, r7} >>>>> 0x2033adbc <_fseeko+848>: ldrh r0, [r4, #12] >>>>> 0x2033adc0 <_fseeko+852>: and r0, r0, r1 >>>>> 0x2033adc4 <_fseeko+856>: strh r0, [r4, #12] >>>>> 0x2033adc8 <_fseeko+860>: add r0, r4, #216 ; 0xd8 >>>>> 0x2033adcc <_fseeko+864>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033add0 <_fseeko+868>: add r0, r4, #200 ; 0xc8 >>>>> 0x2033add4 <_fseeko+872>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033add8 <_fseeko+876>: add r0, r4, #184 ; 0xb8 >>>>> 0x2033addc <_fseeko+880>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033ade0 <_fseeko+884>: add r0, r4, #168 ; 0xa8 >>>>> 0x2033ade4 <_fseeko+888>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033ade8 <_fseeko+892>: add r0, r4, #152 ; 0x98 >>>>> 0x2033adec <_fseeko+896>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033adf0 <_fseeko+900>: add r0, r4, #136 ; 0x88 >>>>> 0x2033adf4 <_fseeko+904>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033adf8 <_fseeko+908>: add r0, r4, #120 ; 0x78 >>>>> 0x2033adfc <_fseeko+912>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033ae00 <_fseeko+916>: add r0, r4, #104 ; 0x68 >>>>> 0x2033ae04 <_fseeko+920>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033ae08 <_fseeko+924>: b 0x2033b070 <_fseeko+1540> >>>>> 0x2033ae0c <_fseeko+928>: cmp r5, #0 ; 0x0 >>>>> (gdb) info all-registers >>>>> r0 0x20651ea4 543497892 >>>>> r1 0xffdf 65503 >>>>> r2 0x0 0 >>>>> r3 0x0 0 >>>>> r4 0x20651dcc 543497676 >>>>> r5 0x0 0 >>>>> r6 0x0 0 >>>>> r7 0x0 0 >>>>> r8 0x20359df4 540384756 >>>>> r9 0x0 0 >>>>> r10 0x0 0 >>>>> r11 0xbfbfb948 -1077954232 >>>>> r12 0x2037b208 540520968 >>>>> sp 0xbfbfb898 -1077954408 >>>>> lr 0x2035a004 540385284 >>>>> pc 0x2033adcc 540257740 >>>>> f0 0 (raw 0x000000000000000000000000) >>>>> f1 0 (raw 0x000000000000000000000000) >>>>> f2 0 (raw 0x000000000000000000000000) >>>>> f3 0 (raw 0x000000000000000000000000) >>>>> f4 0 (raw 0x000000000000000000000000) >>>>> f5 0 (raw 0x000000000000000000000000) >>>>> f6 0 (raw 0x000000000000000000000000) >>>>> f7 0 (raw 0x000000000000000000000000) >>>>> fps 0x0 0 >>>>> cpsr 0x60000010 1610612752 >>>>=20 >>>> The syntax in use for vst1.64 instructions does not explicitly have = the alignment notation. Presuming that the decoding is correct then from = what I read the following applies: >>>>=20 >>>>> Home > NEON and VFP Programming > NEON load and store element and = structure instructions > Alignment restrictions in load and store, = element and structure instructions >>>>>=20 >>>>> . . . When the alignment is not specified in the instruction, the = alignment restriction is controlled by the A bit (SCTLR bit[1]): >>>>> =E2=80=A2 if the A bit is 0, there are no alignment = restrictions (except for strongly ordered or device memory, where = accesses must be element aligned or the result is unpredictable) >>>>> =E2=80=A2 if the A bit is 1, accesses must be element = aligned. >>>>> If an address is not correctly aligned, an alignment fault occurs. >>>>=20 >>>> So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus = error would have the context to happen because of the mis-alignment. >>>>=20 >>>> The following shows the make.conf context that explains how = /usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked: >>>>=20 >>>>> # more /etc/make.conf >>>>> WRKDIRPREFIX=3D/usr/obj/portswork >>>>> WITH_DEBUG=3D >>>>> WITH_DEBUG_FILES=3D >>>>> MALLOC_PRODUCTION=3D >>>>> # >>>>> TO_TYPE=3Darmv6 >>>>> TOOLS_TO_TYPE=3Darm-gnueabi >>>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>>>> .if ${.MAKE.LEVEL} =3D=3D 0 >>>>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>>> .export CC >>>>> .export CXX >>>>> .export CPP >>>>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>>>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>>>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>>>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>>>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>>>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>>>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>>>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>>>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>>>> .export AS >>>>> .export AR >>>>> .export LD >>>>> .export NM >>>>> .export OBJCOPY >>>>> .export OBJDUMP >>>>> .export RANLIB >>>>> .export SIZE >>>>> .export STRINGS >>>>> .endif >>>>=20 >>>>=20 >>>> Other context: >>>>=20 >>>>> # freebsd-version -ku; uname -aKU >>>>> 11.0-CURRENT >>>>> 11.0-CURRENT >>>>> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue = Dec 22 22:02:21 PST 2015 = root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG arm = 1100091 1100091 >>>>=20 >>>>=20 >>>>=20 >>>> I will note that world and kernel are my own build of -r292413 = (earlier experiment) --a build made from an amd64 host context and put = in place via DESTDIR=3D. My expectation would be that the amd64 context = would not be likely to have similar alignment restrictions involved in = its ar activity (or other activity). That would explain how I got this = far using such a clang 3.7 related toolchain for targeting an rpi2 = before finding such a problem. >>>=20 >>>=20 >>> I realized re-reading the all above that it seems to suggest that = the _fseeko code involved is from /usr/local/arm-gnueabi-freebsd/bin/ar = but that was not my intent. >>>=20 >>> libc.so.7 is from my buildworld, including the fseeko = implementation: >>>=20 >>> Reading symbols from /lib/libc.so.7...Reading symbols from = /usr/lib/debug//lib/libc.so.7.debug...done. >>> done. >>> Loaded symbols for /lib/libc.so.7 >>>=20 >>>=20 >>> head/sys/sys/_types.h has: >>>=20 >>> /* >>> * mbstate_t is an opaque object to keep conversion state during = multibyte >>> * stream conversions. >>> */ >>> typedef union { >>> char __mbstate8[128]; >>> __int64_t _mbstateL; /* for alignment */ >>> } __mbstate_t; >>>=20 >>> suggesting an implicit alignment of the union to whatever the = implementation defines for __int64_t --which need not be 8 byte = alignment (in the abstract, general case). But 8 byte alignment is a = possibility as well (in the abstract). >>>=20 >>> But printing *fp in gdb for the fp argument to _fseeko reports the = same not-8-byte aligned address for __mbstate8 that was in r0: >>>=20 >>>> (gdb) bt >>>> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at /usr/src/lib/libc/stdio/fseek.c:299 >>>> #1 0x2033b108 in fseeko (fp=3D0x20651dcc, offset=3D18571438587904, = whence=3D0) at /usr/src/lib/libc/stdio/fseek.c:82 >>>> #2 0x00016138 in ?? () >>>> (gdb) print fp >>>> $2 =3D (FILE *) 0x20651dcc >>>> (gdb) print *fp >>>> $3 =3D {_p =3D 0x2069a240 "", _r =3D 0, _w =3D 0, _flags =3D 5264, = _file =3D 36, _bf =3D {_base =3D 0x2069a240 "", _size =3D 32768}, = _lbfsize =3D 0, _cookie =3D 0x20651dcc, _close =3D 0x20359dfc = <__sclose>, >>>> _read =3D 0x20359de4 <__sread>, _seek =3D 0x20359df4 <__sseek>, = _write =3D 0x20359dec <__swrite>, _ub =3D {_base =3D 0x0, _size =3D 0}, = _up =3D 0x0, _ur =3D 0, _ubuf =3D 0x20651e0c "", _nbuf =3D 0x20651e0f = "", _lb =3D { >>>> _base =3D 0x0, _size =3D 0}, _blksize =3D 32768, _offset =3D 0, = _fl_mutex =3D 0x0, _fl_owner =3D 0x0, _fl_count =3D 0, _orientation =3D = 0, _mbstate =3D {__mbstate8 =3D 0x20651e34 "", _mbstateL =3D 0}, _flags2 = =3D 0} >>>=20 >>> The overall FILE struct containing the _mbstate field is also not = 8-byte aligned. But the offset from the start of the FILE struct to = __mbstate8 is a multiple of 8 bytes. >>>=20 >>> It is my interpretation that there is nothing here to justify the = memset implementation combination: >>>=20 >>> SCTLR bit[1]=3D=3D1 >>>=20 >>> mixed with >>>=20 >>> vst1.64 instructions >>>=20 >>> I.e.: one or both needs to change unless some way for forcing 8-byte = alignment is introduced. >>>=20 >>> I have not managed to track down anything that would indicate = FreeBSD's intent for SCTLR bit[1]. I do not even know if it is required = by the design to be constant (once initialized). >>=20 >>=20 >> I have (so far) removed the build tool crashes based on adding = -fmax-type-align=3D4 to avoid the misaligned accesses. Details follow. >>=20 >> src.conf on amd64 for the rpi2 targeting buildworld/buildkernel now = looks like: >>=20 >>> # more ~/src.configs/src.conf.rpi2-clang.amd64-host >>> TO_TYPE=3Darmv6 >>> TOOLS_TO_TYPE=3Darm-gnueabi >>> FROM_TYPE=3Damd64 >>> TOOLS_FROM_TYPE=3Dx86_64 >>> VERSION_CONTEXT=3D11.0 >>> # >>> KERNCONF=3DRPI2-NODBG >>> TARGET=3Darm >>> .if ${.MAKE.LEVEL} =3D=3D 0 >>> TARGET_ARCH=3D${TO_TYPE} >>> .export TARGET_ARCH >>> .endif >>> # >>> WITHOUT_CROSS_COMPILER=3D >>> # >>> # For WITH_BOOT=3D . . . >>> # arm-gnueabi-freebsd/bin/ld reports bootinfo.o: relocation = R_ARM_MOVW_ABS_NC against `a local symbol' can not be used when making a = shared object; recompile with -fPIC >>> WITHOUT_BOOT=3D >>> # >>> WITH_FAST_DEPEND=3D >>> WITH_LIBCPLUSPLUS=3D >>> WITH_CLANG=3D >>> WITH_CLANG_IS_CC=3D >>> WITH_CLANG_FULL=3D >>> WITH_LLDB=3D >>> WITH_CLANG_EXTRAS=3D >>> # >>> WITHOUT_LIB32=3D >>> WITHOUT_GCC=3D >>> WITHOUT_GNUCXX=3D >>> # >>> NO_WERROR=3D >>> MALLOC_PRODUCTION=3D >>> #CFLAGS+=3D -DELF_VERBOSE >>> # >>> WITH_DEBUG=3D >>> WITH_DEBUG_FILES=3D >>> # >>> # TOOLS_TO_TYPE based on ${TO_TYPE}-xtoolchain-gcc related = bintutils... >>> # >>> #CROSS_TOOLCHAIN=3D${TO_TYPE}-gcc >>> X_COMPILER_TYPE=3Dclang >>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>> .if ${.MAKE.LEVEL} =3D=3D 0 >>> XCC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> XCXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> XCPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> .export XCC >>> .export XCXX >>> .export XCPP >>> XAS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>> XAR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>> XLD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>> XNM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>> XOBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>> XOBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>> XRANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>> XSIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>> #NO-SUCH: XSTRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>> XSTRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>> .export XAS >>> .export XAR >>> .export XLD >>> .export XNM >>> .export XOBJCOPY >>> .export XOBJDUMP >>> .export XRANLIB >>> .export XSIZE >>> .export XSTRINGS >>> .endif >>> # >>> # Host compiler stuff: >>> .if ${.MAKE.LEVEL} =3D=3D 0 >>> CC=3D/usr/bin/clang -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >>> CXX=3D/usr/bin/clang++ -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >>> CPP=3D/usr/bin/clang-cpp -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >>> .export CC >>> .export CXX >>> .export CPP >>> AS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/as >>> AR=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ar >>> LD=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ld >>> NM=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/nm >>> OBJCOPY=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objcopy >>> OBJDUMP=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objdump >>> RANLIB=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ranlib >>> SIZE=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/size >>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/strings >>> STRINGS=3D/usr/local/bin/${TOOLS_FROM_TYPE}-freebsd-strings >>> .export AS >>> .export AR >>> .export LD >>> .export NM >>> .export OBJCOPY >>> .export OBJDUMP >>> .export RANLIB >>> .export SIZE >>> .export STRINGS >>> .endif >>=20 >> make.conf for during the on-rpi2 port builds now looks like: >>=20 >>> $ more /etc/make.conf >>> WRKDIRPREFIX=3D/usr/obj/portswork >>> WITH_DEBUG=3D >>> WITH_DEBUG_FILES=3D >>> MALLOC_PRODUCTION=3D >>> # >>> TO_TYPE=3Darmv6 >>> TOOLS_TO_TYPE=3Darm-gnueabi >>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>> .if ${.MAKE.LEVEL} =3D=3D 0 >>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> .export CC >>> .export CXX >>> .export CPP >>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>> .export AS >>> .export AR >>> .export LD >>> .export NM >>> .export OBJCOPY >>> .export OBJDUMP >>> .export RANLIB >>> .export SIZE >>> .export STRINGS >>> .endif >>=20 >>=20 >>=20 >> =3D=3D=3D >> Mark Millard >> markmi at dsl-only.net >>=20 >>=20 >>=20 >> _______________________________________________ >> freebsd-toolchain@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain >> To unsubscribe, send any mail to = "freebsd-toolchain-unsubscribe@freebsd.org" --Apple-Mail=_918263EA-3FFB-4ABA-8809-80F83519528B Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQIcBAEBCgAGBQJWfdRuAAoJEGwc0Sh9sBEAg6EP/11K319mkZa0LbiV0g4Zbo5k RV44oXg/ucQsROpqDqp0DVzcMkJgGp9TjR6B0J9spviCviWJN5s6Ut5AKF9niV7g IGodpS2yaFRa7sOrv9o3ZffOOVajzOaXpkoeyeesv8+wS78B1wrpVGoKT35CC/mc SVbktqz5HpFAuPKXzCeV7ywAEpzH/NPNZFWrfT0Hi7P2UTS4KuRUemdw8adF5EDr pgARcaxIRpmUDoyU7TaRRxvrMknoqvo5vUcU5w5rLEiMrbH6pQAqdyJuDcUfa0aC 1cP/v+hjbqxFMNxTEFcQFqUgnUjrECKmOsijHJ4OCanNWK0Odiu8h4ORiyD59EyX ayTvbXMDqjiSVG449j775TkHARm8/lVJ2G4BfC4ig8AjflBDQTJpTLPxmKWS81xe Yz2uQfc6sAUKTdvG63/PwIAp1dK9ZaG2hiuSZDeGPKCVbRjFPfux3OTVj/+fUF7u IdrpturezDOhn2kfRea3IpTGeSxOTgKC1VtAx4HSUZ4GIgAiRj4xZEDl6Ww9qzaW 2GOEThC4k1pK+WL6NCoZb38pRiWONojGXErYtgs+wg+PdlXEY5heDX/6RduIs2UW 8lq+YeRGwkSHtvKt/4FnmcCSFWsEjlZ1+0uQu3VgoRZKHjnbxpASxGs7pgS7nGrU sw2TR+Xv0rOMjhVdj8v5 =qpY9 -----END PGP SIGNATURE----- --Apple-Mail=_918263EA-3FFB-4ABA-8809-80F83519528B-- From owner-freebsd-toolchain@freebsd.org Sat Dec 26 01:21:15 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 17D82A51A0F for ; Sat, 26 Dec 2015 01:21:15 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-211-151.reflexion.net [208.70.211.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C5E561F9B for ; Sat, 26 Dec 2015 01:21:14 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 18606 invoked from network); 26 Dec 2015 01:21:12 -0000 Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2) by 0 (rfx-qmail) with SMTP; 26 Dec 2015 01:21:12 -0000 Received: by mail-cs-02.app.dca.reflexion.local (Reflexion email security v7.80.0) with SMTP; Fri, 25 Dec 2015 20:21:11 -0500 (EST) Received: (qmail 6825 invoked from network); 26 Dec 2015 01:21:11 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with SMTP; 26 Dec 2015 01:21:11 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-76-115-7-162.hsd1.or.comcast.net [76.115.7.162]) by iron2.pdx.net (Postfix) with ESMTPSA id 3BDCE1C43E5; Fri, 25 Dec 2015 17:21:07 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: 11.0-CURRENT (r292413) on a rpi2b: arm-gnueabi-freebsd/bin/ar, _fseeko, and memset vs memory alignment (SCTRL bit[1]=1?): Explains the Bus error? From: Mark Millard In-Reply-To: <8B52074F-FDEF-4119-BB04-630F9BE9E6DB@bsdimp.com> Date: Fri, 25 Dec 2015 17:21:11 -0800 Cc: freebsd-arm , FreeBSD Toolchain , Ian Lepore , mat@FreeBSD.org, sbruno@FreeBSD.org Content-Transfer-Encoding: quoted-printable Message-Id: References: <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> <0D81C2CA-BF1C-4C14-B816-A8C5F68715B5@bsdimp.com> <51EB4AAB-BC81-4282-BA4D-D329C41D660B@dsl-only.net> <8B52074F-FDEF-4119-BB04-630F9BE9E6DB@bsdimp.com> To: Warner Losh X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Dec 2015 01:21:15 -0000 > On 2015-Dec-25, at 3:42 PM, Warner Losh wrote: >=20 >=20 >> On Dec 25, 2015, at 3:14 PM, Mark Millard = wrote: >>=20 >> [I'm going to break much of the earlier "original material" text to = tail of the message.] >>=20 >>> On 2015-Dec-25, at 11:53 AM, Warner Losh wrote: >>>=20 >>> So what happens if we actually fix the underlying bug? >>>=20 >>> I see two ways of doing this. In findfp.c, we allocate an array of = FILE * today like: >>> g =3D (struct glue *)malloc(sizeof(*g) + ALIGNBYTES + n * = sizeof(FILE)); >>> but that assumes that FILE just has normal pointer alignment = requirements. However, >>> due to the mbstate having int64_t alignment requirements, this is = wrong. Maybe we >>> need to do something like >>> g =3D (struct glue *)malloc(sizeof(*g) + = max(sizeof(int64_t),ALIGNBYTES) + n * sizeof(FILE)); >>> which wouldn=E2=80=99t change anything on LP64 systems, but would = result in proper alignment >>> for ILP32 systems. We=E2=80=99d have to fix the loop that uses ALIGN = afterwards to use >>> roundup. Instead, we=E2=80=99d need to round up to the neared 8-byte = aligned offset (or technically, >>> the max of ALIGNBYTES and 8, but that=E2=80=99s always 8 on = today=E2=80=99s systems. If we do this, >>> we can make sure that each file is 8-byte aligned or better. We may = need to round up >>> sizeof(FILE) to a multiple of 8 as well. I believe that since it has = the 8-byte alignment >>> for a member, its size must be a multiple of 8, but I=E2=80=99ve not = chased that belief to ground. >>> If not, we may need another decorator (__aligned(8), I think, = spelled with the ugly >>> max expression above). That way, the contract we=E2=80=99re making = with the compiler will >>> always be true. ALIGN BYTES is 4 on Arm anyway, so that bit is = clearly wrong. >>>=20 >>> This wouldn=E2=80=99t be an ABI change, since you can only get a = valid FILE * from fopen (and >>> friends), plus stdin, stdout, and stderr. Those addresses aren=E2=80=99= t hard coded into binaries, >>> so even if we have to tweak the last three and deal with some = =E2=80=98fake=E2=80=99 FILE abuse in libc >>> (which I don=E2=80=99t think suffers from this issue, btw, given the = alignment requirements that would >>> naturally follow from something on the stack), we=E2=80=99d still be = ahead. At least for all CONFORMING >>> implementations[*]... >>>=20 >>> TL;DR: Why not make FILE * always 8-byte aligned? The compiler = options are a band-aide. >>>=20 >>> Warner >>>=20 >>> [*] There=E2=80=99s at least on popular package that has a copy of = the FILE structure in one of its >>> .h files and uses that to do unnatural optimization things, but even = that=E2=80=99s cool, I think, >>> since it never allocates a new one. >>>=20 >>=20 >> The ARM documentation mentions cases of 16 byte alignment = requirements. I've no clue if the clang code generation ever creates = such code. There might be wider requirements possible in arm code as = well. (I'm not an arm expert.) As an example of an implication: "The = malloc() function returns a pointer to a block of at least size bytes = suitably aligned for any use." In other words: aligned to some figure = that is a multiple of *every* alignment requirement that the code = generator can produce, possibly being the least common multiple. >>=20 >> "-fmax-type-align=3D. . ." is a means of controlling/limiting the = range of potential alignments to no more than a fixed, predefined value. = Above that and the code generation has to work in small size accesses = and build-up/split-up bigger values. Using "-fmax-type-align=3D. . ." = allows defining a figure as part of an ABI that is then not subject to = code generator updates that could increase the maximum alignment figure = and break things: It turns off such new capabilities. Other options need = not work that way to preserve the ABI. >=20 > That=E2=80=99s true, as far as it goes=E2=80=A6 But I=E2=80=99m not = sure it goes far enough. The premise here is that the problem is = wide-spread, when in fact I think it is quite narrow. >=20 >> But in the most fundamental terms process wise as far as I can tell. = . . >>=20 >> While the FILE case that occurred is a specific example, every = memory-allocation-like operation is at a potential issue for all such = "allocated" objects where the related code generation requires alignment = to avoid Bus Error (given the SCTLR bit[1] in use). >=20 > The problem isn=E2=80=99t general. The problem isn=E2=80=99t malloc. = Malloc will generally return the right thing on arm (and if it = doesn=E2=80=99t, > then we need to make sure it does). >=20 > The problem is we get a boatload of FILEs from the system all at once, = and those are misaligned because of a bug in the code. One that=E2=80=99s = fixed, I believe, in https://reviews.freebsd.org/D4708. >=20 >=20 >> How many other places in FreeBSD might sometimes return mis-aligned = pointers for the existing code generation and ABI combination? >=20 > It isn=E2=80=99t an ABI thing, just a code bug thing. The only reason = it was an issue was due to the optimizing nature of clang. >=20 > We=E2=80=99ve had to deal with the arm alignment issues for years. I = wager there are very few indeed. The only reason this was was brought to = light was better code-gen from clang. >=20 >> How many other places are subject to breakage when "internal" = structs/unions/fields involved are changed to be of a different size = because the code is not fully auto-adjusting to match the code = generation yet --even if right now "it works"? How fragile will things = be for future work? >=20 > If there are others, I=E2=80=99ll bet they could be counted on one = hand since very few things do the =E2=80=98slab=E2=80=99 allocator that = FILE does. >=20 >> What would it take to find out and deal with them all? (I do not have = the background knowledge to span much.) >>=20 >> My experiment avoided potentially changing parts of the ABI and also = avoided dealing with such a "lots of code to investigate" issue. It may = not be the long term 11.0-RELEASE solution. Even if not, it may be = appropriate for various temporary purposes that need to avoid Bus Errors = in the process. For example if Ian has a good reason to use clang 3.7 = instead of gcc 4.2.1. >=20 > The review above doesn=E2=80=99t change the ABI either. >=20 >> Other notes: >>=20 >>> I believe that since it has the 8-byte alignment >>> for a member, its size must be a multiple of 8 >>=20 >> There are some C/C++ language rules about the address of a structure = equalling the address of the first field, uniformity of the offsets, and = the like. But. . . >>=20 >> The C and C++ languages specify no specific numerical alignment = figures, not even relative to specific sizeof(...) expressions. To use = an old example: a 68010 only needs alignment for >=3D 2 byte things and = even alignment is all that is then required. Some other contexts take a = lot more to meet the specifications. There are some implications of the = modern memory model(s) created to cover concurrency explicitly, such as = avoiding interactions that can happen via, for example, separate objects = (in part) sharing a cache line. (I've only looked at C++ for this, and = only to a degree.) >>=20 >> The detailed alignment rules are more "implementation defined" than = "predefined by the standard". But the definition is trying to meet = language criteria. It is not a fully independent choice. >=20 > Many of them are actually defined by a combination of the standard = language definition, as well as the ABI standard. This is why we know = that mbstate_t must be 8 byte aligned. >=20 >> May be some other standards that FreeBSD is tied to specify more = specifics, such as a N byte integer always aligns to some multiple of N = (a waste on the 68010), including the alignment for union or struct that = it may be a part of tracking. But such rules force padding that may or = may not be required to meet the language's more abstract criteria and = such rules may not match the existing/in-use ABI. >=20 > It is all spelled out in the ARM EABI docs. >=20 >> So far as I can tell explicitly declared alignments may well be = necessary. If that one "popular package", say, formed an array of FILE = copies then the resultant alignments need not all match the ones = produced by your example code unless the FILE declaration forces the = compiler to match, causing sizeof(FILE) to track as well. FILE need not = be the only such issue. >=20 > Arrays of FILEs isn=E2=80=99t an issue (except that it encodes the = size of FILE into the app). It=E2=80=99s the specifically quirky way = that libc does it that=E2=80=99s the problem. >=20 >> My background and reference material are mostly tied the languages = --and so my notes tend to be limited to that much context. >=20 > Understood. While there may be issues with alignment still, tossing a = big hammer at the problem because they might exist will likely mean they = will persist far longer than fixing them one at a time. When we first = ported to arm, there were maybe half a dozen places that needed fixing. = I doubt there=E2=80=99s more now. >=20 > Can you try the patch in the above code review w/o the -f switch and = let me know if it works for you? >=20 > Warner buildworld/buildkernel has been started on amd64 for a rpi2 target. That = and install kernel/world and starting up a port rebuild on the rpi2 and = waiting for it means it will be a few hours even if I start the next = thing just as each prior thing finishes. I may give up and go to sleep = first. As for presumptions: I'll take your word on expected status of things. = I've no clue. But absent even the hear-say status information at the = time I did not presume that what was in front of me was all there is to = worry about --nor did I try to go figure it all out on my own. I took a = path to cover both possibilities for local-only vs. more-wide-spread (so = long as that path did not force a split-up of some larger form of atomic = action). In my view "-mno-unaligned-access" is an even bigger hammer than I used. = I find no clang statement about what its ABI consequences would be, = unlike for what I did: What mix of more padding for alignment vs. more = but smaller accesses? But as I remember I've seen = "-mno-unaligned-access" in use in ports and the like so its consequences = may be familiar material for some folks. Absent any questions about ABI consequences "-mno-unaligned-access" does = well mark the expected SCTLR bit[1] status, far better than what I did. = Again: I was covering my ignorance while making any significant = investigation/debugging as unlikely as I could. > Original material: >=20 >> On Dec 25, 2015, at 7:24 AM, Mark Millard = wrote: >>=20 >> [Good News Summary: Rebuilding buildworld/buildkernel for rpi2 = 11.0-CURRENT 292413 from amd64 based on adding -fmax-type-align=3D4 has = so far removed the crashes during the toolchain activity: no more = misaligned accesses in libc's _fseeko or elsewhere.] >>=20 >> On 2015-Dec-25, at 12:31 AM, Mark Millard = wrote: >>=20 >>> On 2015-Dec-24, at 10:39 PM, Mark Millard = wrote: >>>=20 >>>> [I do not know if this partial crash analysis related to on-arm = clang-associated activity is good enough and appropriate to submit or = not.] >>>>=20 >>>> The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved = below came from pkg install activity instead of port building. Used = as-is. >>>>=20 >>>> When I just tried my first from-rpi2b builds (ports for a rpi2b), = /usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the = following suggests an alignment error for the type of instructions that = memset for 128 bytes was translated to (sizeof(mbstate_t)) in the code = used by /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to = check SCTLR bit[1] to be directly sure that alignment was being = enforced.) >>>>=20 >>>> The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar = : >>>>=20 >>>>> libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru = .libs/libgnuintl.a bindtextdom.o dcgettext.o dgettext.o gettext.o = finddomain.o hash-string.o loadmsgcat.o localealias.o textdomain.o = l10nflist.o explodename.o dcigettext.o dcngettext.o dngettext.o = ngettext.o pluralx.o plural-exp.o localcharset.o threadlib.o lock.o = relocatable.o langprefs.o localename.o log.o printf.o setlocale.o = version.o xsize.o osdep.o intl-compat.o >>>>> Bus error (core dumped) >>>>> *** [libgnuintl.la] Error code 138 >>>>=20 >>>> It failed in _fseeko doing a memset that turned into uses of = "vst1.64 {d16-d17}, [r0]" instructions, for an address in = register r0 that ended in 0xa4, so was not aligned to 8 byte boundaries. = =46rom what I read such "VSTn (multiple n-element structures)" that have = .64 require 8 byte alignment. The evidence of the code and register = value follow. >>>>=20 >>>>> # gdb /usr/local/arm-gnueabi-freebsd/bin/ar = /usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gette= xt-tools/intl/ar.core >>>>> . . . >>>>> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at /usr/src/lib/libc/stdio/fseek.c:299 >>>>> 299 memset(&fp->_mbstate, 0, sizeof(mbstate_t)); >>>>> . . . >>>>> (gdb) x/24i 0x2033adb0 >>>>> 0x2033adb0 <_fseeko+836>: vmov.i32 q8, #0 ; 0x00000000 >>>>> 0x2033adb4 <_fseeko+840>: movw r1, #65503 ; 0xffdf >>>>> 0x2033adb8 <_fseeko+844>: stm r4, {r0, r7} >>>>> 0x2033adbc <_fseeko+848>: ldrh r0, [r4, #12] >>>>> 0x2033adc0 <_fseeko+852>: and r0, r0, r1 >>>>> 0x2033adc4 <_fseeko+856>: strh r0, [r4, #12] >>>>> 0x2033adc8 <_fseeko+860>: add r0, r4, #216 ; 0xd8 >>>>> 0x2033adcc <_fseeko+864>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033add0 <_fseeko+868>: add r0, r4, #200 ; 0xc8 >>>>> 0x2033add4 <_fseeko+872>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033add8 <_fseeko+876>: add r0, r4, #184 ; 0xb8 >>>>> 0x2033addc <_fseeko+880>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033ade0 <_fseeko+884>: add r0, r4, #168 ; 0xa8 >>>>> 0x2033ade4 <_fseeko+888>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033ade8 <_fseeko+892>: add r0, r4, #152 ; 0x98 >>>>> 0x2033adec <_fseeko+896>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033adf0 <_fseeko+900>: add r0, r4, #136 ; 0x88 >>>>> 0x2033adf4 <_fseeko+904>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033adf8 <_fseeko+908>: add r0, r4, #120 ; 0x78 >>>>> 0x2033adfc <_fseeko+912>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033ae00 <_fseeko+916>: add r0, r4, #104 ; 0x68 >>>>> 0x2033ae04 <_fseeko+920>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033ae08 <_fseeko+924>: b 0x2033b070 <_fseeko+1540> >>>>> 0x2033ae0c <_fseeko+928>: cmp r5, #0 ; 0x0 >>>>> (gdb) info all-registers >>>>> r0 0x20651ea4 543497892 >>>>> r1 0xffdf 65503 >>>>> r2 0x0 0 >>>>> r3 0x0 0 >>>>> r4 0x20651dcc 543497676 >>>>> r5 0x0 0 >>>>> r6 0x0 0 >>>>> r7 0x0 0 >>>>> r8 0x20359df4 540384756 >>>>> r9 0x0 0 >>>>> r10 0x0 0 >>>>> r11 0xbfbfb948 -1077954232 >>>>> r12 0x2037b208 540520968 >>>>> sp 0xbfbfb898 -1077954408 >>>>> lr 0x2035a004 540385284 >>>>> pc 0x2033adcc 540257740 >>>>> f0 0 (raw 0x000000000000000000000000) >>>>> f1 0 (raw 0x000000000000000000000000) >>>>> f2 0 (raw 0x000000000000000000000000) >>>>> f3 0 (raw 0x000000000000000000000000) >>>>> f4 0 (raw 0x000000000000000000000000) >>>>> f5 0 (raw 0x000000000000000000000000) >>>>> f6 0 (raw 0x000000000000000000000000) >>>>> f7 0 (raw 0x000000000000000000000000) >>>>> fps 0x0 0 >>>>> cpsr 0x60000010 1610612752 >>>>=20 >>>> The syntax in use for vst1.64 instructions does not explicitly have = the alignment notation. Presuming that the decoding is correct then from = what I read the following applies: >>>>=20 >>>>> Home > NEON and VFP Programming > NEON load and store element and = structure instructions > Alignment restrictions in load and store, = element and structure instructions >>>>>=20 >>>>> . . . When the alignment is not specified in the instruction, the = alignment restriction is controlled by the A bit (SCTLR bit[1]): >>>>> =E2=80=A2 if the A bit is 0, there are no alignment = restrictions (except for strongly ordered or device memory, where = accesses must be element aligned or the result is unpredictable) >>>>> =E2=80=A2 if the A bit is 1, accesses must be element = aligned. >>>>> If an address is not correctly aligned, an alignment fault occurs. >>>>=20 >>>> So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus = error would have the context to happen because of the mis-alignment. >>>>=20 >>>> The following shows the make.conf context that explains how = /usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked: >>>>=20 >>>>> # more /etc/make.conf >>>>> WRKDIRPREFIX=3D/usr/obj/portswork >>>>> WITH_DEBUG=3D >>>>> WITH_DEBUG_FILES=3D >>>>> MALLOC_PRODUCTION=3D >>>>> # >>>>> TO_TYPE=3Darmv6 >>>>> TOOLS_TO_TYPE=3Darm-gnueabi >>>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>>>> .if ${.MAKE.LEVEL} =3D=3D 0 >>>>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>>> .export CC >>>>> .export CXX >>>>> .export CPP >>>>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>>>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>>>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>>>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>>>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>>>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>>>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>>>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>>>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>>>> .export AS >>>>> .export AR >>>>> .export LD >>>>> .export NM >>>>> .export OBJCOPY >>>>> .export OBJDUMP >>>>> .export RANLIB >>>>> .export SIZE >>>>> .export STRINGS >>>>> .endif >>>>=20 >>>>=20 >>>> Other context: >>>>=20 >>>>> # freebsd-version -ku; uname -aKU >>>>> 11.0-CURRENT >>>>> 11.0-CURRENT >>>>> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue = Dec 22 22:02:21 PST 2015 = root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG arm = 1100091 1100091 >>>>=20 >>>>=20 >>>>=20 >>>> I will note that world and kernel are my own build of -r292413 = (earlier experiment) --a build made from an amd64 host context and put = in place via DESTDIR=3D. My expectation would be that the amd64 context = would not be likely to have similar alignment restrictions involved in = its ar activity (or other activity). That would explain how I got this = far using such a clang 3.7 related toolchain for targeting an rpi2 = before finding such a problem. >>>=20 >>>=20 >>> I realized re-reading the all above that it seems to suggest that = the _fseeko code involved is from /usr/local/arm-gnueabi-freebsd/bin/ar = but that was not my intent. >>>=20 >>> libc.so.7 is from my buildworld, including the fseeko = implementation: >>>=20 >>> Reading symbols from /lib/libc.so.7...Reading symbols from = /usr/lib/debug//lib/libc.so.7.debug...done. >>> done. >>> Loaded symbols for /lib/libc.so.7 >>>=20 >>>=20 >>> head/sys/sys/_types.h has: >>>=20 >>> /* >>> * mbstate_t is an opaque object to keep conversion state during = multibyte >>> * stream conversions. >>> */ >>> typedef union { >>> char __mbstate8[128]; >>> __int64_t _mbstateL; /* for alignment */ >>> } __mbstate_t; >>>=20 >>> suggesting an implicit alignment of the union to whatever the = implementation defines for __int64_t --which need not be 8 byte = alignment (in the abstract, general case). But 8 byte alignment is a = possibility as well (in the abstract). >>>=20 >>> But printing *fp in gdb for the fp argument to _fseeko reports the = same not-8-byte aligned address for __mbstate8 that was in r0: >>>=20 >>>> (gdb) bt >>>> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at /usr/src/lib/libc/stdio/fseek.c:299 >>>> #1 0x2033b108 in fseeko (fp=3D0x20651dcc, offset=3D18571438587904, = whence=3D0) at /usr/src/lib/libc/stdio/fseek.c:82 >>>> #2 0x00016138 in ?? () >>>> (gdb) print fp >>>> $2 =3D (FILE *) 0x20651dcc >>>> (gdb) print *fp >>>> $3 =3D {_p =3D 0x2069a240 "", _r =3D 0, _w =3D 0, _flags =3D 5264, = _file =3D 36, _bf =3D {_base =3D 0x2069a240 "", _size =3D 32768}, = _lbfsize =3D 0, _cookie =3D 0x20651dcc, _close =3D 0x20359dfc = <__sclose>, >>>> _read =3D 0x20359de4 <__sread>, _seek =3D 0x20359df4 <__sseek>, = _write =3D 0x20359dec <__swrite>, _ub =3D {_base =3D 0x0, _size =3D 0}, = _up =3D 0x0, _ur =3D 0, _ubuf =3D 0x20651e0c "", _nbuf =3D 0x20651e0f = "", _lb =3D { >>>> _base =3D 0x0, _size =3D 0}, _blksize =3D 32768, _offset =3D 0, = _fl_mutex =3D 0x0, _fl_owner =3D 0x0, _fl_count =3D 0, _orientation =3D = 0, _mbstate =3D {__mbstate8 =3D 0x20651e34 "", _mbstateL =3D 0}, _flags2 = =3D 0} >>>=20 >>> The overall FILE struct containing the _mbstate field is also not = 8-byte aligned. But the offset from the start of the FILE struct to = __mbstate8 is a multiple of 8 bytes. >>>=20 >>> It is my interpretation that there is nothing here to justify the = memset implementation combination: >>>=20 >>> SCTLR bit[1]=3D=3D1 >>>=20 >>> mixed with >>>=20 >>> vst1.64 instructions >>>=20 >>> I.e.: one or both needs to change unless some way for forcing 8-byte = alignment is introduced. >>>=20 >>> I have not managed to track down anything that would indicate = FreeBSD's intent for SCTLR bit[1]. I do not even know if it is required = by the design to be constant (once initialized). >>=20 >>=20 >> I have (so far) removed the build tool crashes based on adding = -fmax-type-align=3D4 to avoid the misaligned accesses. Details follow. >>=20 >> src.conf on amd64 for the rpi2 targeting buildworld/buildkernel now = looks like: >>=20 >>> # more ~/src.configs/src.conf.rpi2-clang.amd64-host >>> TO_TYPE=3Darmv6 >>> TOOLS_TO_TYPE=3Darm-gnueabi >>> FROM_TYPE=3Damd64 >>> TOOLS_FROM_TYPE=3Dx86_64 >>> VERSION_CONTEXT=3D11.0 >>> # >>> KERNCONF=3DRPI2-NODBG >>> TARGET=3Darm >>> .if ${.MAKE.LEVEL} =3D=3D 0 >>> TARGET_ARCH=3D${TO_TYPE} >>> .export TARGET_ARCH >>> .endif >>> # >>> WITHOUT_CROSS_COMPILER=3D >>> # >>> # For WITH_BOOT=3D . . . >>> # arm-gnueabi-freebsd/bin/ld reports bootinfo.o: relocation = R_ARM_MOVW_ABS_NC against `a local symbol' can not be used when making a = shared object; recompile with -fPIC >>> WITHOUT_BOOT=3D >>> # >>> WITH_FAST_DEPEND=3D >>> WITH_LIBCPLUSPLUS=3D >>> WITH_CLANG=3D >>> WITH_CLANG_IS_CC=3D >>> WITH_CLANG_FULL=3D >>> WITH_LLDB=3D >>> WITH_CLANG_EXTRAS=3D >>> # >>> WITHOUT_LIB32=3D >>> WITHOUT_GCC=3D >>> WITHOUT_GNUCXX=3D >>> # >>> NO_WERROR=3D >>> MALLOC_PRODUCTION=3D >>> #CFLAGS+=3D -DELF_VERBOSE >>> # >>> WITH_DEBUG=3D >>> WITH_DEBUG_FILES=3D >>> # >>> # TOOLS_TO_TYPE based on ${TO_TYPE}-xtoolchain-gcc related = bintutils... >>> # >>> #CROSS_TOOLCHAIN=3D${TO_TYPE}-gcc >>> X_COMPILER_TYPE=3Dclang >>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>> .if ${.MAKE.LEVEL} =3D=3D 0 >>> XCC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> XCXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> XCPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> .export XCC >>> .export XCXX >>> .export XCPP >>> XAS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>> XAR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>> XLD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>> XNM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>> XOBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>> XOBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>> XRANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>> XSIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>> #NO-SUCH: XSTRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>> XSTRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>> .export XAS >>> .export XAR >>> .export XLD >>> .export XNM >>> .export XOBJCOPY >>> .export XOBJDUMP >>> .export XRANLIB >>> .export XSIZE >>> .export XSTRINGS >>> .endif >>> # >>> # Host compiler stuff: >>> .if ${.MAKE.LEVEL} =3D=3D 0 >>> CC=3D/usr/bin/clang -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >>> CXX=3D/usr/bin/clang++ -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >>> CPP=3D/usr/bin/clang-cpp -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >>> .export CC >>> .export CXX >>> .export CPP >>> AS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/as >>> AR=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ar >>> LD=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ld >>> NM=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/nm >>> OBJCOPY=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objcopy >>> OBJDUMP=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objdump >>> RANLIB=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ranlib >>> SIZE=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/size >>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/strings >>> STRINGS=3D/usr/local/bin/${TOOLS_FROM_TYPE}-freebsd-strings >>> .export AS >>> .export AR >>> .export LD >>> .export NM >>> .export OBJCOPY >>> .export OBJDUMP >>> .export RANLIB >>> .export SIZE >>> .export STRINGS >>> .endif >>=20 >> make.conf for during the on-rpi2 port builds now looks like: >>=20 >>> $ more /etc/make.conf >>> WRKDIRPREFIX=3D/usr/obj/portswork >>> WITH_DEBUG=3D >>> WITH_DEBUG_FILES=3D >>> MALLOC_PRODUCTION=3D >>> # >>> TO_TYPE=3Darmv6 >>> TOOLS_TO_TYPE=3Darm-gnueabi >>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>> .if ${.MAKE.LEVEL} =3D=3D 0 >>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> .export CC >>> .export CXX >>> .export CPP >>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>> .export AS >>> .export AR >>> .export LD >>> .export NM >>> .export OBJCOPY >>> .export OBJDUMP >>> .export RANLIB >>> .export SIZE >>> .export STRINGS >>> .endif >>=20 >>=20 >>=20 >> =3D=3D=3D >> Mark Millard >> markmi at dsl-only.net >>=20 >>=20 >>=20 >> _______________________________________________ >> freebsd-toolchain@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain >> To unsubscribe, send any mail to = "freebsd-toolchain-unsubscribe@freebsd.org" From owner-freebsd-toolchain@freebsd.org Sat Dec 26 04:32:16 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CFDB2A515B6 for ; Sat, 26 Dec 2015 04:32:16 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-211-151.reflexion.net [208.70.211.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 88EB61C41 for ; Sat, 26 Dec 2015 04:32:16 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 13820 invoked from network); 26 Dec 2015 04:32:14 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 26 Dec 2015 04:32:14 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v7.80.0) with SMTP; Fri, 25 Dec 2015 23:32:19 -0500 (EST) Received: (qmail 30084 invoked from network); 26 Dec 2015 04:32:19 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with SMTP; 26 Dec 2015 04:32:19 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-76-115-7-162.hsd1.or.comcast.net [76.115.7.162]) by iron2.pdx.net (Postfix) with ESMTPSA id 829F51C43BC; Fri, 25 Dec 2015 20:32:07 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: 11.0-CURRENT (r292413) on a rpi2b: arm-gnueabi-freebsd/bin/ar, _fseeko, and memset vs memory alignment (SCTRL bit[1]=1?): Explains the Bus error? From: Mark Millard In-Reply-To: Date: Fri, 25 Dec 2015 20:32:12 -0800 Cc: freebsd-arm , FreeBSD Toolchain , Ian Lepore , mat@FreeBSD.org, sbruno@FreeBSD.org Content-Transfer-Encoding: quoted-printable Message-Id: References: <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> <0D81C2CA-BF1C-4C14-B816-A8C5F68715B5@bsdimp.com> <51EB4AAB-BC81-4282-BA4D-D329C41D660B@dsl-only.net> <8B52074F-FDEF-4119-BB04-630F9BE9E6DB@bsdimp.com> To: Warner Losh X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Dec 2015 04:32:17 -0000 [I am again breaking off another section of older material.] Mixed news I'm afraid. The specific couple of ports that I attempted did build, the same ones = that originally got the Bus Error in ar using (indirectly) _fseeko and = memset that I reported. So I expect that you fixed one error. But when I tried to buildworld, clang++ 3.7 processing = usr/src/lib/clang/libllvmtablegen/ materials quickly got a Bus Error at = nearly the same type of instruction (it has a "!" below that the earlier = one did not), but with r4 holding the misaligned address this time: > --- _bootstrap-tools-lib/clang/libllvmsupport --- > --- APFloat.o --- > clang++: error: unable to execute command: Bus error (core dumped) > . . . > # gdb clang++ usr/src/lib/clang/libllvmtablegen/clang++.core > . . . > Core was generated by `clang++'. > Program terminated with signal 10, Bus error. > #0 0x00c3bb9c in = clang::DependentTemplateSpecializationType::DependentTemplateSpecializatio= nType () > [New Thread 22a18000 (LWP 100128/)] > (gdb) x/40i 0x00c3bb60 > . . . > 0xc3bb9c = <_ZN5clang35DependentTemplateSpecializationTypeC2ENS_21ElaboratedTypeKeywo= rdEPNS_19NestedNameSpecifierEPKNS_14IdentifierInfoEjPKNS_16TemplateArgumen= tENS_8QualTypeE+356>:=09 > vst1.64 {d16-d17}, [r4]! > . . . > (gdb) info all-registers > r0 0xbfbf81a8 -1077968472 > r1 0x22f07e14 586186260 > r2 0xc416bc 12850876 > r3 0x2 2 > r4 0x22f07dfc 586186236 > . . . Thus it appears that there is more code around that likely generates = pointers not aligned so to allow the code generation that is in use for = what is pointed to. At this point I have no clue if the issue is just inside clang itself = vs. if it is in something that clang is layered on top of. Nor if there = is just one bad thing or many. Note: I had not yet tried buildworld/buildkernel for the context of the = "-f" option that I was experimenting with earlier. So I do not have a = direct compare and contrast at this point. Older material: On 2015-Dec-25, at 5:21 PM, Mark Millard wrote: > On 2015-Dec-25, at 3:42 PM, Warner Losh wrote: >=20 >=20 >> On Dec 25, 2015, at 3:14 PM, Mark Millard = wrote: >>=20 >> [I'm going to break much of the earlier "original material" text to = tail of the message.] >>=20 >>> On 2015-Dec-25, at 11:53 AM, Warner Losh wrote: >>>=20 >>> So what happens if we actually fix the underlying bug? >>>=20 >>> I see two ways of doing this. In findfp.c, we allocate an array of = FILE * today like: >>> g =3D (struct glue *)malloc(sizeof(*g) + ALIGNBYTES + n * = sizeof(FILE)); >>> but that assumes that FILE just has normal pointer alignment = requirements. However, >>> due to the mbstate having int64_t alignment requirements, this is = wrong. Maybe we >>> need to do something like >>> g =3D (struct glue *)malloc(sizeof(*g) + = max(sizeof(int64_t),ALIGNBYTES) + n * sizeof(FILE)); >>> which wouldn=E2=80=99t change anything on LP64 systems, but would = result in proper alignment >>> for ILP32 systems. We=E2=80=99d have to fix the loop that uses ALIGN = afterwards to use >>> roundup. Instead, we=E2=80=99d need to round up to the neared 8-byte = aligned offset (or technically, >>> the max of ALIGNBYTES and 8, but that=E2=80=99s always 8 on = today=E2=80=99s systems. If we do this, >>> we can make sure that each file is 8-byte aligned or better. We may = need to round up >>> sizeof(FILE) to a multiple of 8 as well. I believe that since it has = the 8-byte alignment >>> for a member, its size must be a multiple of 8, but I=E2=80=99ve not = chased that belief to ground. >>> If not, we may need another decorator (__aligned(8), I think, = spelled with the ugly >>> max expression above). That way, the contract we=E2=80=99re making = with the compiler will >>> always be true. ALIGN BYTES is 4 on Arm anyway, so that bit is = clearly wrong. >>>=20 >>> This wouldn=E2=80=99t be an ABI change, since you can only get a = valid FILE * from fopen (and >>> friends), plus stdin, stdout, and stderr. Those addresses aren=E2=80=99= t hard coded into binaries, >>> so even if we have to tweak the last three and deal with some = =E2=80=98fake=E2=80=99 FILE abuse in libc >>> (which I don=E2=80=99t think suffers from this issue, btw, given the = alignment requirements that would >>> naturally follow from something on the stack), we=E2=80=99d still be = ahead. At least for all CONFORMING >>> implementations[*]... >>>=20 >>> TL;DR: Why not make FILE * always 8-byte aligned? The compiler = options are a band-aide. >>>=20 >>> Warner >>>=20 >>> [*] There=E2=80=99s at least on popular package that has a copy of = the FILE structure in one of its >>> .h files and uses that to do unnatural optimization things, but even = that=E2=80=99s cool, I think, >>> since it never allocates a new one. >>>=20 >>=20 >> The ARM documentation mentions cases of 16 byte alignment = requirements. I've no clue if the clang code generation ever creates = such code. There might be wider requirements possible in arm code as = well. (I'm not an arm expert.) As an example of an implication: "The = malloc() function returns a pointer to a block of at least size bytes = suitably aligned for any use." In other words: aligned to some figure = that is a multiple of *every* alignment requirement that the code = generator can produce, possibly being the least common multiple. >>=20 >> "-fmax-type-align=3D. . ." is a means of controlling/limiting the = range of potential alignments to no more than a fixed, predefined value. = Above that and the code generation has to work in small size accesses = and build-up/split-up bigger values. Using "-fmax-type-align=3D. . ." = allows defining a figure as part of an ABI that is then not subject to = code generator updates that could increase the maximum alignment figure = and break things: It turns off such new capabilities. Other options need = not work that way to preserve the ABI. >=20 > That=E2=80=99s true, as far as it goes=E2=80=A6 But I=E2=80=99m not = sure it goes far enough. The premise here is that the problem is = wide-spread, when in fact I think it is quite narrow. >=20 >> But in the most fundamental terms process wise as far as I can tell. = . . >>=20 >> While the FILE case that occurred is a specific example, every = memory-allocation-like operation is at a potential issue for all such = "allocated" objects where the related code generation requires alignment = to avoid Bus Error (given the SCTLR bit[1] in use). >=20 > The problem isn=E2=80=99t general. The problem isn=E2=80=99t malloc. = Malloc will generally return the right thing on arm (and if it = doesn=E2=80=99t, > then we need to make sure it does). >=20 > The problem is we get a boatload of FILEs from the system all at once, = and those are misaligned because of a bug in the code. One that=E2=80=99s = fixed, I believe, in https://reviews.freebsd.org/D4708. >=20 >=20 >> How many other places in FreeBSD might sometimes return mis-aligned = pointers for the existing code generation and ABI combination? >=20 > It isn=E2=80=99t an ABI thing, just a code bug thing. The only reason = it was an issue was due to the optimizing nature of clang. >=20 > We=E2=80=99ve had to deal with the arm alignment issues for years. I = wager there are very few indeed. The only reason this was was brought to = light was better code-gen from clang. >=20 >> How many other places are subject to breakage when "internal" = structs/unions/fields involved are changed to be of a different size = because the code is not fully auto-adjusting to match the code = generation yet --even if right now "it works"? How fragile will things = be for future work? >=20 > If there are others, I=E2=80=99ll bet they could be counted on one = hand since very few things do the =E2=80=98slab=E2=80=99 allocator that = FILE does. >=20 >> What would it take to find out and deal with them all? (I do not have = the background knowledge to span much.) >>=20 >> My experiment avoided potentially changing parts of the ABI and also = avoided dealing with such a "lots of code to investigate" issue. It may = not be the long term 11.0-RELEASE solution. Even if not, it may be = appropriate for various temporary purposes that need to avoid Bus Errors = in the process. For example if Ian has a good reason to use clang 3.7 = instead of gcc 4.2.1. >=20 > The review above doesn=E2=80=99t change the ABI either. >=20 >> Other notes: >>=20 >>> I believe that since it has the 8-byte alignment >>> for a member, its size must be a multiple of 8 >>=20 >> There are some C/C++ language rules about the address of a structure = equalling the address of the first field, uniformity of the offsets, and = the like. But. . . >>=20 >> The C and C++ languages specify no specific numerical alignment = figures, not even relative to specific sizeof(...) expressions. To use = an old example: a 68010 only needs alignment for >=3D 2 byte things and = even alignment is all that is then required. Some other contexts take a = lot more to meet the specifications. There are some implications of the = modern memory model(s) created to cover concurrency explicitly, such as = avoiding interactions that can happen via, for example, separate objects = (in part) sharing a cache line. (I've only looked at C++ for this, and = only to a degree.) >>=20 >> The detailed alignment rules are more "implementation defined" than = "predefined by the standard". But the definition is trying to meet = language criteria. It is not a fully independent choice. >=20 > Many of them are actually defined by a combination of the standard = language definition, as well as the ABI standard. This is why we know = that mbstate_t must be 8 byte aligned. >=20 >> May be some other standards that FreeBSD is tied to specify more = specifics, such as a N byte integer always aligns to some multiple of N = (a waste on the 68010), including the alignment for union or struct that = it may be a part of tracking. But such rules force padding that may or = may not be required to meet the language's more abstract criteria and = such rules may not match the existing/in-use ABI. >=20 > It is all spelled out in the ARM EABI docs. >=20 >> So far as I can tell explicitly declared alignments may well be = necessary. If that one "popular package", say, formed an array of FILE = copies then the resultant alignments need not all match the ones = produced by your example code unless the FILE declaration forces the = compiler to match, causing sizeof(FILE) to track as well. FILE need not = be the only such issue. >=20 > Arrays of FILEs isn=E2=80=99t an issue (except that it encodes the = size of FILE into the app). It=E2=80=99s the specifically quirky way = that libc does it that=E2=80=99s the problem. >=20 >> My background and reference material are mostly tied the languages = --and so my notes tend to be limited to that much context. >=20 > Understood. While there may be issues with alignment still, tossing a = big hammer at the problem because they might exist will likely mean they = will persist far longer than fixing them one at a time. When we first = ported to arm, there were maybe half a dozen places that needed fixing. = I doubt there=E2=80=99s more now. >=20 > Can you try the patch in the above code review w/o the -f switch and = let me know if it works for you? >=20 > Warner buildworld/buildkernel has been started on amd64 for a rpi2 target. That = and install kernel/world and starting up a port rebuild on the rpi2 and = waiting for it means it will be a few hours even if I start the next = thing just as each prior thing finishes. I may give up and go to sleep = first. As for presumptions: I'll take your word on expected status of things. = I've no clue. But absent even the hear-say status information at the = time I did not presume that what was in front of me was all there is to = worry about --nor did I try to go figure it all out on my own. I took a = path to cover both possibilities for local-only vs. more-wide-spread (so = long as that path did not force a split-up of some larger form of atomic = action). In my view "-mno-unaligned-access" is an even bigger hammer than I used. = I find no clang statement about what its ABI consequences would be, = unlike for what I did: What mix of more padding for alignment vs. more = but smaller accesses? But as I remember I've seen = "-mno-unaligned-access" in use in ports and the like so its consequences = may be familiar material for some folks. Absent any questions about ABI consequences "-mno-unaligned-access" does = well mark the expected SCTLR bit[1] status, far better than what I did. = Again: I was covering my ignorance while making any significant = investigation/debugging as unlikely as I could. > Original material: >=20 >> On Dec 25, 2015, at 7:24 AM, Mark Millard = wrote: >>=20 >> [Good News Summary: Rebuilding buildworld/buildkernel for rpi2 = 11.0-CURRENT 292413 from amd64 based on adding -fmax-type-align=3D4 has = so far removed the crashes during the toolchain activity: no more = misaligned accesses in libc's _fseeko or elsewhere.] >>=20 >> On 2015-Dec-25, at 12:31 AM, Mark Millard = wrote: >>=20 >>> On 2015-Dec-24, at 10:39 PM, Mark Millard = wrote: >>>=20 >>>> [I do not know if this partial crash analysis related to on-arm = clang-associated activity is good enough and appropriate to submit or = not.] >>>>=20 >>>> The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved = below came from pkg install activity instead of port building. Used = as-is. >>>>=20 >>>> When I just tried my first from-rpi2b builds (ports for a rpi2b), = /usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the = following suggests an alignment error for the type of instructions that = memset for 128 bytes was translated to (sizeof(mbstate_t)) in the code = used by /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to = check SCTLR bit[1] to be directly sure that alignment was being = enforced.) >>>>=20 >>>> The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar = : >>>>=20 >>>>> libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru = .libs/libgnuintl.a bindtextdom.o dcgettext.o dgettext.o gettext.o = finddomain.o hash-string.o loadmsgcat.o localealias.o textdomain.o = l10nflist.o explodename.o dcigettext.o dcngettext.o dngettext.o = ngettext.o pluralx.o plural-exp.o localcharset.o threadlib.o lock.o = relocatable.o langprefs.o localename.o log.o printf.o setlocale.o = version.o xsize.o osdep.o intl-compat.o >>>>> Bus error (core dumped) >>>>> *** [libgnuintl.la] Error code 138 >>>>=20 >>>> It failed in _fseeko doing a memset that turned into uses of = "vst1.64 {d16-d17}, [r0]" instructions, for an address in = register r0 that ended in 0xa4, so was not aligned to 8 byte boundaries. = =46rom what I read such "VSTn (multiple n-element structures)" that have = .64 require 8 byte alignment. The evidence of the code and register = value follow. >>>>=20 >>>>> # gdb /usr/local/arm-gnueabi-freebsd/bin/ar = /usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gette= xt-tools/intl/ar.core >>>>> . . . >>>>> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at /usr/src/lib/libc/stdio/fseek.c:299 >>>>> 299 memset(&fp->_mbstate, 0, sizeof(mbstate_t)); >>>>> . . . >>>>> (gdb) x/24i 0x2033adb0 >>>>> 0x2033adb0 <_fseeko+836>: vmov.i32 q8, #0 ; 0x00000000 >>>>> 0x2033adb4 <_fseeko+840>: movw r1, #65503 ; 0xffdf >>>>> 0x2033adb8 <_fseeko+844>: stm r4, {r0, r7} >>>>> 0x2033adbc <_fseeko+848>: ldrh r0, [r4, #12] >>>>> 0x2033adc0 <_fseeko+852>: and r0, r0, r1 >>>>> 0x2033adc4 <_fseeko+856>: strh r0, [r4, #12] >>>>> 0x2033adc8 <_fseeko+860>: add r0, r4, #216 ; 0xd8 >>>>> 0x2033adcc <_fseeko+864>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033add0 <_fseeko+868>: add r0, r4, #200 ; 0xc8 >>>>> 0x2033add4 <_fseeko+872>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033add8 <_fseeko+876>: add r0, r4, #184 ; 0xb8 >>>>> 0x2033addc <_fseeko+880>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033ade0 <_fseeko+884>: add r0, r4, #168 ; 0xa8 >>>>> 0x2033ade4 <_fseeko+888>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033ade8 <_fseeko+892>: add r0, r4, #152 ; 0x98 >>>>> 0x2033adec <_fseeko+896>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033adf0 <_fseeko+900>: add r0, r4, #136 ; 0x88 >>>>> 0x2033adf4 <_fseeko+904>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033adf8 <_fseeko+908>: add r0, r4, #120 ; 0x78 >>>>> 0x2033adfc <_fseeko+912>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033ae00 <_fseeko+916>: add r0, r4, #104 ; 0x68 >>>>> 0x2033ae04 <_fseeko+920>: vst1.64 {d16-d17}, [r0] >>>>> 0x2033ae08 <_fseeko+924>: b 0x2033b070 <_fseeko+1540> >>>>> 0x2033ae0c <_fseeko+928>: cmp r5, #0 ; 0x0 >>>>> (gdb) info all-registers >>>>> r0 0x20651ea4 543497892 >>>>> r1 0xffdf 65503 >>>>> r2 0x0 0 >>>>> r3 0x0 0 >>>>> r4 0x20651dcc 543497676 >>>>> r5 0x0 0 >>>>> r6 0x0 0 >>>>> r7 0x0 0 >>>>> r8 0x20359df4 540384756 >>>>> r9 0x0 0 >>>>> r10 0x0 0 >>>>> r11 0xbfbfb948 -1077954232 >>>>> r12 0x2037b208 540520968 >>>>> sp 0xbfbfb898 -1077954408 >>>>> lr 0x2035a004 540385284 >>>>> pc 0x2033adcc 540257740 >>>>> f0 0 (raw 0x000000000000000000000000) >>>>> f1 0 (raw 0x000000000000000000000000) >>>>> f2 0 (raw 0x000000000000000000000000) >>>>> f3 0 (raw 0x000000000000000000000000) >>>>> f4 0 (raw 0x000000000000000000000000) >>>>> f5 0 (raw 0x000000000000000000000000) >>>>> f6 0 (raw 0x000000000000000000000000) >>>>> f7 0 (raw 0x000000000000000000000000) >>>>> fps 0x0 0 >>>>> cpsr 0x60000010 1610612752 >>>>=20 >>>> The syntax in use for vst1.64 instructions does not explicitly have = the alignment notation. Presuming that the decoding is correct then from = what I read the following applies: >>>>=20 >>>>> Home > NEON and VFP Programming > NEON load and store element and = structure instructions > Alignment restrictions in load and store, = element and structure instructions >>>>>=20 >>>>> . . . When the alignment is not specified in the instruction, the = alignment restriction is controlled by the A bit (SCTLR bit[1]): >>>>> =E2=80=A2 if the A bit is 0, there are no alignment = restrictions (except for strongly ordered or device memory, where = accesses must be element aligned or the result is unpredictable) >>>>> =E2=80=A2 if the A bit is 1, accesses must be element = aligned. >>>>> If an address is not correctly aligned, an alignment fault occurs. >>>>=20 >>>> So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus = error would have the context to happen because of the mis-alignment. >>>>=20 >>>> The following shows the make.conf context that explains how = /usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked: >>>>=20 >>>>> # more /etc/make.conf >>>>> WRKDIRPREFIX=3D/usr/obj/portswork >>>>> WITH_DEBUG=3D >>>>> WITH_DEBUG_FILES=3D >>>>> MALLOC_PRODUCTION=3D >>>>> # >>>>> TO_TYPE=3Darmv6 >>>>> TOOLS_TO_TYPE=3Darm-gnueabi >>>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>>>> .if ${.MAKE.LEVEL} =3D=3D 0 >>>>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>>> .export CC >>>>> .export CXX >>>>> .export CPP >>>>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>>>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>>>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>>>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>>>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>>>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>>>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>>>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>>>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>>>> .export AS >>>>> .export AR >>>>> .export LD >>>>> .export NM >>>>> .export OBJCOPY >>>>> .export OBJDUMP >>>>> .export RANLIB >>>>> .export SIZE >>>>> .export STRINGS >>>>> .endif >>>>=20 >>>>=20 >>>> Other context: >>>>=20 >>>>> # freebsd-version -ku; uname -aKU >>>>> 11.0-CURRENT >>>>> 11.0-CURRENT >>>>> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue = Dec 22 22:02:21 PST 2015 = root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG arm = 1100091 1100091 >>>>=20 >>>>=20 >>>>=20 >>>> I will note that world and kernel are my own build of -r292413 = (earlier experiment) --a build made from an amd64 host context and put = in place via DESTDIR=3D. My expectation would be that the amd64 context = would not be likely to have similar alignment restrictions involved in = its ar activity (or other activity). That would explain how I got this = far using such a clang 3.7 related toolchain for targeting an rpi2 = before finding such a problem. >>>=20 >>>=20 >>> I realized re-reading the all above that it seems to suggest that = the _fseeko code involved is from /usr/local/arm-gnueabi-freebsd/bin/ar = but that was not my intent. >>>=20 >>> libc.so.7 is from my buildworld, including the fseeko = implementation: >>>=20 >>> Reading symbols from /lib/libc.so.7...Reading symbols from = /usr/lib/debug//lib/libc.so.7.debug...done. >>> done. >>> Loaded symbols for /lib/libc.so.7 >>>=20 >>>=20 >>> head/sys/sys/_types.h has: >>>=20 >>> /* >>> * mbstate_t is an opaque object to keep conversion state during = multibyte >>> * stream conversions. >>> */ >>> typedef union { >>> char __mbstate8[128]; >>> __int64_t _mbstateL; /* for alignment */ >>> } __mbstate_t; >>>=20 >>> suggesting an implicit alignment of the union to whatever the = implementation defines for __int64_t --which need not be 8 byte = alignment (in the abstract, general case). But 8 byte alignment is a = possibility as well (in the abstract). >>>=20 >>> But printing *fp in gdb for the fp argument to _fseeko reports the = same not-8-byte aligned address for __mbstate8 that was in r0: >>>=20 >>>> (gdb) bt >>>> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at /usr/src/lib/libc/stdio/fseek.c:299 >>>> #1 0x2033b108 in fseeko (fp=3D0x20651dcc, offset=3D18571438587904, = whence=3D0) at /usr/src/lib/libc/stdio/fseek.c:82 >>>> #2 0x00016138 in ?? () >>>> (gdb) print fp >>>> $2 =3D (FILE *) 0x20651dcc >>>> (gdb) print *fp >>>> $3 =3D {_p =3D 0x2069a240 "", _r =3D 0, _w =3D 0, _flags =3D 5264, = _file =3D 36, _bf =3D {_base =3D 0x2069a240 "", _size =3D 32768}, = _lbfsize =3D 0, _cookie =3D 0x20651dcc, _close =3D 0x20359dfc = <__sclose>, >>>> _read =3D 0x20359de4 <__sread>, _seek =3D 0x20359df4 <__sseek>, = _write =3D 0x20359dec <__swrite>, _ub =3D {_base =3D 0x0, _size =3D 0}, = _up =3D 0x0, _ur =3D 0, _ubuf =3D 0x20651e0c "", _nbuf =3D 0x20651e0f = "", _lb =3D { >>>> _base =3D 0x0, _size =3D 0}, _blksize =3D 32768, _offset =3D 0, = _fl_mutex =3D 0x0, _fl_owner =3D 0x0, _fl_count =3D 0, _orientation =3D = 0, _mbstate =3D {__mbstate8 =3D 0x20651e34 "", _mbstateL =3D 0}, _flags2 = =3D 0} >>>=20 >>> The overall FILE struct containing the _mbstate field is also not = 8-byte aligned. But the offset from the start of the FILE struct to = __mbstate8 is a multiple of 8 bytes. >>>=20 >>> It is my interpretation that there is nothing here to justify the = memset implementation combination: >>>=20 >>> SCTLR bit[1]=3D=3D1 >>>=20 >>> mixed with >>>=20 >>> vst1.64 instructions >>>=20 >>> I.e.: one or both needs to change unless some way for forcing 8-byte = alignment is introduced. >>>=20 >>> I have not managed to track down anything that would indicate = FreeBSD's intent for SCTLR bit[1]. I do not even know if it is required = by the design to be constant (once initialized). >>=20 >>=20 >> I have (so far) removed the build tool crashes based on adding = -fmax-type-align=3D4 to avoid the misaligned accesses. Details follow. >>=20 >> src.conf on amd64 for the rpi2 targeting buildworld/buildkernel now = looks like: >>=20 >>> # more ~/src.configs/src.conf.rpi2-clang.amd64-host >>> TO_TYPE=3Darmv6 >>> TOOLS_TO_TYPE=3Darm-gnueabi >>> FROM_TYPE=3Damd64 >>> TOOLS_FROM_TYPE=3Dx86_64 >>> VERSION_CONTEXT=3D11.0 >>> # >>> KERNCONF=3DRPI2-NODBG >>> TARGET=3Darm >>> .if ${.MAKE.LEVEL} =3D=3D 0 >>> TARGET_ARCH=3D${TO_TYPE} >>> .export TARGET_ARCH >>> .endif >>> # >>> WITHOUT_CROSS_COMPILER=3D >>> # >>> # For WITH_BOOT=3D . . . >>> # arm-gnueabi-freebsd/bin/ld reports bootinfo.o: relocation = R_ARM_MOVW_ABS_NC against `a local symbol' can not be used when making a = shared object; recompile with -fPIC >>> WITHOUT_BOOT=3D >>> # >>> WITH_FAST_DEPEND=3D >>> WITH_LIBCPLUSPLUS=3D >>> WITH_CLANG=3D >>> WITH_CLANG_IS_CC=3D >>> WITH_CLANG_FULL=3D >>> WITH_LLDB=3D >>> WITH_CLANG_EXTRAS=3D >>> # >>> WITHOUT_LIB32=3D >>> WITHOUT_GCC=3D >>> WITHOUT_GNUCXX=3D >>> # >>> NO_WERROR=3D >>> MALLOC_PRODUCTION=3D >>> #CFLAGS+=3D -DELF_VERBOSE >>> # >>> WITH_DEBUG=3D >>> WITH_DEBUG_FILES=3D >>> # >>> # TOOLS_TO_TYPE based on ${TO_TYPE}-xtoolchain-gcc related = bintutils... >>> # >>> #CROSS_TOOLCHAIN=3D${TO_TYPE}-gcc >>> X_COMPILER_TYPE=3Dclang >>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>> .if ${.MAKE.LEVEL} =3D=3D 0 >>> XCC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> XCXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> XCPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> .export XCC >>> .export XCXX >>> .export XCPP >>> XAS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>> XAR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>> XLD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>> XNM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>> XOBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>> XOBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>> XRANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>> XSIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>> #NO-SUCH: XSTRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>> XSTRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>> .export XAS >>> .export XAR >>> .export XLD >>> .export XNM >>> .export XOBJCOPY >>> .export XOBJDUMP >>> .export XRANLIB >>> .export XSIZE >>> .export XSTRINGS >>> .endif >>> # >>> # Host compiler stuff: >>> .if ${.MAKE.LEVEL} =3D=3D 0 >>> CC=3D/usr/bin/clang -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >>> CXX=3D/usr/bin/clang++ -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >>> CPP=3D/usr/bin/clang-cpp -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >>> .export CC >>> .export CXX >>> .export CPP >>> AS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/as >>> AR=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ar >>> LD=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ld >>> NM=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/nm >>> OBJCOPY=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objcopy >>> OBJDUMP=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objdump >>> RANLIB=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ranlib >>> SIZE=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/size >>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/strings >>> STRINGS=3D/usr/local/bin/${TOOLS_FROM_TYPE}-freebsd-strings >>> .export AS >>> .export AR >>> .export LD >>> .export NM >>> .export OBJCOPY >>> .export OBJDUMP >>> .export RANLIB >>> .export SIZE >>> .export STRINGS >>> .endif >>=20 >> make.conf for during the on-rpi2 port builds now looks like: >>=20 >>> $ more /etc/make.conf >>> WRKDIRPREFIX=3D/usr/obj/portswork >>> WITH_DEBUG=3D >>> WITH_DEBUG_FILES=3D >>> MALLOC_PRODUCTION=3D >>> # >>> TO_TYPE=3Darmv6 >>> TOOLS_TO_TYPE=3Darm-gnueabi >>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>> .if ${.MAKE.LEVEL} =3D=3D 0 >>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>> .export CC >>> .export CXX >>> .export CPP >>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>> .export AS >>> .export AR >>> .export LD >>> .export NM >>> .export OBJCOPY >>> .export OBJDUMP >>> .export RANLIB >>> .export SIZE >>> .export STRINGS >>> .endif >>=20 >>=20 >>=20 >> =3D=3D=3D >> Mark Millard >> markmi at dsl-only.net >>=20 >>=20 >>=20 >> _______________________________________________ >> freebsd-toolchain@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain >> To unsubscribe, send any mail to = "freebsd-toolchain-unsubscribe@freebsd.org" From owner-freebsd-toolchain@freebsd.org Sat Dec 26 16:45:39 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 46F5AA520E6 for ; Sat, 26 Dec 2015 16:45:39 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from mail-oi0-x22b.google.com (mail-oi0-x22b.google.com [IPv6:2607:f8b0:4003:c06::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 04C261F4F for ; Sat, 26 Dec 2015 16:45:38 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: by mail-oi0-x22b.google.com with SMTP id o124so157395743oia.1 for ; Sat, 26 Dec 2015 08:45:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=sender:subject:mime-version:content-type:from:in-reply-to:date:cc :message-id:references:to; bh=MafQe3JvzPs7LCo3zWvMQI+8oUDIOVswkIZw0ddqrnQ=; b=WU0ltwcHWk8RCVwg4XOi/r8zgWs1Q3mSL3bM00siP2n3DRJbx2AU7QznTe1py8S0ua Y8Uj7OqII5UYFrVjyFvsV2nfR3j7TNoEKXg+IFjsnAibUuv3JN+ssjwN4nBNhssQrwW5 UR7F8BDHUXAMlYpMxmjfwTn3IrosJyBTNXbQ153dVNQHbdsHhH5b5c1Jvu2SResJX5dr /1BjVnx64vHBf32AfZz9mZFDPJeo9AtFFb/2/GOQUehGCD8x87k4QBZdpDp32Ke3qOqF Mqk16DbECA2rQVnQ440UYFN6w7JGelbCoqZ6Ryiffa0oWZlGQhObQJ8MwNDP64BGkt2I rNdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:mime-version:content-type:from :in-reply-to:date:cc:message-id:references:to; bh=MafQe3JvzPs7LCo3zWvMQI+8oUDIOVswkIZw0ddqrnQ=; b=jCrkzHg+i40IY0T5P9z+dqulIHZcWCsS+wPuzt/D/kVZZPPvCZU0o3lCobUh+d0yjT ROgM/2uFpJTj/XfVu9pdZffBnvh/+7LkRwYJxBSYTiSQtX7WQ30+4xbZWIzAcr8n9Ped DAbqerDU1PfBSWWdwNM/9MYGf8u09gi9/cL2ELcb6g4HyyROgKGxfDrabvHCyaWwIqrO aRLyHR5zuQEFNs2BEXr2T9gIDBeg7AbvGQOnZafVjf9jRueQcxOV0WH+8cKuI+n2Q3vn j3+E5Bk+ArK+gcRzKny/rGj7TdgcqqzAoTPc2fFl2bALMbOqL+Lxn1lqOkjpS3SF59ew XdoA== X-Gm-Message-State: ALoCoQlEh5rE9gUM7i9mqJqU45YUoaSmLgkb0ltgjpCs8VF5ZiZV85a1qJowbxRBdpvQuV1VxlbfYVOPKnENldHiNv2pdbfiWA== X-Received: by 10.202.229.132 with SMTP id c126mr25328023oih.112.1451148337712; Sat, 26 Dec 2015 08:45:37 -0800 (PST) Received: from ?IPv6:2601:280:4900:3700:f8f5:b3fe:63a6:7f79? ([2601:280:4900:3700:f8f5:b3fe:63a6:7f79]) by smtp.gmail.com with ESMTPSA id bi2sm15493726obb.24.2015.12.26.08.45.36 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 26 Dec 2015 08:45:36 -0800 (PST) Sender: Warner Losh Subject: Re: 11.0-CURRENT (r292413) on a rpi2b: arm-gnueabi-freebsd/bin/ar, _fseeko, and memset vs memory alignment (SCTRL bit[1]=1?): Explains the Bus error? Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Content-Type: multipart/signed; boundary="Apple-Mail=_3A881A22-5ED1-4B30-8C08-758FA0B9A7D1"; protocol="application/pgp-signature"; micalg=pgp-sha512 X-Pgp-Agent: GPGMail 2.5.2 From: Warner Losh In-Reply-To: Date: Sat, 26 Dec 2015 09:45:31 -0700 Cc: freebsd-arm , FreeBSD Toolchain , Ian Lepore , mat@FreeBSD.org, sbruno@FreeBSD.org Message-Id: References: <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> <0D81C2CA-BF1C-4C14-B816-A8C5F68715B5@bsdimp.com> <51EB4AAB-BC81-4282-BA4D-D329C41D660B@dsl-only.net> <8B52074F-FDEF-4119-BB04-630F9BE9E6DB@bsdimp.com> To: Mark Millard X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Dec 2015 16:45:39 -0000 --Apple-Mail=_3A881A22-5ED1-4B30-8C08-758FA0B9A7D1 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Thanks, it sounds like I fixed a bug, but there=E2=80=99s more. What were the specific port so I can test it here? And to be clear, this is a buildworld on the RPi 2 using the cross-built = world with CPUTYPE=3Darmv7a or some such, right? Warner > On Dec 25, 2015, at 9:32 PM, Mark Millard wrote: >=20 > [I am again breaking off another section of older material.] >=20 > Mixed news I'm afraid. >=20 > The specific couple of ports that I attempted did build, the same ones = that originally got the Bus Error in ar using (indirectly) _fseeko and = memset that I reported. So I expect that you fixed one error. >=20 > But when I tried to buildworld, clang++ 3.7 processing = usr/src/lib/clang/libllvmtablegen/ materials quickly got a Bus Error at = nearly the same type of instruction (it has a "!" below that the earlier = one did not), but with r4 holding the misaligned address this time: >=20 >> --- _bootstrap-tools-lib/clang/libllvmsupport --- >> --- APFloat.o --- >> clang++: error: unable to execute command: Bus error (core dumped) >> . . . >> # gdb clang++ usr/src/lib/clang/libllvmtablegen/clang++.core >> . . . >> Core was generated by `clang++'. >> Program terminated with signal 10, Bus error. >> #0 0x00c3bb9c in = clang::DependentTemplateSpecializationType::DependentTemplateSpecializatio= nType () >> [New Thread 22a18000 (LWP 100128/)] >> (gdb) x/40i 0x00c3bb60 >> . . . >> 0xc3bb9c = <_ZN5clang35DependentTemplateSpecializationTypeC2ENS_21ElaboratedTypeKeywo= rdEPNS_19NestedNameSpecifierEPKNS_14IdentifierInfoEjPKNS_16TemplateArgumen= tENS_8QualTypeE+356>: >> vst1.64 {d16-d17}, [r4]! >> . . . >> (gdb) info all-registers >> r0 0xbfbf81a8 -1077968472 >> r1 0x22f07e14 586186260 >> r2 0xc416bc 12850876 >> r3 0x2 2 >> r4 0x22f07dfc 586186236 >> . . . >=20 >=20 > Thus it appears that there is more code around that likely generates = pointers not aligned so to allow the code generation that is in use for = what is pointed to. >=20 > At this point I have no clue if the issue is just inside clang itself = vs. if it is in something that clang is layered on top of. Nor if there = is just one bad thing or many. >=20 > Note: I had not yet tried buildworld/buildkernel for the context of = the "-f" option that I was experimenting with earlier. So I do not have = a direct compare and contrast at this point. >=20 >=20 >=20 > Older material: >=20 > On 2015-Dec-25, at 5:21 PM, Mark Millard wrote: >=20 >> On 2015-Dec-25, at 3:42 PM, Warner Losh wrote: >>=20 >>=20 >>> On Dec 25, 2015, at 3:14 PM, Mark Millard = wrote: >>>=20 >>> [I'm going to break much of the earlier "original material" text to = tail of the message.] >>>=20 >>>> On 2015-Dec-25, at 11:53 AM, Warner Losh wrote: >>>>=20 >>>> So what happens if we actually fix the underlying bug? >>>>=20 >>>> I see two ways of doing this. In findfp.c, we allocate an array of = FILE * today like: >>>> g =3D (struct glue *)malloc(sizeof(*g) + ALIGNBYTES + n * = sizeof(FILE)); >>>> but that assumes that FILE just has normal pointer alignment = requirements. However, >>>> due to the mbstate having int64_t alignment requirements, this is = wrong. Maybe we >>>> need to do something like >>>> g =3D (struct glue *)malloc(sizeof(*g) + = max(sizeof(int64_t),ALIGNBYTES) + n * sizeof(FILE)); >>>> which wouldn=E2=80=99t change anything on LP64 systems, but would = result in proper alignment >>>> for ILP32 systems. We=E2=80=99d have to fix the loop that uses = ALIGN afterwards to use >>>> roundup. Instead, we=E2=80=99d need to round up to the neared = 8-byte aligned offset (or technically, >>>> the max of ALIGNBYTES and 8, but that=E2=80=99s always 8 on = today=E2=80=99s systems. If we do this, >>>> we can make sure that each file is 8-byte aligned or better. We may = need to round up >>>> sizeof(FILE) to a multiple of 8 as well. I believe that since it = has the 8-byte alignment >>>> for a member, its size must be a multiple of 8, but I=E2=80=99ve = not chased that belief to ground. >>>> If not, we may need another decorator (__aligned(8), I think, = spelled with the ugly >>>> max expression above). That way, the contract we=E2=80=99re making = with the compiler will >>>> always be true. ALIGN BYTES is 4 on Arm anyway, so that bit is = clearly wrong. >>>>=20 >>>> This wouldn=E2=80=99t be an ABI change, since you can only get a = valid FILE * from fopen (and >>>> friends), plus stdin, stdout, and stderr. Those addresses aren=E2=80=99= t hard coded into binaries, >>>> so even if we have to tweak the last three and deal with some = =E2=80=98fake=E2=80=99 FILE abuse in libc >>>> (which I don=E2=80=99t think suffers from this issue, btw, given = the alignment requirements that would >>>> naturally follow from something on the stack), we=E2=80=99d still = be ahead. At least for all CONFORMING >>>> implementations[*]... >>>>=20 >>>> TL;DR: Why not make FILE * always 8-byte aligned? The compiler = options are a band-aide. >>>>=20 >>>> Warner >>>>=20 >>>> [*] There=E2=80=99s at least on popular package that has a copy of = the FILE structure in one of its >>>> .h files and uses that to do unnatural optimization things, but = even that=E2=80=99s cool, I think, >>>> since it never allocates a new one. >>>>=20 >>>=20 >>> The ARM documentation mentions cases of 16 byte alignment = requirements. I've no clue if the clang code generation ever creates = such code. There might be wider requirements possible in arm code as = well. (I'm not an arm expert.) As an example of an implication: "The = malloc() function returns a pointer to a block of at least size bytes = suitably aligned for any use." In other words: aligned to some figure = that is a multiple of *every* alignment requirement that the code = generator can produce, possibly being the least common multiple. >>>=20 >>> "-fmax-type-align=3D. . ." is a means of controlling/limiting the = range of potential alignments to no more than a fixed, predefined value. = Above that and the code generation has to work in small size accesses = and build-up/split-up bigger values. Using "-fmax-type-align=3D. . ." = allows defining a figure as part of an ABI that is then not subject to = code generator updates that could increase the maximum alignment figure = and break things: It turns off such new capabilities. Other options need = not work that way to preserve the ABI. >>=20 >> That=E2=80=99s true, as far as it goes=E2=80=A6 But I=E2=80=99m not = sure it goes far enough. The premise here is that the problem is = wide-spread, when in fact I think it is quite narrow. >>=20 >>> But in the most fundamental terms process wise as far as I can tell. = . . >>>=20 >>> While the FILE case that occurred is a specific example, every = memory-allocation-like operation is at a potential issue for all such = "allocated" objects where the related code generation requires alignment = to avoid Bus Error (given the SCTLR bit[1] in use). >>=20 >> The problem isn=E2=80=99t general. The problem isn=E2=80=99t malloc. = Malloc will generally return the right thing on arm (and if it = doesn=E2=80=99t, >> then we need to make sure it does). >>=20 >> The problem is we get a boatload of FILEs from the system all at = once, and those are misaligned because of a bug in the code. One = that=E2=80=99s fixed, I believe, in https://reviews.freebsd.org/D4708. >>=20 >>=20 >>> How many other places in FreeBSD might sometimes return mis-aligned = pointers for the existing code generation and ABI combination? >>=20 >> It isn=E2=80=99t an ABI thing, just a code bug thing. The only reason = it was an issue was due to the optimizing nature of clang. >>=20 >> We=E2=80=99ve had to deal with the arm alignment issues for years. I = wager there are very few indeed. The only reason this was was brought to = light was better code-gen from clang. >>=20 >>> How many other places are subject to breakage when "internal" = structs/unions/fields involved are changed to be of a different size = because the code is not fully auto-adjusting to match the code = generation yet --even if right now "it works"? How fragile will things = be for future work? >>=20 >> If there are others, I=E2=80=99ll bet they could be counted on one = hand since very few things do the =E2=80=98slab=E2=80=99 allocator that = FILE does. >>=20 >>> What would it take to find out and deal with them all? (I do not = have the background knowledge to span much.) >>>=20 >>> My experiment avoided potentially changing parts of the ABI and also = avoided dealing with such a "lots of code to investigate" issue. It may = not be the long term 11.0-RELEASE solution. Even if not, it may be = appropriate for various temporary purposes that need to avoid Bus Errors = in the process. For example if Ian has a good reason to use clang 3.7 = instead of gcc 4.2.1. >>=20 >> The review above doesn=E2=80=99t change the ABI either. >>=20 >>> Other notes: >>>=20 >>>> I believe that since it has the 8-byte alignment >>>> for a member, its size must be a multiple of 8 >>>=20 >>> There are some C/C++ language rules about the address of a structure = equalling the address of the first field, uniformity of the offsets, and = the like. But. . . >>>=20 >>> The C and C++ languages specify no specific numerical alignment = figures, not even relative to specific sizeof(...) expressions. To use = an old example: a 68010 only needs alignment for >=3D 2 byte things and = even alignment is all that is then required. Some other contexts take a = lot more to meet the specifications. There are some implications of the = modern memory model(s) created to cover concurrency explicitly, such as = avoiding interactions that can happen via, for example, separate objects = (in part) sharing a cache line. (I've only looked at C++ for this, and = only to a degree.) >>>=20 >>> The detailed alignment rules are more "implementation defined" than = "predefined by the standard". But the definition is trying to meet = language criteria. It is not a fully independent choice. >>=20 >> Many of them are actually defined by a combination of the standard = language definition, as well as the ABI standard. This is why we know = that mbstate_t must be 8 byte aligned. >>=20 >>> May be some other standards that FreeBSD is tied to specify more = specifics, such as a N byte integer always aligns to some multiple of N = (a waste on the 68010), including the alignment for union or struct that = it may be a part of tracking. But such rules force padding that may or = may not be required to meet the language's more abstract criteria and = such rules may not match the existing/in-use ABI. >>=20 >> It is all spelled out in the ARM EABI docs. >>=20 >>> So far as I can tell explicitly declared alignments may well be = necessary. If that one "popular package", say, formed an array of FILE = copies then the resultant alignments need not all match the ones = produced by your example code unless the FILE declaration forces the = compiler to match, causing sizeof(FILE) to track as well. FILE need not = be the only such issue. >>=20 >> Arrays of FILEs isn=E2=80=99t an issue (except that it encodes the = size of FILE into the app). It=E2=80=99s the specifically quirky way = that libc does it that=E2=80=99s the problem. >>=20 >>> My background and reference material are mostly tied the languages = --and so my notes tend to be limited to that much context. >>=20 >> Understood. While there may be issues with alignment still, tossing a = big hammer at the problem because they might exist will likely mean they = will persist far longer than fixing them one at a time. When we first = ported to arm, there were maybe half a dozen places that needed fixing. = I doubt there=E2=80=99s more now. >>=20 >> Can you try the patch in the above code review w/o the -f switch and = let me know if it works for you? >>=20 >> Warner >=20 > buildworld/buildkernel has been started on amd64 for a rpi2 target. = That and install kernel/world and starting up a port rebuild on the rpi2 = and waiting for it means it will be a few hours even if I start the next = thing just as each prior thing finishes. I may give up and go to sleep = first. >=20 > As for presumptions: I'll take your word on expected status of things. = I've no clue. But absent even the hear-say status information at the = time I did not presume that what was in front of me was all there is to = worry about --nor did I try to go figure it all out on my own. I took a = path to cover both possibilities for local-only vs. more-wide-spread (so = long as that path did not force a split-up of some larger form of atomic = action). >=20 > In my view "-mno-unaligned-access" is an even bigger hammer than I = used. I find no clang statement about what its ABI consequences would = be, unlike for what I did: What mix of more padding for alignment vs. = more but smaller accesses? But as I remember I've seen = "-mno-unaligned-access" in use in ports and the like so its consequences = may be familiar material for some folks. >=20 > Absent any questions about ABI consequences "-mno-unaligned-access" = does well mark the expected SCTLR bit[1] status, far better than what I = did. Again: I was covering my ignorance while making any significant = investigation/debugging as unlikely as I could. >=20 >=20 >> Original material: >>=20 >>> On Dec 25, 2015, at 7:24 AM, Mark Millard = wrote: >>>=20 >>> [Good News Summary: Rebuilding buildworld/buildkernel for rpi2 = 11.0-CURRENT 292413 from amd64 based on adding -fmax-type-align=3D4 has = so far removed the crashes during the toolchain activity: no more = misaligned accesses in libc's _fseeko or elsewhere.] >>>=20 >>> On 2015-Dec-25, at 12:31 AM, Mark Millard = wrote: >>>=20 >>>> On 2015-Dec-24, at 10:39 PM, Mark Millard = wrote: >>>>=20 >>>>> [I do not know if this partial crash analysis related to on-arm = clang-associated activity is good enough and appropriate to submit or = not.] >>>>>=20 >>>>> The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved = below came from pkg install activity instead of port building. Used = as-is. >>>>>=20 >>>>> When I just tried my first from-rpi2b builds (ports for a rpi2b), = /usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the = following suggests an alignment error for the type of instructions that = memset for 128 bytes was translated to (sizeof(mbstate_t)) in the code = used by /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to = check SCTLR bit[1] to be directly sure that alignment was being = enforced.) >>>>>=20 >>>>> The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar = : >>>>>=20 >>>>>> libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru = .libs/libgnuintl.a bindtextdom.o dcgettext.o dgettext.o gettext.o = finddomain.o hash-string.o loadmsgcat.o localealias.o textdomain.o = l10nflist.o explodename.o dcigettext.o dcngettext.o dngettext.o = ngettext.o pluralx.o plural-exp.o localcharset.o threadlib.o lock.o = relocatable.o langprefs.o localename.o log.o printf.o setlocale.o = version.o xsize.o osdep.o intl-compat.o >>>>>> Bus error (core dumped) >>>>>> *** [libgnuintl.la] Error code 138 >>>>>=20 >>>>> It failed in _fseeko doing a memset that turned into uses of = "vst1.64 {d16-d17}, [r0]" instructions, for an address in = register r0 that ended in 0xa4, so was not aligned to 8 byte boundaries. = =46rom what I read such "VSTn (multiple n-element structures)" that have = .64 require 8 byte alignment. The evidence of the code and register = value follow. >>>>>=20 >>>>>> # gdb /usr/local/arm-gnueabi-freebsd/bin/ar = /usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gette= xt-tools/intl/ar.core >>>>>> . . . >>>>>> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at /usr/src/lib/libc/stdio/fseek.c:299 >>>>>> 299 memset(&fp->_mbstate, 0, sizeof(mbstate_t)); >>>>>> . . . >>>>>> (gdb) x/24i 0x2033adb0 >>>>>> 0x2033adb0 <_fseeko+836>: vmov.i32 q8, #0 ; = 0x00000000 >>>>>> 0x2033adb4 <_fseeko+840>: movw r1, #65503 ; 0xffdf >>>>>> 0x2033adb8 <_fseeko+844>: stm r4, {r0, r7} >>>>>> 0x2033adbc <_fseeko+848>: ldrh r0, [r4, #12] >>>>>> 0x2033adc0 <_fseeko+852>: and r0, r0, r1 >>>>>> 0x2033adc4 <_fseeko+856>: strh r0, [r4, #12] >>>>>> 0x2033adc8 <_fseeko+860>: add r0, r4, #216 ; 0xd8 >>>>>> 0x2033adcc <_fseeko+864>: vst1.64 {d16-d17}, [r0] >>>>>> 0x2033add0 <_fseeko+868>: add r0, r4, #200 ; 0xc8 >>>>>> 0x2033add4 <_fseeko+872>: vst1.64 {d16-d17}, [r0] >>>>>> 0x2033add8 <_fseeko+876>: add r0, r4, #184 ; 0xb8 >>>>>> 0x2033addc <_fseeko+880>: vst1.64 {d16-d17}, [r0] >>>>>> 0x2033ade0 <_fseeko+884>: add r0, r4, #168 ; 0xa8 >>>>>> 0x2033ade4 <_fseeko+888>: vst1.64 {d16-d17}, [r0] >>>>>> 0x2033ade8 <_fseeko+892>: add r0, r4, #152 ; 0x98 >>>>>> 0x2033adec <_fseeko+896>: vst1.64 {d16-d17}, [r0] >>>>>> 0x2033adf0 <_fseeko+900>: add r0, r4, #136 ; 0x88 >>>>>> 0x2033adf4 <_fseeko+904>: vst1.64 {d16-d17}, [r0] >>>>>> 0x2033adf8 <_fseeko+908>: add r0, r4, #120 ; 0x78 >>>>>> 0x2033adfc <_fseeko+912>: vst1.64 {d16-d17}, [r0] >>>>>> 0x2033ae00 <_fseeko+916>: add r0, r4, #104 ; 0x68 >>>>>> 0x2033ae04 <_fseeko+920>: vst1.64 {d16-d17}, [r0] >>>>>> 0x2033ae08 <_fseeko+924>: b 0x2033b070 = <_fseeko+1540> >>>>>> 0x2033ae0c <_fseeko+928>: cmp r5, #0 ; 0x0 >>>>>> (gdb) info all-registers >>>>>> r0 0x20651ea4 543497892 >>>>>> r1 0xffdf 65503 >>>>>> r2 0x0 0 >>>>>> r3 0x0 0 >>>>>> r4 0x20651dcc 543497676 >>>>>> r5 0x0 0 >>>>>> r6 0x0 0 >>>>>> r7 0x0 0 >>>>>> r8 0x20359df4 540384756 >>>>>> r9 0x0 0 >>>>>> r10 0x0 0 >>>>>> r11 0xbfbfb948 -1077954232 >>>>>> r12 0x2037b208 540520968 >>>>>> sp 0xbfbfb898 -1077954408 >>>>>> lr 0x2035a004 540385284 >>>>>> pc 0x2033adcc 540257740 >>>>>> f0 0 (raw 0x000000000000000000000000) >>>>>> f1 0 (raw 0x000000000000000000000000) >>>>>> f2 0 (raw 0x000000000000000000000000) >>>>>> f3 0 (raw 0x000000000000000000000000) >>>>>> f4 0 (raw 0x000000000000000000000000) >>>>>> f5 0 (raw 0x000000000000000000000000) >>>>>> f6 0 (raw 0x000000000000000000000000) >>>>>> f7 0 (raw 0x000000000000000000000000) >>>>>> fps 0x0 0 >>>>>> cpsr 0x60000010 1610612752 >>>>>=20 >>>>> The syntax in use for vst1.64 instructions does not explicitly = have the alignment notation. Presuming that the decoding is correct then = from what I read the following applies: >>>>>=20 >>>>>> Home > NEON and VFP Programming > NEON load and store element and = structure instructions > Alignment restrictions in load and store, = element and structure instructions >>>>>>=20 >>>>>> . . . When the alignment is not specified in the instruction, the = alignment restriction is controlled by the A bit (SCTLR bit[1]): >>>>>> =E2=80=A2 if the A bit is 0, there are no alignment = restrictions (except for strongly ordered or device memory, where = accesses must be element aligned or the result is unpredictable) >>>>>> =E2=80=A2 if the A bit is 1, accesses must be element = aligned. >>>>>> If an address is not correctly aligned, an alignment fault = occurs. >>>>>=20 >>>>> So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus = error would have the context to happen because of the mis-alignment. >>>>>=20 >>>>> The following shows the make.conf context that explains how = /usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked: >>>>>=20 >>>>>> # more /etc/make.conf >>>>>> WRKDIRPREFIX=3D/usr/obj/portswork >>>>>> WITH_DEBUG=3D >>>>>> WITH_DEBUG_FILES=3D >>>>>> MALLOC_PRODUCTION=3D >>>>>> # >>>>>> TO_TYPE=3Darmv6 >>>>>> TOOLS_TO_TYPE=3Darm-gnueabi >>>>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>>>>> .if ${.MAKE.LEVEL} =3D=3D 0 >>>>>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>>>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>>>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>>>> .export CC >>>>>> .export CXX >>>>>> .export CPP >>>>>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>>>>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>>>>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>>>>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>>>>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>>>>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>>>>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>>>>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>>>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings= >>>>>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>>>>> .export AS >>>>>> .export AR >>>>>> .export LD >>>>>> .export NM >>>>>> .export OBJCOPY >>>>>> .export OBJDUMP >>>>>> .export RANLIB >>>>>> .export SIZE >>>>>> .export STRINGS >>>>>> .endif >>>>>=20 >>>>>=20 >>>>> Other context: >>>>>=20 >>>>>> # freebsd-version -ku; uname -aKU >>>>>> 11.0-CURRENT >>>>>> 11.0-CURRENT >>>>>> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue = Dec 22 22:02:21 PST 2015 = root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG arm = 1100091 1100091 >>>>>=20 >>>>>=20 >>>>>=20 >>>>> I will note that world and kernel are my own build of -r292413 = (earlier experiment) --a build made from an amd64 host context and put = in place via DESTDIR=3D. My expectation would be that the amd64 context = would not be likely to have similar alignment restrictions involved in = its ar activity (or other activity). That would explain how I got this = far using such a clang 3.7 related toolchain for targeting an rpi2 = before finding such a problem. >>>>=20 >>>>=20 >>>> I realized re-reading the all above that it seems to suggest that = the _fseeko code involved is from /usr/local/arm-gnueabi-freebsd/bin/ar = but that was not my intent. >>>>=20 >>>> libc.so.7 is from my buildworld, including the fseeko = implementation: >>>>=20 >>>> Reading symbols from /lib/libc.so.7...Reading symbols from = /usr/lib/debug//lib/libc.so.7.debug...done. >>>> done. >>>> Loaded symbols for /lib/libc.so.7 >>>>=20 >>>>=20 >>>> head/sys/sys/_types.h has: >>>>=20 >>>> /* >>>> * mbstate_t is an opaque object to keep conversion state during = multibyte >>>> * stream conversions. >>>> */ >>>> typedef union { >>>> char __mbstate8[128]; >>>> __int64_t _mbstateL; /* for alignment */ >>>> } __mbstate_t; >>>>=20 >>>> suggesting an implicit alignment of the union to whatever the = implementation defines for __int64_t --which need not be 8 byte = alignment (in the abstract, general case). But 8 byte alignment is a = possibility as well (in the abstract). >>>>=20 >>>> But printing *fp in gdb for the fp argument to _fseeko reports the = same not-8-byte aligned address for __mbstate8 that was in r0: >>>>=20 >>>>> (gdb) bt >>>>> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D, whence=3D, ltest=3D) at /usr/src/lib/libc/stdio/fseek.c:299 >>>>> #1 0x2033b108 in fseeko (fp=3D0x20651dcc, offset=3D18571438587904, = whence=3D0) at /usr/src/lib/libc/stdio/fseek.c:82 >>>>> #2 0x00016138 in ?? () >>>>> (gdb) print fp >>>>> $2 =3D (FILE *) 0x20651dcc >>>>> (gdb) print *fp >>>>> $3 =3D {_p =3D 0x2069a240 "", _r =3D 0, _w =3D 0, _flags =3D 5264, = _file =3D 36, _bf =3D {_base =3D 0x2069a240 "", _size =3D 32768}, = _lbfsize =3D 0, _cookie =3D 0x20651dcc, _close =3D 0x20359dfc = <__sclose>, >>>>> _read =3D 0x20359de4 <__sread>, _seek =3D 0x20359df4 <__sseek>, = _write =3D 0x20359dec <__swrite>, _ub =3D {_base =3D 0x0, _size =3D 0}, = _up =3D 0x0, _ur =3D 0, _ubuf =3D 0x20651e0c "", _nbuf =3D 0x20651e0f = "", _lb =3D { >>>>> _base =3D 0x0, _size =3D 0}, _blksize =3D 32768, _offset =3D 0, = _fl_mutex =3D 0x0, _fl_owner =3D 0x0, _fl_count =3D 0, _orientation =3D = 0, _mbstate =3D {__mbstate8 =3D 0x20651e34 "", _mbstateL =3D 0}, _flags2 = =3D 0} >>>>=20 >>>> The overall FILE struct containing the _mbstate field is also not = 8-byte aligned. But the offset from the start of the FILE struct to = __mbstate8 is a multiple of 8 bytes. >>>>=20 >>>> It is my interpretation that there is nothing here to justify the = memset implementation combination: >>>>=20 >>>> SCTLR bit[1]=3D=3D1 >>>>=20 >>>> mixed with >>>>=20 >>>> vst1.64 instructions >>>>=20 >>>> I.e.: one or both needs to change unless some way for forcing = 8-byte alignment is introduced. >>>>=20 >>>> I have not managed to track down anything that would indicate = FreeBSD's intent for SCTLR bit[1]. I do not even know if it is required = by the design to be constant (once initialized). >>>=20 >>>=20 >>> I have (so far) removed the build tool crashes based on adding = -fmax-type-align=3D4 to avoid the misaligned accesses. Details follow. >>>=20 >>> src.conf on amd64 for the rpi2 targeting buildworld/buildkernel now = looks like: >>>=20 >>>> # more ~/src.configs/src.conf.rpi2-clang.amd64-host >>>> TO_TYPE=3Darmv6 >>>> TOOLS_TO_TYPE=3Darm-gnueabi >>>> FROM_TYPE=3Damd64 >>>> TOOLS_FROM_TYPE=3Dx86_64 >>>> VERSION_CONTEXT=3D11.0 >>>> # >>>> KERNCONF=3DRPI2-NODBG >>>> TARGET=3Darm >>>> .if ${.MAKE.LEVEL} =3D=3D 0 >>>> TARGET_ARCH=3D${TO_TYPE} >>>> .export TARGET_ARCH >>>> .endif >>>> # >>>> WITHOUT_CROSS_COMPILER=3D >>>> # >>>> # For WITH_BOOT=3D . . . >>>> # arm-gnueabi-freebsd/bin/ld reports bootinfo.o: relocation = R_ARM_MOVW_ABS_NC against `a local symbol' can not be used when making a = shared object; recompile with -fPIC >>>> WITHOUT_BOOT=3D >>>> # >>>> WITH_FAST_DEPEND=3D >>>> WITH_LIBCPLUSPLUS=3D >>>> WITH_CLANG=3D >>>> WITH_CLANG_IS_CC=3D >>>> WITH_CLANG_FULL=3D >>>> WITH_LLDB=3D >>>> WITH_CLANG_EXTRAS=3D >>>> # >>>> WITHOUT_LIB32=3D >>>> WITHOUT_GCC=3D >>>> WITHOUT_GNUCXX=3D >>>> # >>>> NO_WERROR=3D >>>> MALLOC_PRODUCTION=3D >>>> #CFLAGS+=3D -DELF_VERBOSE >>>> # >>>> WITH_DEBUG=3D >>>> WITH_DEBUG_FILES=3D >>>> # >>>> # TOOLS_TO_TYPE based on ${TO_TYPE}-xtoolchain-gcc related = bintutils... >>>> # >>>> #CROSS_TOOLCHAIN=3D${TO_TYPE}-gcc >>>> X_COMPILER_TYPE=3Dclang >>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>>> .if ${.MAKE.LEVEL} =3D=3D 0 >>>> XCC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>>> XCXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>>> XCPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>>> .export XCC >>>> .export XCXX >>>> .export XCPP >>>> XAS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>>> XAR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>>> XLD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>>> XNM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>>> XOBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>>> XOBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>>> XRANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>>> XSIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>>> #NO-SUCH: XSTRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>>> XSTRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>>> .export XAS >>>> .export XAR >>>> .export XLD >>>> .export XNM >>>> .export XOBJCOPY >>>> .export XOBJDUMP >>>> .export XRANLIB >>>> .export XSIZE >>>> .export XSTRINGS >>>> .endif >>>> # >>>> # Host compiler stuff: >>>> .if ${.MAKE.LEVEL} =3D=3D 0 >>>> CC=3D/usr/bin/clang -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >>>> CXX=3D/usr/bin/clang++ -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >>>> CPP=3D/usr/bin/clang-cpp = -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >>>> .export CC >>>> .export CXX >>>> .export CPP >>>> AS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/as >>>> AR=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ar >>>> LD=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ld >>>> NM=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/nm >>>> OBJCOPY=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objcopy >>>> OBJDUMP=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objdump >>>> RANLIB=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ranlib >>>> SIZE=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/size >>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/strings= >>>> STRINGS=3D/usr/local/bin/${TOOLS_FROM_TYPE}-freebsd-strings >>>> .export AS >>>> .export AR >>>> .export LD >>>> .export NM >>>> .export OBJCOPY >>>> .export OBJDUMP >>>> .export RANLIB >>>> .export SIZE >>>> .export STRINGS >>>> .endif >>>=20 >>> make.conf for during the on-rpi2 port builds now looks like: >>>=20 >>>> $ more /etc/make.conf >>>> WRKDIRPREFIX=3D/usr/obj/portswork >>>> WITH_DEBUG=3D >>>> WITH_DEBUG_FILES=3D >>>> MALLOC_PRODUCTION=3D >>>> # >>>> TO_TYPE=3Darmv6 >>>> TOOLS_TO_TYPE=3Darm-gnueabi >>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>>> .if ${.MAKE.LEVEL} =3D=3D 0 >>>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>>> .export CC >>>> .export CXX >>>> .export CPP >>>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>>> .export AS >>>> .export AR >>>> .export LD >>>> .export NM >>>> .export OBJCOPY >>>> .export OBJDUMP >>>> .export RANLIB >>>> .export SIZE >>>> .export STRINGS >>>> .endif >>>=20 >>>=20 >>>=20 >>> =3D=3D=3D >>> Mark Millard >>> markmi at dsl-only.net >>>=20 >>>=20 >>>=20 >>> _______________________________________________ >>> freebsd-toolchain@freebsd.org mailing list >>> https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain >>> To unsubscribe, send any mail to = "freebsd-toolchain-unsubscribe@freebsd.org" >=20 >=20 >=20 --Apple-Mail=_3A881A22-5ED1-4B30-8C08-758FA0B9A7D1 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQIcBAEBCgAGBQJWfsQsAAoJEGwc0Sh9sBEAXM0QAKRH78oT70ZQFQDPar9s9qIc LJBzzu5FaK4R+Ztv+t1ypx1dx8CLUUOkji/GXrKGnEblY2AwAoGv2deAC5Y6Q5AE N/vd5p4V8Z11iKrH/YqFakBoFabdtbtl+gjHLyEBZMH6jjqI9s+8oA71LOwyPsX6 5rutJPHFAP4HYGTNMv9Jn9vF5mr4CyCtgHw6VNyB8PW9rbNX+a9Ox3dlwkgWqLes RDpVri0Sc0QYCaagfvnZHqsuww8W+MYL9TnT2ioArQZVhECZCrYh5NcCP2JiiyX7 Tq/4+lXroggDDY45BTZq+M1dhlJf57DCJf84oqFx0f/+ygoEODC83FRfnQPFAA2M y+5HwlyDaThfY7387/IgyBPIH2T6zC/xj1JiXwRNPVjtJPsSRlYfomCyNeEQTZwu gM3LJXfJCvxCyYh2f7yYbPAdZu9AbLDJctVsqgRhy22jy/l/xfcOmdi7UzaLMdDq gaBbekcp+VwAojeciJU0baOu1uwWBT4FNx3ErhzmjND+2oDCWWt7JH2jpEiUVpt/ Wbo0avYU4QZW2K5qJjgcIwDRWLZRFnR+fWQJ81accWMOAYANHcWZDrYK97lAPaUV QfQf7fXOtZ3lfxdLqoS/oBEzmmpnj/qR9Uni8faFzrQJQvt9CBbJDXo2vZCjmcT7 CjUy0akjLstzZ12O7RN8 =l7Th -----END PGP SIGNATURE----- --Apple-Mail=_3A881A22-5ED1-4B30-8C08-758FA0B9A7D1-- From owner-freebsd-toolchain@freebsd.org Sat Dec 26 17:00:22 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DED86A5251B for ; Sat, 26 Dec 2015 17:00:22 +0000 (UTC) (envelope-from ian@freebsd.org) Received: from erouter6.ore.mailhop.org (erouter6.ore.mailhop.org [54.187.213.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BAC7C1396 for ; Sat, 26 Dec 2015 17:00:22 +0000 (UTC) (envelope-from ian@freebsd.org) Received: from ilsoft.org (unknown [73.34.117.227]) by outbound3.ore.mailhop.org (Halon Mail Gateway) with ESMTPSA; Sat, 26 Dec 2015 16:59:41 +0000 (UTC) Received: from rev (rev [172.22.42.240]) by ilsoft.org (8.14.9/8.14.9) with ESMTP id tBQH0DWq010253; Sat, 26 Dec 2015 10:00:14 -0700 (MST) (envelope-from ian@freebsd.org) Message-ID: <1451149213.25138.271.camel@freebsd.org> Subject: Re: 11.0-CURRENT (r292413) on a rpi2b: arm-gnueabi-freebsd/bin/ar, _fseeko, and memset vs memory alignment (SCTRL bit[1]=1?): Explains the Bus error? From: Ian Lepore To: Mark Millard , Warner Losh Cc: mat@FreeBSD.org, freebsd-arm , FreeBSD Toolchain Date: Sat, 26 Dec 2015 10:00:13 -0700 In-Reply-To: References: <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> <0D81C2CA-BF1C-4C14-B816-A8C5F68715B5@bsdimp.com> <51EB4AAB-BC81-4282-BA4D-D329C41D660B@dsl-only.net> <8B52074F-FDEF-4119-BB04-630F9BE9E6DB@bsdimp.com> Content-Type: text/plain; charset="us-ascii" X-Mailer: Evolution 3.16.5 FreeBSD GNOME Team Port Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Dec 2015 17:00:23 -0000 On Fri, 2015-12-25 at 17:21 -0800, Mark Millard wrote: > In my view "-mno-unaligned-access" is an even bigger hammer than I > used. I find no clang statement about what its ABI consequences would > be, unlike for what I did: What mix of more padding for alignment vs. > more but smaller accesses? But as I remember I've seen "-mno > -unaligned-access" in use in ports and the like so its consequences > may be familiar material for some folks. > > Absent any questions about ABI consequences "-mno-unaligned-access" > does well mark the expected SCTLR bit[1] status, far better than what > I did. Again: I was covering my ignorance while making any > significant investigation/debugging as unlikely as I could. After reading the docs more carefully, I think -mno-unaligned-access isn't a bigger hammer, it's just a different tool that addresses a different problem than the one you ran into, and it's one we need. In particular, it prevents alignment-required accesses to potentially unaligned fields in a struct marked as 'packed', which is something we rely on (it's why we mark some structs as packed). -- Ian From owner-freebsd-toolchain@freebsd.org Sat Dec 26 19:07:57 2015 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 50A75A526B7 for ; Sat, 26 Dec 2015 19:07:57 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-211-151.reflexion.net [208.70.211.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 136E115E0 for ; Sat, 26 Dec 2015 19:07:56 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 15959 invoked from network); 26 Dec 2015 19:07:49 -0000 Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2) by 0 (rfx-qmail) with SMTP; 26 Dec 2015 19:07:49 -0000 Received: by mail-cs-02.app.dca.reflexion.local (Reflexion email security v7.80.0) with SMTP; Sat, 26 Dec 2015 14:07:48 -0500 (EST) Received: (qmail 6369 invoked from network); 26 Dec 2015 19:07:47 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with SMTP; 26 Dec 2015 19:07:47 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-76-115-7-162.hsd1.or.comcast.net [76.115.7.162]) by iron2.pdx.net (Postfix) with ESMTPSA id 6B8A1B1E001; Sat, 26 Dec 2015 11:07:45 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: 11.0-CURRENT (r292413) on a rpi2b: arm-gnueabi-freebsd/bin/ar, _fseeko, and memset vs memory alignment (SCTRL bit[1]=1?): Explains the Bus error? From: Mark Millard In-Reply-To: <1451149213.25138.271.camel@freebsd.org> Date: Sat, 26 Dec 2015 11:07:48 -0800 Cc: Warner Losh , mat@FreeBSD.org, freebsd-arm , FreeBSD Toolchain Content-Transfer-Encoding: quoted-printable Message-Id: <6EC8F626-D8C4-435E-B67A-CA841C9C1322@dsl-only.net> References: <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> <0D81C2CA-BF1C-4C14-B816-A8C5F68715B5@bsdimp.com> <51EB4AAB-BC81-4282-BA4D-D329C41D660B@dsl-only.net> <8B52074F-FDEF-4119-BB04-630F9BE9E6DB@bsdimp.com> <1451149213.25138.271.camel@freebsd.org> To: Ian Lepore X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Dec 2015 19:07:57 -0000 On 2015-Dec-26, at 9:00 AM, Ian Lepore wrote: > On Fri, 2015-12-25 at 17:21 -0800, Mark Millard wrote: >> In my view "-mno-unaligned-access" is an even bigger hammer than I >> used. I find no clang statement about what its ABI consequences would >> be, unlike for what I did: What mix of more padding for alignment vs. >> more but smaller accesses? But as I remember I've seen "-mno >> -unaligned-access" in use in ports and the like so its consequences >> may be familiar material for some folks. >>=20 >> Absent any questions about ABI consequences "-mno-unaligned-access" >> does well mark the expected SCTLR bit[1] status, far better than what >> I did. Again: I was covering my ignorance while making any >> significant investigation/debugging as unlikely as I could. >=20 > After reading the docs more carefully, I think -mno-unaligned-access > isn't a bigger hammer, it's just a different tool that addresses a > different problem than the one you ran into, and it's one we need. In > particular, it prevents alignment-required accesses to potentially > unaligned fields in a struct marked as 'packed', which is something we > rely on (it's why we mark some structs as packed). >=20 > -- Ian >=20 >=20 If clang uses the same interpretation as gcc for arm then I agree: > -munaligned-access > -mno-unaligned-access > Enables (or disables) reading and writing of 16- and 32- bit values = from addresses that are not 16- or 32- bit aligned. By default unaligned = access is disabled for all pre-ARMv6 and all ARMv6-M architectures, and = enabled for all other architectures. If unaligned access is not enabled = then words in packed data structures are accessed a byte at a time. I see that linux went with SCTLR bit[1] being cleared for >=3D armv6 for = the kernel: = http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=3D= 8428e84d42179c2a00f5f6450866e70d802d1d05 Interestingly clang -cc1 -help only mentions -mno-unaligned-access as a = note to -mstrict-align: > # clang++ -cc1 -help | grep align > -fmax-type-align=3D > Specify the maximum alignment to enforce on = pointers lacking an explicit alignment > -fno-bitfield-type-align > Ignore bit-field types when aligning = structures > -fpack-struct=3D Specify the default maximum struct packing = alignment > -mstack-alignment=3D > Set the stack alignment > -mstackrealign Force realign the stack at entry to every = function > -mstrict-align Force all memory accesses to be aligned = (same as mno-unaligned-access) Also -munaligned-access is not mentioned at all. Apparently "clang -cc1 = -help" does not generally document gcc compatibility syntax. gcc's AArch64 page = https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html#AArch64-Options = only mentions -mstrict-align : "Do not assume that unaligned memory = references are handled by the system". (Not as explicit for = interpretation as the earlier-quoted arm wording.) gcc's arm page = https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html#ARM-Options only = mentions -munaligned-access and -mno-unaligned-access (as quoted = earlier), not -mstrict-align . powerpc's page at = https://gcc.gnu.org/onlinedocs/gcc/RS_002f6000-and-PowerPC-Options.html#RS= _002f6000-and-PowerPC-Options only mentions -mstrict-align and = -mno-strict-align : "On System V.4 and embedded PowerPC systems do not = (do) assume that unaligned memory references are handled by the system". It looks like being compatible for the command line syntax requires = separate cases across architectures, especially when spanning both clang = and gcc.