Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 24 May 2016 16:38:21 -0700
From:      Mark Millard <markmi@dsl-only.net>
To:        freebsd-arm <freebsd-arm@freebsd.org>
Subject:   Re: xorg broken on Beaglebone black revision 300438 [FYI: notes about the exceptions that still will happen]
Message-ID:  <B78CAE44-80AE-49E8-97D6-4F336AD6E15E@dsl-only.net>
In-Reply-To: <1464130548.1204.25.camel@freebsd.org>
References:  <AE62E2F1-1D9F-418F-97E8-6D7F0E6B4B87@dsl-only.net> <1464127156.1204.10.camel@freebsd.org> <E268D55F-7E4D-4FF7-B38E-0912F275BF0C@dsl-only.net> <1464130548.1204.25.camel@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
The following is for just reference for those interested. I was not =
familiar and so took a quick look in the documentation about memory =
alignment for armv6 and armv7. (I'm not an ARM expert.) The below notes =
span of examples may not be complete.

Looking in a "ARM Architecture Reference Manual ARMv7-A and ARMv7-R =
edition" it appears that SCTLR.A=3D0 for armv7 still gets data access =
alignment faults for check failures for:

Halfword: LDREXH, STREXH

Word: LDREX, STREX, all forms of LDM, STM, PDRD, RFE, SRS, STRD, SWP, =
PUSH (except for T3 and A2 encodings), POP (except for T3 and A2 =
encodings), LDC, LDC12, STC, STC2, VLDM, VLDR, VPOP, VPUSH, VSTM, VSTR

Doubleword: LRDEXD, STREXD

:<align> specified: VLD1, VLD2, VLD3, VLD4, VST1, VST2, VST3, VST4


armv6, by contrast depends on SCTLR.U's status from what I can tell. . .

There are words in places such as (example from STREX description):

> If SCTLR.A and SCTLR.U are both 0, a non word-aligned memory address =
causes UNPREDICTABLE behavior.

There are words for "otherwise" tied to ExclusiveMonitorsPass(. . .): =
true means the exception is generated but false leaves the exception's =
status as IMPLEMENTATION DEFINED.

Also (from "Introduction to ARMv6"):

> =46rom the introduction of ARMv6T2, ARM deprecated use of SCTLR.U =3D=3D=
 0.

> In ARMv7, SCTLR.U is always set to 1. ARMv7 alignment support is the =
same as ARMv6K in this configuration. =46rom ARMv7, use of a value of 0 =
for SCTLR.U is obsolete.

So it appears that the "SCTLR.A and SCTLR.U are both 0" and =
"UNPREDICTABLE behavior" together are specific to some armv6 =
variants.=E2=80=A8 These may be harder to detect problems on. (Later =
below there is a armv7 unpredictable behavior reference.)

There are words for some specific contexts such as:

> In versions of the architecture before ARMv7, if the SCTLR.A and =
SCTLR.U bits are both 0, an unaligned access is forced to be aligned by =
replacing the low-order address bits with zeros.=20


There are words for Virtualization extensions that indicate a case of =
unpredictable behavior for armv7, such as:

> In versions of the ARMv7 architecture before the introduction of the =
Virtualization extensions, the behavior of an unaligned access to Device =
or Strongly-ordered memory is architecturally UNPREDICTABLE. Most =
implementations generate an abort on such an access. =E2=80=A8>=20

It is also noted that HCR.TGE=3D1 leads to Hyp Trap exceptions instead =
of Data Abort exceptions when an unsupported unaligned access is =
attempted.

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2016-May-24, at 3:55 PM, Ian Lepore <ian at freebsd.org> wrote:

On Tue, 2016-05-24 at 15:19 -0700, Mark Millard wrote:
> On 2016-May-24, at 2:59 PM, Ian Lepore <ian at freebsd.org> wrote:
>=20
>> On Tue, 2016-05-24 at 14:35 -0700, Mark Millard wrote:
>>> Quoting from Otac=C3=ADlio Tue May 24 00:06:10 UTC 2016 and its =
locore
>>> -v6.S changes:
>>>=20
>>>> -    orr    r7, #CPU_CONTROL_UNAL_ENABLE
>>>> -    orr    r7, #CPU_CONTROL_AFLT_ENABLE
>>>> +    bic    r7, #CPU_CONTROL_UNAL_ENABLE
>>>> +    bic    r7, #CPU_CONTROL_AFLT_ENABLE
>>>=20
>>> -r295256 (2016-Feb-14) changed from:
>>>=20
>>> bic     r7, #CPU_CONTROL_UNAL_ENABLE
>>>=20
>>> to:
>>>=20
>>> orr     r7, #CPU_CONTROL_UNAL_ENABLE
>>>=20
>>> in two places (moving it a few lines down for each example as
>>> well).
>>> So this much of the proposed changes would be reverting the=20
>>> -r295256
>>> change. The check in comment indicates the bit is RAO/SBOP for
>>> armv7.
>>> For armv6 the check in comment claims it controls armv5
>>> compatible
>>> alignment support.
>>>=20
>>> But:
>>>=20
>>> orr     r7, #CPU_CONTROL_AFLT_ENABLE
>>>=20
>>> has been in locore-v6.S since the file's first checkin. So this
>>> change to bic here be new.
>>>=20
>>> What is the FreeBSD intent for each of the two new settings for
>>> armv7? armv6?
>>>=20
>>=20
>> It was always wrong to clear CPU_CONTROL_UNAL_ENABLE on armv7 (it's
>> documented as RAO/SBOP).  Setting it on armv6 makes the v6 (which
>> is
>> only the RPi in our world) behave the same as v7.  So that change
>> was
>> just a bugfix.
>>=20
>> I think FreeBSD is the only major OS left that is enforcing strict
>> alignment on armv6/v7 and it causes a lot of trouble for ports and
>> other 3rd party software, and prevents us from enabling certain
>> compiler options and optimizations.  I'm very close to a commit to
>> stop
>> enforcing strict alignment (clear rather than
>> CPU_CONTROL_AFLT_ENABLE).
>> I've been testing it yesterday and today, and haven't run into any
>> trouble at all.
>>=20
>> -- Ian
>=20
> Good to know. I had submitted at least one port bug report that will
> likely need to be canceled if this goes through. Effectively its an
> ABI change allowing a wider variety of code to be compliant.
>=20

It was partly all that testing you did a few months ago, and the PRs
and discussions coming out of that, which are driving these changes.=20
If I could get away with procrastinating a bit more, I probably would
(always too busy), but with the big hardfloat abi change and with a
code freeze coming up later this week, this seems like the last chance
to do some disruptive changes that are long overdue.

> Is the kernel involved in emulating access/instructions via some
> technique for misaligned access for armv6/armv7 for some types of
> instructions? Are there performance issues/tradeoffs that might
> contribute to sometimes choosing to be careful about alignment?
>=20

Nope, no emulation, the hardware is able to do this, we've just always
run with alignment faults enabled, partly because base freebsd already
has to work on other strict-alignment hardware anyway.  The driver of
this change is ports more than anything -- increasingly you run into
code that assumes #ifdef __arm__ is sufficient to mean "unaligned
access will work".

There are a few arm instructions that still require alignment, but (at
least in theory) the compiler knows about that and only emits those
instructions when it knows they're safe (such as it knowing that the
stack stays aligned to 8-byte boundaries in non-leaf functions).  We'll
see; everything seems okay in testing I've done so far.

Performance-wise, there is a cost for unaligned access.  The hardware
has to do more work so unaligned accesses take extra cycles.  On the
other hand, if the data is unaligned, you also have to use extra
cycles, potentially a lot of them, to copy-align the data or access it
a byte at a time and reassmble it in a register.  In theory this should
be faster than doing copy-align stuff.

-- Ian

> In one way I liked the strict alignment environment being around: It
> allowed easily testing if software was more portable for such issues
> vs. not. (Not that FreeBSD should use such criteria for its choices.)
>=20





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B78CAE44-80AE-49E8-97D6-4F336AD6E15E>