Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 10 Feb 2017 18:11:23 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        Alexandre Martins <alexandre.martins@stormshield.eu>
Cc:        freebsd-arm <freebsd-arm@freebsd.org>, Ian Lepore <ian@FreeBSD.org>
Subject:   Re: bcopy/memmove optimization broken ? [looks like you are correct to me, I give supporting detail]
Message-ID:  <25360EAB-3079-4037-9FB5-B7781ED40FA6@dsl-only.net>
In-Reply-To: <5335118.oK1KXXDaG5@pc-alex>
References:  <5335118.oK1KXXDaG5@pc-alex>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2017-Feb-10, at 7:51 AM, Alexandre Martins <alexandre.martins at =
stormshield.eu> wrote:

> Hi all
>=20
> I see in the kernel code of bcopy/memmove an optimization if the =
copied region=20
> don't overlap:
>=20
> =
https://svnweb.freebsd.org/base/head/sys/arm/arm/support.S?view=3Dannotate=
#l403
>=20
> I'm a newbie in ARM assembly, so, sorry if i'm wrong, but
> - R0 is the dst (1st parameter)
> - R1 is the src (2nd parameter)
>=20
> So :
> subcc r3, r0, r1 /* if (dst > src) r3 =3D dst - src */
> subcs r3, r1, r0 /* if (src > dsr) r3 =3D src - dst */
> cmp r3, r2 /* if (r3 < len) we have an overlap */
> bcc PIC_SYM(_C_LABEL(memcpy), PLT)
>=20
> Seems to be inverted. Should it be that ? :
> subcs r3, r0, r1 /* if (dst > src) r3 =3D dst - src */
> subcc r3, r1, r0 /* if (src > dsr) r3 =3D src - dst */
> cmp r3, r2 /* if (r3 < len) we have an overlap */
> bcs PIC_SYM(_C_LABEL(memcpy), PLT)
>=20
>=20
> Best regards
>=20
> --=20
> Alexandre Martins
> STORMSHIELD

I did not expect something so central that has been
around so long to look wrong but. . .

Going through the supporting details of the original
code based on my looking around here is what I found:

#include	<string.h>
void *memmove(void *dest, const void *src, size_t n);

So I'd expect (c'ish notation):
r0=3D=3Ddest
r1=3D=3Dsrc
r2=3D=3Dn

Then for (the comments vs. code is being challenged):
(The comments do seem to give the intent correctly.)

cmp     r0, r1
RETeq           /* Bail now if src/dst are the same */
subcc   r3, r0, r1      /* if (dst > src) r3 =3D dst - src */
subcs   r3, r1, r0      /* if (src > dsr) r3 =3D src - dst */
cmp     r3, r2          /* if (r3 < len) we have an overlap */
bcc     PIC_SYM(_C_LABEL(memcpy), PLT)
. . .

cmp r0,r1 should result in condition code (c'ish notation):

eq=3D(r0=3D=3Dr1)
cc=3D(r0<r1)
cs=3D(r0>=3Dr1)

(Only the r0 position has no immediate-value alternative.)


subcc r3,r0,r1 is: if (cc) r3=3Dr0-r1 // no condition code change
In other words: if (dst<src) r3=3Ddst-src

So it does not match the test listed in the comment as
far as I can see. And in (mathematical) integer arithmetic
the r3 result would be negative for dst-src. For
non-negative arithmetic (natural or whole): undefined.


subcs r3,r1,r0 is: if (cs) r3=3Dr1-r0 // no condition code change
In other words: if (dst>=3Dsrc) r3=3Dsrc-dst

So it does not match the test listed in the comment as
far as I can see. And in (mathematical) integer arithmetic
the r3 result would be nonpositive for src-dst. But the
earlier RETeq prevents the zero case, so: negative. For
non-negative arithmetic (natural or whole): undefined.


If it was only a normal mathemetical context r3=3D-abs(dst-src)
would be a summary of the two sub instruction sequence as it
is from what I can tell.

For the purpose the summary should be: r3=3Dabs(dst-src), given
dst!=3Dsrc . There is no need to wonder outside normal
mathematical non-negative arithmetic in the process either.

Your code change would have the right summary and use only
normal mathematical rules from what I can tell:

cmp     r0, r1
RETeq           /* Bail now if src/dst are the same */
subcs r3, r0, r1 /* if (dst > src) r3 =3D dst - src */
subcc r3, r1, r0 /* if (src > dsr) r3 =3D src - dst */
cmp r3, r2 /* if (r3 < len) we have an overlap */
bcs PIC_SYM(_C_LABEL(memcpy), PLT)
. . .

subcs r3,r0,r1 is: if (cs) r3=3Dr0-r1 // no condition code change
In other words: if (dst>=3Dsrc) r3=3Ddst-src.
Given the prior RETeq, that is effectively: if (dst>src) r3=3Ddst-src.
And that matches the comments and would produce a positive result
for r3, matching the normal mathematical result.

subcc r3,r1,r0 is: if (cc) r3=3Dr1-r0 // no condition code change
In other words: if (dst<src) r3=3Dsrc-dst
And that matches the comments and would produce a positive result
for r3, matching the normal mathematical result.

Overall summary of the two updated sub?? instructions:
r3=3Dabs(dst-src), given dst!=3Dsrc.

And that would make for an appropriate comparison to n (i.e., to r2).

It appears to have been as it is now since -r143175 when memmove was
added to sys/arm/arm/support.S (2005-Apr-12).


=3D=3D=3D
Mark Millard
markmi at dsl-only.net




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?25360EAB-3079-4037-9FB5-B7781ED40FA6>