Date: Fri, 10 Feb 2017 18:11:23 -0800 From: Mark Millard <markmi@dsl-only.net> To: Alexandre Martins <alexandre.martins@stormshield.eu> Cc: freebsd-arm <freebsd-arm@freebsd.org>, Ian Lepore <ian@FreeBSD.org> Subject: Re: bcopy/memmove optimization broken ? [looks like you are correct to me, I give supporting detail] Message-ID: <25360EAB-3079-4037-9FB5-B7781ED40FA6@dsl-only.net> In-Reply-To: <5335118.oK1KXXDaG5@pc-alex> References: <5335118.oK1KXXDaG5@pc-alex>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2017-Feb-10, at 7:51 AM, Alexandre Martins <alexandre.martins at = stormshield.eu> wrote: > Hi all >=20 > I see in the kernel code of bcopy/memmove an optimization if the = copied region=20 > don't overlap: >=20 > = https://svnweb.freebsd.org/base/head/sys/arm/arm/support.S?view=3Dannotate= #l403 >=20 > I'm a newbie in ARM assembly, so, sorry if i'm wrong, but > - R0 is the dst (1st parameter) > - R1 is the src (2nd parameter) >=20 > So : > subcc r3, r0, r1 /* if (dst > src) r3 =3D dst - src */ > subcs r3, r1, r0 /* if (src > dsr) r3 =3D src - dst */ > cmp r3, r2 /* if (r3 < len) we have an overlap */ > bcc PIC_SYM(_C_LABEL(memcpy), PLT) >=20 > Seems to be inverted. Should it be that ? : > subcs r3, r0, r1 /* if (dst > src) r3 =3D dst - src */ > subcc r3, r1, r0 /* if (src > dsr) r3 =3D src - dst */ > cmp r3, r2 /* if (r3 < len) we have an overlap */ > bcs PIC_SYM(_C_LABEL(memcpy), PLT) >=20 >=20 > Best regards >=20 > --=20 > Alexandre Martins > STORMSHIELD I did not expect something so central that has been around so long to look wrong but. . . Going through the supporting details of the original code based on my looking around here is what I found: #include <string.h> void *memmove(void *dest, const void *src, size_t n); So I'd expect (c'ish notation): r0=3D=3Ddest r1=3D=3Dsrc r2=3D=3Dn Then for (the comments vs. code is being challenged): (The comments do seem to give the intent correctly.) cmp r0, r1 RETeq /* Bail now if src/dst are the same */ subcc r3, r0, r1 /* if (dst > src) r3 =3D dst - src */ subcs r3, r1, r0 /* if (src > dsr) r3 =3D src - dst */ cmp r3, r2 /* if (r3 < len) we have an overlap */ bcc PIC_SYM(_C_LABEL(memcpy), PLT) . . . cmp r0,r1 should result in condition code (c'ish notation): eq=3D(r0=3D=3Dr1) cc=3D(r0<r1) cs=3D(r0>=3Dr1) (Only the r0 position has no immediate-value alternative.) subcc r3,r0,r1 is: if (cc) r3=3Dr0-r1 // no condition code change In other words: if (dst<src) r3=3Ddst-src So it does not match the test listed in the comment as far as I can see. And in (mathematical) integer arithmetic the r3 result would be negative for dst-src. For non-negative arithmetic (natural or whole): undefined. subcs r3,r1,r0 is: if (cs) r3=3Dr1-r0 // no condition code change In other words: if (dst>=3Dsrc) r3=3Dsrc-dst So it does not match the test listed in the comment as far as I can see. And in (mathematical) integer arithmetic the r3 result would be nonpositive for src-dst. But the earlier RETeq prevents the zero case, so: negative. For non-negative arithmetic (natural or whole): undefined. If it was only a normal mathemetical context r3=3D-abs(dst-src) would be a summary of the two sub instruction sequence as it is from what I can tell. For the purpose the summary should be: r3=3Dabs(dst-src), given dst!=3Dsrc . There is no need to wonder outside normal mathematical non-negative arithmetic in the process either. Your code change would have the right summary and use only normal mathematical rules from what I can tell: cmp r0, r1 RETeq /* Bail now if src/dst are the same */ subcs r3, r0, r1 /* if (dst > src) r3 =3D dst - src */ subcc r3, r1, r0 /* if (src > dsr) r3 =3D src - dst */ cmp r3, r2 /* if (r3 < len) we have an overlap */ bcs PIC_SYM(_C_LABEL(memcpy), PLT) . . . subcs r3,r0,r1 is: if (cs) r3=3Dr0-r1 // no condition code change In other words: if (dst>=3Dsrc) r3=3Ddst-src. Given the prior RETeq, that is effectively: if (dst>src) r3=3Ddst-src. And that matches the comments and would produce a positive result for r3, matching the normal mathematical result. subcc r3,r1,r0 is: if (cc) r3=3Dr1-r0 // no condition code change In other words: if (dst<src) r3=3Dsrc-dst And that matches the comments and would produce a positive result for r3, matching the normal mathematical result. Overall summary of the two updated sub?? instructions: r3=3Dabs(dst-src), given dst!=3Dsrc. And that would make for an appropriate comparison to n (i.e., to r2). It appears to have been as it is now since -r143175 when memmove was added to sys/arm/arm/support.S (2005-Apr-12). =3D=3D=3D Mark Millard markmi at dsl-only.net
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?25360EAB-3079-4037-9FB5-B7781ED40FA6>