Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 16 Nov 2017 18:15:36 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Mark Millard <markmi@dsl-only.net>
Cc:        svn-src-head@freebsd.org
Subject:   Re: svn commit: r325765 - head/lib/libc/string
Message-ID:  <20171116171111.D1034@besplex.bde.org>
In-Reply-To: <37F2AD18-1B69-4559-8BD3-B92E01A067BF@dsl-only.net>
References:  <37F2AD18-1B69-4559-8BD3-B92E01A067BF@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 15 Nov 2017, Mark Millard wrote:

> Bruce Evans brde at optusnet.com.au wrote on
> Tue Nov 14 12:41:50 UTC 2017 :
>
>> - memcpy.3.  It was already documented in the BUGS section that memcpy() is
>>    implemented as bcopy() and therefore the "strings" "may overlap" [then
>>    portability considerations due to this].  This is buggier than before:
>>    - old bug: only the MI implementation of memcpy() clearly implements
>>      memcpy() as bcopy().  The amd64, i386 and arm MD implementations
>>      do the same.

I was confused between arm and mips and only fixed this in some places.
It is mips that does the same as amd64 and i386.  arm has different MD code,
and the others have no md code.

> head/sys/arm/arm/support.S ( -r283366 ) has
>
> 394	ENTRY(bcopy)
> 395	        /* switch the source and destination registers */
> 396	        eor     r0, r1, r0
> 397	        eor     r1, r0, r1
> 398	        eor     r0, r1, r0
> 399	EENTRY(memmove)
> 400	        /* Do the buffers overlap? */
> 401	        cmp     r0, r1
> 402	        RETeq           /* Bail now if src/dst are the same */
> 403	        subcc   r3, r0, r1      /* if (dst > src) r3 = dst - src */
> 404	        subcs   r3, r1, r0      /* if (src > dsr) r3 = src - dst */
> 405	        cmp     r3, r2          /* if (r3 < len) we have an overlap */
> 406	        bcc     PIC_SYM(_C_LABEL(memcpy), PLT)
> . . .

I didn't look closely at the arm code, and just noticed more bugs in it.

Falling through like that breaks at least profiling.  Only ENTRY() does
_PROF_PROLOGUE, so profiling of memmove() seems to be broken.

A less worse way to use EENTRY() is ENTRY() then 1 more more EENTRY()'s
with no code in between.  Then the names are just aliases.  Profiling
can't distinguish the aliases, but at least all entries get counted.
amd64 and i386 use ALTENTRY() which handles this as well as possible
(not very well -- like a tail call -- too much gets charged to the
tail function.  sparc64 also has ALTENTRY() but not the complications
needed for it to work only as badly as a tail call.

It is a style bug for memmove() to exist in the kernel.  Especially an MD
version of it.  It is bad enough that libkern has an MI memmove().  This
is not the same as the libc/string one (another style bug), but just
implements memmove() by calling bcopy().  Profiling works right for this
unless it the function is over-optimized to a tail call when it works
like ALTENTRY().

> head/lib/libc/arm/string/memmove.S ( -r288373 ) has:
>
> 37	#ifndef _BCOPY
> 38	/* LINTSTUB: Func: void *memmove(void *, const void *, size_t) */
> 39	ENTRY(memmove)
> 40	#else
> 41	/* bcopy = memcpy/memmove with arguments reversed. */
> 42	/* LINTSTUB: Func: void bcopy(void *, void *, size_t) */
> 43	ENTRY(bcopy)
> 44	        /* switch the source and destination registers */
> 45	        eor     r0, r1, r0
> 46	        eor     r1, r0, r1
> 47	        eor     r0, r1, r0
> 48	#endif

This is the correct way to do it -- separate object files with duplication
in source files avoided using ifdefs.

> 49	        /* Do the buffers overlap? */
> 50	        cmp     r0, r1
> 51	        it      eq
> 52	        RETeq           /* Bail now if src/dst are the same */
> 53	        ite     cc
> 54	        subcc   r3, r0, r1      /* if (dst > src) r3 = dst - src */
> 55	        subcs   r3, r1, r0      /* if (src > dsr) r3 = src - dst */
> 56	        cmp     r3, r2          /* if (r3 < len) we have an overlap */
> 57	        bcc     PIC_SYM(_C_LABEL(memcpy), PLT)
> . . .
>
> It looks to me like bcopy and memmove call memcpy under
> some conditions, not the other way around. I did not
> notice any code in memcpy going back to bcopy (or
> memmove) in either area. (That might have lead to
> recursion as things are.)

The above only shows memmove() calling bcopy(), only in the kernel.  This
would be invalid in userland since memmove() is Standard but bcopy() is
supposed to be usable by applications (unless __BSD_VISIBLE).  The
separate object files in userland make this not a problem.  Otherwise,
bcopy and memmove in the same object file would cause namespace problems
even if bcopy is implemented as a [tail] call or fall-through to memmove.

> ...
> Another issue is if compilers (clang, gcc) are well
> controlled for code substitutions to preserve
> supposed FreeBSD rules about where overlaps
> are well defined. The library code might not be all
> there is to it as far as behavior goes for
> compiled C/C++ source code.

Do you mean the namespace stuff?

> head/sys/arm64/arm64/ is similar for bcopy/memmove
> calling memcpy (but head/lib/libc/amd64/string/ is
> not):

It is reasonable, and possibly best, to implement the usual
non-overlapping case by a call to memcpy() and not try hard to
optimize the overlapping case (or jump far away to it to keep
caches cleaner for the usual case).

> head/sys/arm64/arm64/memmove.S ( -r307909 ) has:
> ...
> head/lib/libc/amd64/string/bcopy.S has:
> ...

Same organisation as for plain arm, so good for libc and not so good fot
the kernel.

>> mips is backwards for bcopy() and implements it as
>>      memmove().  mips has 2 separate memcpy*.S files, one for plain arm
>>      and one for xscale.  The plain arm one is smaller than memmove.S,
>>      so presumably doesn't support overlapped copies.  The xscale one
>>      is much larger, so has space for overlapped copies and many optimizations
>>      that no longer work.  I don't know what it does.
>
> The mix of "mips" and "arm"/"xscale" above confused me. Looking
> around this seems to be referencing head/lib/libc/arm/string/
> materials instead of head/lib/libc/mips/string/ or
> head/sys/mips/mips/ materials.

I meant arm here.

mips does have some MI string instructions in libc which I missed or
got confused before.  Also, aarch64 has some which I missed because the
they are obfuscated by putting them in contrib instead of under libc.
mips implements all of bcopy, memcpy and memmove in bcopy.S, and seems
to be indeed like amd64 and i386 -- none of these arches bothers to
optimize memcpy by not supporting overlapping copies in it, exactly
as documented.

However, the documentation is still incorrect.  Without -ffreestanding,
compilers can and do often implement memcpy inline.  Then overlapping
copies are not supported.  The BUGS section in memcpy.3 should just be
deleted.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20171116171111.D1034>