Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 15 Feb 2017 12:06:09 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        Warner Losh <imp@bsdimp.com>
Cc:        Alexandre Martins <alexandre.martins@stormshield.eu>, freebsd-arm <freebsd-arm@freebsd.org>, Ian Lepore <ian@freebsd.org>
Subject:   Re: bcopy/memmove optimization broken ? [looks like you are correct to me, I give supporting detail]
Message-ID:  <C1377018-81A3-4F20-B3F9-99859C46CC4E@dsl-only.net>
In-Reply-To: <CANCZdfrbih-1FtTAy5P=W=tRU2ztwjt36hazHrEaEs_ygMRMKw@mail.gmail.com>
References:  <5335118.oK1KXXDaG5@pc-alex> <25360EAB-3079-4037-9FB5-B7781ED40FA6@dsl-only.net> <7424243.zp5tqGREgJ@pc-alex> <8E5F8A15-2F79-4015-B93B-975D27308782@dsl-only.net> <674C2DA0-808D-4968-B86D-7CADEC3A7EEE@dsl-only.net> <CANCZdfrbih-1FtTAy5P=W=tRU2ztwjt36hazHrEaEs_ygMRMKw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2017-Feb-15, at 8:25 AM, Warner Losh <imp at bsdimp.com> wrote:

> On Mon, Feb 13, 2017 at 9:04 PM, Mark Millard <markmi at dsl-only.net> =
wrote:
>> On 2017-Feb-13, at 1:27 AM, Mark Millard <markmi at dsl-only.net> =
wrote:
>>=20
>>> As the decision about when to call the code that can
>>> deal with overlapping memory regions is wrong, the code
>>> that should only be used for non-overlaping regions likely
>>> would handle some overlapping regions and so would operate
>>> incorrectly in at least some cases.
>>>=20
>>> In other words, I think the bug is worse than just an
>>> example of being sub-optimal: the code is wrong from what
>>> I can tell. (I've no clue if the code is ever put to use
>>> for any bad cases.)
>>=20
>> I was wrong about the error status, possibly for multiple reasons,
>> but the following is sufficient:
>>=20
>> https://www.freebsd.org/cgi/man.cgi?query=3Dmemcpy&sektion=3D3
>>=20
>> says:
>>=20
>>     In this implementation memcpy() is implemented using bcopy(3), =
and there-
>>     fore the strings may overlap.  On other systems, copying =
overlapping
>>     strings may produce surprises.  Programs intended to be portable =
should
>>     use memmove(3) when src and dst may overlap.
>>=20
>> so the branch taken case for:
>>=20
>> bcc PIC_SYM(_C_LABEL(memcpy), PLT)
>>=20
>> also deals with overlaps since FreeBSD criteria is
>> that memcpy does so. (I had been thinking that it
>> did not deal with such.)
>>=20
>>=20
>> Side note:
>>=20
>> Notably the arm implementation of FreeBSD memcpy does not call
>> bcopy (that would be recursive in the arm implementation).
>> memcpy just needs to have some properties that bcopy also has.
>>=20
>> This suggests that memcpy vs. bcopy may have a performance
>> Principle of Least Astonishment violation since memcpy may well
>> perform differently than bcopy for some types of contexts but
>> memcpy is supposed to use bcopy.
>>=20
>>=20
>> [A varient of these notes are in the comments for bugzilla
>> 217065.]
>=20
> Seems like the memcpy man page should be softened to reflect the !x86
> reality. If we provide different semantics between different arches,
> we should consider carefully why and document in the code or change.
>=20
> Warner

If FreeBSD's memcpy ever does not handle overlapped memory
regions in the copy how much FreeBSD specific source code that
is not intended to be TARGET_ARCH specific would be invalidated
and need to be updated to work on all TARGET_ARCH's? I would
not expect any "softended" criteria for memcpy to go this far.
There is a lot of history tied to the existing definition that
would have to be reviewed and likely various updates made.

If the "softended" aspect were only the removal of the direct
claim that bcopy is used, then I'd hope for some documented
guidance about tradeoffs for choosing bcopy vs. memcpy. It
would be nice if the guidance was not TARGET_ARCH specific in
any usually-important way, otherwise there would be be a
systematic need for conditional selection to pick the better
one in source code.

If an an architecture needs an unusual distinction I'd expect
an architecture-specific name for an architecture-specific
routine would be better: it makes clear the need for selection.
If a few architectures share an unusual need then a name based
on the need could be used instead.


=3D=3D=3D
Mark Millard
markmi at dsl-only.net




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C1377018-81A3-4F20-B3F9-99859C46CC4E>