Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 2 Dec 2019 15:07:00 -0800
From:      Mark Millard <marklmi@yahoo.com>
To:        freebsd-arm@freebsd.org
Subject:   Re: Comparing the OverDrive 1000 (A57) vs. MACCHIATObin Double Shot (A72) for buildworld and via a CPU/cache/RAM tradeoff-exploring benchmark (links corrected, again)
Message-ID:  <8E3A0E01-F22D-4635-A8CF-CDB98CFF9794@yahoo.com>
In-Reply-To: <63787F5A-A3B7-434A-B594-999D95559BEE@yahoo.com>
References:  <92E7B63A-E790-4815-9D91-2161A4F66B71.ref@yahoo.com> <92E7B63A-E790-4815-9D91-2161A4F66B71@yahoo.com> <5F7E7618-A503-4D16-B83C-0379F4B6327F@yahoo.com> <63787F5A-A3B7-434A-B594-999D95559BEE@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
[May be this time I'll get working links in place . . .]

On 2019-Dec-2, at 14:56, Mark Millard <marklmi@yahoo.com> wrote:

> [Just correcting the links to be to .png files
> and correcting some PowerMac11,2 related wording.]
>=20
> On 2019-Dec-2, at 14:15, Mark Millard <marklmi at yahoo.com> wrote:
>=20
>> It looks like the OverDrive 1000 vs. MACCHIATObin Double
>> Shot comparison ends up being an example of memory
>> access making the difference for the specific workload:
>> -j4 buildworld for head -r355027 (building itself
>> from scratch).
>>=20
>> buildworld times (not needing a llvm bootstrap build):
>>=20
>> OverDrive 1000:           13895 sec (about 3.86 hrs)
>> MACCHIATObin Double Shot: 16561 sec (about 4.60 hrs)
>>=20
>> So a little under 45 min difference when the mean
>> and geometric mean are both a little over 4.2 hrs.
>>=20
>> SSD ufs file systems: One with Samsung 860 Pro, the
>> other with Samsung 850 Pro. I do not expect that I/O
>> made much of a difference, but I did nothing to measure
>> such for the buildworld activity.
>>=20
>> OverDrive RAM:     8GiByte, half in each of the 2 slots
>> MACCHIATObin RAM: 16GiByte, all in its 1 slot.
>>=20
>> MACCHIATObin: jumpers set for the fastest CPU/RAM
>> speed for the Double Shot.
>>=20
>> A comparison graph from exploring single threaded
>> and multi-threaded CPU/cache and RAM limited
>> performance (a variation on the old HINT serial
>> and pthread benchmarks) is shown at:

Corrected link (2nd try):

=
https://github.com/markmi/acpphint/blob/master/acpphint_example_data/acpph=
int-OverDrive_1000_MacchDblShot-threads_4-LP64-g%2B%2B_9_O3-libc%2B%2B-DSI=
ZE_large_fast_types-RAM.png

>> There are curves for various involved types:
>> double (d), unsigned long long (ull), unsigned
>> long (ul), unsigned int (ui). The match for
>> ull and ul for the context provides some
>> evidence of the variability observed.
>>=20
>> (The OverDrive and MACCHIATObin were not benchmarked
>> for the graph at the same version of head: -r352341
>> based vs. -r355027 based.)
>>=20
>> (I did not set things such that the benchmark run
>> would explore paging getting involved. Thus there
>> is basically no I/O considered in the comparison
>> graph.)
>>=20
>> The MACCHIATObin clearly wins single threaded and
>> its memory subsystem was well matched to the single
>> threaded use when the same-invovled-types are
>> compared. (Single threaded are the blueish curves,
>> MACCHIATObin having the lighter colors.)
>>=20
>> For multi-threaded in the range where RAM access
>> limits things, the two systems are a close match.
>> (Greenish colors, right side of plot, upper
>> curves.)
>>=20
>> The range were the OverDrive 1000 is clearly faster
>> is part of the middle of the multi-threaded curves.
>> (This might be tied to whatever is done with the
>> dual RAM slot structure or to the amount of caching,
>> or some such, I do not know the details.)
>>=20
>> I would expect "-j1 buildworld" would take less time
>> on the MACCHIATObin than on the OverDrive, but I'm
>> not planing on measuring that.
>>=20
>>=20
>>=20
>> A more historical comparison, old PowerMac11,2
>> (2 sockets, 2 cores each) vs. the MACCHIATObin,
>> both having 16 GiBytes of RAM:
>>=20
>> For analogous benchmark graphs (matching types),
>> the MACCHIATObin single threaded is faster than
>> the old PowerMac11,2 single threaded and also is
>> usually faster than that 11,2's multi-threaded
>> benchmark data as well.
>=20
> I should have pointed out that the MACCHIATObin
> single threaded and PowerMac11,2 multi-threaded
> results are similar where memory access limits
> things, with use of double (d) being a little
> slower on the MACCHIATObin in this region.
>=20
>> Multi-threaded, the
>> MACCHIATObin is faster for the exploration by
>> the benchmark.
>=20

Corrected link (2nd try):

=
https://github.com/markmi/acpphint/blob/master/acpphint_example_data/acpph=
int-MacchDblShot_PowerMac11%2C2-threads_4-LP64-g%2B%2B_9_O3-libc%2B%2B-DSI=
ZE_large_fast_types-RAM.png

>> I expect that this is interesting for the likely
>> difference in power usage during the benchmarking.
>> (Not that I've measured the power usage.)
>>=20
>> (The FreeBSD head vintages are not the same in
>> the graph: -r355027 based vs. -r352341 based.)
>>=20



=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8E3A0E01-F22D-4635-A8CF-CDB98CFF9794>