Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 27 Mar 2005 12:33:36 +0200
From:      Anthony Atkielski <atkielski.anthony@wanadoo.fr>
To:        freebsd-questions@freebsd.org
Subject:   Re: hyper threading.
Message-ID:  <14510304120.20050327123336@wanadoo.fr>
In-Reply-To: <8C7007D5D4D30D2-A38-3B313@mblk-r33.sysops.aol.com>
References:  <c6ef380c050326061976f164b@mail.gmail.com> <1641928994.20050326192811@wanadoo.fr> <8C700529A2DFD74-A44-3A157@mblk-d34.sysops.aol.com> <439876144.20050326220638@wanadoo.fr> <8C7006AE7E80573-FAC-3B652@mblk-r28.sysops.aol.com> <49251524.20050326234521@wanadoo.fr> <8C7007D5D4D30D2-A38-3B313@mblk-r33.sysops.aol.com>

next in thread | previous in thread | raw e-mail | index | archive | help
em1897@aol.com writes:

> You can argue the technical theory all you want, but the
> measurements say otherwise.

You have to ensure that you're doing the right measurements.

>FreeBSD 4.9 ->> Load: 38% (I put this in for fun :-)
>
> Freebsd 5.4-Pre UP (no HT) -> Load: high 55-60% range
>
> FreeBSD 5.4-Pre SMP/HT -> Load:  70-80% (much more jumping around)

You'll find that the total CPU time required from start to finish for a
single thread is ALWAYS higher for SMP than for a UP environment, even
if you have separate physical processors.

Several things happen when you move from a uniprocessor environment to
an environment with two or more processors:

- The total CPU time for each thread increases.

- The total system load on a per process basis increases.

- The total throughput of the system improves if there is more than one
independent process running in the system.

- Each of the processors runs more slowly than it would if it were the
only processor running in a UP environment.

If you run a single-thread benchmark on a MP system, you'll find that it
runs more slowly than it does on a UP system. If you run multiple
single-thread independent benchmarks on a MP system, you'll find that
total CPU time for each benchmark increases over that required in a UP
system--but the elapsed time required to complete all benchmarks
substantially diminishes.

To properly gauge the performance of a multiprocessor system, you must
run a realistic mix of tasks on the system and measure overall
throughput.  If you do this, you'll find that you always come out ahead
with multiple processors, even HT processors.

Hyperthreading is just a special case of multiprocessing that imposes
some additional restrictions.  HT is much more sensitive to similarities
in instruction mix across processes, because the actual processor
hardware is being shared.  With a sufficiently heterogenous instruction
mix across multiple execution threads, this isn't a problem; but if you
are running a single-threaded benchmark, or a series of identical
single-threaded benchmarks, it can seriously distort your measurements.

Although adding physical processors diminishes the performance of each
processor, it still adds overall processing power, up to a certain
point. The increment is never equal to the actual number of processors
added, though; that is, if you go from one to two processors, you never
get a doubling of effective processor power--it's more like 70-80%. The
percentage increment gets worse with each additional processor, until
you reach a point at which performance actually starts to decline (the
point at which this happens is extremely hardware dependent, but it's
always well beyond two processors).

Hyperthreaded processors should not diminish in performance just because
HT is turned on, because the hardware contention that diminishes
performance in conventional MP systems is largely absent in a HT
microprocessor.  However, since you are really still only sharing a
single processor with HT, the overall increment is much lower than it
would be with two physical processors, and it is very sensitive to the
instruction mix.

> this shows that you really are a bit foggy. Did you miss the part
> where with 2 processors you actually do have 2 processors?

I actually read what Intel had to say on how the architecture works, and
I spent years measuring systems the hard way (with hardware monitors and
probes), so I know somewhat whereof I speak.  Multiprocessing was always
a significant hot-button issue with customers, as they always wanted to
know how much they really gained with multiple processors (as opposed to
what they had been promised).

> I can make an argument that networking with 1 processor on 5.4 is
> better than with 2. For example, with a test similar to the above, with
> 2 phyiscal processors FreeBSD 5.4 will start dropping packets way before
> it hits 500Kpps unless you increase the interrrupts/second, which of
> course increases the system load. And even with the dropped packets
> (which should reduce the load because it doesnt have to receive
> and transmit the packet), the load is still higher than for 4.x with
> a single processor.

Load is not a problem, as long as it's below 100%.  Since individual
processors slow down in MP configurations, anything that depends on raw
processor speed will suffer in an MP configuration.  However, overall
system throughput is greatly enhanced by running with several
processors.  At the same time, the total processor time required to
complete all tasks is greater in an MP environment than it would be in a
UP environment--it's the fact that things can run in parallel that
improves the throughput.

Moral: if you want to avoid dropping packets in the situation you
describe, increase the interrupt rate.  The additional processing power
of the system will make this practical.

> You and many others regulary say things like "SMP is obviously faster",
> or "Opterons are noticably faster", but those statements are only true
> for certain applications.

True, but those "certain applications" are the kind normally executed in
real-world desktop and server systems.  If this were not the case,
multiprocessing systems would have been abandoned long ago.

It's almost always better to have a single processor at 2 GFLOPS than it
is to have two processors at 1 GFLOPS, but if you can't get 2 GFLOPS
processors, having two 1 GFLOPS processors is the next best thing.

-- 
Anthony




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?14510304120.20050327123336>