Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 23 Nov 2004 15:31:12 -0500 (EST)
From:      "Brian Szymanski" <ski@indymedia.org>
To:        "Rob" <spamrefuse@yahoo.com>
Cc:        freebsd-stable@lists.freebsd.org
Subject:   Re: make -j$n buildworld : use of -j investigated
Message-ID:  <2566.10.0.0.26.1101241872.squirrel@10.0.0.26>
In-Reply-To: <41A2C5C0.3080908@yahoo.com>
References:  <41A2C5C0.3080908@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Did you try any machines that used Hyperthreading? I'd be interested to
see how those machines fare based on the number of logical and real CPUs.

> Although people suggest "-j4" as optimal in general
> case, I have come to a very different conclusion:
>
> 1) single CPU with enough RAM (2 GHz, 512 MB)
>     there's no significant speed up in the range
>     "-j1" to "-j9".
>     So "-j1" is as good as "-j9".

If you went to all that trouble, you might as well post the numbers :-)

> 2) single CPU with little RAM (333 MHz, 64 MB)
>     speed slows down rapidly from "-j1" to "-j9",
>     because of intensive swapping.
>     So "-j1" performs best in this case.

This is expected. A note should probably be added to the handbook giving
rough approximations of how much memory per simultaneous process is
necessary for optimal performance. I'd guess 48MB * p + c, where c = the
machine's memory load while idle and p = the number of compile processes
(most don't take nearly that much memory, but c++ can gobble it)

> 3) dual CPU with enough RAM (2 x 800 MHz, 1GB)
>     speed up by almost two from "-j1" to "-j2",
>     but after that no noticeable speed up anymore.
>     So "-j2" is as good as "-j9".

Again, you went to the trouble, post the numbers?

> With these simple tests, I come to the conclusion that
> "make -j$n buildworld" is best with n = number of CPUs.
> Does that make sense?

Sort of. It depends on more than just the number of CPUs. IO speed is also
very important. If you're using NFS over non-gigabit ethernet or to a slow
NFS server, it's worth ratcheting the number of threads up. The same would
go for old slow disks, or if you have /usr/src union-mounted from a cdrom
drive, etc. Also disk layout: having /usr/src on a different drive from
/usr/obj can speed up the IO-bound portions of the process a great deal by
eliminating contention.

If you do less waiting for IO, adding more threads has a less pronounced
or even negative effect due to cpu contention instead of the positive
"work while the other thread waits on IO" effect. This is the basic
underlying principle, which the handbook doesn't really point out.

Seems to me the pluses and minuses of increasing n are:
+ More chances to do work when other processes are waiting on IO.
- CPU contention resulting in context switches and other wasted cycles due
to extra scheduling overhead (probably negligible, maybe significant with
high HZ in kernel config).
- Memory contention (aka usage).

It might be worth decreasing the number recommended somewhat, but I think
j = ncpu is too small for a general recommendation, because unless you are
memory tight there is very little harm in increasing the number. I'd
suspect j = 2 * ncpu or even j = ncpu + 1 are better rules of thumb.

A better formula would take average IO thruput and latency rates from
bonnie++, amount of available memory, and the number and speed of cpus. A
perl script that measures these numbers and determines the optimal setting
is left as an excersize to the reader. Extra credit - code it in C and get
it integrated in -CURRENT so that "make buildworld" automagically calls
"make -j=$n real_buildworld" with the optimal value of n :-)

My results, for what it's worth:
Specs: Athlon XP 2500+, 512M of 333MHz DDR ram.
/usr/obj is a gvinum raid0 (striped) volume of two SATA disks.
/usr/src is on a gvinum raid1 (mirrored) volume of two PATA disks.
options HZ=1000 in the kernel config, pretty vanilla besides that..
in make.conf: CFLAGS=-O2 -pipe -march=athlon-xp
CXXFLAGS empty due to a bug with memoization last time i tried a compile...

make -j1 buildworld:
real    64m54.298s
user    52m56.915s
sys     9m13.041s

make -j2 buildworld:
real    67m55.816s
user    56m20.778s
sys     10m20.247s

make -j3 buildworld:
real    70m53.936s
user    59m2.447s
sys     10m43.325s

make -j4 buildworld:
real    72m25.904s
user    60m19.098s
sys     10m59.492s


-- 
Brian Szymanski
ski@indymedia.org




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2566.10.0.0.26.1101241872.squirrel>