FreeBSD Mail Archives

Date:      Thu, 4 Jan 2007 09:02:52 -0500
From:      "Jamie Bowden" <ragnar@sysabend.org>
To:        "soralx@cydem.org" <soralx@cydem.org>
Cc:        freebsd-chat@freebsd.org
Subject:   Re: Venting my frustration with FreeBSD
Message-ID:  <d6895b7d0701040602y18d76e1dja3c121520c36db51@mail.gmail.com>
In-Reply-To: <20061211075839.11bc0900@freen0de>
References:  <200612041443.15154.josh@tcbug.org> <200612061006.56852.jhb@freebsd.org> <20061206134536.0c775367@freen0de> <200612061805.05727.jhb@freebsd.org> <20061211075839.11bc0900@freen0de>

On 12/11/06, soralx@cydem.org <soralx@cydem.org> wrote:
>
> > > > 512-way machine?  Scaling on a 512-way machine is quite a
> > > > different ball of wax from scaling on 4-way, and scaling up to 32
> > > > and 64 is going to be another ball of wax as well.
> > > can you give a few examples how scaling ability can be a function of
> > > the number of cores? seems like my curiosity exceeds my imagination
> > > today -- can't come up with any good reasons why this is true :)
> >
> > You may make different tradeoffs.  For example, on a 4-cpu system, it
> > may be fine to have certain data structures shared across CPUs and
> > protected via a lock which avoids the overhead of multiple copies and
> > complexity of updating multiple copies of a data structure.  However,
> > with a 512-way system you may have to resort to using duplicated
> > per-cpu (or maybe per-cpu group) copies of a structure because the
> > tradeoffs are different.
>
> Well, I see what you mean. However, as for this example, it should be
> possible to always share data between CPU groups (that can be sized
> dynamically), right? Thus, given an optimal dynamics algorithm,
> performance would always be close to best possible? More generally,
> it seems that some code may often be added to make the scaling
> ability more or less independent of the quantity of processing units.
> I still believe that an operating system that scales close to linearly
> is possible. The question is, how big an overhaul FreeBSD needs for a
> jump start to becoming of interest in the areas where performance &
> scalability matter?

Building a system that will scale up isn't simply a question of
software.  It's also a question of hardware architecture.  I have,
sitting about 16" to my left, a box of CDs containing Irix 6.5.30.  I
can use that set to install on anything from any single proc. MIPS4
(64bit clean MIPS R8000 and above) system all the way up to a 2048
processor Origin3x00, and it will be a single system image.

SGI get away with this thanks to ccNUMA (Non Uniform Memory Access),
which modularizes processor boards and the memory on them into
discrete units.  Each processor board or brick, depending on what
hardware you're playing with, has (n) processors on it, and some
amount of system RAM.  When you run a program and load its associated
data into memory, it's spread across multiple CPU boards/bricks
(depending on how much memory it uses, of course).  The runtime
program actually moves across processors to the one closest to where
you data is loaded in memory.  The MIPS procs themselves have only L1
cache and each board/brick has a shared L2 cache.  You no longer have
to worry about cache coherence (at least, you have to worry a lot
less).

Here's a quick look at an Origin350 I have handy:

8:47am banshee  /home/jamie %hinv
4 800 MHZ IP35 Processors
CPU: MIPS R16000 Processor Chip Revision: 2.2
FPU: MIPS R16010 Floating Point Chip Revision: 2.2
Main memory size: 4096 Mbytes
Instruction cache size: 32 Kbytes
Data cache size: 32 Kbytes
Secondary unified instruction/data cache size: 8 Mbytes
<elided>

Those four processors are on a single board and share the L2 cache and
all system RAM.  The same setup for an 8 processor system would
require two chassis and a Craylink cable.  So you'd have two of those,
and a running application would jump across processor boards if your
application was large enough that its data wouldn't fit in all the RAM
on a single board.  Onyx/Origin3x00 series are full rack sized systems
(the Origin 3x0s are 2 RU boxes) and hold 64procs (128?  It's been a
while since I looked at the specs on a maxed out 3800) in a single
rack, which can be Craylinked together in the same fashion.

SGI is dead, long live SGI.

Jamie Bowden
-- 
"It was half way to Rivendell when the drugs began to take hold"
Hunter S Tolkien "Fear and Loathing in Barad Dur"
Iain Bowen <alaric@alaric.org.uk>

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?d6895b7d0701040602y18d76e1dja3c121520c36db51>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation