Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 21 May 1996 18:40:09 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        imp@village.org (Warner Losh)
Cc:        davem@caip.rutgers.edu, terry@lambert.org, jehamby@lightside.com, jkh@time.cdrom.com, current@FreeBSD.ORG, hackers@FreeBSD.ORG
Subject:   Re: Congrats on CURRENT 5/1 SNAP...
Message-ID:  <199605220140.SAA02800@phaeton.artisoft.com>
In-Reply-To: <199605220117.TAA01774@rover.village.org> from "Warner Losh" at May 21, 96 07:17:54 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> Don't know if clustering is the answer because then you get a lot of
> data copying overhead that is free on a shared memory MP box.  Does
> lock contention get so bad that you actually win by data copying for
> huge numbers of CPUs?  I don't know, but it is certainly an area of
> research that will be fruitful for a few years.

I haven't responded to the other post yet myself because of the
time needed to give it fair treatment.

I dislike the clustering soloution at a gut level.  I have to say
I agree that the data copying overhead is a big loss, especially
if you plan to migrate processes between clusters (anyone else
ever run Amoeba?).

The locking overhead only applies to contended memory access;
using per processor pools instead of a Solaris/SVR4 pure SLAB
allocator resolves most of the locking issues.

The locking overhead scales relative to bus contention and
cache invalidation and/or update overhead for distributed
cache coherency.  This is more a design flaw in the SMP model
used by Solaris/SVR4 than it is an inhernet flaw in the
idea of shared memory SMP.

Hybrid NUMA architectures using per processor page pools instead
of sharing all memory are a big win (Sequent proved this; where
Sequent screwed up was in not doing real FS multithreading.  Even
so, Sequent was able to scale a BSD-derived OS to 32 processors
without a lot of undue overhead.


The idea of SLAB allocation (as treated in the Vahalia book) is
a good one; but it probably wants to be implemented on top of
a per processor pool global memory allocator to make it less
contentious.


The idea of distributed cluster computing only works for small
domain distribution; the net is currently too unreliable (and
too slow) for any serious process migration over a big domain.
And the net is probably only going to get worse in the short
term.  By the time it's fixed, I suspect content servers will
have a big edge over compute servers as a desirable resource.



					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199605220140.SAA02800>