From owner-freebsd-hackers Tue May 21 18:45:23 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id SAA24487 for hackers-outgoing; Tue, 21 May 1996 18:45:23 -0700 (PDT) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id SAA24482; Tue, 21 May 1996 18:45:21 -0700 (PDT) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id SAA02800; Tue, 21 May 1996 18:40:10 -0700 From: Terry Lambert Message-Id: <199605220140.SAA02800@phaeton.artisoft.com> Subject: Re: Congrats on CURRENT 5/1 SNAP... To: imp@village.org (Warner Losh) Date: Tue, 21 May 1996 18:40:09 -0700 (MST) Cc: davem@caip.rutgers.edu, terry@lambert.org, jehamby@lightside.com, jkh@time.cdrom.com, current@FreeBSD.ORG, hackers@FreeBSD.ORG In-Reply-To: <199605220117.TAA01774@rover.village.org> from "Warner Losh" at May 21, 96 07:17:54 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > Don't know if clustering is the answer because then you get a lot of > data copying overhead that is free on a shared memory MP box. Does > lock contention get so bad that you actually win by data copying for > huge numbers of CPUs? I don't know, but it is certainly an area of > research that will be fruitful for a few years. I haven't responded to the other post yet myself because of the time needed to give it fair treatment. I dislike the clustering soloution at a gut level. I have to say I agree that the data copying overhead is a big loss, especially if you plan to migrate processes between clusters (anyone else ever run Amoeba?). The locking overhead only applies to contended memory access; using per processor pools instead of a Solaris/SVR4 pure SLAB allocator resolves most of the locking issues. The locking overhead scales relative to bus contention and cache invalidation and/or update overhead for distributed cache coherency. This is more a design flaw in the SMP model used by Solaris/SVR4 than it is an inhernet flaw in the idea of shared memory SMP. Hybrid NUMA architectures using per processor page pools instead of sharing all memory are a big win (Sequent proved this; where Sequent screwed up was in not doing real FS multithreading. Even so, Sequent was able to scale a BSD-derived OS to 32 processors without a lot of undue overhead. The idea of SLAB allocation (as treated in the Vahalia book) is a good one; but it probably wants to be implemented on top of a per processor pool global memory allocator to make it less contentious. The idea of distributed cluster computing only works for small domain distribution; the net is currently too unreliable (and too slow) for any serious process migration over a big domain. And the net is probably only going to get worse in the short term. By the time it's fixed, I suspect content servers will have a big edge over compute servers as a desirable resource. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.