From owner-freebsd-cluster@FreeBSD.ORG  Mon Sep 26 18:16:38 2005
Return-Path: <owner-freebsd-cluster@FreeBSD.ORG>
X-Original-To: freebsd-cluster@freebsd.org
Delivered-To: freebsd-cluster@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 77E7516A41F;
	Mon, 26 Sep 2005 18:16:38 +0000 (GMT) (envelope-from ike@lesmuug.org)
Received: from beth.easthouston.org (dsl254-117-002.nyc1.dsl.speakeasy.net
	[216.254.117.2])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 0CE6043D66;
	Mon, 26 Sep 2005 18:16:35 +0000 (GMT) (envelope-from ike@lesmuug.org)
Received: from [192.168.1.22] (249-218.customer.cloud9.net [168.100.249.218])
	(using TLSv1 with cipher RC4-SHA (128/128 bits))
	(No client certificate requested)
	by beth.easthouston.org (Postfix) with ESMTP
	id 932B2D9AC20; Mon, 26 Sep 2005 14:16:34 -0400 (EDT)
In-Reply-To: <20050924141025.GA1236@uk.tiscali.com>
References: <20050924141025.GA1236@uk.tiscali.com>
Mime-Version: 1.0 (Apple Message framework v734)
Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
Message-Id: <E7A2AE04-87DC-4F3A-87DE-97CD5B51E60F@lesmuug.org>
Content-Transfer-Encoding: 7bit
From: Isaac Levy <ike@lesmuug.org>
Date: Mon, 26 Sep 2005 14:16:31 -0400
To: Brian Candler <B.Candler@pobox.com>
X-Mailer: Apple Mail (2.734)
Cc: freebsd-isp@freebsd.org, freebsd-cluster@freebsd.org
Subject: Re: Options for synchronising filesystems
X-BeenThere: freebsd-cluster@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Clustering FreeBSD <freebsd-cluster.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-cluster>, 
	<mailto:freebsd-cluster-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-cluster>
List-Post: <mailto:freebsd-cluster@freebsd.org>
List-Help: <mailto:freebsd-cluster-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-cluster>,
	<mailto:freebsd-cluster-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 26 Sep 2005 18:16:38 -0000

Hi Brian, All,

This email has one theme: GEOM! :)

On Sep 24, 2005, at 10:10 AM, Brian Candler wrote:

> Hello,
>
> I was wondering if anyone would care to share their experiences in
> synchronising filesystems across a number of nodes in a cluster. I  
> can think
> of a number of options, but before changing what I'm doing at the  
> moment I'd
> like to see if anyone has good experiences with any of the others.
>
> The application: a clustered webserver. The users' CGIs run in a  
> chroot
> environment, and these clearly need to be identical (otherwise a  
> CGI running
> on one box would behave differently when running on a different box).
> Ultimately I'd like to synchronise the host OS on each server too.
>
> Note that this is a single-master, multiple-slave type of filesystem
> synchronisation I'm interested in.

I just wanted to throw out some quick thoughts on a totally different  
approach which nobody has really explored in this thread, solutions  
which are production level software. (Sorry if I'm repeating things  
or giving out info yall' already know:)

--
Geom:
http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/geom- 
intro.html

The core Disk IO framework for FreeBSD, as of 5.x, led by PHK:
http://www.bsdcan.org/2004/papers/geom.pdf

This framework itself is not as useful to you as the utilities which  
make use of it,

--
Geom Gate:
http://kerneltrap.org/news/freebsd?from=20

Network device-level client/server disk mapping tool.
(VERY IMPORTANT COMPONENT, it's reportedly faster, and more stable  
than NFS has ever been- so people have immediately and happily  
deployed it in production systems!)

--
Gvinum and Gmirror:

Gmirror
http://people.freebsd.org/~rse/mirror/
http://www.ie.freebsd.org/doc/en_US.ISO8859-1/books/handbook/geom.html

(Sidenote: even Greg Lehey (original author of Vinum), has stated  
that it's better to use Geom-based tools than Vinum for the  
forseeable future.)

--
In a nutshell, to address your needs, let me toss out the following  
example setup:

I know of one web-shop in Canada, which is running 2 machines for  
every virtual cluster, in the following configuration:

2 servers,
4 SATA drives per box,
quad copper/ethernet gigabit nic on each box

each drive is mirrored using gmirror, over each of the gigabit  
ethernet nics
each box is running Vinum Raid5 across the 4  mirrored drives

The drives are then sliced appropriately, and server resources are  
distributed across the boxes- with various slices mounted on each box.
The folks I speak of simply have a suite of failover shell scripts  
prepared, in the event of a machine experiencing total hardware failure.

Pretty tough stuff, very high-performance, and CHEAP.

--
With that, I'm working towards similar setups, oriented around  
redundant jailed systems, with an eventual end to tie CARP (from pf)  
into the mix to make for nearly-instantaneous jailed failover  
redundancy- (but it's going to be some time before I have what I want  
worked out for production on my own).

Regardless, it's worth tapping into the GEOM dialogues, as there are  
many new ways of working with disks coming into existence- and the  
GEOM framework itself provides an EXTREMELY solid base to bring  
'exotic' disk configurations up to production level quickly.
(Also noteworthy, there's a couple of encrypted disk systems based on  
GEOM emerging now too...)

--
Hope all that helps,

Best,
.ike