From owner-freebsd-cluster Sat Nov 9 8: 6:42 2002 Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DD27837B401 for ; Sat, 9 Nov 2002 08:06:39 -0800 (PST) Received: from grant.org (grant.org [206.190.164.98]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3C58A43E42 for ; Sat, 9 Nov 2002 08:06:39 -0800 (PST) (envelope-from mgrant@splat.grant.org) Received: from splat.grant.org (mgrant@splat.grant.org [213.39.2.177]) by grant.org (8.12.6/8.12.6) with ESMTP id gA9G6V57024801 for ; Sat, 9 Nov 2002 11:06:32 -0500 (EST) (envelope-from mgrant@splat.grant.org) Received: (from mgrant@localhost) by splat.grant.org (8.11.6+Sun/8.11.6) id gA9G4wW28126; Sat, 9 Nov 2002 17:04:58 +0100 (MET) Date: Sat, 9 Nov 2002 17:04:58 +0100 (MET) Message-Id: <200211091604.gA9G4wW28126@splat.grant.org> From: Michael Grant To: freebsd-cluster@FreeBSD.ORG Subject: clustering freebsd Sender: owner-freebsd-cluster@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG I'm new to the list (but not new to unix!) I've been running freebsd for years now on a box I colo. I've got some clients and sell some services on my box. I'm becomming very interested in creating a smallish cluster of machines to make my little operation more reliable. One of the big things that cause me down time is upgrading the OS. I'm also worried about hardware failure (which luckily hasn't happened to me yet...) I too would like to achieve at least 5 nines. I read all the archives of this list back to january 2002. Andy's phase-2 project definitely sounds cool. Let's say I have a cluster of n machines. Some of those n machines may be running a web server, some a shell server, some mail server, some pop/imap mail servers...etc. How is an incoming connection sent to the right machine? It seems like that there needs to be a single machine in front of the cluster to send connections the right way, isn't this a single point of failure? If you do have multiple machines answering requests, how's this done? With multiple IP addresses? I know one can specify multiple A records in DNS and that it'll do a sort of round-robin. But does this work well? What if one of the machines is down and a caching dns server returns an ip address of one of the down machines? Seems like you need then to start modifying the dns zone to take out the down machines and use a low ttl. This starts to get ugly quickly. Second problem I have been thinking about is shared disk. I read a post by someone who also had this concern. One obvious way to solve the shared disk problem is to have another box which has a bunch of disks in a RAID configuration, and mount the diks via nfs. This disk box would probably need to be highly available with redundant power supplies and the like. However, I'm not so convinced that a third disk box is the right answer. I'd like to see something which could mirror (in real time) a file system over the lan, thus keeping 2+ disks in sync just like a RAID array spread over multiple systems. Does such a thing exist? After hours of searching, I could find nothing that did this. There seems to be essentially 2 types of clustering: 1) hot spare failovers 2) multiple machines operating in parallel (Perhaps someone could enlighten me if there are proper names for these). It would seem that Andy's phase 2 is more like #2 and his phase 1 is more like #1 above. I'm definltey more insterested in #2. I'm very interested to find something which lets me run n machines to provide a a bunch of services. I don't mind if they all look like one machine or several at this point, I'm not sure if that's important to me. What's important to me at the moment is that if I have a user on one machine that goes down that they can get right back on another machine and get at their mail or files. Of if someone is surfing our site, they just automatically get files from the server that's up. So, after thinking about dns headaches and single machines in front of a cluster, I'm totally exasperated to figure out what the right thing to do is. Does anyone know of some list of clustering software? Is there anything I can use today to do #2 that runs on freebsd (or other bsd systems)? Sorry for such a long rant, but I hope that this sparks some chitchat on this otherwise seemingly dead list. Michael Grant To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-cluster" in the body of the message From owner-freebsd-cluster Sat Nov 9 10:46:29 2002 Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0F0D937B401 for ; Sat, 9 Nov 2002 10:46:27 -0800 (PST) Received: from smtp09.wxs.nl (smtp09.wxs.nl [195.121.6.38]) by mx1.FreeBSD.org (Postfix) with ESMTP id AEB3943E6E for ; Sat, 9 Nov 2002 10:46:25 -0800 (PST) (envelope-from akruijff@dds.nl) Received: from cybertron.kruijff ([213.10.151.186]) by smtp09.wxs.nl (Netscape Messaging Server 4.15) with ESMTP id H5BO5C01.BDJ; Sat, 9 Nov 2002 19:46:24 +0100 Date: Sat, 9 Nov 2002 19:45:13 +0100 From: Alex X-Mailer: The Bat! (v1.53d) Reply-To: Alex X-Priority: 3 (Normal) Message-ID: <4532786804.20021109194513@dds.nl> To: Michael Grant Cc: freebsd-cluster@FreeBSD.ORG Subject: Re: clustering freebsd In-Reply-To: <200211091604.gA9G4wW28126@splat.grant.org> References: <200211091604.gA9G4wW28126@splat.grant.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-cluster@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Hello/Beste Michael, Saturday, November 09, 2002, 5:04:58 PM, you wrote: > I'm new to the list (but not new to unix!) I've been running freebsd > for years now on a box I colo. I've got some clients and sell some > services on my box. I'm becomming very interested in creating a > smallish cluster of machines to make my little operation more > reliable. > One of the big things that cause me down time is upgrading the OS. > I'm also worried about hardware failure (which luckily hasn't happened > to me yet...) I too would like to achieve at least 5 nines. > I read all the archives of this list back to january 2002. Andy's > phase-2 project definitely sounds cool. > Let's say I have a cluster of n machines. Some of those n machines > may be running a web server, some a shell server, some mail server, > some pop/imap mail servers...etc. How is an incoming connection sent > to the right machine? It seems like that there needs to be a single > machine in front of the cluster to send connections the right way, > isn't this a single point of failure? If this is a real problem then look in to high avalibity server. Basicaly you got two server. Is the first one goes doen the second takes over. > If you do have multiple machines answering requests, how's this done? > With multiple IP addresses? I think you are looking for a virtual IP address. When the first one goes doen, the second takes the virtual IP address. > I know one can specify multiple A > records in DNS and that it'll do a sort of round-robin. But does this > work well? What if one of the machines is down and a caching dns > server returns an ip address of one of the down machines? Seems like > you need then to start modifying the dns zone to take out the down > machines and use a low ttl. This starts to get ugly quickly. > Second problem I have been thinking about is shared disk. I read a > post by someone who also had this concern. One obvious way to solve > the shared disk problem is to have another box which has a bunch of > disks in a RAID configuration, and mount the diks via nfs. This disk > box would probably need to be highly available with redundant power > supplies and the like. > However, I'm not so convinced that a third disk box is the right > answer. I'd like to see something which could mirror (in real time) a > file system over the lan, thus keeping 2+ disks in sync just like a > RAID array spread over multiple systems. Does such a thing exist? > After hours of searching, I could find nothing that did this. Keeping disk in sync is is asking for trouble, but it can be done. Something like NFS is the most fail proof. I ones heared a rumor about the possible existence of something like physical shared disk. This seems the best option but also the most expensive. > There seems to be essentially 2 types of clustering: > 2) multiple machines operating in parallel Beowulf clusters. Usably one master with a keyboard and monitor and multiple slave without this. All are dedicated servers. > 1) hot spare failovers Cow(s) = Cluster Of Workstation(s). The use to got nothing to do during the night. > (Perhaps someone could enlighten me if there are proper names for > these). > It would seem that Andy's phase 2 is more like #2 and his phase 1 is > more like #1 above. I'm definltey more insterested in #2. I'm very > interested to find something which lets me run n machines to provide a > a bunch of services. I don't mind if they all look like one machine > or several at this point, I'm not sure if that's important to me. > What's important to me at the moment is that if I have a user on one > machine that goes down that they can get right back on another machine > and get at their mail or files. Of if someone is surfing our site, > they just automatically get files from the server that's up. > So, after thinking about dns headaches and single machines in front of > a cluster, I'm totally exasperated to figure out what the right thing > to do is. > Does anyone know of some list of clustering software? Is there > anything I can use today to do #2 that runs on freebsd (or other bsd > systems)? For most application it means rewriting the software for the use in a cluster. Check the port system for the strings cluster, MPI, there is thirty option but i forgot this. (you find it on some site looking for MPI; could be MVP but i'm not sure). -- Best regards/Met vriendelijke groet, Alex To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-cluster" in the body of the message From owner-freebsd-cluster Sat Nov 9 11: 6:43 2002 Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 449EC37B401 for ; Sat, 9 Nov 2002 11:06:42 -0800 (PST) Received: from mgr2.xmission.com (mgr2.xmission.com [198.60.22.202]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4C31C43E42 for ; Sat, 9 Nov 2002 11:06:41 -0800 (PST) (envelope-from glewis@eyesbeyond.com) Received: from mail by mgr2.xmission.com with spam-scanned (Exim 3.35 #1) id 18Aavz-0004Bz-02 for freebsd-cluster@freebsd.org; Sat, 09 Nov 2002 12:06:20 -0700 Received: from [207.135.128.145] (helo=misty.eyesbeyond.com) by mgr2.xmission.com with esmtp (Exim 3.35 #1) id 18Aavw-00044a-02; Sat, 09 Nov 2002 12:06:17 -0700 Received: (from glewis@localhost) by misty.eyesbeyond.com (8.11.6/8.11.6) id gA9J6B878926; Sun, 10 Nov 2002 05:36:11 +1030 (CST) (envelope-from glewis@eyesbeyond.com) X-Authentication-Warning: misty.eyesbeyond.com: glewis set sender to glewis@eyesbeyond.com using -f Date: Sun, 10 Nov 2002 05:36:11 +1030 From: Greg Lewis To: Alex Cc: Michael Grant , freebsd-cluster@FreeBSD.ORG Subject: Re: clustering freebsd Message-ID: <20021110053611.A78898@misty.eyesbeyond.com> References: <200211091604.gA9G4wW28126@splat.grant.org> <4532786804.20021109194513@dds.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <4532786804.20021109194513@dds.nl>; from akruijff@dds.nl on Sat, Nov 09, 2002 at 07:45:13PM +0100 X-Spam-Status: No, hits=-3.8 required=8.0 tests=IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES, SIGNATURE_SHORT_DENSE,SPAM_PHRASE_01_02,USER_AGENT, USER_AGENT_MUTT,X_AUTH_WARNING version=2.43 X-Spam-Level: Sender: owner-freebsd-cluster@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sat, Nov 09, 2002 at 07:45:13PM +0100, Alex wrote: > > Does anyone know of some list of clustering software? Is there > > anything I can use today to do #2 that runs on freebsd (or other bsd > > systems)? > > For most application it means rewriting the software for the use in a > cluster. > > Check the port system for the strings cluster, MPI, there is > thirty option but i forgot this. (you find it on some site looking for > MPI; could be MVP but i'm not sure). PVM. There are variants of MPI too, MPICH, LAM, etc. There are even variants for specific hardware like Myrinet. You may also want to look into the PBS port, although its sort of old and should be updated to a more recent OpenPBS version (I have 2.3.14, which is free from the nastier license terms). There is also a SLURM port that has appeared recently and someone was working on a SGE port too. These are more HPC clustering tools though and you seem to be looking in the HA space instead. "Clustering software" is rather a broad category. There is software in that category that runs on FreeBSD, but you'll need to be more specific to get better answers :). I don't believe FreeBSD currently has anything for transparent process migration like BProc or MOSIX (although IIRC MOSIX was originally written for BSD). -- Greg Lewis Email : glewis@eyesbeyond.com Eyes Beyond Web : http://www.eyesbeyond.com Information Technology FreeBSD : glewis@FreeBSD.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-cluster" in the body of the message From owner-freebsd-cluster Sat Nov 9 11:47:40 2002 Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 13D4E37B40F for ; Sat, 9 Nov 2002 11:47:39 -0800 (PST) Received: from vps.vitalit.com (vps.vitalit.com [64.105.194.69]) by mx1.FreeBSD.org (Postfix) with ESMTP id 521B143E75 for ; Sat, 9 Nov 2002 11:47:38 -0800 (PST) (envelope-from car@vitalit.com) Received: from LAPTOP (gso167-138-145.triad.rr.com [24.167.138.145]) by vps.vitalit.com (8.12.3/8.12.3) with SMTP id gA9JlapU033470 for ; Sat, 9 Nov 2002 14:47:37 -0500 (EST) (envelope-from car@vitalit.com) Message-ID: <006101c28828$bb7ebd70$0501000a@LAPTOP> From: "Rouzer, Charles A (Chuck)" To: Subject: iSCSI driver (Re: clustering freebsd) Date: Sat, 9 Nov 2002 14:46:43 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-freebsd-cluster@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG You can have two machines share a single physical SCSI bus which would allow two machines to share a RAID or JBOD. IMO, it would be risky to load balance between the two machines running the same applications because of data consistency. It would be good for fail-over though. Unfortunately there aren't many open-source apps (databases being the most critical) that provide high availability. I do believe PostgreSQL is working to provide replication, but this isn't needed if FreeBSD eventually gets its own iSCSI driver. I checked with the author of Vinum (FreeBSD software RAID). You could create remote (WAN) synchronous mirroring by using iSCSI and software RAID (Vinum) for mirroring the data between the local drive and a remote iSCSI drive. Chuck > > Second problem I have been thinking about is shared disk. I read a > > post by someone who also had this concern. One obvious way to solve > > the shared disk problem is to have another box which has a bunch of > > disks in a RAID configuration, and mount the diks via nfs. This disk > > box would probably need to be highly available with redundant power > > supplies and the like. > > > However, I'm not so convinced that a third disk box is the right > > answer. I'd like to see something which could mirror (in real time) a > > file system over the lan, thus keeping 2+ disks in sync just like a > > RAID array spread over multiple systems. Does such a thing exist? > > After hours of searching, I could find nothing that did this. > > Keeping disk in sync is is asking for trouble, but it can be done. > Something like NFS is the most fail proof. I ones heared a rumor about > the possible existence of something like physical shared disk. This > seems the best option but also the most expensive. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-cluster" in the body of the message From owner-freebsd-cluster Sat Nov 9 13:14:48 2002 Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C06ED37B404 for ; Sat, 9 Nov 2002 13:14:45 -0800 (PST) Received: from grant.org (grant.org [206.190.164.98]) by mx1.FreeBSD.org (Postfix) with ESMTP id 29D9243E4A for ; Sat, 9 Nov 2002 13:14:45 -0800 (PST) (envelope-from mgrant@splat.grant.org) Received: from splat.grant.org (mgrant@splat.grant.org [213.39.2.177]) by grant.org (8.12.6/8.12.6) with ESMTP id gA9LEW57028347 for ; Sat, 9 Nov 2002 16:14:33 -0500 (EST) (envelope-from mgrant@splat.grant.org) Received: (from mgrant@localhost) by splat.grant.org (8.11.6+Sun/8.11.6) id gA9LDFG28495; Sat, 9 Nov 2002 22:13:15 +0100 (MET) Date: Sat, 9 Nov 2002 22:13:15 +0100 (MET) Message-Id: <200211092113.gA9LDFG28495@splat.grant.org> From: Michael Grant To: freebsd-cluster@FreeBSD.ORG Subject: Re: clustering freebsd Sender: owner-freebsd-cluster@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Alex wrote: > For most application it means rewriting the software for the use in a > cluster. That's a shame, that will make it much much harder, I don't really want to start messing with apache, imapd, or stuff like that where I would end up having to support a parrallel version. Surely there must be an easier way? > Check the port system for the strings cluster, MPI, there is > thirty option but i forgot this. (you find it on some site looking for > MPI; could be MVP but i'm not sure). then Greg Lewis wrote: > PVM. There are variants of MPI too, MPICH, LAM, etc. There are even > variants for specific hardware like Myrinet. > > You may also want to look into the PBS port, although its sort of old > and should be updated to a more recent OpenPBS version (I have 2.3.14, > which is free from the nastier license terms). There is also a SLURM > port that has appeared recently and someone was working on a SGE port > too. > > These are more HPC clustering tools though and you seem to be looking > in the HA space instead. Wow, alphabet soup! I admit that I've never heard of MPI, PVM, MPICH, LAM, or HPC. I have heard of beowulf. I'll do some more reading on these. Thanks. > "Clustering software" is rather a broad category. There is software > in that category that runs on FreeBSD, but you'll need to be more > specific to get better answers :). It's possible that what I want doesn't exist (yet). I would like to make something highly reliable, but not necessarily something that involves failover to a hot spare. In my mind, I'd rather have 2 or more boxes there available to answer requests rather than one sitting there uselessly until the other fails. I imagine having multiple machines, they may be running an identical image of the OS or they may each have their own OS, at any rate, I don't want to take down the entire cluster to upgrade the OS. I imagine the user's files available on all machines in the cluster. This means that if they log into a shell on one box, they modify a file, the modifications are seen by the other boxes. For example, they modify their web page, all the web servers see the change. They delete a message from their mailbox, all servers in the cluster see the change. So either the data is mirrored in real time or there's some other back end disk farm somewhere. Whether the sharing is done via iSCSI or double-ended SCSI or a LAN mirror, I'm not sure what's best. The iSCSI stuff sounds pretty neat if you could actually mount the disk read/write on more than one machine and run Vinum on them as Charles Rouzer says. I never heard of this, but it sounds like what I had in mind to share the user data between machines. I have seen double-ended SCSI used in a failover situation where only one machine mounts the SCSI disk(s) read/write. I saw this at Sun nearly 10 years ago. > I don't believe FreeBSD currently has anything for transparent process > migration like BProc or MOSIX (although IIRC MOSIX was originally > written for BSD). As for process migration, that would be nice. Andy's phase-2 stuff sounds like it will do that someday. As of today, I could certainly live with having to kick the shell user's off and having them log back into another server. imap and web users probably wouldn't notice a thing if the server they were connected to went down, maybe just a slight timeout before the clients reconnected to one of the up servers. It's not my goal to build a cluster to run some massively parrallel program to grind out an answer, my goal is oriented towards spreading out the workload amongst multiple boxes and creating something that's very highly available. I also do not want to modify each individual application to run on the cluster, I want to run stuff that's out of the box, out of the ports collection or something that I can just type "make install". Imagine if I had to modify apache by hand every time a new version of apache came out! Ok, that's not a reasonable example. Imagine that I use something from the ports collection that nobody took the time to modify to run on some freebsd cluster. I'd have to modify it and probably end up supporting it. That could be a nightmare. Does this rambling help you understand more what I'm looking for? Does it exist today or is Andy's phase-2 stuff what I want? Still unanswered, for a cluster of machines, is there some front end machine that somehow directs requests around like NAT/ip-masquerading to the back-end machines? Alex mentioned "Virtual IP addresses". Is this something that happens at the router/hub level? Thanks for your responses, this is getting interesting! Michael Grant To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-cluster" in the body of the message From owner-freebsd-cluster Sat Nov 9 14:25:30 2002 Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0654537B401 for ; Sat, 9 Nov 2002 14:25:29 -0800 (PST) Received: from vps.vitalit.com (vps.vitalit.com [64.105.194.69]) by mx1.FreeBSD.org (Postfix) with ESMTP id 139D343E42 for ; Sat, 9 Nov 2002 14:25:28 -0800 (PST) (envelope-from car@vitalit.com) Received: from LAPTOP (gso167-138-145.triad.rr.com [24.167.138.145]) by vps.vitalit.com (8.12.3/8.12.3) with SMTP id gA9MPQpU070633 for ; Sat, 9 Nov 2002 17:25:26 -0500 (EST) (envelope-from car@vitalit.com) Message-ID: <007601c2883e$c6d8ebd0$0501000a@LAPTOP> From: "Rouzer, Charles A (Chuck)" To: References: <200211092113.gA9LDFG28495@splat.grant.org> Subject: Re: clustering freebsd Date: Sat, 9 Nov 2002 17:24:26 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-freebsd-cluster@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Michael, I would suggest looking at all the currect load balancing, fail-over, and high availability (in order of complexity) articles that are available on the internet explaining how they built their cluster and the basics of clusters. Find solutions that could work for you and implement them with your FreeBSD platform. There really isn't a "here's a FreeBSD fail-over cluster solution", because it is usually a custom environment like the setup below though you might find some tools that have been ported to FreeBSD to help you get to where you want to be. One of the biggest issues to overcome is data integrity and consistency. You can't have two machines trying to update the same data at once. You can set this up now with shared SCSI, but I am planning on setting up something like this when a decent synchronous network file system is available: Two machines, each running seperately with its own active users and applications, each having local access to a mirror of the other machines data. When online machine notices offline machine has been down for n minute(s) it will mount the local mirror, bring up virtual interfaces, and start all processes with their configurations. The offline machine would ideally also have "shutdown" procedures in the event that the outage was network related. A recover procedure would involve updating then mounting the local mirror on the offline machine, killing the processes on the online machine, removing/creating virtual interfaces, and starting the processes on the recovered machine. Chuck To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-cluster" in the body of the message From owner-freebsd-cluster Sat Nov 9 14:30:30 2002 Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8685437B401 for ; Sat, 9 Nov 2002 14:30:28 -0800 (PST) Received: from yello.shallow.net (yello.shallow.net [203.18.243.120]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9F9C143E4A for ; Sat, 9 Nov 2002 14:30:20 -0800 (PST) (envelope-from joshua@shallow.net) Received: by yello.shallow.net (Postfix, from userid 1001) id F12C62A5B; Sun, 10 Nov 2002 09:30:08 +1100 (EST) Date: Sun, 10 Nov 2002 09:30:08 +1100 From: Joshua Goodall To: Michael Grant Cc: freebsd-cluster@FreeBSD.ORG Subject: Re: clustering freebsd Message-ID: <20021109223008.GE33758@roughtrade.net> References: <200211092113.gA9LDFG28495@splat.grant.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200211092113.gA9LDFG28495@splat.grant.org> User-Agent: Mutt/1.5.1i Sender: owner-freebsd-cluster@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sat, Nov 09, 2002 at 10:13:15PM +0100, Michael Grant wrote: > It's possible that what I want doesn't exist (yet). I would like to > make something highly reliable, but not necessarily something that > involves failover to a hot spare. In my mind, I'd rather have 2 or > more boxes there available to answer requests rather than one sitting > there uselessly until the other fails. Typically one would specify a load balancer sitting in front of the application servers. The LB is responsible for handing off inbound connections to a destination server, using a variety of algorithms and detecting server failures. Where I have used them, I've preferred the appliance-style dedicated LB hardware (e.g. the Foundry ServerIron). There are a couple of software tools in the ports collection (net/balance and net/loadd) but I've not tried them. In a true HA situation, I'd deploy two LB's that share a virtual IP address (e.g. via VRRP). Behind the LBs is where server clustering comes in. Right now, none of the following are possible with FreeBSD out of the box: a) Shared-mount filesystems b) Distributed resource locking c) Cluster membership service d) Atomically reliable group communications e) System-system-image management. Together these five elements form the basis of most application clustering toolkits (e.g. TruCluster, Sun Cluster, VMS, Oracle Parallel Server). You can do some of the above at the application layer with software tools. The Spread toolkit (www.spread.org or ports/net/spread, of which I'm the maintainer) can be used to synthesise application clusters. It'll give you (c) and (d). Spread is one of the two targets of the postgresql-r project. See also http://www.spread.org/ You'll usually need to modify applications to be cluster-aware, however, and that's not trivial. Those multi-CPU systems where you don't are basically NUMA machines. I believe ccNUMA on FreeBSD is a long-term goal of Andy Sporner's, and good luck to him. Joshua -- Joshua Goodall joshua@roughtrade.net "Your byte hit ratio is weak, old man" "If you cache me now, I will dump more core than you can possibly imagine" To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-cluster" in the body of the message