From owner-freebsd-cluster Wed Jun 19 1:59:41 2002 Delivered-To: freebsd-cluster@freebsd.org Received: from gate.nentec.de (gate2.nentec.de [194.25.215.66]) by hub.freebsd.org (Postfix) with ESMTP id 4DAC437B401 for ; Wed, 19 Jun 2002 01:59:35 -0700 (PDT) Received: from nenny.nentec.de (root@nenny.nentec.de [153.92.64.1]) by gate.nentec.de (8.11.3/8.9.3) with ESMTP id g5J8xWA21336; Wed, 19 Jun 2002 10:59:32 +0200 Received: from nentec.de (andromeda.nentec.de [153.92.64.34]) by nenny.nentec.de (8.11.3/8.11.3/SuSE Linux 8.11.1-0.5) with ESMTP id g5J8xQZ02048; Wed, 19 Jun 2002 10:59:27 +0200 Message-ID: <3D1047EE.4000505@nentec.de> Date: Wed, 19 Jun 2002 10:59:26 +0200 From: Andy Sporner User-Agent: Mozilla/5.0 (X11; U; Linux i686; de-AT; rv:0.9.8) Gecko/20020204 X-Accept-Language: de-at, de, en, en-us MIME-Version: 1.0 To: Derek Barrett , freebsd-cluster Subject: Re: Application cluster References: <20020618172808.25913.qmail@graffiti.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: by AMaViS-perl11-milter (http://amavis.org/) Sender: owner-freebsd-cluster@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Hi Derek, >hahahahaha well as a fellow American then I should >have replied, "Thanks partner! USA!" > Well I am relieved to see that this is still something that exists! Today I just left the main FreeBSD hackers list because I found it to be just a margin too cut-throat. These days with what little time I have the last thing I want is not to be taken seriously. I regard email as a primary communications method and for some to just pretend the mail never got there is hard to tolerate (especially when I send it directly to the individual--2-3 times). I have the feeling that people are too childish to face things directly. Which is one of the reasons I am working here--better environment. -- but enough flaming for the moment ;-) > >I don't think you should dismiss your scripts that >"only start and stop" as being laughable. To me, >that's 75% of the battle. I know I've spent hours >at times just getting my startup scripts to work >properly, missing a switch here or there, the trial >and error involved in that is alot sometimes. And >getting a RELIABLE method of monitoring the other >servers has still been a challenge for everyone. > Thanks for your complement... > >Truly, getting a failover >server to successfully take over means: > >1) Reduced late night phone calls >2) Not having to make as many late night phone calls :-D > I used to work at Hyatt's central computer division in Chicago and I had many times "pager duty" and, yes I can sympathize with you! > >And most of these types >of scripts depend on having a second network card >and a serial cable as well. The Linux HA >servers even have a controlling server for the entire >cluster called a Director. That your mechanism goes >across a network card is nice, the less overhead, the better. > I allow the configuration of network addresses for each node. A heartbeat message is sent out over any and all links that are present for the server. The whole thing with the serial cable seems rather archeaic. I mean if the networking layer has failed, the server is probably not that usefull anyways! > >I mean, a couple thousand dollar hardware failover solution >is nice, but so would a Ferrari as a company car. I recently worked >in a high uptime enviornment, and every single server there had >an identical backup, run by a hardware failover switch, and >let me tell you, I got really SPOILED. The amount of >stress relief that those failover switches provided made troubleshooting >and maintenance a breeze. > Funny thing, I worked for about 5 years with Sequent clusters and in their earlier versions (< 2.0) the stand-alone machine was more reliable that the same machine in a cluster. At that time they really never got more than 2 nodes working right. That was why I started with 3 nodes in the beginning as it changes the dynamics remarkably and these same dynamics work very well on 2 nodes too. > > >Let me see what I can come up with for a place for you to post your file. > Meanwhile I need to dust off the work I was doing (I had modularized it with DSO support and added some process statis collecting so that you can from one point monitor processes on any node of the cluster). I also improved the build environment--it was previously very rickety. Andy To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-cluster" in the body of the message