From owner-freebsd-cluster Thu Mar 2 7:29:53 2000 Delivered-To: freebsd-cluster@freebsd.org Received: from symbion.srrc.usda.gov (symbion.srrc.usda.gov [199.133.86.40]) by hub.freebsd.org (Postfix) with ESMTP id 52BF137BEF6 for ; Thu, 2 Mar 2000 07:29:43 -0800 (PST) (envelope-from gjohnson@nola.srrc.usda.gov) Received: (from glenn@localhost) by symbion.srrc.usda.gov (8.9.3/8.9.3) id JAA73396 for cluster@freebsd.org; Thu, 2 Mar 2000 09:29:27 -0600 (CST) (envelope-from glenn) From: Glenn Johnson Date: Thu, 2 Mar 2000 09:29:27 -0600 To: cluster@freebsd.org Subject: Is there a cluster manager for FreeBSD that does job migration? Message-ID: <20000302092927.A73201@symbion.srrc.usda.gov> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0.1i Sender: owner-freebsd-cluster@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG I am currently using FreeBSD-3.4 with clusterit-1.3 on a small 4 node cluster. I have mpich and pvm installed for parallel programs. I need to expand the capabilities of this cluster and add another 12 nodes to it. I do not think that the current tools I have (clusterit and GNQS) can meet the needs I have. The parallel part of the cluster is not a problem but I need to set it up for High Throughput Computing as the majority of the jobs are single processor jobs, but there are a lot of them. The availability of the cluster will also be widened to include other users at my facility besides myself. This means that a cluster that presents itself to the user as a single machine would be beneficial. Also, there will be some nodes on the cluster that will be in and out of the cluster as they are called upon for other needs periodically so the ability to move jobs between nodes is important. In my Internet search on this topic I have concluded that I need something like MOSIX or Condor or a fairly sophisticated batch queuing system like PBS or maybe DQS. My preference would be MOSIX but of course it is only available for Linux, as is Condor. I looked into Treadmarks but I do not think that would be the best tool. I came across some references to something that Sarnoff had done but all of the links to it are stale and therefore I could not really get any useful information on that. Can anyone provide any information on what is possible with FreeBSD as related to High Throughput Computing with good cluster management software? I would be willing to sacrifice some "ease of use" features in the cluster management software if it meant that I would not have to switch to Linux. Thanks in advance. -- Glenn Johnson Technician USDA, ARS, SRRC New Orleans, LA To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-cluster" in the body of the message From owner-freebsd-cluster Thu Mar 2 9:47:56 2000 Delivered-To: freebsd-cluster@freebsd.org Received: from acl.lanl.gov (acl.lanl.gov [128.165.147.1]) by hub.freebsd.org (Postfix) with ESMTP id 9719237BD8C for ; Thu, 2 Mar 2000 09:47:49 -0800 (PST) (envelope-from rminnich@lanl.gov) Received: from mini.acl.lanl.gov (root@mini.acl.lanl.gov [128.165.147.34]) by acl.lanl.gov (8.8.8/8.8.5) with ESMTP id KAA300043; Thu, 2 Mar 2000 10:47:46 -0700 (MST) Received: from localhost (rminnich@localhost) by mini.acl.lanl.gov (8.9.3/8.8.8) with ESMTP id KAA28902; Thu, 2 Mar 2000 10:47:46 -0700 X-Authentication-Warning: mini.acl.lanl.gov: rminnich owned process doing -bs Date: Thu, 2 Mar 2000 10:47:45 -0700 (MST) From: "Ronald G. Minnich" X-Sender: rminnich@mini.acl.lanl.gov To: Glenn Johnson Cc: cluster@FreeBSD.ORG Subject: Re: Is there a cluster manager for FreeBSD that does job migration? In-Reply-To: <20000302092927.A73201@symbion.srrc.usda.gov> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-cluster@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Thu, 2 Mar 2000, Glenn Johnson wrote: > I am currently using FreeBSD-3.4 with clusterit-1.3 on a small 4 node > cluster. I have mpich and pvm installed for parallel programs. I need > to expand the capabilities of this cluster and add another 12 nodes > to it. I do not think that the current tools I have (clusterit and > GNQS) can meet the needs I have. The parallel part of the cluster is > not a problem but I need to set it up for High Throughput Computing as > the majority of the jobs are single processor jobs, but there are a > lot of them. The availability of the cluster will also be widened to > include other users at my facility besides myself. This means that a > cluster that presents itself to the user as a single machine would be > beneficial. Also, there will be some nodes on the cluster that will > be in and out of the cluster as they are called upon for other needs > periodically so the ability to move jobs between nodes is important. I'd look into PBS, that's what we use here and overall it's probably the best you're going to do. It has some pretty neat capabilities which I just learned about. For example, once you've scheduled the nodes you can rsh to them. You'll see why this matters if you look at my new home page and check out vex, the Vector EXecute tool. > but I do not think that would be the best tool. I came across some > references to something that Sarnoff had done but all of the links to it > are stale and therefore I could not really get any useful information on > that. That was me, and I've moved. The links should be to www.acl.lanl.gov/~rminnich. you can do everything in freebsd just as well as you can do in Linux. You lose some things -- some commercial tools such as PGI Fortran won't run on FreeBSD, you don't get giganet, etc. -- but you get the freebsd advantages, which are considerable. I'm doing Linux full time now, since that's what the DOE has picked, but I haven't forgot my good times using freebsd on 80 machines. BTW, the stuff on my web page does scale to 161 nodes -- I've tested that. So scaling on 16 nodes is quite good. ron To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-cluster" in the body of the message