From owner-freebsd-cluster@FreeBSD.ORG Mon Apr 19 20:32:11 2004 Return-Path: Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 887D416A4CE; Mon, 19 Apr 2004 20:32:11 -0700 (PDT) Received: from TRANG.nuxi.com (trang.nuxi.com [66.93.134.19]) by mx1.FreeBSD.org (Postfix) with ESMTP id 679A743D49; Mon, 19 Apr 2004 20:32:11 -0700 (PDT) (envelope-from obrien@NUXI.com) Received: from dragon.nuxi.com (obrien@localhost [127.0.0.1]) by TRANG.nuxi.com (8.12.11/8.12.10) with ESMTP id i3K3WA2t098483; Mon, 19 Apr 2004 20:32:10 -0700 (PDT) (envelope-from obrien@dragon.nuxi.com) Received: (from obrien@localhost) by dragon.nuxi.com (8.12.11/8.12.11/Submit) id i3K3W9n5098482; Mon, 19 Apr 2004 20:32:09 -0700 (PDT) (envelope-from obrien) Date: Mon, 19 Apr 2004 20:32:08 -0700 From: "David O'Brien" To: Jeffrey Racine Message-ID: <20040420033208.GB98258@dragon.nuxi.com> References: <024f01c41ffa$029327e0$0c03a8c0@internal.thebeatbox.org> <1081775064.990.13.camel@x1-6-00-b0-d0-c2-67-0e.twcny.rr.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1081775064.990.13.camel@x1-6-00-b0-d0-c2-67-0e.twcny.rr.com> User-Agent: Mutt/1.4.1i X-Operating-System: FreeBSD 5.2-CURRENT Organization: The NUXI BSD Group X-Pgp-Rsa-Fingerprint: B7 4D 3E E9 11 39 5F A3 90 76 5D 69 58 D9 98 7A X-Pgp-Rsa-Keyid: 1024/34F9F9D5 cc: freebsd-amd64@freebsd.org cc: freebsd-cluster@freebsd.org Subject: Re: LAM MPI on dual processor opteron box sees only one cpu... X-BeenThere: freebsd-cluster@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: obrien@freebsd.org List-Id: Clustering FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Apr 2004 03:32:11 -0000 On Mon, Apr 12, 2004 at 09:04:24AM -0400, Jeffrey Racine wrote: > Hi Roland. > > I do get CPU #1 launched. This is not the problem. > > The problem appears to be with the way that current is scheduling. > > With mpirun np 2 I get the job running on CPU 0 (two instances on one > proc). However, it turns out that with np 4 I get the job running on CPU > 0 and 1 though with 4 instances (and associated overhead). Here is top > for np 4... notice that in the C column it is using both procs. > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU > COMMAND > 96090 jracine 131 0 7148K 2172K CPU1 1 0:19 44.53% 44.53% > n_lam > 96088 jracine 125 0 7148K 2172K RUN 0 0:18 43.75% 43.75% > n_lam > 96089 jracine 136 0 7148K 2172K RUN 1 0:19 42.19% 42.19% > n_lam > 96087 jracine 135 0 7188K 2248K RUN 0 0:19 41.41% 41.41% > n_lam > > > One run (once when I rebooted lam) did allocate the job correctly with > np 2, but this is not in general the case. On other systems I use, > however, they correctly farm out np 2 to CPU 0 and 1... > > Thanks, and any suggestions welcome. 1. Please don't top-post -- it looses context. This is a Unix list, not Mikeysoft one. 2. Have you tried with the 4.4BSD scheduler vs. the "ULE" scheduler? To test, replace: options SCHED_ULE # ULE scheduler with options SCHED_4BSD #4BSD scheduler -- David From owner-freebsd-cluster@FreeBSD.ORG Wed Apr 21 04:40:16 2004 Return-Path: Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8DB2516A4CE; Wed, 21 Apr 2004 04:40:16 -0700 (PDT) Received: from ms-smtp-01.nyroc.rr.com (ms-smtp-01.nyroc.rr.com [24.24.2.55]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0EECF43D5D; Wed, 21 Apr 2004 04:40:16 -0700 (PDT) (envelope-from jracine@maxwell.syr.edu) Received: from [24.59.145.52] (syr-24-59-145-52.twcny.rr.com [24.59.145.52]) i3LBeCdd018121; Wed, 21 Apr 2004 07:40:12 -0400 (EDT) From: Jeffrey Racine To: obrien@freebsd.org In-Reply-To: <20040420033208.GB98258@dragon.nuxi.com> References: <024f01c41ffa$029327e0$0c03a8c0@internal.thebeatbox.org> <1081775064.990.13.camel@x1-6-00-b0-d0-c2-67-0e.twcny.rr.com> <20040420033208.GB98258@dragon.nuxi.com> Content-Type: text/plain Organization: Syracuse University Message-Id: <1082547606.31496.3.camel@x1-6-00-b0-d0-c2-67-0e.twcny.rr.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Wed, 21 Apr 2004 07:40:06 -0400 Content-Transfer-Encoding: 7bit X-Virus-Scanned: Symantec AntiVirus Scan Engine cc: freebsd-amd64@freebsd.org cc: freebsd-cluster@freebsd.org Subject: Re: LAM MPI on dual processor opteron box sees only one cpu... X-BeenThere: freebsd-cluster@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Clustering FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Apr 2004 11:40:16 -0000 Hi David. Thanks for your response. With the 4BSD scheduler, things run as expected... lam with 2 processors always fires up cpu 0 and 1 and distributes the load evenly... PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU COMMAND 838 jracine 101 0 5632K 2012K CPU0 0 0:02 58.94% 5.62% n_lam 839 jracine 101 0 5616K 1968K RUN 1 0:02 57.91% 5.52% n_lam -- Jeff On Mon, 2004-04-19 at 23:32, David O'Brien wrote: > On Mon, Apr 12, 2004 at 09:04:24AM -0400, Jeffrey Racine wrote: > > Hi Roland. > > > > I do get CPU #1 launched. This is not the problem. > > > > The problem appears to be with the way that current is scheduling. > > > > With mpirun np 2 I get the job running on CPU 0 (two instances on one > > proc). However, it turns out that with np 4 I get the job running on CPU > > 0 and 1 though with 4 instances (and associated overhead). Here is top > > for np 4... notice that in the C column it is using both procs. > > > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU > > COMMAND > > 96090 jracine 131 0 7148K 2172K CPU1 1 0:19 44.53% 44.53% > > n_lam > > 96088 jracine 125 0 7148K 2172K RUN 0 0:18 43.75% 43.75% > > n_lam > > 96089 jracine 136 0 7148K 2172K RUN 1 0:19 42.19% 42.19% > > n_lam > > 96087 jracine 135 0 7188K 2248K RUN 0 0:19 41.41% 41.41% > > n_lam > > > > > > One run (once when I rebooted lam) did allocate the job correctly with > > np 2, but this is not in general the case. On other systems I use, > > however, they correctly farm out np 2 to CPU 0 and 1... > > > > Thanks, and any suggestions welcome. > > 1. Please don't top-post -- it looses context. This is a Unix list, not > Mikeysoft one. > > 2. Have you tried with the 4.4BSD scheduler vs. the "ULE" scheduler? > To test, replace: > options SCHED_ULE # ULE scheduler > with > options SCHED_4BSD #4BSD scheduler > > -- David From owner-freebsd-cluster@FreeBSD.ORG Wed Apr 21 04:50:27 2004 Return-Path: Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 91EA516A4CE; Wed, 21 Apr 2004 04:50:27 -0700 (PDT) Received: from ms-smtp-04.nyroc.rr.com (ms-smtp-04.nyroc.rr.com [24.24.2.58]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2696D43D3F; Wed, 21 Apr 2004 04:50:27 -0700 (PDT) (envelope-from jracine@maxwell.syr.edu) Received: from [24.59.145.52] (syr-24-59-145-52.twcny.rr.com [24.59.145.52]) i3LBoNMY003602; Wed, 21 Apr 2004 07:50:24 -0400 (EDT) From: Jeffrey Racine To: obrien@freebsd.org In-Reply-To: <20040420033208.GB98258@dragon.nuxi.com> References: <024f01c41ffa$029327e0$0c03a8c0@internal.thebeatbox.org> <1081775064.990.13.camel@x1-6-00-b0-d0-c2-67-0e.twcny.rr.com> <20040420033208.GB98258@dragon.nuxi.com> Content-Type: text/plain Organization: Syracuse University Message-Id: <1082548217.31726.1.camel@x1-6-00-b0-d0-c2-67-0e.twcny.rr.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Wed, 21 Apr 2004 07:50:18 -0400 Content-Transfer-Encoding: 7bit X-Virus-Scanned: Symantec AntiVirus Scan Engine cc: freebsd-amd64@freebsd.org cc: freebsd-cluster@freebsd.org Subject: Re: LAM MPI on dual processor opteron box sees only one cpu... X-BeenThere: freebsd-cluster@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Clustering FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Apr 2004 11:50:27 -0000 Hi David. It runs as fine with the 4BSD scheduler and distributes the load evenly... here is top with 4BSD doing the scheduling... PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU COMMAND 6105 jracine 107 0 5616K 1968K CPU0 0 0:06 95.70% 21.19% n_lam 6104 jracine 107 0 5632K 2012K RUN 1 0:06 95.48% 21.14% n_lam Thanks ever so much for your kind response. -- Jeff On Mon, 2004-04-19 at 23:32, David O'Brien wrote: > On Mon, Apr 12, 2004 at 09:04:24AM -0400, Jeffrey Racine wrote: > > Hi Roland. > > > > I do get CPU #1 launched. This is not the problem. > > > > The problem appears to be with the way that current is scheduling. > > > > With mpirun np 2 I get the job running on CPU 0 (two instances on one > > proc). However, it turns out that with np 4 I get the job running on CPU > > 0 and 1 though with 4 instances (and associated overhead). Here is top > > for np 4... notice that in the C column it is using both procs. > > > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU > > COMMAND > > 96090 jracine 131 0 7148K 2172K CPU1 1 0:19 44.53% 44.53% > > n_lam > > 96088 jracine 125 0 7148K 2172K RUN 0 0:18 43.75% 43.75% > > n_lam > > 96089 jracine 136 0 7148K 2172K RUN 1 0:19 42.19% 42.19% > > n_lam > > 96087 jracine 135 0 7188K 2248K RUN 0 0:19 41.41% 41.41% > > n_lam > > > > > > One run (once when I rebooted lam) did allocate the job correctly with > > np 2, but this is not in general the case. On other systems I use, > > however, they correctly farm out np 2 to CPU 0 and 1... > > > > Thanks, and any suggestions welcome. > > 1. Please don't top-post -- it looses context. This is a Unix list, not > Mikeysoft one. > > 2. Have you tried with the 4.4BSD scheduler vs. the "ULE" scheduler? > To test, replace: > options SCHED_ULE # ULE scheduler > with > options SCHED_4BSD #4BSD scheduler > > -- David From owner-freebsd-cluster@FreeBSD.ORG Thu Apr 22 04:34:39 2004 Return-Path: Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 28B6516A4CE for ; Thu, 22 Apr 2004 04:34:39 -0700 (PDT) Received: from mail.fibertel.com.ar (mta4.fibertel.com.ar [24.232.0.164]) by mx1.FreeBSD.org (Postfix) with ESMTP id E54DD43D31 for ; Thu, 22 Apr 2004 04:34:38 -0700 (PDT) (envelope-from diabolo@fibertel.com.ar) Received: from [10.0.10.2] (200.89.154.233) by mail.fibertel.com.ar (7.0.019) (authenticated as diabolo) id 4084459D001039A6 for freebsd-cluster@freebsd.org; Thu, 22 Apr 2004 08:34:38 -0300 From: Diabolo To: freebsd-cluster@freebsd.org Content-Type: text/plain Message-Id: <1082634059.1117.5.camel@debianito.inferno> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Thu, 22 Apr 2004 08:40:59 -0300 Content-Transfer-Encoding: 7bit Subject: Best tool X-BeenThere: freebsd-cluster@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Clustering FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2004 11:34:39 -0000 Hi guys what is the best tool to make cluster like OPENMOSIX in linux LAM its the best??? Regards. Diabolo.- From owner-freebsd-cluster@FreeBSD.ORG Thu Apr 22 05:43:19 2004 Return-Path: Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E9ACE16A4CE for ; Thu, 22 Apr 2004 05:43:19 -0700 (PDT) Received: from mwinf0403.wanadoo.fr (smtp4.wanadoo.fr [193.252.22.27]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6462B43D5F for ; Thu, 22 Apr 2004 05:43:19 -0700 (PDT) (envelope-from saut@mip.ups-tlse.fr) Received: from banquo.homeip.net (AToulouse-104-1-4-23.w80-11.abo.wanadoo.fr [80.11.126.23]) by mwinf0403.wanadoo.fr (SMTP Server) with ESMTP id 2B5D35000694; Thu, 22 Apr 2004 14:43:18 +0200 (CEST) Received: from mip.ups-tlse.fr (banquo.homeip.net [127.0.0.1]) by banquo.homeip.net (Postfix) with ESMTP id 4D3C698; Thu, 22 Apr 2004 14:43:21 +0200 (CEST) Message-ID: <4087BDE9.2090002@mip.ups-tlse.fr> Date: Thu, 22 Apr 2004 14:43:21 +0200 From: Olivier Saut User-Agent: Mozilla Thunderbird 0.5 (X11/20040405) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Diabolo References: <1082634059.1117.5.camel@debianito.inferno> In-Reply-To: <1082634059.1117.5.camel@debianito.inferno> X-Enigmail-Version: 0.83.6.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-cluster@freebsd.org Subject: Re: Best tool X-BeenThere: freebsd-cluster@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Clustering FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2004 12:43:20 -0000 Diabolo wrote: > Hi guys what is the best tool to make cluster like OPENMOSIX in linux > LAM its the best??? AFAIK there is no equivalent to Mosix on FreeBSD (though I think mosix was initially developed on BSD OS). There is no checkpointing in FBSD but there is a working implementation in DragonFly BSD (which I'd like to see backported to FBSD but I am far from being able to do the job). If you just want to run MPI apps, ports/net/mpich runs fine. I have also worked flawlessly with LAM but it is not in the ports system. Regards, - Olivier From owner-freebsd-cluster@FreeBSD.ORG Thu Apr 22 05:53:19 2004 Return-Path: Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 838A516A4CE for ; Thu, 22 Apr 2004 05:53:19 -0700 (PDT) Received: from otter3.centtech.com (moat3.centtech.com [207.200.51.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 00E5C43D46 for ; Thu, 22 Apr 2004 05:53:19 -0700 (PDT) (envelope-from anderson@centtech.com) Received: from centtech.com (neutrino.centtech.com [10.177.171.220]) by otter3.centtech.com (8.12.3/8.12.3) with ESMTP id i3MCrIE8088077 for ; Thu, 22 Apr 2004 07:53:18 -0500 (CDT) (envelope-from anderson@centtech.com) Message-ID: <4087C033.5010501@centtech.com> Date: Thu, 22 Apr 2004 07:53:07 -0500 From: Eric Anderson User-Agent: Mozilla Thunderbird 0.5 (X11/20040406) X-Accept-Language: en-us, en MIME-Version: 1.0 To: freebsd-cluster@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Clustering NFS servers X-BeenThere: freebsd-cluster@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Clustering FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2004 12:53:19 -0000 Has anyone done anything like this? Has anyone heard of Polycom's clustered NFS solution? We are a heavy NFS usage shop, and I'd like to distribute my load. Eric -- ------------------------------------------------------------------ Eric Anderson Sr. Systems Administrator Centaur Technology Today is the tomorrow you worried about yesterday. ------------------------------------------------------------------ From owner-freebsd-cluster@FreeBSD.ORG Sat Apr 24 04:17:43 2004 Return-Path: Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BC6FE16A4CE for ; Sat, 24 Apr 2004 04:17:43 -0700 (PDT) Received: from ms-smtp-03.nyroc.rr.com (ms-smtp-03.nyroc.rr.com [24.24.2.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1513F43D5C for ; Sat, 24 Apr 2004 04:17:43 -0700 (PDT) (envelope-from jracine@maxwell.syr.edu) Received: from [24.59.145.52] (syr-24-59-145-52.twcny.rr.com [24.59.145.52]) i3OBHef2013110 for ; Sat, 24 Apr 2004 07:17:40 -0400 (EDT) From: Jeffrey Racine To: freebsd-cluster@freebsd.org In-Reply-To: 1082634059.1117.5.camel@debianito.inferno Content-Type: text/plain Organization: Syracuse University Message-Id: <1082805406.14410.1.camel@x1-6-00-b0-d0-c2-67-0e.twcny.rr.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Sat, 24 Apr 2004 07:16:46 -0400 Content-Transfer-Encoding: 7bit X-Virus-Scanned: Symantec AntiVirus Scan Engine Subject: Best tool X-BeenThere: freebsd-cluster@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Clustering FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Apr 2004 11:17:43 -0000 Hi. Diabolo wrote: > I have also worked flawlessly with LAM but it is not in the ports system. Just a quick note to point out that LAM is indeed in the ports system... /usr/ports/net/lam -- Jeff Regards, - Olivier