From owner-freebsd-cluster@FreeBSD.ORG Wed Apr 21 04:40:16 2004 Return-Path: Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8DB2516A4CE; Wed, 21 Apr 2004 04:40:16 -0700 (PDT) Received: from ms-smtp-01.nyroc.rr.com (ms-smtp-01.nyroc.rr.com [24.24.2.55]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0EECF43D5D; Wed, 21 Apr 2004 04:40:16 -0700 (PDT) (envelope-from jracine@maxwell.syr.edu) Received: from [24.59.145.52] (syr-24-59-145-52.twcny.rr.com [24.59.145.52]) i3LBeCdd018121; Wed, 21 Apr 2004 07:40:12 -0400 (EDT) From: Jeffrey Racine To: obrien@freebsd.org In-Reply-To: <20040420033208.GB98258@dragon.nuxi.com> References: <024f01c41ffa$029327e0$0c03a8c0@internal.thebeatbox.org> <1081775064.990.13.camel@x1-6-00-b0-d0-c2-67-0e.twcny.rr.com> <20040420033208.GB98258@dragon.nuxi.com> Content-Type: text/plain Organization: Syracuse University Message-Id: <1082547606.31496.3.camel@x1-6-00-b0-d0-c2-67-0e.twcny.rr.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Wed, 21 Apr 2004 07:40:06 -0400 Content-Transfer-Encoding: 7bit X-Virus-Scanned: Symantec AntiVirus Scan Engine cc: freebsd-amd64@freebsd.org cc: freebsd-cluster@freebsd.org Subject: Re: LAM MPI on dual processor opteron box sees only one cpu... X-BeenThere: freebsd-cluster@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Clustering FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Apr 2004 11:40:16 -0000 Hi David. Thanks for your response. With the 4BSD scheduler, things run as expected... lam with 2 processors always fires up cpu 0 and 1 and distributes the load evenly... PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU COMMAND 838 jracine 101 0 5632K 2012K CPU0 0 0:02 58.94% 5.62% n_lam 839 jracine 101 0 5616K 1968K RUN 1 0:02 57.91% 5.52% n_lam -- Jeff On Mon, 2004-04-19 at 23:32, David O'Brien wrote: > On Mon, Apr 12, 2004 at 09:04:24AM -0400, Jeffrey Racine wrote: > > Hi Roland. > > > > I do get CPU #1 launched. This is not the problem. > > > > The problem appears to be with the way that current is scheduling. > > > > With mpirun np 2 I get the job running on CPU 0 (two instances on one > > proc). However, it turns out that with np 4 I get the job running on CPU > > 0 and 1 though with 4 instances (and associated overhead). Here is top > > for np 4... notice that in the C column it is using both procs. > > > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU > > COMMAND > > 96090 jracine 131 0 7148K 2172K CPU1 1 0:19 44.53% 44.53% > > n_lam > > 96088 jracine 125 0 7148K 2172K RUN 0 0:18 43.75% 43.75% > > n_lam > > 96089 jracine 136 0 7148K 2172K RUN 1 0:19 42.19% 42.19% > > n_lam > > 96087 jracine 135 0 7188K 2248K RUN 0 0:19 41.41% 41.41% > > n_lam > > > > > > One run (once when I rebooted lam) did allocate the job correctly with > > np 2, but this is not in general the case. On other systems I use, > > however, they correctly farm out np 2 to CPU 0 and 1... > > > > Thanks, and any suggestions welcome. > > 1. Please don't top-post -- it looses context. This is a Unix list, not > Mikeysoft one. > > 2. Have you tried with the 4.4BSD scheduler vs. the "ULE" scheduler? > To test, replace: > options SCHED_ULE # ULE scheduler > with > options SCHED_4BSD #4BSD scheduler > > -- David