Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 28 Dec 2004 05:37:36 +0000 (UTC)
From:      Kris Kennaway <kris@FreeBSD.org>
To:        ports-committers@FreeBSD.org, cvs-ports@FreeBSD.org, cvs-all@FreeBSD.org
Subject:   cvs commit: ports/Tools/portbuild/scripts getmachine
Message-ID:  <200412280537.iBS5bauB093350@repoman.freebsd.org>

next in thread | raw e-mail | index | archive | help
kris        2004-12-28 05:37:36 UTC

  FreeBSD ports repository

  Added files:
    Tools/portbuild/scripts getmachine 
  Log:
  Overhaul of the job scheduler.  The new scheduler runs builds
  synchronously instead of probabilistically scheduling jobs, which
  means that the job load on a machine never exceeds a desired
  threshold, and we can preferentially use faster machines when they are
  available.  This has a dramatic effect on package build throughput,
  although I don't yet have precise measurements of the performance
  improvements.
  
  Specifically, the changes are:
  
  * Introduce the new variable maxjobs in portbuild.  This replaces the
  build scheduling weights previously listed in the mlist file, which
  now changes format to list the build machines only, ranked in order of
  preference for job dispatches (i.e. faster machines first).
  
  * The ${arch}/queue directory is used to list machines available for
  jobs (file content is the number of jobs currently running on the
  machine).  Changes to files in this directory are serialized using
  lockf on the .lock file.
  
  * Claim a machine with the getmachine script, with the .lock held.
  This picks the machine with the fewestnumber of jobs running, which is
  listed highest in the mlist file in case of multiple machines with
  equal load.  The job counter is incremented, and the file removed if
  the counter reaches ${maxjobs} for that machine.  If all machines are
  busy, sleep for 15 seconds and retry.
  
  * After we have claimed a machine, we run claim-chroot on it to claim
  an empty chroot, as before.  If the claim fails, release the job from
  the queue with the releasemachine script and retry after a 15 second
  wait.
  
  * When the build is finished, decrement the job counter with the
  releasemachine script, with .lock held.
  
  * The checkmachines script now exists only to poll the load averages
  for admin convenience (every 2 minutes), and to ping for unreachable
  machines.  When a machine cannot be reached, remove the entry in the
  queue directory to stop further job dispatches to it.  This needs more
  work to deal with reinitialization of machines after they become
  available again.
  
  Revision  Changes    Path
  1.1       +53 -0     ports/Tools/portbuild/scripts/getmachine (new)



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200412280537.iBS5bauB093350>