From owner-freebsd-current@FreeBSD.ORG  Wed Aug 15 19:18:52 2007
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id DF0EB16A468
	for <freebsd-current@freebsd.org>; Wed, 15 Aug 2007 19:18:51 +0000 (UTC)
	(envelope-from kris@obsecurity.org)
Received: from elvis.mu.org (elvis.mu.org [192.203.228.196])
	by mx1.freebsd.org (Postfix) with ESMTP id C5F0E13C457
	for <freebsd-current@freebsd.org>; Wed, 15 Aug 2007 19:18:51 +0000 (UTC)
	(envelope-from kris@obsecurity.org)
Received: from rot26.obsecurity.org (elvis.mu.org [192.203.228.196])
	by elvis.mu.org (Postfix) with ESMTP id 91ABD1A4D7C;
	Wed, 15 Aug 2007 12:17:23 -0700 (PDT)
Received: by rot26.obsecurity.org (Postfix, from userid 1001)
	id 8D556C3EC; Wed, 15 Aug 2007 15:18:50 -0400 (EDT)
Date: Wed, 15 Aug 2007 15:18:50 -0400
From: Kris Kennaway <kris@obsecurity.org>
To: Erik Cederstrand <erik@cederstrand.dk>
Message-ID: <20070815191850.GA74746@rot26.obsecurity.org>
References: <46C2C19D.9090700@cederstrand.dk>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="9jxsPFA5p3P2qPhR"
Content-Disposition: inline
In-Reply-To: <46C2C19D.9090700@cederstrand.dk>
User-Agent: Mutt/1.4.2.3i
Cc: freebsd-current@freebsd.org
Subject: Re: Feedback for performance tracker
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Aug 2007 19:18:52 -0000


--9jxsPFA5p3P2qPhR
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Aug 15, 2007 at 11:04:29AM +0200, Erik Cederstrand wrote:
> Hi!
>=20
> This autumn, we have decided to grab the Performance Tracker entry[1]
> from the project ideas page and give it a spin as a subject for our
> thesis at the IT University of Copenhagen. The tracker intends to fill a
> hole in the range of tinderboxes and automatic stress/regression tests
> that FreeBSD already has.
>=20
> The initial idea is to have a small collection of servers constantly
> performing benchmarks and publishing the results to a server with a web
> interface.
>=20
> Before we start coding, we'd like to ask a couple of questions:
>=20
> 1) Which benchmarks would you like to see being run?
> 2) Which tests do you perform regularly, which the tracker could automate?
> 3) Which features in the web interface would you find most helpful?
>=20
> Also, we'd greatly appreciate pointers to previous work in the area.
>=20
> We welcome all comments and suggestions, but please bear in mind that we
> only have around 3 months full-time to develop the tracker.

Hi,

Thanks for your interest in the project.  I have some recommendations
for how to approach it:

* Don't focus on the individual benchmarks, instead on the framework
  for accumulating and analysing the data.  There are lots of
  benchmarks we may want to plug into this over time, so developing a
  flexible and extensible system for doing this is more important than
  any given benchmark.

* I imagine a system where data from benchmark systems (which will be
  geographically remote) is fed into a database that tracks multiple
  data sets over time.  A front end would provide an interface into
  this database and allow for various analyses and visualizations of
  the data

* The system should allow for annotation of data, for example to
  provide explanations for sudden jumps in performance when they are
  understood.

* Data sets may be multi-dimensional (e.g. tracking a performance
  metric like network throughput as various parameters like packet
  size, number of concurrent streams, etc, are changed).  In most
  cases we are also interested in changes over time.

* There may be parametric and non-parametric variables.  An example of
  a parametric variable would be "size of a network packet" (i.e. a
  numerical parameter which takes values over some range).  A
  non-parametric variable might be "kernel built with option X, or
  option Y, or option Z".  It makes sense to visualize parametrized
  data as a continuous function, e.g. by plotting it as a continuous
  function on a graph, or fitting the data to a function.  It makes
  less sense to treat non-parametric data as a continuous function.

* Data sets are typically noisy.  They need to be analysed by
  statistical techniques to extract a signal (if any), which will
  usually be tiny over small times but may accumulate over larger
  times.  A background in statistics will be most useful here.

* An ideal front-end would be able to apply appropriate statistical
  and data visualization techniques to cross-sections of the data to
  answer questions like "have there been any statistically significant
  changes to this data set (or subset) over time, and if so, when did
  they occur?".

* There is likely to be significant prior art in all of this, but I
  don't know what any of it is.  The HDF data format
  http://hdf.ncsa.uiuc.edu/ and related tools might be interesting to
  investigate; but I don't really know anything about it so it might
  be too heavy-weight.  Perhaps some of our scientific computing users
  can make some suggestions.

* Start small.  You should keep an eye on the bigger picture such as
  what I suggest, but don't try and bite it all off at once.  For
  example, you could start by limiting to recording and analysing data
  sets that contain only a single data point changing over time (while
  hopefully not limiting future expansion), because even that will be
  a useful beginning.

Kris

--9jxsPFA5p3P2qPhR
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFGw1GaWry0BWjoQKURAvZIAKCEyR9SB5tDenW3IBeAL5bkAtAyfQCgyI6w
Vpy5c3M0b/d8GPu82akJfmg=
=Ou/p
-----END PGP SIGNATURE-----

--9jxsPFA5p3P2qPhR--