From owner-freebsd-hackers@FreeBSD.ORG  Tue Jan  7 17:01:39 2014
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 91726B91;
 Tue,  7 Jan 2014 17:01:39 +0000 (UTC)
Received: from mail-ve0-x230.google.com (mail-ve0-x230.google.com
 [IPv6:2607:f8b0:400c:c01::230])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 3AA371A1D;
 Tue,  7 Jan 2014 17:01:39 +0000 (UTC)
Received: by mail-ve0-f176.google.com with SMTP id oz11so331769veb.21
 for <multiple recipients>; Tue, 07 Jan 2014 09:01:38 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date:message-id:subject
 :from:to:cc:content-type;
 bh=uBQiX9gtxcSNPeR0lpSPptryT21zewLXMGAO7peDiiM=;
 b=l9wtF0cX5LE8UvL3mZok37cbkKfLP3bkEPzM6HmT7N7IndGPa9EFTGYR0o0OgGnvdC
 U9AIfSHQHUC6+Ff7m8BhRVkQyjophkuTY54JjV2qI3HCrmkufCEIrWXR8jADNKB05Zle
 PVus8d5qtAJ16ZapbJ5JpAose6tZqd3mLah8cBOqCraKrKKHQDL+ZxZ6k1j5lajzFXoT
 AzEoyTnr3fqhZ2Pkf8ADnD/cQDj5ONYgf0HioL7YGqToIxkXVScBCXEW2eVLzS5pfaBp
 dzubIeSrTPstvP5NYN69l1HhE8vzmcZliNYrpwNP7rz/3aDt6ah9SNJq9lxW+oFqQVIy
 jmCw==
MIME-Version: 1.0
X-Received: by 10.52.157.68 with SMTP id wk4mr9308191vdb.19.1389114098215;
 Tue, 07 Jan 2014 09:01:38 -0800 (PST)
Sender: asomers@gmail.com
Received: by 10.58.57.163 with HTTP; Tue, 7 Jan 2014 09:01:38 -0800 (PST)
In-Reply-To: <CADyfeQUwmGnNVjExJGTwzTaTh9VgDgzcX0JNUvOcnpkZ7RK5gg@mail.gmail.com>
References: <lah8s3$8ur$1@ger.gmane.org>
 <CADyfeQUwmGnNVjExJGTwzTaTh9VgDgzcX0JNUvOcnpkZ7RK5gg@mail.gmail.com>
Date: Tue, 7 Jan 2014 10:01:38 -0700
X-Google-Sender-Auth: 6U-crYLiw7k8-P2Pwa0VZROfcRI
Message-ID: <CAOtMX2iPTrdWvHA3-GTbGUnBw17BDe6qNi2YfZJKcfLa+L7tqg@mail.gmail.com>
Subject: Re: Continual benchmarking / regression testing?
From: Alan Somers <asomers@freebsd.org>
To: Julio Merino <julio@meroh.net>
Content-Type: text/plain; charset=ISO-8859-1
Cc: "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>,
 Ivan Voras <ivoras@freebsd.org>
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Jan 2014 17:01:39 -0000

On Tue, Jan 7, 2014 at 9:11 AM, Julio Merino <julio@meroh.net> wrote:
> On Tue, Jan 7, 2014 at 4:09 PM, Ivan Voras <ivoras@freebsd.org> wrote:
>> Hello,
>>
>> Is someone working on a contitual benchmarking / regression testing
>> project for FreeBSD? I seem to recall there was a post several months
>> ago but I can't find it.
>
> See http://wiki.freebsd.org/TestSuite for the current efforts.

I think that Kyua is less than ideal for benchmarking.  It could be
extended, but there are two fundamental differences between a test
framework and a benchmark framework:

1) Benchmarks are slow.  Not only that, but they usually come with a
bewildering array of options (file size, I/O size, etc) that
exponentially increase the time required to do a comprehensive run of
all available tests with all available options.  So, you don't want to
run all of the benchmarks all of the time.  In contrast, tests are
usually fast, and you usually want to run all of them all of the time.

2) Tests usually have a binary output.  Did it pass or didn't it?
Kyua has a few other possible outcomes (expected failure, skipped,
broken), but it's still a short list.  In contrast, benchmarks usually
have a variable output, expressed as one (or more) real numbers.

IMO, the extensions that would be required for Kyua to function as a
benchmark framework would be too intrusive; they would make it more
difficult to maintain Kyua's role as a test framework, and add nothing
to Kyua's testing abilities.  I think that a separate benchmarking
framework would be better.

The best benchmarking framework that I know of is the Phoronix Test
Suite (http://www.phoronix-test-suite.com/) .  Its cross-platform, it
has a decent report generator, including a public list of results at
http://openbenchmarking.org/, and a huge library of benchmark
programs.  However, it has several drawbacks.  Many of the benchmark
programs are of poor quality IMHO; they seems like that get committed
without sufficient analysis to make sure that they're testing
something useful.  Also, while the PTS does some hardware profiling
before each run (see representative output at
http://openbenchmarking.org/result/1401071-UT-BUKOWSKIW54 ), it is
insufficient to really do a scientific analysis of hardware's
contributions to the scores.  For example, there is no way to query
openbenchmarking.org to see a graph of all the results for test X on
systems with CPU Y and harddrive Z and RAM speed Q vs the amount of
installed memory, with multiple results plotted as range bars.  I
would really like to be able to do that.  In fact, the cross-platform
nature of the PTS makes it harder to collect such information.
Finally, the PTS doesn't have any ability to run tests on a cluster of
machines.  That is critical for testing any subsystem that involves
networking, for example NFS.

For these reasons, I set out to write my own framework.  At a very
high level, it provides a framework that handles common functionality
like reporting results, commanding slave nodes, profiling the system
hardware, etc.  The individual benchmark programs are each written as
ruby scripts that are executed by the framework.  Importantly, the
framework does not include any kind of built-in sequencer.  There is
no way to say "run all benchmarks".  I envision that a technician
would be responsible to selecting which benchmarks to run with which
configuration options based on an organizations current needs.  In a
CI setting, there would be a short sh script that would run several
benchmarks in series.  In any case, the result report format does not
assume anything about how the tests were sequenced.  Each result
enters the database as a separate record with full information about
its configuration and the hardware and software environment under
which it ran.

Unfortunately, my framework is extremely incomplete.  It's not even
good enough for internal use, much less a wider audience.  And I fear
that my bosses won't give me any more time to work on it.  It's also
written in Ruby and uses STAF to command slave nodes, which the
FreeBSD community might not be excited about.  However, if there is
any interest, I can ask for permission to share my design as a
starting point for a more general framework.

-Alan

>
> --
> Julio Merino / @jmmv
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"