From owner-freebsd-performance@FreeBSD.ORG Thu Sep 25 00:49:54 2014 Return-Path: Delivered-To: freebsd-performance@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C5F5C218 for ; Thu, 25 Sep 2014 00:49:54 +0000 (UTC) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id B283E331 for ; Thu, 25 Sep 2014 00:49:54 +0000 (UTC) Received: from u10-2-16-063.office.norse-data.com (unknown [50.204.88.51]) by elvis.mu.org (Postfix) with ESMTPSA id F2D3C346DE7C for ; Wed, 24 Sep 2014 17:49:47 -0700 (PDT) Message-ID: <542366AC.3030700@freebsd.org> Date: Wed, 24 Sep 2014 17:49:48 -0700 From: Alfred Perlstein Organization: FreeBSD User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:31.0) Gecko/20100101 Thunderbird/31.1.1 MIME-Version: 1.0 To: freebsd-performance@FreeBSD.org Subject: Re: I like iostat, but... References: <20140924150915.GC1221@albert.catwhisker.org> In-Reply-To: <20140924150915.GC1221@albert.catwhisker.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Sep 2014 00:49:54 -0000 On 9/24/14 8:09 AM, David Wolfskill wrote: > On Tue, Sep 23, 2014 at 12:29:22AM +0300, Andriy Gapon wrote: >> On 23/09/2014 00:22, David Wolfskill wrote: >>> ... I rather wish I could get the same information via sysctl. (Well, >>> something seems to be available via the "opaque" kern.devstat.all >>> sysctl(8) variable, but sysctl(8) doesn't display all of it, and parsing >>> it seems as if that would require knowledge about the internals of the >>> system where the data were acquired.) >> Perhaps sysutils/devstat could be of help? >> ... > On Tue, Sep 23, 2014 at 08:56:54AM +0200, Borja Marcos wrote: >> ... >> Reading sysctl from a small C program is not hard at all. I did it for devilator (a data recollector for Orca). And >> there's a lot of data available. An advantage is, you avoid launching several processes. >> ... > On Tue, Sep 23, 2014 at 10:45:14AM +0200, Borja Marcos wrote: >> ... >> Anyway, for disk stats GEOM offers a nice API. You can get delays per GEOM provider, bandwidths, etc. > On Tue, Sep 23, 2014 at 11:38:44AM +0300, Stefan Parvu wrote: >> ... >> I gave up parsing sysctl via Perl for disks and network devices. It would be >> nice to have devstat properly working via sysctl for disk devices. Similar way >> kern.cp_times does. Currently there is no simple way to extract per disk stats from >> sysctl as a Perl or Sh consumer, unless we build a C module to do that. > > Folks, I appreciate the suggestions, but they address problems other > than the one I am trying to solve. > > In particular: > * I require that the tool must only depend on components of base FreeBSD; > thus, I don't need to perturb the system I want to measure by > installing otherwise unneeded software on it. > > * I am using a shell script (which uses date, sysctl, netstat, and awk) > so I don't need to recompile my data acquisition tool. > > * The tasks I am trying to measure are software builds similar to > a stable/10 "make universe" -- but about 2 - 3 times the duration > (and reading and writing a significantly larger amount of data). > > Thus, the number of additional processes caused by my probe firing > even as often as once/second is lost in the noise. > > > >> ... >>> If iostat(8) could be taught to (optionally) provide a timestamp, that >>> might suffice. >> In fact all performance userland tools should be able to nicely produce timestamp >> CSV records which can be used for capacity planning and feed them to an >> analytic product. Something like: >> >> 1411445589:4.01:16.03:383.97:1.93:0.29:1.68:0.11:95.99:0.00:0.00:0.00:0.00:123.00:229516.00:570992.00 >> >> or something like iostat: >> >> 1411445733:ada0:0.00:0.00:0.00:0.00:0.00:0.00:0.00 >> >> where the first field would be always the timestmp, unix time. It is not that complicated >> but it does not exist. > > The output of my shell script may be described as: > > # ::= newline > # ::= | tab > # ::= : > # ::= [_0-9a-zA-Z][-_.0-9a-zA-Z]+ > # ::= [^\t\n]* > > # Each record will have a field with a tag called "time"; the > # associated value will be the number of seconds since the Epoch, > # but will be coerced as necessary to ensure that it is monotonically > # increasing. > > # Note that the colon (":") is a valid character in the value part > # of a field (but not in the tag). Further, there is no whitespace > # on either side of that delimiting colon: everything to the left of the > # colon (up to, but not including, the previous tab, if any) is part of the > # tag; everything to the right of the colon (up to, but not including, > # the next tab or newline) is part of the value. > > A prior version of the script output CSV; for this version, I chose to > use the above format because I had situations where some fields showed > up on some lines, but not on others. That tends to make the use of CSV > a bit problematic. (On a machine where I post-process the collected > data, I have some Perl that reads the above format, creates new field > names as necessary to cope with multivariate data (e.g., kern.cp_time or > vm.loadavg), and generates CSV from the result.) > >> ... >> My idea of having standard data recorders: sysrec, cpurec, nicrec diskrec >> which can extract: overall system consumption, per device statistics. >> http://www.systemdatarecorder.org/recording/agents.html >> ... > That looks interesting and useful for a broad class of similar problems. > > However, as far as I can tell, it is not suitable for the problem(s) I > am trying to solve in this case. > > Basically, I have something that works "well enough" for things > like CPU counters, memory usage (at the rather coarse granularity > that top(1) provides, vs. "vmstat -m" output), load avergaes, and > NIC counters, and is readily extensible to any univariate (or simple > list of multivariate) (non-opaque) sysctl OIDs. I'd like to be > able to include information from the I/O subsystem -- in particular, > data that is accessible from "iostat -x". > David: check this out: https://github.com/alfredperlstein/eagleeye it makes time series data out of a number of tools, it's not really up to date, but you can probably pull useful code from it. Also there was the "machine readable output from utilities" GSoC project that I still need to merge, but that is a larger project. -Alfred