Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 06 Jul 2007 09:23:29 -0700
From:      Garrett Cooper <youshi10@u.washington.edu>
To:        Tim Kientzle <kientzle@freebsd.org>
Cc:        ports@freebsd.org, hackers@freebsd.org
Subject:   Re: Finding slowdowns in pkg_install (continuations of	previous	threads)
Message-ID:  <468E6C81.4060908@u.washington.edu>
In-Reply-To: <468E60E9.80507@freebsd.org>
References:  <468C96C0.1040603@u.washington.edu>	<468C9718.1050108@u.washington.edu> <468E60E9.80507@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Tim Kientzle wrote:
>>>    I'm currently running a gamut of tests (500 tests, per package -- 
>>> 128 total on my server), and outputting all data to CSV files to 
>>> interpret later, using another Perl script to interpret calculated 
>>> averages and standard deviations.
>
> Excellent!  Much-needed work.
>
>>>    Using basic printf(2)'s with clock_gettime(2) I have determined 
>>> that the majority of the issues are disk-bound (as Tom Kientzle put 
>>> it).
>
> Next question:  What are those disk operations and are any
> of them redundant?
>
>>> The scope of my problem is not to analyze tar,...
>
> I've spent the last three years+ doing exactly that.
> Make sure you're using the newest bsdtar/libarchive,
> which has some very noticable performance improvements.
>
>>> but I've discovered that a lot of time is spent in reading and 
>>> interpreting the +CONTENTS and related files (most notably in 
>>> parsing commands to be honest).
>
> Oh?  That's interesting.  Is data being re-parsed (in which case
> some structural changes to parse it once and store the results
> may help)?  Or is the parser just slow?
>
>>>    Will post more conclusive results tomorrow once all of my results 
>>> are available.
>
> I don't follow ports@ so didn't see these "conclusive results"
> of yours.  I'm very interested, though.
>
> Tim Kientzle
Some extra notes:
    -My tests are still running, but almost done (unfortunately I won't 
be able to post any results before tonight since I'm going to work now). 
It's taking a lot longer than I originally thought it would (I've 
produced several gigabytes of logfiles and csv files... eep).
    -I placed them around what I considered pkg_install specific 
sensitive areas, i.e. locations where tar was run, or the meta files 
were processed.
    -I tried implementing a small buffering technique (read in 10 lines 
at once, parse the 10 lines, and repeat, instead of read 1 line and 
parse, then repeat), around the +CONTENTS file parsing function, and the 
majority of the time it yielded good results (9/10 times the buffering 
technique won over the non-buffering technique). Given that success I'm 
going to try implementing the file reading in terms of fgetc(2) to 
properly read in a number of lines all at once, and see what happens 
instead (my hunch is those results may be more favorable).
Thanks,
-Garrett



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?468E6C81.4060908>