Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 14 Jul 2007 04:04:21 -0700
From:      Garrett Cooper <youshi10@u.washington.edu>
To:        Tim Kientzle <kientzle@freebsd.org>
Cc:        ports@freebsd.org, hackers@freebsd.org, krion@freebsd.org
Subject:   Re: Finding slowdowns in pkg_install	(continuations	of	previous	threads)
Message-ID:  <4698ADB5.7080600@u.washington.edu>
In-Reply-To: <4697A210.2020301@u.washington.edu>
References:  <468C96C0.1040603@u.washington.edu>	<468C9718.1050108@u.washington.edu>	<468E60E9.80507@freebsd.org>	<468E6C81.4060908@u.washington.edu>	<468E7192.8030105@freebsd.org>	<4696C0D2.6010809@u.washington.edu> <4697A210.2020301@u.washington.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
Garrett Cooper wrote:
> Garrett Cooper wrote:
>> Tim Kientzle wrote:
>>>>    -I tried ... buffering ...  the +CONTENTS file parsing function, 
>>>> and the
>>>> majority of the time it yielded good results ....
>>>
>>> One approach I prototyped sometime back was to use
>>> libarchive in pkg_add as follows:
>>>   * Open the archive
>>>   * Read +CONTENTS directly into memory (it's
>>> guaranteed to always be first in the archive)
>>>   * Parse all of +CONTENTS at once
>>>   * Continue scanning the archive, disposing
>>> of each file as it appears in the archive.
>>>
>>> Based on my experience with this, I would
>>> suggest you just read all of +CONTENTS
>>> directly into memory at once and parse
>>> the whole thing in a single shot.
>>> fopen(), then fstat() to get the size,
>>> then allocate a buffer and read the whole
>>> thing, then fclose().  You can then
>>> parse it all at once.
>>>
>>> As a bonus, your parser then becomes a nice
>>> little bit of reusable code that reads
>>> a block of memory and returns a structure describing
>>> the package metadata.
>>>
>>> Tim Kientzle
>> I'm not 100% sure because I'm not comparing apples (virtual disk on 
>> desktop via VMware) to apples (real disk on server), but I'm showing 
>> a 2.5-fold speedup after adding the simple parser:
>>
>> Virtual disk:
>>        4.42 real         1.37 user         1.47 sys
>>
>> Real disk:
>>       10.26 real         5.36 user         0.99 sys
>>
>> I'll run a battery of tests just to ensure whether or not that's the 
>> case.
>>
>> Be back with results in a few more days.
>>
>> -Garrett
> Hello,
>    As promised, here are some results for my work:
>
>    By modifying the parser and heuristics in plist_cmd I appear to 
> have decreased all figures (except plist_cmd, which I will note later) 
> from their original values to much lower values. The only drawback is 
> that I appear to have stimulated a bug with either malloc'ing memory, 
> printf/vargs, or transferring large amounts of data via pipes where 
> some of my debug messages are making it into plist_cmd(..) from 
> obtainbymatch(..), which represents the the 3-fold increase in 
> reported plist_cmd(..) iterations.
>
>    I'm going to try replacing the debug commands with standard print 
> statements wherever possible, then replace all tar commands with 
> libarchive APIs, and see if the problem solves itself.
>
> Notes:
> 1. This sample is based off x11-libs/atk.
> 2. It isn't the final set of results.
> 3. Graphs coming soon (need to simulate values in Excel on work 
> machine and convert to screenshots later on when I have a break -- 
> thinking around noon). I'll repost when I have them available.
> 4. CSV files available at: 
> http://students.washington.edu/youshi10/posted/atk-results.tgz.
I've posted HTML results of the interpreted spreadsheet on 
<http://students.washington.edu/posted/atk.htm>. I'll provide commentary 
tomorrow after I get some sleep.
-Garrett



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4698ADB5.7080600>