Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 Aug 2008 15:17:35 -0700
From:      Jason Evans <jasone@FreeBSD.org>
To:        Kris Kennaway <kris@FreeBSD.org>
Cc:        freebsd-performance@freebsd.org, Robert Watson <rwatson@FreeBSD.org>, Tim Traver <tt-list@simplenet.com>, Kris Kennaway <kris@obsecurity.org>
Subject:   Re: 7.0 CPU and Memory Performance
Message-ID:  <48A35D7F.3010805@FreeBSD.org>
In-Reply-To: <48A332FC.20600@FreeBSD.org>
References:  <48A1F379.2040805@simplenet.com>	<alpine.BSF.1.10.0808130939100.70092@fledge.watson.org> <48A33015.2080900@simplenet.com> <48A332FC.20600@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Kris Kennaway wrote:
> Tim Traver wrote:
>> And here is the run of the ubench.5.4 binary:
>> FreeBSD 7.0 - CPU 139,623 - MEM - 207,180
>>
>> And a rerun of the FreeBSD 7.0 ubench making sure there is absolutely 
>> no activity on the box
>> FreeBSD 7.0 - CPU 200,562 - MEM - 107,695
>>
>> That run is a little better than the previous one, but there seems to 
>> still be quite a difference in the memory tests...
>>
>> Does that show anything ????
> 
> It shows that if there is a difference it is probably in userland, not 
> the kernel.  The obvious guess is the new malloc in 7.0.  As for whether 
> it indicates a bug, someone would have to look more closely at what 
> ubench does.  The author's description of his benchmark doesn't inspire 
> confidence: it does "rather senseless memory allocation and memory to 
> memory copying operations for another 3 mins concurrently using several 
> processes".

The ubench memory benchmark operates almost entirely on 1024B buffers, 
which is nearly worst case for jemalloc.  Also, its memory use 
fluctuates wildly, in a pattern that causes a lot of dirty page flushing 
and chunk map/unmap activity.  That is where most of the difference is; 
jemalloc is more aggressive/effective in returning pages to the VM than 
is phkmalloc.  In order to verify the cause of the performance 
difference, I ran ubench (on an 8-current system) with 
MALLOC_OPTIONS=7F6K (avoid flushing dirty pages, and use 64-MiB chunks 
in order to avoid repeatedly mapping/unmapping chunks), and the ubench 
memory benchmark sped up by ~51%.  With the default configuration, 
jemalloc was ~13% slower than phkmalloc, but with 7F6K it was ~31% 
faster than phkmalloc.

On possible factor for stock FreeBSD 7.0 is a scalability issue that I 
MFC'ed a fix for in r176922 on 7 March (shortly after the 7.0 release). 
  And, there's a non-trivial overall performance improvement that I'm 
planning to MFC this week.

I encourage you to find some better way of testing memory performance 
than ubench.  Generic malloc benchmarking is *hard*.  The most effective 
approach for someone not specifically interested in allocators is to 
benchmark the actual applications that will be run in production.  If 
you find that jemalloc performs poorly in such circumstances, please let 
me know the details so that I can look into possible improvements.

Thanks,
Jason



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?48A35D7F.3010805>