Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 06 Mar 2008 14:21:53 -0500
From:      gnn@freebsd.org
To:        Jason Evans <jasone@freebsd.org>
Cc:        current@freebsd.org
Subject:   Re: Differences in malloc between 6 and 7?
Message-ID:  <7iejanmze6.wl%gnn@neville-neil.com>
In-Reply-To: <47CD9F87.4000509@freebsd.org>
References:  <677e3b3e0802280915x3f29e79cqe6093b5d7bfba975@mail.gmail.com> <7ifxv7pnei.wl%gnn@neville-neil.com> <47CD9F87.4000509@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
At Tue, 04 Mar 2008 11:14:15 -0800,
Jason Evans wrote:
> 
> gnn@freebsd.org wrote:
> > One of the folks I'm working with found this.  The following code,
> > which yes, is just an example, is 1/2 as fast on 7.0-RELEASE as on
> > 6.3.  Where should I look to find out why?
> 
> There is a definite performance problem an arena_run_alloc(), but I'm 
> happy to report that it was fixed in -current a while back.  I plan to 
> MFC malloc to RELENG_7 within the next few weeks.
> 

Great!

> In a nutshell, the arena_run_alloc() performance problem is due to
> using a linear search to find sufficiently large runs of mapped (but
> currently unused) pages.  There are caching mechanisms that speed up
> the searches to some degree, but there are still some linear aspects
> to the algorithm, so as memory usage increases, the searches take
> progressively longer.  In -current, this problem is solved by
> maintaining red-black trees, so that arena_run_alloc() does a O(lg
> n) tree search, rather than a O(n) iterative search.
> 
> It's worth mentioning that the benchmark is of marginal use, due to
> a simple (but common) flaw.  At a minimum, a malloc benchmark should
> touch all allocated memory at least once.  Otherwise, the benchmark
> is IMO too far removed from reality to measure anything of value,
> since memory access patterns look nothing like those of an actual
> application that dynamically allocates memory.  Both phkmalloc and
> jemalloc use data structures that are mostly disjunct from the
> allocations (no headers), so the benchmark never even faults most
> pages in.  This is especially true for phkmalloc, so jemalloc is
> unjustly penalized.  If we were to include, say, dlmalloc in this
> comparison, it would be even more heavily penalized due to touching
> the pages while modifying allocation headers.

Fair enough, I'll pass that on.

Best,
George



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7iejanmze6.wl%gnn>