Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 11 Dec 2005 19:27:39 -0800
From:      Jason Evans <jasone@canonware.com>
To:        Johan Bucht <bucht@acc.umu.se>
Cc:        current@freebsd.org
Subject:   Re: New libc malloc patch
Message-ID:  <CFDC323A-F36B-4358-B636-8465E763E7C3@canonware.com>
In-Reply-To: <439CEB74.9080505@acc.umu.se>
References:  <20051212014852.GA8775@shaka.acc.umu.se> <9FAD2B4B-C167-42D7-A8E7-BE03F4C07543@canonware.com> <439CEB74.9080505@acc.umu.se>

next in thread | previous in thread | raw e-mail | index | archive | help
On Dec 11, 2005, at 7:16 PM, Johan Bucht wrote:
> Jason Evans wrote:
>>
>> On Dec 11, 2005, at 5:48 PM, Johan Bucht wrote:
>>> * Saw you used a tree to look up the allocation sizes at free, this
>>> * differs from how I and Solaris libumem does it. Instead I went for
>>> * prepending a 8/16 byte tag containing the size of the  
>>> allocation for
>>> * lockless size lookup.
>>
>> Actually, my implementation prepends a 4 byte tag that contains   
>> allocated/free flags, and the size of the allocation.  The trees  
>> are  only used to look up the sizes of huge allocations (> 8 MB on  
>> 32-bit  platforms, and > 256 MB on 64-bit platforms).  Huge  
>> allocations start  at chunk boundaries, so prepending a tag would  
>> require an entire  extra page (and would cause potentially serious  
>> external  fragmentation on 32-bit systems).
>>
> Isn't 8 byte alignment expected by some applications?

Yes, 8 or 16 byte alignment is expected (in fact I'm of the opinion  
that we should be using 16 byte alignment on i386 due to SSE2/SSE3).   
If I understand your question correctly, you're asking how I get away  
with 4 byte tags.  Consider that (assuming 8-byte quantum) a malloc 
(16) call must actually be serviced by at least 24 bytes internally,  
in order to pad to the next quantum size.  If I used 8 byte tags,  
then malloc(17) would require 32 bytes internally.  By using 4 byte  
tags, malloc(13)..malloc(20) can be serviced by 24 bytes internally.   
At least one implementation (the OS X malloc implementation) uses 2  
byte tags, but this has two problems: 1) unaligned memory access is  
slow on some architectures, and 2) it's too small to handle large  
object sizes, so a different allocation strategy has to be used  
starting at ~12 KB.

> How do you know if a allocation is huge if you don't have a tag?

I know an allocation is huge if its base address is chunk-aligned.   
The actual size is stored in a red-black tree node, which is  
allocated separately.

Jason



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CFDC323A-F36B-4358-B636-8465E763E7C3>