Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 2 Jun 2001 10:47:56 -0700 (PDT)
From:      Matt Dillon <dillon@earth.backplane.com>
To:        Ian Dowse <iedowse@maths.tcd.ie>
Cc:        freebsd-hackers@FreeBSD.ORG
Subject:   Re: UFS large directory performance 
Message-ID:  <200106021747.f52Hluf03989@earth.backplane.com>
References:   <200106021207.aa78407@salmon.maths.tcd.ie>

next in thread | previous in thread | raw e-mail | index | archive | help

:
:Thanks for the comments :-) Yes, malloc pool fragmentation is a
:problem. I think that it can be addressed to some extent by using
:a 2-level system (an array of pointers to fixed-size arrays) instead
:of a single large array, but I'm open to any better suggestions.

    I do precisely this for the swap meta-support structures associated
    with VM objects.  It works very well and it means you can use the
    zone allocator for the fixed size structures.

:If the second-level array size was fixed at around 4k, that would
:keep the variable-length first-level array small enough not to
:cause too many fragmentation issues. The per-DIRBLKSIZ free space
:summary array is probably relatively okay as it is now.

    4K = (IA32) 1024 directory entries / 2 (approximate hash slop) = 
    512 directory entries?  That seems quite reasonable but I would use
    a smaller block size.. 512 bytes (remember, with zalloc there is no
    overhead for allocating smaller structures.  Read on!

    I would further recommend a (dynamic) array of pointers at the first
    level as part of the summary structure.  Any given array entry would
    either point to the second level array (the 512 byte allocations),
    be NULL (no second level array was necessary), or be (void *)-1 which
    would indicate that the second level array was reclaimed for other
    uses.

    During a lookup your hash algorithm would operate as per normal but
    when it skips to the next top level array index it would test for
    NULL (search ends, entry not found) and (void *)-1.  (void *)-1 would
    indicate 'search ends but the result is indeterminant, you have to
    rescan the directory'.

    By using a smaller block size for the second level array you create
    more slots in the first level array which gives the system a better
    chance of reusing a second level array block for other purpopses without
    seriously compromising performance for file creates, deletions, and
    lookups on the directory.  e.g. lower chance of the lookup hitting the
    (void *)-1 reclaim mark in the first level array.

:The other main issue, that of discarding old hashes when the memory
:limit is reached, may be quite tricky to resolve. Any approach
:based on doing this from ufsdirhash_build() is likely to become a
:locking nightmare. My original idea was to have ufsdirhash_build()
:walk a list of other inodes with hashes attached and free them if
:possible, but that would involve locking the other inode, so things
:could get messy.
:
:Ian

    If the zone allocator is used for the second level block allocations
    it shouldn't be a problem.  You can (had better be able to!) put a mutex
    around zone frees in -current.

						-Matt
    

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200106021747.f52Hluf03989>