Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 16 Apr 2001 15:52:58 -0700 (PDT)
From:      Matt Dillon <dillon@earth.backplane.com>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        "Justin T. Gibbs" <gibbs@scsiguy.com>, Doug Barton <DougB@DougBarton.net>, "'current@freebsd.org'" <current@FreeBSD.ORG>
Subject:   Re: FW: Filesystem gets a huge performance boost 
Message-ID:  <200104162252.f3GMqwG83808@earth.backplane.com>
References:   <Pine.BSF.4.21.0104161724210.6152-100000@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help

:> It just seems inelegant to have a system that, on paper, is
:> so inefficient.  Can't we do better?
:
:Sure.  Don't discard buffer contents when recycling a B_MALLOC'ed buffer,
:but manage it using a secondary buffer cache that doesn't have as much
:overhead as the primary one (in particular, don't reserve BKVASIZE bytes
:of kernel virtual address space for each secondary buffer).  This would
:be even more inelegant, and more complicated, but not so inefficient.
:
:Bruce

    Well, I think the last few years have proven that B_MALLOC buffers
    are essentially unmanageable.  Even if you were to come up with the
    perfect algorithm, KVM just doesn't scale to physical memory the way
    it should.  Only physical memory scales to physical memory, and that
    means the VM Page cache.

    We could conceivably use the VM object representing the filesystem
    block device, which normally only holds cylinder group bitmaps and inodes,
    and use it to back piecemeal buffer cache mappings for directories
    (at least as long as we do not allow mmap()ing of directories, which
    would make this impossible).  The backing pages would still be 4K,
    and we would have to be extremely careful in regards to the valid and
    dirty bits in the vm_page_t so as not to infringe on adjacent file
    fragments (which could be mmap'd), but now the 4K of backing store
    would be able to cache up to 8 small directories that happen to reside
    in the same filesystem block.

    The above would be an extremely complex solution and I wouldn't want
    to implement it for that reason.  A separately managed buffer cache
    is also a complex solution because in order to be effective it needs
    to be scaleable (as the current B_MALLOC is not).

    Even though the potential wasteage with the current solution seems high,
    the actual impact on the system is low.  I have yet to see any
    detrimental results in my own testing.  Anyone can test -- simply turn
    on the vmiodirenable sysctl and have at it!

						-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200104162252.f3GMqwG83808>