Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 21 Mar 2000 22:17:52 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Paul Richards <paul@originative.co.uk>
Cc:        Richard Wendland <richard@netcraft.com>, Alfred Perlstein <bright@wintelcom.net>, Poul-Henning Kamp <phk@critter.freebsd.dk>, current@FreeBSD.ORG, fs@FreeBSD.ORG
Subject:   Re: FreeBSD random I/O performance issues
Message-ID:  <200003220617.WAA86154@apollo.backplane.com>
References:  <200003220022.AAA28786@ns0.netcraft.com> <38D833BC.A082DF09@originative.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
:written immediately which is 8750/10000 writes.
:
:When the write size drops below the filesystem block size then the
:clustering code never gets called because the buffers are just marked
:dirty and cached.
:
:I think if we fixed the issue of writing out full blocks this behviour
:would stop but I also think the clustering code could do with a fix. It
:should at least check to see if there is a cluster being built when the
:blockno is 0 and push it out. Possibly though it'd be better to not push
:out clusters of only one block and just leave them in the cache.

    Hmm.  Your analysis is correct but I don't think it's worth
    fixing the block-is-0 case.   It may be worth revisiting the
    write-behind code to try to give it the ability to better discern
    random I/O from sequential I/O (e.g. perhaps it should ignore unaligned
    full blocks).

    It is perfectly ok for dirty blocks to remain in the buffer cache.  In
    fact, it's *optimal* to leave them in the buffer cache as long as the
    buffer cache does not get saturated with them.  The buffer cache is
    perfectly capable of clustering delayed writes.  Also, the filesystem 
    syncer comes along every 30 seconds or so anyway and flushes everything
    out.

    What the write-behind code tries to do is to prevent the buffer cache 
    from being saturated with dirty buffers and to smooth out disk write
    I/O.  It makes the assumption that write-behind data is not typically
    accessed by the program immediately after being written -- an assumption
    that winds up being incorrect in the DBM case you tested and resulting
    in stalls due to the buffer / VM pages being locked during the write I/O.
    The stalls are *not* due to the I/O itself but instead are due to side
    effects of the I/O being in-progress.  If a user program doesn't access
    any of the information it recently wrote the whole mechanism winds up
    operating asynchronously in the background.  If a user program does, 
    then the write behind mechanism breaks down and you get a stall.

    The most common dirty-data case the filesystem has to deal with is 
    appending to a file -- that is, doing piecemeal sequential writes.  There
    are virtually no other cases which have the ability to saturate the
    buffer cache.  This is why the write-behind code only tries to handle
    the piecemeal-write-flush-full-blocks case.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200003220617.WAA86154>