Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 31 Mar 2008 16:06:28 -0700 (PDT)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Bakul Shah <bakul@bitblocks.com>
Cc:        Christopher Arnold <chris@arnold.se>, Martin Fouts <mfouts@danger.com>, qpadla@gmail.com, arch@freebsd.org, Poul-Henning Kamp <phk@phk.freebsd.dk>, freebsd-arch@freebsd.org
Subject:   Re: Flash disks and FFS layout heuristics 
Message-ID:  <200803312306.m2VN6SRa029758@apollo.backplane.com>
References:  <20080331223846.CFD975BAE@mail.bitblocks.com>

next in thread | previous in thread | raw e-mail | index | archive | help
:[Poul, use positive encouragement and you'd inspire a lot more
:people!]
:
:Note that in effect this is exactly what zfs does. Update of
:any block implies finding a new place for the updated copy,
:which means the block pointing to it must be also updated,
:which means a new place for it etc. etc.
:
:But hey, I spent just a few minutes sketching out the idea so
:it is possible I missed a whole bunch of things!  If I was
:actually implementing this (which I am tempted to...) I'd
:certainly want to know what others did.
:
:One thing I forgot to add: I'd let the lower level handle bad
:block forwarding and wear levelling (like on the m-tron
:device).

    This is my understanding of what ZFS does too, and I considered it
    when I was designing HAMMER.  I ultimately decided not to go that
    route because I was worried it would destroy seek-locality-of-reference
    on-disk (i.e. read/access performance).  Seek locality of reference
    is of course very important for a disk-based filesystem but not so
    important for a flash-based filesystem.

    The one hard part I have left to do in HAMMER is the UNDO meta-data log.
    Or, more precisely, the recover-on-mount code for the UNDO meta-data log.
    Everything else is done and working.  I knew it would be the hardest part
    of the filesystem when I ultimately decided not to go ZFS's route.

    The UNDO log is basically one seek-write per fsync or whenever the
    filesystem is flushed (every 30 seconds on BSDs)... not too bad,
    particularly because it stores only meta-data changes and not
    data-changes.  Ultimately I think I can make it worthwhile by including
    data elements for small seek/write/fsync sequences in the UNDO record
    and just syncing it, which would be awesome for database applications.
    I have no immediate plans to do that right now, though.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200803312306.m2VN6SRa029758>