Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 29 Mar 2008 14:33:20 -0700
From:      Bakul Shah <bakul@bitblocks.com>
To:        Poul-Henning Kamp <phk@phk.freebsd.dk>
Cc:        arch@freebsd.org
Subject:   Re: Flash disks and FFS layout heuristics 
Message-ID:  <20080329213320.387805B42@mail.bitblocks.com>
In-Reply-To: Your message of "Fri, 28 Mar 2008 19:09:28 -0000." <6472.1206731368@critter.freebsd.dk> 

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 28 Mar 2008 19:09:28 -0000 Poul-Henning Kamp <phk@phk.freebsd.dk>  wrote:
> 
> I've laid my hands on a "M-Tron MOBI3000 32GB" flash disk (2.5" format,
> it'll be in my laptop before long :-)
> 
> Here is a naive benchmark sequence, comparing it to a WD Raptor
> (<WDC WD360GD-00FNA0 35.06K35>)
> 
>                                Flash            Disk		
> ---------------------------------------------------------------
> Empty fsck:                        0.83            2.47    -66%
> restore -rf                      839            1251	   -33%
> loaded fsck:                      10.34           78.81	   -87%
> dump 0f /dev/null:               563.21         1094.91	   -49%
> ---------------------------------------------------------------
> 
> So far so good, it's clearly the seektime that dominates the
> flash-advantage.
> 
> But this reproducible observation by fsck concerns me a bit:
> 
>    Flash:  (205727 frags,  896270 blocks, 1.4% fragmentation)
> 
>    Disk:   (197095 frags, 1193944 blocks, 1.1% fragmentation)
>
> I might indicate that the flash is fast enough to confuse some of
> FFS's layout heuristics.
> 
> Any aspiring filesystems hackers should start to consider the
> implications for filesystemlayout, if there is in essence no
> seek-time penalty for reads and a fair seek pentalty for writes.

On a flash "disk" the write penalty has to do with the large
erase block size.  We can confirm this by looking at the MOBI
disk's datasheet: it can do 130 IOPS (IO ops/sec) for random
writes for 512B or 4KB blocksize but 16500 IOPs for 4KB
blocksize sequential writes.  Presumably it can coalesces
sequential writes to bigger blocks but not random writes.

Given this, does it even makes sense to use the FFS layout?
For best performance ideally all writes happen sequentially,
with occasional fix ups of the super block etc. Even inodes
that changed should be laid out sequentially.  Basically you
just write the journal and fix up its roots so that on reboot
you can quickly discover the filesystem structure!



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080329213320.387805B42>