Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 31 Mar 2008 12:51:35 -0700
From:      "Martin Fouts" <mfouts@danger.com>
To:        "Matthew Dillon" <dillon@apollo.backplane.com>, <qpadla@gmail.com>
Cc:        Christopher Arnold <chris@arnold.se>, arch@freebsd.org, freebsd-arch@freebsd.org
Subject:   RE: Flash disks and FFS layout heuristics
Message-ID:  <B95CEC1093787C4DB3655EF330984818051D09@EXCHANGE.danger.com>
In-Reply-To: <200803311915.m2VJFSoR027593@apollo.backplane.com>
References:  <20080330231544.A96475@localhost> <200803310135.m2V1ZpiN018354@apollo.backplane.com> <B95CEC1093787C4DB3655EF330984818051D03@EXCHANGE.danger.com> <200803312125.29325.qpadla@gmail.com> <200803311915.m2VJFSoR027593@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help
=20

> -----Original Message-----
> From: Matthew Dillon [mailto:dillon@apollo.backplane.com]=20
>=20
>     Hamming codes (ECC codes) are very fragile beasts.  While=20
> they are in the same family as a CRC it is a really bad idea to=20
> try to use the ECC code as your CRC which is why I recommended=20
> against it in my previous posting.

True, but when you're working with a part that does ECC in HW, you're
stuck with the ECC it does.

>     I've written numerous filesystems, including a NOR flash=20
> filesystem (whos characteristics are somewhat different due to the=20
> availability of byte-write).  In my opinion, designing a filesystem=20
> *specifically* for NAND flash is a mistake because the technology is
rapidly=20
> evolving and such a filesystem would wind up being obsolete in fairly=20
> short order.

Well, those of us who are shipping devices with flash parts in them have
a somewhat different view on that, which is why I've worked on three
NAND specific file systems in the last four years. Two of those are in
use in shipping devices, and are expected to be in use for five or more
years.


>     For example, the simple addition of some front-end non-volatile
cache,
> such as a dime-cap-backed static ram, would have a very serious effect
> on any such filesystem design.

Yes.  However since the phone market makes such a change very unlikely,
because of cost pressures, it's not one we take into consideration.

> It is far far better to design the filesystem around generally desired
> characteristics, such as good write locality of reference (though,
again, indices still=20
> have to be updated and those usually do not have good locality of
reference).

You've talked yourself into pretty much the same mistake that led to
jffs2, which turned out to be a terrible idea.

>     DragonFly's HAMMER has pretty good write-locality of=20
> reference but still does random updates for B-Tree indices and things
like=20
> the mtime and atime fields.  It also uses numerous blockmaps that
could=20
> make direct use of a flash sector-mapping translation layer (1).  It=20
> might be adaptable.

You are pretty much describing the data structures that have made jffs2
such a poor performer.

>=20
>     (1) A flash sector-mapping translation layer gives a=20
> filesystem the ability to use 'named block numbers'.  For example, the

> NOR filesystem I did used 32 bit named block numbers regardless of the

> size of the flash (which was typically only 2MB).  The filesystem
topology was
> actually encoded into the block number it self.  In other=20
> words, the filesystem is not bound to a linear range of block numbers
it is
> simply bound

Works OK for NOR. Has interesting problems, mainly with maintaining the
block number map reliabily in storage, when attempted on NAND.

>     What does this mean?  This means that what you really=20
> want to do is not necessarily write a filesystem that is explicitly=20
> designed for NAND operation, but instead write a filesystem that is=20
> explicitly designed to run on top of an abstracted topology (such as
one=20
> where you can have named block numbers), and which generally has the
desired=20
> features for locality of reference.  Such a filesystem would not=20
> become obsolete anywhere near as quickly as a nand-specific filesystem
would and=20
> rebuilding an abstracted topology (whos underlying code=20
> would become obsolete as the technology changes) is a whole lot easier
then
> redesigning a filesystem.

There's really only one topology that's efficient for a NAND device, and
that's to do log-like writing coupled with garbage collection.


> I am quite partial to the named-block concept, I really=20
> think it's the best way to go for flash filesystem design.  The flash=20
> already has to have a sector-translation mechanism, making the jump to
a=20
> full blown named-block model is only a small additional step.

The devil in the details of your naming scheme turns out to be managing
the name translation information within the NAND storage itself. This is
the source of significant performance problems in jffs2, for example,
and have a huge amount of code complexity in the commercial system I
work with.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B95CEC1093787C4DB3655EF330984818051D09>