From owner-freebsd-arch@FreeBSD.ORG Sat Mar 29 21:33:21 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 158D91065673 for ; Sat, 29 Mar 2008 21:33:21 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from mail.bitblocks.com (bitblocks.com [64.142.15.60]) by mx1.freebsd.org (Postfix) with ESMTP id EFB0E8FC20 for ; Sat, 29 Mar 2008 21:33:20 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from bitblocks.com (localhost.bitblocks.com [127.0.0.1]) by mail.bitblocks.com (Postfix) with ESMTP id 387805B42; Sat, 29 Mar 2008 14:33:20 -0700 (PDT) To: Poul-Henning Kamp In-reply-to: Your message of "Fri, 28 Mar 2008 19:09:28 -0000." <6472.1206731368@critter.freebsd.dk> Date: Sat, 29 Mar 2008 14:33:20 -0700 From: Bakul Shah Message-Id: <20080329213320.387805B42@mail.bitblocks.com> Cc: arch@freebsd.org Subject: Re: Flash disks and FFS layout heuristics X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Mar 2008 21:33:21 -0000 On Fri, 28 Mar 2008 19:09:28 -0000 Poul-Henning Kamp wrote: > > I've laid my hands on a "M-Tron MOBI3000 32GB" flash disk (2.5" format, > it'll be in my laptop before long :-) > > Here is a naive benchmark sequence, comparing it to a WD Raptor > () > > Flash Disk > --------------------------------------------------------------- > Empty fsck: 0.83 2.47 -66% > restore -rf 839 1251 -33% > loaded fsck: 10.34 78.81 -87% > dump 0f /dev/null: 563.21 1094.91 -49% > --------------------------------------------------------------- > > So far so good, it's clearly the seektime that dominates the > flash-advantage. > > But this reproducible observation by fsck concerns me a bit: > > Flash: (205727 frags, 896270 blocks, 1.4% fragmentation) > > Disk: (197095 frags, 1193944 blocks, 1.1% fragmentation) > > I might indicate that the flash is fast enough to confuse some of > FFS's layout heuristics. > > Any aspiring filesystems hackers should start to consider the > implications for filesystemlayout, if there is in essence no > seek-time penalty for reads and a fair seek pentalty for writes. On a flash "disk" the write penalty has to do with the large erase block size. We can confirm this by looking at the MOBI disk's datasheet: it can do 130 IOPS (IO ops/sec) for random writes for 512B or 4KB blocksize but 16500 IOPs for 4KB blocksize sequential writes. Presumably it can coalesces sequential writes to bigger blocks but not random writes. Given this, does it even makes sense to use the FFS layout? For best performance ideally all writes happen sequentially, with occasional fix ups of the super block etc. Even inodes that changed should be laid out sequentially. Basically you just write the journal and fix up its roots so that on reboot you can quickly discover the filesystem structure!