From owner-freebsd-geom@FreeBSD.ORG  Sun Feb 10 14:30:06 2008
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A491B16A421
	for <freebsd-geom@hub.freebsd.org>;
	Sun, 10 Feb 2008 14:30:06 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 8B1D213C45A
	for <freebsd-geom@hub.freebsd.org>;
	Sun, 10 Feb 2008 14:30:06 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id m1AEU6n9014099
	for <freebsd-geom@freefall.freebsd.org>; Sun, 10 Feb 2008 14:30:06 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.2/8.14.1/Submit) id m1AEU6NQ014096;
	Sun, 10 Feb 2008 14:30:06 GMT (envelope-from gnats)
Date: Sun, 10 Feb 2008 14:30:06 GMT
Message-Id: <200802101430.m1AEU6NQ014096@freefall.freebsd.org>
To: freebsd-geom@FreeBSD.org
From: Volker <volker@vwsoft.com>
Cc: 
Subject: Re: bin/110705: gmirror control utility does not exit with correct
 exit status
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Volker <volker@vwsoft.com>
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 10 Feb 2008 14:30:06 -0000

The following reply was made to PR bin/110705; it has been noted by GNATS.

From: Volker <volker@vwsoft.com>
To: bug-followup@FreeBSD.org, tom@tomjudge.com
Cc:  
Subject: Re: bin/110705: gmirror control utility does not exit with correct
 exit status
Date: Sun, 10 Feb 2008 15:28:28 +0100

 MFC to RELENG_6 missing! If done, this PR can be closed.

From owner-freebsd-geom@FreeBSD.ORG  Sun Feb 10 14:58:09 2008
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 233DB16A468;
	Sun, 10 Feb 2008 14:58:09 +0000 (UTC)
	(envelope-from rafan@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 0DFB013C4EE;
	Sun, 10 Feb 2008 14:58:09 +0000 (UTC)
	(envelope-from rafan@FreeBSD.org)
Received: from freefall.freebsd.org (rafan@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id m1AEw8Vx015479;
	Sun, 10 Feb 2008 14:58:08 GMT
	(envelope-from rafan@freefall.freebsd.org)
Received: (from rafan@localhost)
	by freefall.freebsd.org (8.14.2/8.14.1/Submit) id m1AEw89G015475;
	Sun, 10 Feb 2008 14:58:08 GMT (envelope-from rafan)
Date: Sun, 10 Feb 2008 14:58:08 GMT
Message-Id: <200802101458.m1AEw89G015475@freefall.freebsd.org>
To: tom@tomjudge.com, rafan@FreeBSD.org, freebsd-geom@FreeBSD.org
From: rafan@FreeBSD.org
Cc: 
Subject: Re: bin/110705: gmirror control utility does not exit with correct
	exit status
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 10 Feb 2008 14:58:09 -0000

Synopsis: gmirror control utility does not exit with correct exit status

State-Changed-From-To: patched->closed
State-Changed-By: rafan
State-Changed-When: Sun Feb 10 14:58:08 UTC 2008
State-Changed-Why: 
Patch committed in RELENG_[67] and HEAD. Thanks!.

http://www.freebsd.org/cgi/query-pr.cgi?pr=110705

From owner-freebsd-geom@FreeBSD.ORG  Sun Feb 10 14:58:38 2008
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 06CC816A41B;
	Sun, 10 Feb 2008 14:58:38 +0000 (UTC)
	(envelope-from rafan@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id E559B13C47E;
	Sun, 10 Feb 2008 14:58:37 +0000 (UTC)
	(envelope-from rafan@FreeBSD.org)
Received: from freefall.freebsd.org (rafan@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id m1AEwbdq015526;
	Sun, 10 Feb 2008 14:58:37 GMT
	(envelope-from rafan@freefall.freebsd.org)
Received: (from rafan@localhost)
	by freefall.freebsd.org (8.14.2/8.14.1/Submit) id m1AEwbWi015522;
	Sun, 10 Feb 2008 14:58:37 GMT (envelope-from rafan)
Date: Sun, 10 Feb 2008 14:58:37 GMT
Message-Id: <200802101458.m1AEwbWi015522@freefall.freebsd.org>
To: tom@tomjudge.com, rafan@FreeBSD.org, freebsd-geom@FreeBSD.org
From: rafan@FreeBSD.org
Cc: 
Subject: Re: bin/110705: gmirror control utility does not exit with correct
	exit status
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 10 Feb 2008 14:58:38 -0000

Synopsis: gmirror control utility does not exit with correct exit status

State-Changed-From-To: closed->patched
State-Changed-By: rafan
State-Changed-When: Sun Feb 10 14:58:22 UTC 2008
State-Changed-Why: 
Oops, this requires a MFC to 6

http://www.freebsd.org/cgi/query-pr.cgi?pr=110705

From owner-freebsd-geom@FreeBSD.ORG  Mon Feb 11 11:07:05 2008
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4C7DB16A418
	for <freebsd-geom@hub.freebsd.org>;
	Mon, 11 Feb 2008 11:07:05 +0000 (UTC)
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 352E413C474
	for <freebsd-geom@hub.freebsd.org>;
	Mon, 11 Feb 2008 11:07:05 +0000 (UTC)
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id m1BB75j9007384
	for <freebsd-geom@FreeBSD.org>; Mon, 11 Feb 2008 11:07:05 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.2/8.14.1/Submit) id m1BB742a007380
	for freebsd-geom@FreeBSD.org; Mon, 11 Feb 2008 11:07:04 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 11 Feb 2008 11:07:04 GMT
Message-Id: <200802111107.m1BB742a007380@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
	owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@FreeBSD.org>
To: freebsd-geom@FreeBSD.org
Cc: 
Subject: Current problem reports assigned to freebsd-geom@FreeBSD.org
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 11 Feb 2008 11:07:05 -0000

Current FreeBSD problem reports
Critical problems
Serious problems

S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o kern/73177   geom       kldload geom_* causes panic due to memory exhaustion
o kern/76538   geom       [gbde] nfs-write on gbde partition stalls and continue
o kern/83464   geom       [geom] [patch] Unhandled malloc failures within libgeo
o kern/84556   geom       [geom] GBDE-encrypted swap causes panic at shutdown
o kern/87544   geom       [gbde] mmaping large files on a gbde filesystem deadlo
s kern/89102   geom       [geom_vfs] [panic] panic when forced unmount FS from u
o bin/90093    geom       fdisk(8) incapable of altering in-core geometry
o kern/90582   geom       [geom_mirror] [panic] Restore cause panic string (ffs_
o kern/98034   geom       [geom] dereference of NULL pointer in acd_geom_detach 
o kern/104389  geom       [geom] [patch] sys/geom/geom_dump.c doesn't encode XML
o kern/113419  geom       [geom] geom fox multipathing not failing back
o kern/113957  geom       [gmirror] gmirror is intermittently reporting a degrad
o kern/115572  geom       [gbde] [patch] gbde partitions fail at 28bit/48bit LBA
o kern/120021  geom       net-p2p/qbittorrent crashes system when it works thoug
o kern/120231  geom       [geom] GEOM_CONCAT error adding second drive

15 problems total.

Non-critical problems

S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o bin/78131    geom       gbde "destroy" not working.
o kern/79251   geom       [2TB] newfs fails on 2.6TB gbde device
o kern/94632   geom       [geom] Kernel output resets input while GELI asks for 
f kern/105390  geom       [geli] filesystem on a md backed by sparse file with s
o kern/107707  geom       [geom] [patch] [request] add new class geom_xbox360 to
p bin/110705   geom       gmirror control utility does not exit with correct exi
o kern/113837  geom       [geom] unable to access 1024 sector size storage
o kern/113885  geom       [geom] [patch] improved gmirror balance algorithm
o kern/114532  geom       [geom] GEOM_MIRROR shows up in kldstat even if compile
o kern/115547  geom       [geom] [patch] [request] let GEOM Eli get password fro
o kern/119743  geom       [geom] geom label for cds is keeped after dismount and
o kern/120044  geom       [msdosfs] [geom] incorrect MSDOSFS label fries adminis
f kern/120091  geom       [geom] [geli] [gjournal] geli does not prompt for pass

13 problems total.


From owner-freebsd-geom@FreeBSD.ORG  Wed Feb 13 02:06:45 2008
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 281DE16A421
	for <freebsd-geom@freebsd.org>; Wed, 13 Feb 2008 02:06:45 +0000 (UTC)
	(envelope-from dwiest@vailsys.com)
Received: from cprobd02.vailsys.com (cprobd02.vailsys.com [63.210.102.130])
	by mx1.freebsd.org (Postfix) with ESMTP id 03AB013C46E
	for <freebsd-geom@freebsd.org>; Wed, 13 Feb 2008 02:06:44 +0000 (UTC)
	(envelope-from dwiest@vailsys.com)
Received: from dpfuser01.vail (dpfuser01.vail [192.168.129.103])
	by cprobd02.vailsys.com (Postfix) with ESMTP id 06FBACE53A
	for <freebsd-geom@freebsd.org>; Tue, 12 Feb 2008 19:35:25 -0600 (CST)
Received: from dfwdamian.vail (dfwdamian.vail [192.168.129.233])
	by dpfuser01.vail (Postfix) with ESMTP id CDE195C90
	for <freebsd-geom@freebsd.org>; Tue, 12 Feb 2008 19:35:24 -0600 (CST)
Received: (from dwiest@localhost)
	by dfwdamian.vail (8.13.8/8.13.8/Submit) id m1D1ZOf1096305
	for freebsd-geom@freebsd.org; Tue, 12 Feb 2008 19:35:24 -0600 (CST)
	(envelope-from dwiest@vailsys.com)
X-Authentication-Warning: dfwdamian.vail: dwiest set sender to
	dwiest@vailsys.com using -f
Date: Tue, 12 Feb 2008 19:35:24 -0600
From: Damian Wiest <dwiest@vailsys.com>
To: freebsd-geom@freebsd.org
Message-ID: <20080213013524.GE82589@dfwdamian.vail>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.4.2.3i
Subject: GEOM related panic during install
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 13 Feb 2008 02:06:45 -0000

I recently tried to install FreeBSD 6.3 on an Intel based server and 
encountered a geom related panic during the boot process.  The symptoms 
are very similar to those reported here,

http://www.nabble.com/bypassing-gmirror-to-recover-filesystems-to9302550.html#a9302550

Here's the relevant output from the boot process:

ad4: 476940MB <WDC WD5000YS-01MPB1 09.02E09> at ata2-master SATA150
ad4: 976773168 sectors [969021C/16H/63S] 16 sectors/interrupt 1 depth queue
GEOM: new disk ad4
ad4: Intel check1 failed
ad4: Adaptec check1 failed
ad4: LSI (v3) check1 failed
ad4: LSI (v2) check1 failed
ad4: FreeBSD check1 failed
ata3-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire
ad6: 476940MB <WDC WD5000YS-01MPB1 09.02E09> at ata3-master SATA150
ad6: 976773168 sectors [969021C/16H/63S] 16 sectors/interrupt 1 depth queue
GEOM: new disk ad6
ad6: Intel check1 failed
ad6: Adaptec check1 failed
ad6: LSI (v3) check1 failed
ad6: LSI (v2) check1 failed
ad6: FreeBSD check1 failed
WARNING: Device name truncated! (ad6p57p57p57p57p57p57p57p57p57p57p57p57p57p57p57p57p57p57p57p57)
WARNING: Device name truncated! (ad6p57p57p57p57p57p57p57p57p57p57p57p57p57p57p57p57p57p57p57p57)
... [warning repeats many, many times] ...
Fatal double fault
rip = 0xffffffff803ee5d0
rsp = 0xffffffffb2162fc0
rbp = 0xffffff0076a74680
panic: double fault
Uptime: 54s
Cannot dump. No dump device defined.
Automatic reboot in 15 seconds - press a key on the console to abort
--> Press a key on the console to reboot,
--> or switch off the system now.

I believe that the panic occurs while geom is tasting the system's disks.
AFAIK, the disks were new, but someone here had configured the BIOS to 
use the onboard soft-RAID controller to mirror the drives.  I disabled 
this setting before beginning the install, so I suspect that's how the 
label on ad6 got messed up.

What's the proper way of recovering from this situation?  I can't simply
pull the offending disk, boot into FreeBSD, reinsert the disk and then
use dd to zero the label because x86/amd64 servers won't notice the new 
disk.  I ended up using a Solaris install CD to write a new label to each
disk, but I suppose I could build a custom kernel that does not contain
any of the geom modules and use that as a fixit disk.  Do I just need
to use boot option 6 and then have the loader unload any modules?

-Damian

From owner-freebsd-geom@FreeBSD.ORG  Wed Feb 13 10:18:19 2008
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 69DFE16A417
	for <freebsd-geom@freebsd.org>; Wed, 13 Feb 2008 10:18:19 +0000 (UTC)
	(envelope-from gcubfg-freebsd-geom@m.gmane.org)
Received: from ciao.gmane.org (main.gmane.org [80.91.229.2])
	by mx1.freebsd.org (Postfix) with ESMTP id E487113C47E
	for <freebsd-geom@freebsd.org>; Wed, 13 Feb 2008 10:18:18 +0000 (UTC)
	(envelope-from gcubfg-freebsd-geom@m.gmane.org)
Received: from list by ciao.gmane.org with local (Exim 4.43)
	id 1JPEgu-0002rZ-48
	for freebsd-geom@freebsd.org; Wed, 13 Feb 2008 10:18:12 +0000
Received: from lara.cc.fer.hr ([161.53.72.113])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-geom@freebsd.org>; Wed, 13 Feb 2008 10:18:12 +0000
Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-geom@freebsd.org>; Wed, 13 Feb 2008 10:18:12 +0000
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-geom@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Wed, 13 Feb 2008 11:19:56 +0100
Lines: 60
Message-ID: <foug4s$9hf$1@ger.gmane.org>
References: <20080213013524.GE82589@dfwdamian.vail>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature";
	boundary="------------enigD945FF17E44D55F7D8106797"
X-Complaints-To: usenet@ger.gmane.org
X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr
User-Agent: Thunderbird 2.0.0.6 (X11/20071022)
In-Reply-To: <20080213013524.GE82589@dfwdamian.vail>
X-Enigmail-Version: 0.95.0
Sender: news <news@ger.gmane.org>
Subject: Re: GEOM related panic during install
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 13 Feb 2008 10:18:19 -0000

This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enigD945FF17E44D55F7D8106797
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Damian Wiest wrote:

> I believe that the panic occurs while geom is tasting the system's disk=
s.
> AFAIK, the disks were new, but someone here had configured the BIOS to =

> use the onboard soft-RAID controller to mirror the drives.  I disabled =

> this setting before beginning the install, so I suspect that's how the =

> label on ad6 got messed up.
>=20
> What's the proper way of recovering from this situation?  I can't simpl=
y
> pull the offending disk, boot into FreeBSD, reinsert the disk and then
> use dd to zero the label because x86/amd64 servers won't notice the new=
=20
> disk.  I ended up using a Solaris install CD to write a new label to ea=
ch
> disk, but I suppose I could build a custom kernel that does not contain=

> any of the geom modules and use that as a fixit disk.  Do I just need
> to use boot option 6 and then have the loader unload any modules?

The problem here is that even if you do remove optional GEOM
modules/classes from the kernel, you'll still be left with the GEOM
framework which does the initial tasting, which you can't remove because
it's the kernel's interface to the drives. Also, the "Intel check1
failed" messages are from the ATA driver, as it tries to recognize
BIOS/soft-raid configurations, and you can't remove that. It (the ATA
driver) is also the probable cause of the panic here.

It would be useful if you tried to debug the problem in the driver - try
and download a recent snapshot of 8-current, with debugging enabled, and
see if you can get a backtrace on panic which would help fix the driver.

Other than that, you'll probably have to boot another OS (Linux,
Solaris, etc.) and use dd to clear the first few and the last few
sectors of the drives.


--------------enigD945FF17E44D55F7D8106797
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHssRNldnAQVacBcgRAnfvAKC7dQpMXCtnwhseUlrgCrvuCprsYQCcC8d1
DYykWGQhuLKaLw2k/XKQp5A=
=exne
-----END PGP SIGNATURE-----

--------------enigD945FF17E44D55F7D8106797--


From owner-freebsd-geom@FreeBSD.ORG  Fri Feb 15 19:00:38 2008
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: geom@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2F9E816A41A
	for <geom@FreeBSD.org>; Fri, 15 Feb 2008 19:00:38 +0000 (UTC)
	(envelope-from xcllnt@mac.com)
Received: from smtpoutm.mac.com (smtpoutm.mac.com [17.148.16.80])
	by mx1.freebsd.org (Postfix) with ESMTP id 1283B13C4F0
	for <geom@FreeBSD.org>; Fri, 15 Feb 2008 19:00:37 +0000 (UTC)
	(envelope-from xcllnt@mac.com)
Received: from mac.com (asmtp007-s [10.150.69.70])
	by smtpoutm.mac.com (Xserve/smtpout017/MantshX 4.0) with ESMTP id
	m1FIcxCQ021944
	for <geom@FreeBSD.org>; Fri, 15 Feb 2008 10:38:59 -0800 (PST)
Received: from mini-g4.jnpr.net (natint3.juniper.net [66.129.224.36])
	(authenticated bits=0)
	by mac.com (Xserve/asmtp007/MantshX 4.0) with ESMTP id m1FIcuBL026461
	(version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO)
	for <geom@FreeBSD.org>; Fri, 15 Feb 2008 10:38:57 -0800 (PST)
Message-Id: <4A4329EB-B8EF-4CDA-98C0-4753289C4788@mac.com>
From: Marcel Moolenaar <xcllnt@mac.com>
To: geom@FreeBSD.org
Content-Type: text/plain; charset=US-ASCII; format=flowed
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v919.2)
Date: Fri, 15 Feb 2008 10:38:55 -0800
X-Mailer: Apple Mail (2.919.2)
Cc: 
Subject: Brainstorm: NAND flash
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 15 Feb 2008 19:00:38 -0000

All,

I've been thinking about supporting NAND flash for disk storage
and I've come up with some initial thoughts for people to shoot
at. The intend of this thread is to align thoughts and for
people to tell me that they already have an implementation ;-)


NAND class
----------

NAND flash devices present themselves to the GEOM layer with
GEOMs of the NAND class. This is similar to HDDs presenting
themselves with GEOMs of the DISK class. The idea here is that
we need some additional I/O requests and also want to be able
to distinguish flash from disks.

GEOMs of class NAND don't have the mediasize and sectorsize
attributes (or they have them with value 0). The mediasize is
dependent upon the number of bad blocks, which is not being
dealt with at this level. NANDs don't have sectors.
Attributes of this class include:
	blockcount - the raw number of blocks
	blocksize - the number of bytes or pages in a block
	pagesize - the number of bytes in a page
	oobsize - the number of bytes per page used for OOB

The NAND class support BIO_DELETE. It'll also need something
for random access to the OOB data. For this we can introduce
BIO_READOOB and BIO_WRITEOOB. This allow byte-wise I/O. The
standard BIO_READ and BIO_WRITE operate on pages by default.

With the above, we have raw access to the NAND flash. That is
before any wear-leveling or sector mapping happens. A device
special file corresponding to GEOMs of this class can be used
by diagnostics and/or initialization tools.

Open issue: do we want this GEOM to deal with bad blocks?


WEARLEVEL class
---------------

GEOMs of the WEARLEVEL class (further referred to as WL class),
will taste GEOMs of the NAND class. In particular, they will
use the blockcount, blocksize, pagesize and oobsize in order
to determine whether a GEOM is suitable. The tasting process
will read OOB data to determine if wear-leveling is used. As
such, wear-leveling needs to be setup. For this a geom(8)
library exists. GEOMs of this class export the same variables
as GEOMs of the NAND class, but also has a non-0 mediasize.

The primary purpose of the WL class is to present a NAND
flash device that for which wear-leveling is not a concern
and that does not have any bad blocks. It can implement
different policies, such as block-based wear-leveling or
page-based  wear-leveling. All configurable through geom(8).


NANDDIDK class
-------------

GEOMs of the NANDDISK class (bad name, I know) attach to GEOMs
of either NAND or WEARLEVEL classes and present a consumer that
looks like a "regular" disk. It has the mediasize and sectorsize
attributes and not any of the blockcount, blocksize, pagesize
or oobsize attributes. Also BIO_READOOB and BIO_WRITEOOB are not
supported, though BIO_DELETE may be.

The primary purpose of this class is to provide standard sector
mapping for file systems that are not designed for NAND flash.
The mapping can be trivial.


NANDSIM class
-------------

Not needed in production, but it would be good to have a GEOM
that simulates a NAND flash and that keeps statistics. It is
configured by geom(8) and needs a provider for actual storage.
As such, you can use an underlying MD for storage and present
the GEOM layer with a NAND flash device. Statistics include
such things as erase count per block, read and write counts
per block or page.
Other features could include the simulation of power loss to
test algorithms used for wear-leveling and or sector mapping.


Let the discussion begin...

-- 
Marcel Moolenaar
xcllnt@mac.com


From owner-freebsd-geom@FreeBSD.ORG  Fri Feb 15 23:46:05 2008
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: geom@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 23A5E16A41B
	for <geom@FreeBSD.org>; Fri, 15 Feb 2008 23:46:05 +0000 (UTC)
	(envelope-from xcllnt@mac.com)
Received: from smtpoutm.mac.com (smtpoutm.mac.com [17.148.16.73])
	by mx1.freebsd.org (Postfix) with ESMTP id 1388A13C457
	for <geom@FreeBSD.org>; Fri, 15 Feb 2008 23:46:04 +0000 (UTC)
	(envelope-from xcllnt@mac.com)
Received: from mac.com (asmtp004-s [10.150.69.67])
	by smtpoutm.mac.com (Xserve/smtpout010/MantshX 4.0) with ESMTP id
	m1FNk4pp000921; Fri, 15 Feb 2008 15:46:04 -0800 (PST)
Received: from mini-g4.jnpr.net (natint3.juniper.net [66.129.224.36])
	(authenticated bits=0)
	by mac.com (Xserve/asmtp004/MantshX 4.0) with ESMTP id m1FNk2m7001882
	(version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO);
	Fri, 15 Feb 2008 15:46:03 -0800 (PST)
Message-Id: <A7D3ED75-BC10-4E15-BABE-7182995FAC61@mac.com>
From: Marcel Moolenaar <xcllnt@mac.com>
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
In-Reply-To: <93634.1203118109@critter.freebsd.dk>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v919.2)
Date: Fri, 15 Feb 2008 15:46:02 -0800
References: <93634.1203118109@critter.freebsd.dk>
X-Mailer: Apple Mail (2.919.2)
Cc: geom@FreeBSD.org
Subject: Re: Brainstorm: NAND flash
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 15 Feb 2008 23:46:05 -0000


On Feb 15, 2008, at 3:28 PM, Poul-Henning Kamp wrote:

> In message <4A4329EB-B8EF-4CDA-98C0-4753289C4788@mac.com>, Marcel  
> Moolenaar wri
> tes:
>
>> GEOMs of class NAND don't have the mediasize and sectorsize
>> attributes (or they have them with value 0). The mediasize is
>> dependent upon the number of bad blocks, which is not being
>> dealt with at this level.
>
> Mediasize is about addressability, not about usability, so this
> assumption is wrong.
>
> A GEOM provider is just an addressable array of sectors, it
> doesn't guarantee that you can read them all or write them
> all, as is indeed the case when your disk develops a bad sector.
>
> NAND is only special due to the OOB stuff, the main page array
> is just a pretty spotty disk, for all GEOM cares.

The reason I thought this was good is that disks are
shipped without bad blocks visible to the "application".
That is: the norm is no bad blocks. With NAND flash
the norm is that bad blocks part of the deal. I thought
that dealing with bad blocks explicitly for NAND would
level the playing field and make it more consistent...

>> dealt with at this level. NANDs don't have sectors.
>> Attributes of this class include:
>> 	blockcount - the raw number of blocks
>
> This goes in mediasize (as a byte count)
>
>> 	blocksize - the number of bytes or pages in a block
>
> This goes in sectorsize.

Can't this cause race conditions?

Suppose there happens to be a MBR in the first page at
offset 0. The MBR class could end up taking the provider,
when a wear-leveling geom should really take it.

>> Open issue: do we want this GEOM to deal with bad blocks?
>
> I'm not sure I understand this question.  GEOM doesn't know about
> bad blocks, if you try to use them, GEOM happily transports the
> resulting error code back, but it does not care if the error code
> is "read error" or "values of beta gives rise to dom!"

See above.

>> NANDDIDK class
>> -------------
>
>> The primary purpose of this class is to provide standard sector
>> mapping for file systems that are not designed for NAND flash.
>> The mapping can be trivial.
>
> I don't understand why this would be necessary, this is normally
> done in the wearleveling class (for reasons that should be obvious),
> so why do you want to split it into a separate class ?

I'm ignorant of the obviousness of why sector mapping and
wear-leveling are to be done at the same time...

...and I presume you can't elaborate...


-- 
Marcel Moolenaar
xcllnt@mac.com


From owner-freebsd-geom@FreeBSD.ORG  Fri Feb 15 23:48:21 2008
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: geom@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 9D7AB16A417
	for <geom@FreeBSD.org>; Fri, 15 Feb 2008 23:48:21 +0000 (UTC)
	(envelope-from phk@critter.freebsd.dk)
Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222])
	by mx1.freebsd.org (Postfix) with ESMTP id 63C7013C4E3
	for <geom@FreeBSD.org>; Fri, 15 Feb 2008 23:48:20 +0000 (UTC)
	(envelope-from phk@critter.freebsd.dk)
Received: from critter.freebsd.dk (unknown [192.168.61.3])
	by phk.freebsd.dk (Postfix) with ESMTP id 34EC217104;
	Fri, 15 Feb 2008 23:28:30 +0000 (UTC)
Received: from critter.freebsd.dk (localhost [127.0.0.1])
	by critter.freebsd.dk (8.14.2/8.14.2) with ESMTP id m1FNSTXj093635;
	Fri, 15 Feb 2008 23:28:29 GMT (envelope-from phk@critter.freebsd.dk)
To: Marcel Moolenaar <xcllnt@mac.com>
From: "Poul-Henning Kamp" <phk@phk.freebsd.dk>
In-Reply-To: Your message of "Fri, 15 Feb 2008 10:38:55 PST."
	<4A4329EB-B8EF-4CDA-98C0-4753289C4788@mac.com> 
Date: Fri, 15 Feb 2008 23:28:29 +0000
Message-ID: <93634.1203118109@critter.freebsd.dk>
Sender: phk@critter.freebsd.dk
Cc: geom@FreeBSD.org
Subject: Re: Brainstorm: NAND flash 
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 15 Feb 2008 23:48:21 -0000

In message <4A4329EB-B8EF-4CDA-98C0-4753289C4788@mac.com>, Marcel Moolenaar wri
tes:

>GEOMs of class NAND don't have the mediasize and sectorsize
>attributes (or they have them with value 0). The mediasize is
>dependent upon the number of bad blocks, which is not being
>dealt with at this level. 

Mediasize is about addressability, not about usability, so this
assumption is wrong.

A GEOM provider is just an addressable array of sectors, it
doesn't guarantee that you can read them all or write them
all, as is indeed the case when your disk develops a bad sector.

NAND is only special due to the OOB stuff, the main page array
is just a pretty spotty disk, for all GEOM cares.

>dealt with at this level. NANDs don't have sectors.
>Attributes of this class include:
>	blockcount - the raw number of blocks

This goes in mediasize (as a byte count)

>	blocksize - the number of bytes or pages in a block

This goes in sectorsize.

>	pagesize - the number of bytes in a page
>	oobsize - the number of bytes per page used for OOB

These two are secondary attributes which are not likely to change
easily for a given NAND, so they should be handled by the BIO_GETATTR
(as "NAND::PAGESIZE" and "NAND::OOBSIZE" for instance).

>For this we can introduce
>BIO_READOOB and BIO_WRITEOOB.

Yes, this sound sensible.

The original plan is that all BIO_ operations are power of two and
providers should have a bitmap of which they support, (G_PF_CANDELETE
is a mistake in this respect) so this shouldn't be a problem.

In general we should not introduce new BIO_ operations
without reason, but these two are very reasonable.

>With the above, we have raw access to the NAND flash. That is
>before any wear-leveling or sector mapping happens. A device
>special file corresponding to GEOMs of this class can be used
>by diagnostics and/or initialization tools.

Yes, given suitable ioctls to geom_dev, for the new BIO_*OOB.

>Open issue: do we want this GEOM to deal with bad blocks?

I'm not sure I understand this question.  GEOM doesn't know about
bad blocks, if you try to use them, GEOM happily transports the
resulting error code back, but it does not care if the error code
is "read error" or "values of beta gives rise to dom!"

>WEARLEVEL class
>---------------

Sounds good.   I'm under NDA on M-Systems algorithm and Sandisk
is sueing left and right on those patents.

>NANDDIDK class
>-------------

>The primary purpose of this class is to provide standard sector
>mapping for file systems that are not designed for NAND flash.
>The mapping can be trivial.

I don't understand why this would be necessary, this is normally
done in the wearleveling class (for reasons that should be obvious),
so why do you want to split it into a separate class ?

>NANDSIM class
>-------------
>
>Not needed in production, but it would be good to have a GEOM
>that simulates a NAND flash and that keeps statistics.

A very good idea.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.

From owner-freebsd-geom@FreeBSD.ORG  Sat Feb 16 00:33:09 2008
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: geom@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2FB8416A418
	for <geom@FreeBSD.org>; Sat, 16 Feb 2008 00:33:09 +0000 (UTC)
	(envelope-from phk@critter.freebsd.dk)
Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222])
	by mx1.freebsd.org (Postfix) with ESMTP id AB82913C45A
	for <geom@FreeBSD.org>; Sat, 16 Feb 2008 00:33:08 +0000 (UTC)
	(envelope-from phk@critter.freebsd.dk)
Received: from critter.freebsd.dk (unknown [192.168.61.3])
	by phk.freebsd.dk (Postfix) with ESMTP id 3BE5817104;
	Sat, 16 Feb 2008 00:33:07 +0000 (UTC)
Received: from critter.freebsd.dk (localhost [127.0.0.1])
	by critter.freebsd.dk (8.14.2/8.14.2) with ESMTP id m1G0X6tr094216;
	Sat, 16 Feb 2008 00:33:06 GMT (envelope-from phk@critter.freebsd.dk)
To: Marcel Moolenaar <xcllnt@mac.com>
From: "Poul-Henning Kamp" <phk@phk.freebsd.dk>
In-Reply-To: Your message of "Fri, 15 Feb 2008 15:46:02 PST."
	<A7D3ED75-BC10-4E15-BABE-7182995FAC61@mac.com> 
Date: Sat, 16 Feb 2008 00:33:06 +0000
Message-ID: <94215.1203121986@critter.freebsd.dk>
Sender: phk@critter.freebsd.dk
Cc: geom@FreeBSD.org
Subject: Re: Brainstorm: NAND flash 
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 16 Feb 2008 00:33:09 -0000

In message <A7D3ED75-BC10-4E15-BABE-7182995FAC61@mac.com>, Marcel Moolenaar wri
tes:

>> Mediasize is about addressability, not about usability, so this
>> assumption is wrong.
>>
>> A GEOM provider is just an addressable array of sectors, it
>> doesn't guarantee that you can read them all or write them
>> all, as is indeed the case when your disk develops a bad sector.
>>
>> NAND is only special due to the OOB stuff, the main page array
>> is just a pretty spotty disk, for all GEOM cares.
>
>The reason I thought this was good is that disks are
>shipped without bad blocks visible to the "application".
>That is: the norm is no bad blocks. With NAND flash
>the norm is that bad blocks part of the deal. I thought
>that dealing with bad blocks explicitly for NAND would
>level the playing field and make it more consistent...

Well, if you want to take that route, you should not use
GEOM to connect the wear-leveling to the NAND flash in
the first place.

Which option you prefer there is sort of a toss.

Putting it gives you devices in /dev and other benefits, using a
private interface allows you to get it more precisely tailored to
your needs.

I would say put it under GEOM, the bad blocks will not trouble GEOM,
and should somebody get perfect NAND (or care to handle the bad
blocks otherwise), they can stick their filesystem there directly,
if they don't need to write to it too much.

>>> dealt with at this level. NANDs don't have sectors.
>>> Attributes of this class include:
>>> 	blockcount - the raw number of blocks
>>
>> This goes in mediasize (as a byte count)
>>
>>> 	blocksize - the number of bytes or pages in a block
>>
>> This goes in sectorsize.
>
>Can't this cause race conditions?
>
>Suppose there happens to be a MBR in the first page at
>offset 0. The MBR class could end up taking the provider,
>when a wear-leveling geom should really take it.

At the moment the wear-leveling opens the NAND device for writing,
the MBR would get spoiled and disappear.

And the chances of MBR finding its metadata in the right physical
sector is pretty small to begin with if the wear leveling is worth
anything.

Of course if you do simple bad-block substitution, the chance would
be close to certainty, but the MBR would still get spoiled, so that
would still work.

>I'm ignorant of the obviousness of why sector mapping and
>wear-leveling are to be done at the same time...
>
>...and I presume you can't elaborate...

No I can't.

But I can tell you something about filesystems under BSD license
which might interest you.

Imagine you implement a filesystem, that allocates space in
512 byte sectors, even though the underlying device has a
(much) larger sector size.[1]

To reduce the amount of disk-I/O, you would obviously want
to avoid doing
	read 64k block
	modify 512 bytes of those
	write 64k block
	read same 64k block
	modify some other 512 bytes of those
	write 64k block again
In particular if writes were very slow or otherwise expensive.

You would of course do this, by implementing, as UNIX has always
done, a buffer-cache that does the logical/physical translation.

BUT, imagine now as a complication, that your filesystem was
log-structured in somewhat the same hacked up way that Margo Seltzer
did with LFS.

The idea behind LFS is important in this context:  The objective
was to gain write speed by always writing sequentially and basically
treat the disk as a circular buffer, hoping that the RAM cache would
limit the amount of seeks for reading, and that the disk would have
enough free space to reduce the workload of the cleaner process.

The trouble with that of course, is that both assumptions were wrong
until RAM and disk exploded in size just a few years ago.  On a
95% filled filesystem, LFS sunk under the weight of the cleaner,
and RAM was never big enough to cache all you wanted and it doesn't
help until the second access anyway.

The other important aspect of a LFS, is that you need a "cleaner"
process to run ahead of the write pointer, and scavenge space.
If it finds a fully used big block, it leaves it alone, but if
it finds an 64k block with only 512 bytes of data, it copies
those 512 bytes into the write stream so it can mark the 64k
block as free, and recycle it.

Margos LFS was a fiasco, but we can still learn from it:

The source of trouble, as far as I have been able to find out, is
that the filesystem naming layer (in her case UFS) need a logical
block number which must be determined before the physical block
number has been allocated, so the logical block number must be
translated to a physical number through some sort of means or table.

You obviously would _not_ want two copies of the data in the cache,
one under the logical and one under the physical blocknumber, so
you have to pick one or the other.

Margos choice for the easy solution to the logical/physical mapping
problem in LFS, sucked badly when it came to write the "cleaner"
process: A mapping that gives you only a logical->physical translation
cheaply, but requires you to read many blocks of disk to reverse
the mapping, doesn't help you when you read a physical sector and
need to find out if it is used in, and where it belongs in the
logical space.

Which is exactly what the cleaner needs to do.

I belive in the end her choice made it so damn hard that the cleaner
never happened during the time she took an interest in LFS (exactly
until she got her phd I belive ?)  Ousterhout had some very good
and relevant, but harsh words for her about that.

(Sprites LFS, by Ousterhout, is also worth a study, but it was better
designed but also more narrowly tailored to the Sprite OS, and thus
we cannot learn as much from it today.)

This is all from memory, I havn't bothered to look up the LFS source
code or the correspondence on Ousterhouts page, so some details may
be slightly off, for which I apologize.

Poul-Henning

[1] Its interesting that Sun gave up on this and had to get
special firmware to CD-ROM drives, but that's an entirely
different story and not relevant :-)

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.