From owner-freebsd-arch@FreeBSD.ORG Wed Aug 25 15:58:31 2010 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B3EDC10656AE for ; Wed, 25 Aug 2010 15:58:31 +0000 (UTC) (envelope-from xcllnt@mac.com) Received: from asmtpout023.mac.com (asmtpout023.mac.com [17.148.16.98]) by mx1.freebsd.org (Postfix) with ESMTP id 9D8768FC0A for ; Wed, 25 Aug 2010 15:58:31 +0000 (UTC) MIME-version: 1.0 Content-transfer-encoding: 7BIT Content-type: text/plain; charset=us-ascii Received: from macbook-pro.jnpr.net (natint3.juniper.net [66.129.224.36]) by asmtp023.mac.com (Sun Java(tm) System Messaging Server 6.3-8.01 (built Dec 16 2008; 32bit)) with ESMTPSA id <0L7P00515TP4UP70@asmtp023.mac.com> for freebsd-arch@freebsd.org; Wed, 25 Aug 2010 08:58:18 -0700 (PDT) X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 ipscore=0 phishscore=0 bulkscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx engine=6.0.2-1004200000 definitions=main-1008250105 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.0.10011,1.0.148,0.0.0000 definitions=2010-08-25_08:2010-08-25, 2010-08-25, 1970-01-01 signatures=0 From: Marcel Moolenaar Date: Wed, 25 Aug 2010 08:58:16 -0700 Message-id: <34EF2360-1B68-4E0C-8CCE-409CE141D0B8@mac.com> To: "freebsd-arch@FreeBSD.org Arch" X-Mailer: Apple Mail (2.1081) Subject: RFC: root mount enhancement (round 2) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Aug 2010 15:58:31 -0000 Summary of round 1: 1. A ramdisk root file system (whether pre-loaded by the loader or compiled into the kernel) allows any and all file systems to be mounted as root (in theory). One can populate the ramdisk with whatever tools one needs to setup the storage solution and mount file systems. 2. Negative experiences with the ramdisk root file system as a general approach for mounting a root file system have been expressed. 3. A well-defined and simple recursive algorithm that the kernel uses for finding (nested) root file systems has not been shot down, but needs to handle the power of GEOM better. See also: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=5942+0+current/freebsd-arch Round 2 preamble: Let me mention a problem with the currently implemented root mount logic as a reminder that something needs to be fixed, even if we don't want to enhance: A USB disk cannot always be used as a root file system by virtue of the USB stack releasing the root mount lock after creating the umass device, but before CAM has created the corresponding da device. The kernel will try mounting from /dev/da0 before the device exists, fails and then drops into the root mount prompt. Often the story ends here -- with failure. The root mount enhancement intends to solve this scenario by specifically waiting for the mentioned device/path before moving on to the next alternative. Round 2: The logic remains mostly the same as described in round 1, but gains a directive and limited variable substitution. These are added to decouple the mount directive (${FS}:${DEV}) from the creation of the memory disk so that GEOM can do it's thing. As such, the creation of a memory disk is now a separate directive: .md To mount the memory disk (UFS in the example), use: ufs:/dev/md# Here md# refers to the md unit created by the last .md directive. Since the logic is for mounting the root file system only, a .md directive implicitly detaches and releases the previously created md device before creating a new one. In other words: the enhancement is not for creating a bunch of md devices. Should this be relaxed so that any number of md device can be created before we try a root mount? When the md device appears, GEOM gets to taste the provider and all kinds of interesting things can happen. By decoupling the creating of the md device and the mount directive, it's trivial to handle arbitrarily complex GEOM graphs. For example: ufs:/dev/md#s1a ufs:/dev/md#.uzip ... For completeness, the syntax of the configuration file (in some weird hybrid regex-based specification that is sloppy about spaces) to make sure things get fleshed out enough for review: <.mount.conf> : (^$)* : | | : '#'.* : : | | | | | : ':' : | : | ',' : | '=' | ".md" : | : | ',' : "nocompress" # compress is default | "nocluster" # cluster is default | "async" | "readonly" : ".ask" : "wait" : "onfail" : "panic" # default | "reboot" | "retry" | "continue" : ".init" : | ':' To re-iterate: the logic is recursive. After mounting some file system as root, the kernel will follow the directives in /.mount.conf (if the file exists) for remounting the root file system. At each iteration the kernel will remount devfs under /dev and remount the current root file system under /.mount within the new root file system. Thoughts? -- Marcel Moolenaar xcllnt@mac.com