From owner-freebsd-geom@FreeBSD.ORG Tue Feb 6 17:39:46 2007 Return-Path: X-Original-To: geom@FreeBSD.org Delivered-To: freebsd-geom@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 43F2516A4CF for ; Tue, 6 Feb 2007 17:39:46 +0000 (UTC) (envelope-from xcllnt@mac.com) Received: from smtpout.mac.com (smtpout.mac.com [17.250.248.173]) by mx1.freebsd.org (Postfix) with ESMTP id 2EFBF13C4AC for ; Tue, 6 Feb 2007 17:39:44 +0000 (UTC) (envelope-from xcllnt@mac.com) Received: from mac.com (smtpin07-en2 [10.13.10.152]) by smtpout.mac.com (Xserve/8.12.11/smtpout03/MantshX 4.0) with ESMTP id l16HdiD1008887; Tue, 6 Feb 2007 09:39:44 -0800 (PST) Received: from [192.168.1.2] (c-67-164-11-148.hsd1.ca.comcast.net [67.164.11.148]) (authenticated bits=0) by mac.com (Xserve/smtpin07/MantshX 4.0) with ESMTP id l16HdeBD017334 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Tue, 6 Feb 2007 09:39:42 -0800 (PST) In-Reply-To: <89489.1170747375@critter.freebsd.dk> References: <89489.1170747375@critter.freebsd.dk> Mime-Version: 1.0 (Apple Message framework v752.3) Content-Type: text/plain; charset=US-ASCII; format=flowed Message-Id: <835A2C66-BBEB-4A19-B6A3-A60E17572604@mac.com> Content-Transfer-Encoding: 7bit From: Marcel Moolenaar Date: Tue, 6 Feb 2007 09:38:13 -0800 To: Poul-Henning Kamp X-Mailer: Apple Mail (2.752.3) X-Brightmail-Tracker: AAAAAA== X-Brightmail-scanned: yes Cc: geom@FreeBSD.org Subject: Re: New g_part class X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Feb 2007 17:39:47 -0000 On Feb 5, 2007, at 11:36 PM, Poul-Henning Kamp wrote: > Considering the fact that editing can be done equally well in > userland, what is the rationale or benefit of putting the code into > the kernel, to deal with very infrequent operations to change the > disk-layout ? Editing is of course still done in user space. The application is responsible for providing the right values to the kernel and the kernel will simply fail the operation if something is not correct. The reason why the kernel supports the application with these verbs is simple: the kernel needs to be involved because the application cannot write to disk directly in all cases. With the kernel involved, we can have as many ad-hoc verbs as there are partitioning schemes or we can have a single partitioning GEOM capable of handling various on-disk schemes. Since every partitioning scheme has the same fundamental purpose, a single GEOM maximizes code-reuse and allows us to have a single tool to handle all known schemes. This latter is already a need: sysinstall. > My second concern is if we might still have to replicate all the > error detection in userland, if we want to retain the option for > atomic changes, ie: allowing users to specify a set of changes (with > disklabel -e for instance) before committing them all ? The verbs change the in-memory data only. A commit is needed to write the data to disk. An undo verb exists to revert the in- memory data to match what's on disk. This not only allows complex operations to be written to disk in an atomic fashion, but also supports applications like sysinstall, where everything is prepared up-front and disks are being written after the user gives the final go-ahead. The added complexity to support this is minimal. The benefits are numerous. Atomicity is one of them. > Third, I doubt this will prove as useful as expected in writing > partitioning tools. For instance, how will the partitioning tool > know about the geometry/alignment restrictions of MBR ? A simple query is all that it takes. The application does not have to know about geometry, only about partition alignment. The GEOM can provide this to the application at runtime, based on the geometry of the disk. > If you study libdisk, you will find that there are a couple of DWIW > functiosn that translate the users wish for a NN MB size thing into > a properly aligned and sized thing for the MBR. Where does that > functionality live in this situation ? Libdisk is badly designed (if at all) and badly implemented, but the DWIM/DWIW functionality is in the right place in libdisk. It's the application that should exhibit artificial intelligence (if at all), not the kernel. > Does the kernel return "no > good, try these parameters instead ?" or does it silently truncate > and align ? I think the kernel will error. There's no use-case for this because APM and GPT don't have this restriction. Obvious is that the MBR partitioning scheme will have to enforce this. It can return an error or round the start up and the size down to make it all aligned. I favor erroring. > So I would advocate that you try to implement the MBR method next > and then do a prototype disk-editor utility, so we can see if this > actually makes life easier or not. I will write an application first. There's no partitioning tool for PowerPC and I have a PR open to rewrite gpt(8) to use ctl requests for a while now. That drove me to implement g_part in the first place. >> schemes like MBR, BSD, SUN and/or PC98. > > BSD labels represent a particular nasty case, because of the > possibility that the label sector is inside on a partition. I will > advocate that if we go this direction, we should not migrate BSD, > but leave it behind to die, eventually. I wouldn't mind if BSD labels die. At this time the g_part class already supports the notion of leaf partitioning schemes, of which BSD will be one. Leaf schemes cannot have sub-partitions. This would be enough to prevent infinite nesting. That the metadata can be within the first partition is not a problem for the g_part class proper. It doesn't know how the on-disk meta data looks like. It only knows the beginning and end of usable disk space that can be partitioned and it doesn't care if the beginning is at offset 0. -- Marcel Moolenaar xcllnt@mac.com