From owner-freebsd-current Sun Feb 15 15:42:14 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id PAA16711 for freebsd-current-outgoing; Sun, 15 Feb 1998 15:42:14 -0800 (PST) (envelope-from owner-freebsd-current@FreeBSD.ORG) Received: from smtp04.primenet.com (smtp04.primenet.com [206.165.6.134]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id PAA16636 for ; Sun, 15 Feb 1998 15:41:59 -0800 (PST) (envelope-from tlambert@usr01.primenet.com) Received: (from daemon@localhost) by smtp04.primenet.com (8.8.8/8.8.8) id QAA21266; Sun, 15 Feb 1998 16:41:57 -0700 (MST) Received: from usr01.primenet.com(206.165.6.201) via SMTP by smtp04.primenet.com, id smtpd021204; Sun Feb 15 16:41:52 1998 Received: (from tlambert@localhost) by usr01.primenet.com (8.8.5/8.8.5) id QAA08289; Sun, 15 Feb 1998 16:41:44 -0700 (MST) From: Terry Lambert Message-Id: <199802152341.QAA08289@usr01.primenet.com> Subject: Soft updates: OPTIONS Vs. TUNABLES, and a plan... To: garbanzo@hooked.net (Alex) Date: Sun, 15 Feb 1998 23:41:43 +0000 (GMT) Cc: brian@worldcontrol.com, current@FreeBSD.ORG In-Reply-To: from "Alex" at Feb 15, 98 11:32:06 am X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > Yes, just run make depend first. Can one enable soft updates in the > fstab, or does one have to use tunefs, and will that stay after reboots? It's a tunable. The reason is that any existing writes that are outstanding are dependencies. The update mount to read-only does a lot of work to make sure the dependencies are synchronized out for the read-only case (and then it still manages to write the superblock on mount). Discussion: I don't necessarily agree with the rationale here, but the reasoning was that the FS should not be mounted at the time of the change because of the additional gross hacks that the R/O update causes. Not doing the hack is problematic; one wonders how a root FS can ever be tuned after the system has been booted. Mounting read-only is *not* sufficient. This is because there exist read dependencies for read-before-write operations. You can probably get away with it, but you are flying by the seat of your pants if you do so. My preferred approach is to get rid of the VFS_MOUNTROOT code path seperation in VFS_MOUNT (my patches to get rid of the physically seperate VFS_MOUNTROOT went in some time ago). The logically seperate code path also needs to go. The steps are: 1) Change the way mountpoint covering works. Instead of mounting something covered, mount it into an anonymous mountpoint structure. 2) Move the root/non-root distinction to upper level code; the affected code is: o processing of user arguments Allow passing of user arguments. For the root mount case, these will be null. In most cases, the user arguments common to all FS's are handled by the upper level code. o mount updates Change updates. Updates are now defined as "unmount/remount" instead of "mount -> reload". Updates are handled uniformly across all FS implementations by upper level code. The ffs_reload code goes away. o block device lookup Block device lookup occurs in upper level code. The upper level code makes the distinction between the root vs. non-root device lookup procedure. The lookup procedure is invariant across all FS's. o mounted from information The mounted from information is passed in. It is set normally (in the non-read-only case!). It is set in core, but the superblock is not marked dirty in the R/O case. o mounted on information A new interface, VFS_MOUNTEDON, is exported. This interface is used to get/set the "mounted on" information from the upper level code. Setting results in a write of the superblock, even if the device is logically read-only. It is not to be called if the device is read-only, except in the case of geometry configuration (see below). o vfs_export The vfs_export interface is delayed until the VFS_MOUNTEDON/set interface is invoked by the upper level code. 3) The fstab processing for mapping into the FS hierarchy by setting the covering vnode at the mount point occurs *after* the mount has completed. It consists of: o covering the mountpoint vnode The mountpoint vnode is covered by the mounted FS o setting the "mounted on" information for R/W FS's The VFS_MOUNTEDON interface is used with the "set" operation in order to set the mountpoint information into the FS superblock. Not all FS's have superblocks, and this set is only applicable to R/W mounts, in any case. o calling the vfs_export interface This interface is called to export FS's to the NFS server system. Notice that most of the code dealing with mounting and mount options becomes common code. This makes for a smaller system when more FS's are used, and also makes for a more robust system, at least for the common option processing, and so on. Notice it also allows *any* FS to be used as a root FS. There are still a number of hooks necessary for the NFS root case; the changes to support these are already there, from the vfs_init changes which were integrated some time ago. Dicsussion: You may notice that it's possible to get rid of the fstab at this point, since the mounted-on information is sufficient to locate mountpoints, and the VFS_MOUNTEDON/get could be used to determine hierarchy location, and the VFS_MOUNTEDON/set could be used to administratively set this information. In actuality, there is still a need for an fstab in four cases: o Physically R/O devices You can't use VFS_MOUNTEDON/set on these devices. You could, in practice, code this information when you burnt CDROM's. This would make sense, for example, for a FreeBSD CDROM with a source repository, which you wanted to union mount. But not all CDROM's will have this information. o FS types without a superblock For FS types without a superblock, there is no "last mounted on" string available. You *could* technically use a namespace incursion to handle this. For example, you could create an otherwise illegal file name as a long name match for the MS-DOS volume label, and store this information there. It would be transparent to MS-DOS/Windows95. Alternately, you could actually make a namespace incursion, steal an unlikely (and hidden) filename. The Linux UMSDOS uses "__LINUX_" (which they didn't use "_UMSDOS_", I don't know. Maybe they didn't want it to be OS independent.). o Naievely duplicated devices It is a common practice for people to use "dd" on occasion to naievely duplicate disk contents. You could require that thy use a duplication program instead (this requirement could be enforced by an access protocol for raw devices which the naieve "dd" would be unable to handle). The problem occurs when you have two of these devices, both claiming that they were "/usr". One might imagine that in this circumstance, the kernel programmer would be *very* tempted to declare them mirrors, and ignore the problem. o NFS (or other non-local-media FS) client mounts This is pretty obvious. You can't mount what you can't see. In typical cases, without collision, however, you could see the following process taking place: A) Devices are probed true. B) Devices are handed to slice code. C) Slice code claims devices, which create more devices; goto (B). D) Slice code does not claim a device. The device is defined as being "terminal". E) The device is handed to the FS mount code. F) The FS mount code mounts the device, R/O, into the anonymous mount structure list G) Process is repeated until last device arrives; goto (A). H) All devices which are recognizable as FS's have an anonymous mount point. VFS_MOUNTEDON/get is called on each device. I) Returned paths are sorted by length, smallest to largest. This is sufficient because of pathing. J) Mount points are covered, starting with root, from the anonymouns mount points last mounted on information. K) The completed file system hierarchy is now ready for use. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message