Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 4 Oct 2002 02:03:28 +1000 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Poul-Henning Kamp <phk@critter.freebsd.dk>
Cc:        Mark Santcroos <marks@ripe.net>, <freebsd-hackers@FreeBSD.ORG>, <emulation@FreeBSD.ORG>
Subject:   Re: vmware reads disk on non-sector boundary 
Message-ID:  <20021004012906.S4315-100000@gamplex.bde.org>
In-Reply-To: <3209.1033649795@critter.freebsd.dk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 3 Oct 2002, Poul-Henning Kamp wrote:

> In message <20021003115649.GC584@laptop.6bone.nl>, Mark Santcroos writes:
> >On Thu, Oct 03, 2002 at 09:50:45PM +1000, Bruce Evans wrote:
> >> Unbreaking block devices would be a better solution.  Without buffering,
> >>...
> >What was the reason for the removal of block devices anyway?
> >It would be nice if you would tell me some background about that.. :)
>
> It's well documented in the mail-archives actually.
>
> The short story:

Shorter story: phk didn't like them.

> 1. We don't in general assign a special vnode type to device modes,
> instead we assign multiple device nodes, SCSI tapes is an
> example of this.
>
> 2. The vnode layer already have enough trouble aliasing /dev/fd0,
> /mnt/dev/fd0, /usr/jail/dev/fd0, /cdrom/dev/fd0 (you get the idea),
> we do not need to make it even harder by also aliasing /dev/fd0 and
> /dev/rfd0.

Aliases that differ in type are slightly easier to handle than aliases
that differ by name or mount point.  They are diferent devices so they
have different vnodes, and the aliasing problems for them are quite
different than the vnode aliasing problems caused by having the same
device under different mount points.  E.g., slices and partitions allow
configuring 31*7 aliases per drive for hard drives (31 slices with 7
configuring partitions each).  Slices and partitions may overlap,
giving something more complicated than aliases.  The vnode layer doesn't
understand any of this.

> 3. Write ordering on buffered devices were unspecified.  In other
> words, you cannot use it for anything which even remotely smells
> of transactions, because you have no way to know when your writes
> have hit the disk and in which order they did so.

This is no different than for regular files.

> 4. No write errors were reported back to userland.

Actually, write errors were reported at fsync() and close() time in
the same way as for regular files.  fsync()'s handling of write errors
was broken for both regular files and buffered devices at the time
buffered devices were axed.

> (Given 3 and 4, it follows that use of block devices for any sort
> of data you happen to like is a very bad idea.)

It follos similarly that use of fil systems is a bad idea :-).  You
You should only use a databases on raw disks if you value your data.

> 5. Block devices was in the way of getting DEVFS working in an
> architecturally sane manner.

This can be considered a feature.

> So they were removed, and good riddance.
>
> If a buffered access-mode on block devices is desired, it should
> be implemented either as an ioctl controllable feature, or as
> a GEOM module.  The latter is probably by far the easiest way.

It was desired, and was sort of promised.

Bruce


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-emulation" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20021004012906.S4315-100000>