Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 26 Sep 1998 19:32:48 +0200
From:      Poul-Henning Kamp <phk@critter.freebsd.dk>
To:        "Justin T. Gibbs" <gibbs@narnia.plutotech.com>
Cc:        current@FreeBSD.ORG
Subject:   Re: Current is Really Broken(tm) 
Message-ID:  <8655.906831168@critter.freebsd.dk>
In-Reply-To: Your message of "Sat, 26 Sep 1998 10:52:32 MDT." <199809261652.KAA05259@narnia.plutotech.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
In message <199809261652.KAA05259@narnia.plutotech.com>, "Justin T. Gibbs" writes:
>In article <7171.906791511@critter.freebsd.dk> you wrote:
>>>>This is just the idea that is realized by SLICE.
>>>
>>>This is exactly the opposite of what is realized by SLICE.  SLICE
>>>does the "mount" at a deep level of the drivers (in an interrupt
>>>handler).
>> 
>> ...Which was one of the most wrong things about it.
>
>Whatever replaces it must be able to be notified of insertion and
>removal events from an interrupt context.  It can simply queue the
>notification up to be processed by a process or thread, but subsystems
>like CAM which process command completions from an SWI need to be
>able to perform notifications from low level contexts.

<ARCHITECTURE>

For any SLICE/GEOMETRY implementation, the discovery and instantiation
of the network of handlers and devices is the most tricky part,
no doubt about that.

There are two basic ways to skin that cat:

A) "The kernel knows"

This is what Julian sort of implemented in SLICE.  It is the "quick"
way but not the easy way.  The trouble is that you need to read
diskblocks from some kind of thead or event handler, examine their
contents, configure the right drivers and so on.  That sounds easy,
but is hairy.

The major problems is with this approach is that you hardcode a
lot of gunk into the kernel, how does a BSD disklabel look, how
does a MBR disklabel look, and so on.

Next you need to figure out how the kernel will discover that you
muck about with a disklabel/MBR something else.  And things go
rapidly down-hill from there.

B) "This is magic, we need a daemon"

If you do it from userland in a daemon, then the interface in the
kernel becomes much cleaner, there are no "hidden users" which do
odd things to you disk.

You can make one generic method that slices a device into several
devices, and depending on what your daemon finds, it will be
configured with the data from a disklabel, a MBR or a Mac VTOC for
that matter.

On the other hand you get a bootstrap problem, to find / you need
to run a program (unless you cheat of course, see: "Veritas")

The issue of changing a layout is now moved from the kernel to a
daemon in user-space, which needs to sanity-check and implement
the changes people want to do.  This is a lot less hairy than
doing it in the kernel.


The second most tricky problem is open/read/write locking: can I,
considering what else is open, open this device for read/write ?

Clearly most current filesystems would kindly but firmly insist
that nobody else writes to their partition while the have it
mounted.  There are on the other hand filesystems which legitimately
do allow this.

Consequently opens can be made in one of several ways:

	"read only, don't care about other users"
	"read/write, don't care about other users"
	"read only, nobody else can write"
	"read/write, nobody else can write"	

This needs to be propageted all the way down (and possibly up) through
the network of instances/devices, for approval before it can succeed.

Of course you can take the UNIX attitude: "This could point at your
foot, be careful..." to this problem, but that is probably neither wise
nor user-friendly.

This needs precise tracking of open-ness for vnodes, today this is
not implemented:

	f1 = open("/dev/rfoo", "r")	 device open called for READ.
	f2 = open("/dev/rfoo", "rw")	 device open called for READ+WRITE.
	close(f2)			 "Nothing happens"
	close(f1)			 device close called.

Summary:

Either way, you need to work out your architecture carefully, or you will
paint yourself into a corner.  The floor is btw not big, but if you don't
paint all of it, the result is not acceptable.

</ARCHITECTURE>

--
Poul-Henning Kamp             FreeBSD coreteam member
phk@FreeBSD.ORG               "Real hackers run -current on their laptop."
"ttyv0" -- What UNIX calls a $20K state-of-the-art, 3D, hi-res color terminal

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8655.906831168>