Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Mar 2003 23:45:47 +0100
From:      Poul-Henning Kamp <phk@phk.freebsd.dk>
To:        arch@freebsd.org
Subject:   GEOM ioctl issue.
Message-ID:  <11639.1048286747@critter.freebsd.dk>

next in thread | raw e-mail | index | archive | help

I am trying to fix some issues relating to removable devices and
GEOM, and have run into an either/or issue with GEOM's state-engine.

A bit of background:

ioctl will automagically do copyin(9)/copyout(9) of the primary argument
which often is a struct.

The way this works is that the ioctl identifier, which is an integer,
has 13 bits dedicated to the length, so primary arguments can be up
to 8191 bytes long.  (The alert reader will notice that this is one byte
short of being able to accomodate a UFS bootblock.)

The ioctl syscall handler will extract this length and two flags
which encode if the primary argument should be copied into the
kernel before and out of the kernel after handling the ioctl, and
do the appropriate copyin(9) and copyout(9).  (See <sys/ioccom.h>
for how this encoding is done.)

Think of this as a zeroth order marshalling layer.

If the primary argument is a struct containing pointer(s) to something
else, a string or maybe another structure, these secondary arguments
are not handled in the generic ioctl syscall code, it becomes the
responsibility of the ioctl implementor to explicity call
copyin(9)/copyout(9) on these arguments.

The way we have implemented copyin(9) and copyout(9) depends strongly
on the fact that they are executed in the corresponding thread, it is
not possible to access userland for one thread with copyin/out from
another thread, userland or kernel.

GEOM abandons the original thread as the very first thing, there
are a multitude of reasons for this, the primary being prevention
of kernel stack overrun and the second realible and simple serialization.

If an ioctl is called on /dev/da0, what will happen is that the ioctl
gets as far as to geom_dev.c (which is a normal device driver), where
the ioctl arguments are packed into a special kind of struct bio, which
then is shipped down-stream in the dedicated GEOM io-path threads.

It will relatively fast hit geom_disk.c which is the GEOM class which
interfaces to disk device drivers.  geom_disk.c recognizing the potential
need for copyin/copyout will return the request with a special error
code and a function pointer.

The special struct bio will eventually get delivered to geom_dev.c
into the original userland thread, and special case code on seeing
the magic error code will call the provided function pointer to
really execute the ioctl.

If the disk device driver at this point gets an interrupt and finds
out that its hardware has departed the system, things explode in
a nice display of fireworks, because there is no way to lock this
scenario without causing deadlocks.

So I am faced with an Either/Or decision:  Either I support ioctls
in the disk device drivers Or we get removable devices working
robustly.

Three drivers today use the ioctl interface, (notice that we are
only talking GEOM disk device drivers here, CDROM drives for instance
are not affected).

"afd" uses ioctl to implement two primitivs for opening and
closing the device tray.  "mlx" uses it for management tasks and
"da" uses it to find the corresponding "pass" device for a given
"da" device.  Close inspection of libcam has not shown that this
ioctl is actually used however.  Finally RaidFrame uses ioctl for
configuration issues.

For all I can tell, the code in "da" can be dropped and nobody will
notice.  The "mlx" and "raidframe" can probably more logically implement
a "mlx.ctl" and "rf.ctl" control device which is used soleley for
OAM issues, this would be in line with the md(4) and ccd(4) drivers.

That leaves the "afd" drivers "open/close tray" ioctls, which currently
are #ifdef'ed out pending further study of how we actually want to
interface with removable devices in general (this subject is out of
scope for this email).

On the other hand, the SANE2002 conference handed out "USB-keys"
to all in attendance and everybody and his dad has a digital camera
these days, so getting removable devices working robustly is not
even open for discussion.

So the decision is really a foregone conclusion:  We cannot possibly
support the copyin/copyout semantics of ioctls in GEOM disk device
drivers.

One option would be to simply drop the support for doing copyin/copyout
in the ioctl handing code, but this would mean that we suddenly have
two kinds of ioctls in the system, and it would leave another issue
unresolved:  the serialization between regular I/O and ioctls.

The replacement mechanism will therefore be to pass the
BIO_GETATTR/BIO_SETATTR struct bio requests through to the device
drivers and marshal ioctls into these at the geom_dev.c level
is required.

I have not completed the patches to carry this transformation out yet,
but this is the next item on my TODO list, so expect a patch to be
floated for test and review in a few days time.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?11639.1048286747>