Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 23 Oct 1997 16:40:01 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        hackers@FreeBSD.ORG
Subject:   Re: FreeBSD 3.0 kernel API ?!
Message-ID:  <199710231640.JAA23174@usr02.primenet.com>
In-Reply-To: <199710220334.UAA23820@kithrup.com> from "Sean Eric Fagan" at Oct 21, 97 08:34:20 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> >> How will you deal with struct ifnet when we rename all the member
> >> variables from their current names to "opaque_variable_01" through
> >> "opaque_variable_NN"?  Even if you can depend on the structure, you
> >> can't reasonably expect the kernel internal interface to not change.
> >This sort of change I think is, to put it bluntly, fucked.
> 
> Terry's problem is that he is forgetting that non-kernel bits are part of
> the OS in unix.

I'm not forgetting this.  I'm fine with it.


> This means that some non-kernel bits have to know the format (and location)
> of some kernel data structures, sometimes.

This, I stringly disagree with.  Kernel internal structures are *internal*,
that's the whole point in calling them that.  8-).

What this means is that they should be accessed via accessor functions
instead of directly.  The best mechanism would be descriptor based to
not externalize any structure changes, *ever*.


> (While there are many cases where you can abstract this into a usable
> API, there are many other cases where you can't -- because what you
> want to get at is, indeed, exactly what the kernel is using.)

Then I would argue that you are cutting the interface in the wrong place:
in the middle of the iterator instead of above it.

Let's take an example: the proc struct.  I want to iterate the processes
on the system, and provide some information as a result of this iteration.
This may be because I'm the 'w' command, or it may be that I'm the 'ps'
command, or it may be that I'm a to-be-written session manager.

Right now, this can be done one of several ways:

1)	open /dev/kmem via lib kvm (and need to know the proc struct
	size and layout) and grovel
2)	popen() an existing command that does #1 already (and compound
	the difficulty of fixing #1)
3)	iterate /proc (and need to have it mounted)

Of these, the best programatic interface is currently #3.  But it fails
to operate, as 'ps' currently does, on system dump images.  Let's forget
for the moment that this functionality belongs in the system dump
analysis tool instead of the regular commands.  How do we make our
putatively "new, improved" 'ps' command do these things?

The easiest way would be to associate the iterator interface, not with
the 'ps' program (and duplicate the code in all programs like it), but
to provide access via an iterator mechanism (as in #3)... only not to
depend on a cannonizing-data-exposure interface (like procfs).  You
don't want to depend on data-exposure because it can only expose the
data of a running kernel.  And libkvm is a non-cannonizing-data-exposure
interface.

So what do you do?

The obvious soloution is to somehow make an association of an iterator
interface with the image that creates the data.  You can do this with
ELF by making the interface, effectivly, a shared library which resides
in the kernel image and exports data by descriptor.  With ELF, you can
do this.  The trick it to make the kernel loader not load sections with
a section attribute that indicates they are this interface, and to make
the dlopen() interface take a section type argument (actually, you change
dlopen() to wrap another interface with the third argument -- otherwise
you lose backward compatability and have to do things like actually
looking to the future -- ugh!).


Another alternative would be to *not* ignore what we did before: remove
the system dump analysis capability from 'ps' and 'w' and ..., and put
it in a system dump analyzer tool.  Then do conversion to approach #3.
This makes procfs manadatory (I really don't think this is such a bad
thing, myself).


> This is further complicated by the fact that some utilities people have
> decided are part of the OS are not maintained by us.

These utilities must track system changes.  That's all there is to it.
If it becomes too burdensome, then FreeBSD must pay for the ability
to make changes away from the mainstream by picking up maintenance.

Can you say a.out?  ...I knew you could.  Maintenance of old GNU tools
is the payment FreeBSD must render for not going to ELF with the rest
of the world.

Similarly, maintenance of things which grovel /dev/kmem (whether or
not via libkvm is irrelevant) falls to FreeBSD as well.  Which is
the number one reason to eliminate the interface (number two is so
you don't have to rebuild libkvm and everything which uses it each
time a trivial kernel structure change takes place).


Back to the original poster:

There are two valid complaints you have, both of which require the
core team to establish policy:

1)	How do I write portable kernel code on other platforms
	which can be run on FreeBSD?

Right now, if you use kernel structures, you must define "KERNEL (as
someone else pointed out, this is a bogus name space incursion, and
should be "_KERNEL).  This will alleviate your complaint, assuming
you are building "live" code.


2)	How do I test kernel code in user space?

The ability to test kernel codde in user space is a developement
environmnet option.  The FreeBSD core team is responsible for
decisions of policy regarding whether or not this option is to
be offered by FreeBSD.

My personal take (since I'm not one of them) is that it's a
desirable future goal to be able to develop and test kernel code in
user space, and that to some extent, this will require a conversion
to ELF to be able to externalize the kernel interfaces.  I would
like to see a formal DDI/DKI definition, and I'd like to see that
definition result in a user space test harness and transport layer.


But what does this buy you beyond LKM's?

It buys you the ability to do source level debugging.  Right now,
to get source level kernel debugging, it requires two machines.

So the answer for right now is to use LKM's, put the code into the
running kernel, and use another machine to get source debugging.


Hopefully, this subject is now exhausted.


					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199710231640.JAA23174>