From owner-freebsd-current  Tue Dec 14  1:49: 9 1999
Delivered-To: freebsd-current@freebsd.org
Received: from zippy.cdrom.com (zippy.cdrom.com [204.216.27.228])
	by hub.freebsd.org (Postfix) with ESMTP id 90400150BD
	for <current@FreeBSD.ORG>; Tue, 14 Dec 1999 01:49:03 -0800 (PST)
	(envelope-from jkh@zippy.cdrom.com)
Received: from zippy.cdrom.com (localhost [127.0.0.1])
	by zippy.cdrom.com (8.9.3/8.9.3) with ESMTP id BAA02687;
	Tue, 14 Dec 1999 01:49:25 -0800 (PST)
	(envelope-from jkh@zippy.cdrom.com)
To: Donn Miller <dmmiller@cvzoom.net>
Cc: Eric Jones <ejon@colltech.com>, current@FreeBSD.ORG
Subject: Re: sysinstall: is it really at the end of its lifecycle? 
In-reply-to: Your message of "Tue, 14 Dec 1999 02:36:04 EST."
             <3855F364.E66EC87B@cvzoom.net> 
Date: Tue, 14 Dec 1999 01:49:25 -0800
Message-ID: <2683.945164965@zippy.cdrom.com>
From: "Jordan K. Hubbard" <jkh@zippy.cdrom.com>
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> As far as the successor to sysinstall goes, I think it would be
> nice to have both a console version and an X version, with some X
> tookit such as Lesstif or Qt, or Tcl/Tk.  It could be a lot like
> RedHat's "linuxconf", where you can use it as both an installer
> or system administration tool.

Which is about correct, though there's a volume of details behind your
conceptualization of the system in outline form there. :-)

To really understand where we're trying to go, however, it's somewhat
helpful to take a good look at where we are now, e.g. stuck with our
dear friends sysinstall and the pkg_install suite.

The sysinstall program is basically nothing more than a monolithic
C program which knows how to do the following "special" things:

o Run as init, if so invoked, and do the special inity-things that must
  be done for an interactive program to subsequently function properly
  (see system.c:systemInitialize).

o How to stomp on disks directly for the purposes of partitioning,
  writing MBRs, etc.  Most of this is already abstracted by libdisk(3)
  (see also disks.c and label.c).

o How to newfs and/or mount filesystems of many different types
  (UFS, DOS, NFS, CD, floppy, etc) and how to use them as
  installation media (see media.c and friends).

o How to configure network interfaces (with or without DHCP) and
  how to use FTP servers as media devices, much of the latter
  being abstracted by ftpio(3) (see tcpip.c and network.c).

o How to read FreeBSD "distributions" (gzippied, split tarballs with
  external property (.inf) files, essentially) and extract them to
  a mounted hierarchy of UFS partitions (see dist.c).

o How to read /usr/ports/INDEX files and enough about the internals of
  packages to get the pkg_install suite to jump through many hoops you'd
  rather just not know about (see index.c and package.c).

o How to spit out hosts, resolv.conf and rc.conf files from internal
  variable state, allowing dialogs to be constructed which front-end
  much of the contents of these files (see config.c and variable.c).

o How to use the dialog(3) library in ways that should not be
  discussed within earshot of small children (see dmenu.c).

All of these capabilities adding up to a composite picture with a
number of deep and irredeemable flaws.

Let's take the UI, for example.  Even in a system as simple as
sysinstall, we have 2 screens open: the primary interaction screen on
VTY1 and the debugging information screen on VTY2 (not counting the
possible child holoshell on VTY4 or a child ppp session on VTY3).  We
put them on separate VTYs because there is no clever multithreaded UI
here which allows such output to scroll along in one window while
doing other things in another, it's basically a single thread of
control per VTY.  Anyway, this works great for most things but causes
problems the minute you want to install a package which is
interactive.  Most packages just cause pkg_add to spew various bits of
diagnostic output which you'd otherwise be perfectly happy to go onto
VTY2, but if a package suddenly takes it into its head to bring up a
menu, that's also going to go on VTY2 since sysinstall has no idea in
advance that this package might have something meaningful to say and
it's going to route its stdin and stdout to VTY2 as always.
Unfortunately, that causes consernation on the part of the user who's
staring at VTY1 meanwhile and wondering why the package is taking so
damn *long* to install.

I've seen novices wait so long that medical intervention was necessary
in order to save them, leaving us unlikely to win any Tog Awards for
interface design in such cases.  Sure, I can hear you yelling at those
novices from here: "JUST SWITCH TO THE OTHER VTY AND *LOOK*, YOU
CHEESEHEADS!", but it's never that simple.  Users will be users and
even our own annoying common-sense tells us that both pkg_add and
sysinstall should really be "rendevousing" on the user, wherever the
user's eyeballs might happen to be at the moment, and bringing up the
menu in question.  It shouldn't be over on the debugging screen and if
pkg_add were somehow a library linked into sysinstall and using the
same common UI API, initialized by sysinstall to point to VTY1 at
startup time, well then by golly that would Just Work(tm) and life
would be good again.

That's one of the design precepts of the New System, in fact.  There
is one common UI abstraction which sysinstall II (hereafter referred
to as Setup) and the new package system both use.  The generic UI
front-end API is "bound" at runtime to a back-end implementation
class, the two currently supported ones being Qt and Turbovision (the
references implementation for the common UI stuff is all written in
C++), and everything pops up in the appropriate UI environment from
that point forward.  Our test code checks for $DISPLAY and does the
appropriate Qt magic in that case, otherwise it binds in Turbovision.
In theory, one could even write a back-end class which talked to a
browser.  Scary. :)

Having a proper UI also means a lot more than just being able to see
your packages when they try and interact with you at install time, it
means that many things which were previously limited by the actual
dialog(3) UI itself are possible.  Ever wonder why there's no "Back"
button in many of sysinstall's dialogs, for example?  Because the
dialog code only supports 2 buttons in menu dialogs and would have
required radical reworking of the thing to support an arbitrary number
of buttons. :) That is only one of the many limitations of the dialog
library and all of them have hampered sysinstall's progress to some
extent, a properly designed callback-driven UI making it possible, for
example, to do extraction of packages in a separate thread and allow
the user to continue to do other things in the UI while extraction is
taking place.  I won't even get into the possibilities of actually
being able to use the mouse, or going towards something closer to a
tiled multi-panel interface which makes it more obvious at a glance
just where you're going and where you've been in the installation.

Another place where sysinstall went wrong, sadly of necessity, is in
drawing the dividing line between what a package is supposed to do and
sysinstall itself is supposed to do.  Because the FreeBSD distribution
format is kind of brain-dead in being a split gzipped tar file, a
format which does not lend itself to being randomly accessed or
selectively decompressed, sysinstall assumes that there's no decent
information about distributions available from outside and takes all
responsibility for getting the right bits to the right places, doing
so by consulting a fair bit of in-built knowledge of what a FreeBSD
distributions look like and which ones should go into what
directories.  This naturally causes divergence problems as FreeBSD
distributions evolve and it means that sysinstall has to be cognizant
of a lot more special details than it really should about FreeBSD's
distribution format.  Changing distributions into packages doesn't
really help us either since packages are also gzipped tarballs and,
what's worse, the pkg_install system attempts to unpack a package into
a temporary directory before moving it to its final location given
that many packages only install a certain subset of themselves.  This
is fine if you're installing a 300K package like bash, but a 70MB
package like bin might run your /var/tmp directory a bit low. :)

Clearly, neither format was really designed properly for the idea of
potentially massive packages which also had certain "smarts" about
what to do before, during and after their installation.  Starting
over, we clearly see that the zip file offer quite a bit more in terms
of compression, random access and even in-place updating or deletion
of contents (not offered with tar).  Furthermore, zip files offer
"comment fields" on a per-file and per-entry basis, leaving us with
room to store MD5/pgp signatures, extended attribution information,
etc.  That is why the new package system is based on the zip archive
format and uses a library (well, currently two, one in C and one in
C++) to abstract away the details of zipfile innards.  A given zip
file need also not be extracted in its entirety now, making the
concept of "fat packages" (multiple architecture support) and such
possible.

In order to ensure that the package's installation routines call the
common UI routines for all their interaction needs (remember the VTY2
scenario), a package's installation script is also now assumed to be a
secure TCL script rather than being the arbitrary executable it is
now.  This has a number of implications even more important than
simple interface unification, of course, most of them in the realm of
security.

As previously mentioned, a package is currently installed by unpacking
it to a temporary location and then running a number of optional
(e.g. if found) scripts before finally going through the packages's
"packing list" file for guidance on what to do with the rest of it.  A
packing list can also contain (and cause to run) in-line shell
commands and such during pkg_add's parsing of it, so for the purpose
of simplification we'll refer to the packing list as another "script."

Now every one of these scripts runs as the user running pkg_add and,
in most cases, that's root.  That means that a package's install
routine can turn right around and "rm -rf /" and you probably won't
even notice it happening until stuff starts disappearing out from
under your feet.  Sure, you can pgp sign trusted packages, but what
happens when somebody in a trusted position makes a mistake?  Maybe an
install script wants to "rm -rf /${TMPJUNKDIR}" in order to be a good
citizen after it's done but its not-so-good programmer forgot what
happens when you don't spell TEMPJUNKDIR correctly and the variable
expands to nothing (another bad reason to prepend the /, but I digress :).

Clearly, it should be possible to prevent an installation script from
doing *anything* it pleases, even if the system grants the privilege,
and that's where the idea for using secure TCL comes in.  The language
is comparatively small and easy to embed, meaning the results will
actually still fit on an mfsroot floppy (sorry PERL addicts), and it's
not a dialect of lisp or forth, forstalling the usual religious
flame-wars which occur when either of those two languages are
mentioned.  Somewhat more importantly, the evolution that secure TCL
went through in order to run TCL inside of a web browser plug-in
(TCLets) gives us the ability to provide a very selective set of
operations for packages, any of which can be disabled, overridden or
made more picky based on how much we trust the specific package or all
packages in general.  A file::create routine, for example, could be
set to create files in any directory, to create them only in
directories found in ${file_create_authorized_dirs}, to create them
based on the opinion of an external hook function, etc etc.

Because any interpreter can create more (but not less) restricted
versions of itself, a package can even elect to override certain
system functions with even more anal versions in order to "firewall"
some suspect piece of 3rd-party installation code which a port
maintainer found along with the original bits.  The possibilities are
pretty endless and the name of the game is basically to know for
certain what a package is doing every step of the way.  Along the same
lines, you can even write your various primitives to push "undo
information" for themselves before committing whatever actions they're
designed to commit, allowing arbitrary roll-back of an installation
procedure in case of error or even somewhat later in the package's
lifetime ("damn, this is broken, give me the previous version!").

Making packages this much more "intelligent" and endowed with a more
powerful UI abstraction for getting information from the user would
also allow us to finally solve the problem of where to put
configuration dialogs and important setup checklists.  It's always
been clear that they belong in the packages themselves, not the
installers of the packages, but since we've always taken a very
non-interactive approach to packaging we wound up instead with a
situation where sysinstall, for example, used to ask setup questions
for Apache.  It required constant updating to remain in sync with the
apache server's configuration directory location and such and it
sucked so much, in fact, that I eventually took that setup screen back
out of sysinstall again.  Other similar ones remain, however, too
useful to remove from sysinstall but not actually belonging *in*
sysinstall, they belong in the bindist or the XFree86 distribution or
wherever, those distribution formats simply being too stupid to manage
this on their own.

I could go on and on about how braindead the current packaging system
is, how updates and version-specific dependences and occlusion and
garbage collection and roll-back and all that good stuff just isn't
implemented at all, but that'd be like kicking the bleached and dried
skeleton of a dead horse. :-) Suffice it to say, we're certainly
hoping to address all of those problems in whatever we eventually come
up with as "the new package system."  I've described some of the
existing parts already and outlined many others which exist only
loosly in my mind and the minds of a few others.  I've also only
touched on some of it, some other (future) topics being the concept of
"media maps" which tell where all the various little packages are and
what they depend on, the problems constituted in probing for
installation media devices and device discovery in general in the
installer, etc.

We also need to discuss the ways and means of creating not so much an
installer but an installation "nucleus" around which we also have a
general script execution and menu-generation framework which makes it
easy for other people to write "configurators" in secure TCL which
take on the job of configuring some utility like, say, Samba.  When
you pkg_add samba.zip in such a system, it runs its configurator to
generate the initial smb.conf file but also drops a copy of the
configuration script into some special config directory under the
Networking category.  Now the next time the user fires up the system
configuration tool and goes to the Networking section, they see Samba
there as a new item and clicking on it will bring up the configuration
tool again (perhaps in the same form, perhaps not).  If Samba is
deleted from the system, the correspnding item goes away along with
the configuration script and I'm sure you all get the idea at this
point.  No more monolithic prototypes!  Framework!  Frame-work!
Frame-work! [jkh jumps up on a chair and begins waving his hands
enthusiastically before losing his balance and toppling over with an
abrupt scream].

- Jordan


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message