From owner-freebsd-arch  Tue Mar 27 16:49:57 2001
Delivered-To: freebsd-arch@freebsd.org
Received: from smtp02.primenet.com (smtp02.primenet.com [206.165.6.132])
	by hub.freebsd.org (Postfix) with ESMTP id 841EE37B718
	for <freebsd-arch@FreeBSD.ORG>; Tue, 27 Mar 2001 16:49:50 -0800 (PST)
	(envelope-from tlambert@usr01.primenet.com)
Received: (from daemon@localhost)
	by smtp02.primenet.com (8.9.3/8.9.3) id RAA11400;
	Tue, 27 Mar 2001 17:42:58 -0700 (MST)
Received: from usr01.primenet.com(206.165.6.201)
 via SMTP by smtp02.primenet.com, id smtpdAAATqaqpw; Tue Mar 27 17:42:54 2001
Received: (from tlambert@localhost)
	by usr01.primenet.com (8.8.5/8.8.5) id RAA26184;
	Tue, 27 Mar 2001 17:49:44 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <200103280049.RAA26184@usr01.primenet.com>
Subject: Re: configuration files
To: jonathan@graehl.org (Jonathan Graehl)
Date: Wed, 28 Mar 2001 00:49:43 +0000 (GMT)
Cc: tlambert@primenet.com (Terry Lambert),
	freebsd-arch@FreeBSD.ORG (freebsd-Arch)
In-Reply-To: <NCBBLOALCKKINBNNEDDLGEDODNAA.jonathan@graehl.org> from "Jonathan Graehl" at Mar 27, 2001 01:06:14 PM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> > A real problem with XML is that there is very little in the
> > way of schema standardization out there (the one big exception
> > is EDI of financial data, which was standardized almost the
> > day XML came out).
> 
> This is not a problem with the XML spec; what would you change about it that
> would solve this problem?

Require that schema be published whenever data is published, so
that someone who wants to interpret the data has the necessary
schema at hand to do the job.  As it stands, it's permissable
for programs to have implied contracts about schema, which means
they will only interoperate with each other, unless they are
reverse engineered..

> I believe, for example, that standardization of XML
> DTDs for product descriptions of components (chips, fasteners, etc.) has been
> moving forward.

That's really the whole "B2B Exchange" fallacy.  I'm convinced
that that is really going nowhere.  It effectively requires that
fastener suppliers be willing to publish data on an exchange,
where the primary determining factor between vendors is the
lowest commodity price.  There are now two companies that I'm
aware of who are trying to "value add" to this model, by using
additional information, such as delivery dates, order bundling,
and so on.  But as things currently stand, you would have to be
a fool, if you wre a supplier who participated in one of these
exchanges, where the primary factor will end up being price.


> I don't see why the DTD couldn't allow specification of actions to
> perform when modifying document subtrees.

The question is not whether it could, but whether you can rely
on it doing so.

> If the XML file were
> the primary source of configuration information for my program, I
> would not want any accessor/mutator functions at all; I would
> simply provide validation constraints and construct what I need
> from the configuration file on startup, or when signaled.

This gets back to the cached configuration data problem.  For
the sendmail example, the host name might change, and sendmail
would not be aware of the change, and would therefore answer
connection requests with incorrect data.

Similarly, the need to HUP a bind serving 10,000 domains takes
the bind out of service for the reload period (DNSUPDAT can help
this, but only if you hack bind to permit zone creation/deletion
via DNSUPDAT).

The problem with this approach is that you are, in effect, making
your server unreliable, in exchange for making the changes take
effect more quickly.

Really, you are interested in hard and soft service interdependencies,
which are very difficult to express in XML, and which are equally
difficult to express in such a way as to ensure ordering, atomicity,
and idempotence for transactions.

At Whistle/IBM GSB, there were serious interaction and QOS issues
that derived from the "restartd" architecture.  It was a good idea,
which lacked in complexity of execution.  It was in the process of
being redesigned -- with the configuration store fronted by a single
protocol point.  I didn't entirely agree with the design, but it was
a far sight better than the predecessor, and solved many issues.

Link management in the InterJet is what I would call a data interface,
rather than a procedural/functional interface.  While everything was
under our control, per se, so we avoided the many issues surround use
of data interfaces and schema synchronization, you have only to look
as far as "ps", "netstat", and "libkvm" for validation of the idea
that data interfaces are inherently untrustworthy.  In 20/20 hindsight,
there should have been a link management daemon, which sat on a link
management device, and decided, based on who was asking, whether the
policy currently in effect should permit or prohibit the request, and
attempts to connect a socket (which, if permitted, would result in a
demand link up) should be administratively denied in the kernel, and
return an error indicating administrative denial.

It is simply not acceptable to have to take on the maintenance burden
of teching every open source package on the planet how to do its own
link management, without a centralized policy enforcement mechanism.

By the same token, a centralize policy enforcement mechanism is
needed for UNIX configuration data as well.

Vanilla XML, without a protocol, API, or other procedural access
point to provide subschema enforcement of your validation
constraints suffers many of the same weaknesses we fought..


> > For that reason, it's generally much more useful to have a
> > protocol gating access to your data model, rather than just a
> > raw data model sitting around somewhere, with no way to demark
> > transactions into "do these changes atomically, and generate
> > the new sendmail.cf file only once, please".
> 
> Suggestions?  I agree that a uniform protocol for making configuration changes
> active is as important as a uniform format for describing them (although the
> action to make the changes active could itself be described in the schema).  I
> would think the editor would signal the running daemon to reconfigure itself
> from the source data, after initiating any post-modification steps indicated
> in the XML configuration file.

ACAP is one option.  My personal preference would be LTAP, if the
LDAP server modifications for asynchronous notification could be
pried free from Lucent.  LDAP would work, if a simple transaction
extension were added (nesting would not be required, if data were
properly normalized).  The back end data store for the protocol
server could be XML (or ROT-13 binary files; it doesn't matter).

I have to say I'm violently opposed to signalling as a catch-all;
I understand that it's the standard means of informing a daemon
of a configuration change, but an overly simplistic, low granularity
approach is exactly what must be avoided.

As another example from the InterJet, it ran a split horizon DNS.
When the link came up, it changed the IP address in the external
DNS; in most cases there was no reason for this: in the case of a
static IP address, the address would never change.  In the case of
a dynamic IP address, the address would change to the point of it
being useless to run a DNS on the external port.  If this had to
run, the most correct approach would be to use DNSUPDAT.  Because
the external mail server (which was used for fingerd, connection
up-based SMTP trigger, and (later), ODMR dependended on external
DNS, when external DNS had to be restarted to change the IP address,
so did mail and other services dependent upon DNS services.  But
sendmail did not care about _any_ DNS changes: it only cared about
A record (canonical server name) changes.  But the system was
insufficiently granular to distinguish different types of data
changes in DNS, and so customers suffered the consequences of their
mail server being hit over the head with a brickbat, and therefore
not being up at precisely the time that the ISP mail server was
attempting to contact them.


> > What this basically means is that it's great, if you are doing
> > code that you don't expect to interoperate with anyone elses
> > code, and less great otherwise.
> 
> XML gives you a data interchange format, not an actual protocol.  Shared XML
> schemas can be extended while still allowing for base interoperability.

I'm (I guess) known for the statement "standard plus extensions
is not standard".  I think that deviations from standards render
code practically useless.


> > The primary reason I see it being used in places like IBM is
> > that it can tunnel RPC calls and other data over HTTP, which
> > people tend to let through firewalls.  In other words, it is
> > capable of routing around anal retentive security types, who
> > live in deathly fear of FTP and DNS.  IMO, XML was practically
> > invented just to get around IBM network security.
> 
> I don't understand: why is XML necessary to tunnel your application-specific
> messages over HTTP?  If I wanted to bypass IBM network security for my
> application, I could roll my own data interchange format that happens to look
> like HTTP requests/replies involving the usual port.  XML-RPC-HTTP facilities
> may have been designed to bypass crude network filtering, but XML was not.

I was being facetious.  I don't see XML being a panacea.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message