Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 16 Jun 2002 15:46:51 -0700
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Mike Makonnen <makonnen@pacbell.net>
Cc:        current@FreeBSD.ORG, danny@cs.huji.ac.il, gordont@gnf.org
Subject:   Re: HEADS UP: rc.d is in the tree
Message-ID:  <3D0D155B.1ACEF5E8@mindspring.com>
References:  <E17IrYC-000NFi-00@cse.cs.huji.ac.il> <20020614142308.7ddeaed0.makonnen@pacbell.net> <3D0A6E7B.F243329A@mindspring.com> <20020615121247.A6971@dragon.nuxi.com> <3D0B9A60.A4A816A4@mindspring.com> <20020615144656.06f8404d.makonnen@pacbell.net> <3D0BBD43.6623BCBD@mindspring.com> <20020616054030.29e6ed35.makonnen@pacbell.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Mike Makonnen wrote:
> So, what you're describing is a chicken and egg problem. The solution
> is simple, the sysadmin decides which one he wants to start first by
> fiddling with the REQUIRE and BEFORE lines, or the script can make
> use of the force_depend() subroutine to start required services
> that aren't already started.

Not that simple, still.

The dependency list is assumed to be a DAG -- an Acyclic Directed
Graph -- and the purpose of rcorder is to perform a topological
sort (tsort) on the graph, so that a breadth-first descent will
result in everything starting in the proper order.

The are two problems here:

1)	Starting something isn't enough.  There's "started" and
	there's "available".  Restarting a DNS server with about
	50,000 domains and all the accompanying records is going
	to take on the order 3 minutes (this is from personal
	experience; this is why DJBDNS is insufficient, and this
	is also why DNSUPDAT is a good thing, and why zone creation
	is something you should hack in).

2)	The problem I'm describing is a cycle in the graph.  The
	rcorder *depends* on the dependency graph being acyclic;
	I'm telling you that it's cyclic, unless you take into
	account the strength of dependencies... i.e. in addition
	to "started" and "available", there's also "actively used".

It looks like I need to give you a more complex case.

Consider the case of a dial on demand gateway device with interior
network connectivity and exterior connectivity (right now, all we
care about is that there are two interfaces: inside-facing and
outside-facing), which is running:

o	An external DNS server
o	An internal DNS server
o	An internal SMTP server
o	An external SMTP (firewalled) server

When the exterior interface is brought up, all services which are
dependent on the external IP address must be reconfigured and
restarted.  Obviously, this must include the external DNS server
(for right now, we will ignore the possibility of DNSUPDAT).

When this happens, all services that depend on the external DNS
server must be reconfigured and restarted.  This list includes
the external SMTP server.  Why?  The answer is that the external
SMTP server has obtained its canonical name for the external
interface -- the name it uses in the connection greeting message,
and in HELO/EHLO sent to other machines in the outbound email
case -- from the external DNS server.  If you dial into a different
IP address, then this external server's name changes, since the
treverse mappings are owned by the ISP that is permitting the
dialin.

In fact, any time you change the external DNS information at all,
you must pull a "Pol Pot" -- you nust take the DNS server out into
the street, and shoot it in the back of the head so that it is
restarted with new information.  But this must become a massacre:
you have to do the same to all services that depend on the DNS.

Now consider what "demand" is in the "dial on demand" case: it's
the need to send packets off the local network.  The most common
case for this is the need to send a SYN to a remote SMTP server
from the interior SMTP server, in order to send email.  So... the
SMTP server polling interval arrives, you do a local queue run,
and decide to make contact to an exterior SMTP server.  The SMTP
server locally attempts a connect(), and the link comes up; you
shoot the external DNS server in the head.  Then you shoot the
program that made the demand in the head because it's idea of its
canonical name is now incorrect.  The demand goes away.

What can we learn from this (other than "cached information is a
bad thing")?

That we can't get rid of circular dependencies entirely, and so
we have to base the dependencies on classification.

Now there are ways around this and similar problems.  Most of
them involve running your own NOC to service the devices in the
field, so that you can make certain simplifying assumptions;
some of them, you could fix if you were to modify the sockets
interface to do application based connect() attempt failure
(e.g. EADMIN -- operation was administratively prohibited) on
the credential of the requesting party.

But the one thing that's obvious is that you need to be able to
discriminate dependencies with a granularity better than "dead"
vs. "alive".

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3D0D155B.1ACEF5E8>