Date: Sun, 16 Jun 2002 15:46:51 -0700 From: Terry Lambert <tlambert2@mindspring.com> To: Mike Makonnen <makonnen@pacbell.net> Cc: current@FreeBSD.ORG, danny@cs.huji.ac.il, gordont@gnf.org Subject: Re: HEADS UP: rc.d is in the tree Message-ID: <3D0D155B.1ACEF5E8@mindspring.com> References: <E17IrYC-000NFi-00@cse.cs.huji.ac.il> <20020614142308.7ddeaed0.makonnen@pacbell.net> <3D0A6E7B.F243329A@mindspring.com> <20020615121247.A6971@dragon.nuxi.com> <3D0B9A60.A4A816A4@mindspring.com> <20020615144656.06f8404d.makonnen@pacbell.net> <3D0BBD43.6623BCBD@mindspring.com> <20020616054030.29e6ed35.makonnen@pacbell.net>
next in thread | previous in thread | raw e-mail | index | archive | help
Mike Makonnen wrote: > So, what you're describing is a chicken and egg problem. The solution > is simple, the sysadmin decides which one he wants to start first by > fiddling with the REQUIRE and BEFORE lines, or the script can make > use of the force_depend() subroutine to start required services > that aren't already started. Not that simple, still. The dependency list is assumed to be a DAG -- an Acyclic Directed Graph -- and the purpose of rcorder is to perform a topological sort (tsort) on the graph, so that a breadth-first descent will result in everything starting in the proper order. The are two problems here: 1) Starting something isn't enough. There's "started" and there's "available". Restarting a DNS server with about 50,000 domains and all the accompanying records is going to take on the order 3 minutes (this is from personal experience; this is why DJBDNS is insufficient, and this is also why DNSUPDAT is a good thing, and why zone creation is something you should hack in). 2) The problem I'm describing is a cycle in the graph. The rcorder *depends* on the dependency graph being acyclic; I'm telling you that it's cyclic, unless you take into account the strength of dependencies... i.e. in addition to "started" and "available", there's also "actively used". It looks like I need to give you a more complex case. Consider the case of a dial on demand gateway device with interior network connectivity and exterior connectivity (right now, all we care about is that there are two interfaces: inside-facing and outside-facing), which is running: o An external DNS server o An internal DNS server o An internal SMTP server o An external SMTP (firewalled) server When the exterior interface is brought up, all services which are dependent on the external IP address must be reconfigured and restarted. Obviously, this must include the external DNS server (for right now, we will ignore the possibility of DNSUPDAT). When this happens, all services that depend on the external DNS server must be reconfigured and restarted. This list includes the external SMTP server. Why? The answer is that the external SMTP server has obtained its canonical name for the external interface -- the name it uses in the connection greeting message, and in HELO/EHLO sent to other machines in the outbound email case -- from the external DNS server. If you dial into a different IP address, then this external server's name changes, since the treverse mappings are owned by the ISP that is permitting the dialin. In fact, any time you change the external DNS information at all, you must pull a "Pol Pot" -- you nust take the DNS server out into the street, and shoot it in the back of the head so that it is restarted with new information. But this must become a massacre: you have to do the same to all services that depend on the DNS. Now consider what "demand" is in the "dial on demand" case: it's the need to send packets off the local network. The most common case for this is the need to send a SYN to a remote SMTP server from the interior SMTP server, in order to send email. So... the SMTP server polling interval arrives, you do a local queue run, and decide to make contact to an exterior SMTP server. The SMTP server locally attempts a connect(), and the link comes up; you shoot the external DNS server in the head. Then you shoot the program that made the demand in the head because it's idea of its canonical name is now incorrect. The demand goes away. What can we learn from this (other than "cached information is a bad thing")? That we can't get rid of circular dependencies entirely, and so we have to base the dependencies on classification. Now there are ways around this and similar problems. Most of them involve running your own NOC to service the devices in the field, so that you can make certain simplifying assumptions; some of them, you could fix if you were to modify the sockets interface to do application based connect() attempt failure (e.g. EADMIN -- operation was administratively prohibited) on the credential of the requesting party. But the one thing that's obvious is that you need to be able to discriminate dependencies with a granularity better than "dead" vs. "alive". -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3D0D155B.1ACEF5E8>