Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 27 Dec 2011 13:26:51 +0400
From:      Eygene Ryabinkin <rea@freebsd.org>
To:        Doug Barton <dougb@FreeBSD.org>
Cc:        Pyun Yong-Hyeon <pyunyh@gmail.com>, Brooks Davis <brooks@freebsd.org>, freebsd-rc@FreeBSD.ORG, Garrett Cooper <yanegomi@gmail.com>, Gleb Smirnoff <glebius@FreeBSD.org>, Dag-Erling Smorgrav <des@des.no>, d@delphij.net, Xin LI <delphij@delphij.net>
Subject:   Re: Annoying ERROR: 'wlan0' is not a DHCP-enabled interface
Message-ID:  <LKVlrdfIBdRPFfTmZOlaU48u3P0@g5jH1yj%2BTnAiSdLOy3xs5Jutvhc>
In-Reply-To: <4EF971E4.4050905@FreeBSD.org> <4EF96D7D.3030701@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--JVVqWhpkAs5raV7A
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Mon, Dec 26, 2011 at 11:21:08PM -0800, Doug Barton wrote:
> On 12/26/2011 22:39, Eygene Ryabinkin wrote:
> > This solution will also do the work, but I am slightly concerned
> > that it will
> >=20
> >  - call all netif machinery for interfaces with static IPs:
>=20
> The machinery is not that big/complex.

It is not an argument.  It would be an argument, if this addition will
add the substantial value, so putting the load on the system via the
added netif invocation will worth it.

> > it will be useless for already-configured interfaces;
>=20
> It also won't harm anything.

It just ruined the connectivity of my workstation I am sitting in front of.
I had just changed the devd rule to
{{{
notify 0 {
        match "system"          "IFNET";
        match "type"            "LINK_UP";
        media-type              "ethernet";
        action "/etc/rc.d/netif quietstart $subsystem";
        action "logger /etc/rc.d/netif quietstart $subsystem";
};
}}}

And I had started to experience infinite link flaps on my static
interface:
{{{
Dec 27 11:51:59 rea: /etc/rc.d/netif quietstart msk0
Dec 27 11:52:02 rea: /etc/rc.d/netif quietstart msk0
Dec 27 11:52:02 kernel: msk0: link state changed to UP
Dec 27 11:52:02 kernel: msk0: link state changed to DOWN
Dec 27 11:52:06 rea: /etc/rc.d/netif quietstart msk0
Dec 27 11:52:06 kernel: msk0: link state changed to UP
Dec 27 11:52:06 kernel: msk0: link state changed to DOWN
Dec 27 11:52:09 rea: /etc/rc.d/netif quietstart msk0
Dec 27 11:52:09 kernel: msk0: link state changed to UP
Dec 27 11:52:09 kernel: msk0: link state changed to DOWN
Dec 27 11:52:13 rea: /etc/rc.d/netif quietstart msk0
Dec 27 11:52:13 kernel: msk0: link state changed to UP
Dec 27 11:52:13 kernel: msk0: link state changed to DOWN
Dec 27 11:52:18 rea: /etc/rc.d/netif quietstart msk0
Dec 27 11:52:18 kernel: msk0: link state changed to UP
Dec 27 11:52:18 kernel: msk0: link state changed to DOWN
Dec 27 11:52:21 rea: /etc/rc.d/netif quietstart msk0
Dec 27 11:52:21 kernel: msk0: link state changed to UP
Dec 27 11:52:21 kernel: msk0: link state changed to DOWN
Dec 27 11:52:25 rea: /etc/rc.d/netif quietstart msk0
Dec 27 11:52:25 kernel: msk0: link state changed to UP
Dec 27 11:52:25 kernel: msk0: link state changed to DOWN
Dec 27 11:52:28 rea: /etc/rc.d/netif quietstart msk0
Dec 27 11:52:28 kernel: msk0: link state changed to UP
Dec 27 11:52:28 kernel: msk0: link state changed to DOWN
Dec 27 11:52:32 rea: /etc/rc.d/netif quietstart msk0
Dec 27 11:52:32 kernel: msk0: link state changed to UP
Dec 27 11:52:32 kernel: msk0: link state changed to DOWN
}}}
This is with devd running.

With devd disabled and running 'service netif quietstart msk0' I had
discovered the reason: 'start' for msk(4) makes interface to be
brought down and resurrected back (the logs are from two invocations
of netif to assure myself that the problem is repeatable):
{{{
Dec 27 12:31:27 kernel: msk0: link state changed to DOWN
Dec 27 12:31:31 kernel: msk0: link state changed to UP
Dec 27 12:31:35 kernel: msk0: link state changed to DOWN
Dec 27 12:31:38 kernel: msk0: link state changed to UP
}}}

So, in my case, linkup event triggers devd and 'netif start'
that, in turn, triggers DOWN/UP, so we have while(1)-type loop.

This isn't "won't harm anything"-type of change, isn't it?

> >  - in the case of vlan interfaces, ifconfig dance will be done twice
> >    for each of them: once from the netif for the parent interface and
> >    once for each vlan in turn.
>=20
> Are you certain that the devd.conf trigger will fire when a vlan is up'ed?

Doug, please, do everyone a favor: if you're unsure in something,
check it by yourself first.  This will greatly reduce the number of
such questions.

It is all simple: add variable 'vlans_<ifname>' to rc.conf and set its
value to some numbers, say '1 2'.  Then add 'ifconfig_<ifname>_1' and
'ifconfig_<ifname>_2' saying, for example, 'up'.  Run 'service netif
<ifname> start', unplug the cable and watch the logs for UP/DOWN
interface notifications.  Here are mine with 2 VLANs:
{{{
Dec 27 09:43:57 kernel: sk0: link state changed to DOWN
Dec 27 09:43:57 kernel: sk0.1: link state changed to DOWN
Dec 27 09:43:57 kernel: sk0.2: link state changed to DOWN
Dec 27 09:44:00 kernel: sk0: link state changed to UP
Dec 27 09:44:00 kernel: sk0.1: link state changed to UP
Dec 27 09:44:00 kernel: sk0.2: link state changed to UP
}}}


> > This will just do the work that is useless in all-static configuration.
>=20
> I'm not sure I agree that it's useless. I can actually see this as quite
> handy. Personally I try to be in the habit of adding the configuration
> to rc.conf first, and using netif to start the interface so that I know
> for sure what will happen when that host reboots.

No problems, do it yourself by hand.  But we're talking about devd who
will do this automatically upon the link flap event.  That's useless
and, as was demonstrated by me and Garrett, harmful.

> > Worse, this solution will ruin host's connectivity in the following
> > scenario:
> >=20
> >  - one runs his remote server with all static configuration and strict,
> >    default-to-deny firewall configuration (call this person "Eygene
> >    Ryabinkin");
> >=20
> >  - his upstream provider tells him: listen, we're rearranging our IP
> >    space and you should change IP1 to IP2;
> >=20
> >  - administrator is busy changing the configuration of his host; his
> >    plan is to substitute IP1 to IP2 everywhere and to reboot his
> >    machine to cleanly acquire IP2 and continue operations;
> >=20
> >  - he already substituted IP1 -> IP2 in rc.conf and starts poking
> >    the firewall configuration, but here comes the link down event
> >    due to the $PROVIDER who reconfigures his $CISCO or whatever;
> >=20
> >  - the system ends up in an unusable state, because link up event
> >    will change interface's IP, but firewall isn't ready for this
> >    and isn't allowing connections to IP2, but allows them only for
> >    IP1 that is already gone from the interface due to devd and netif
> >    script.
>=20
> First, I think what you're describing is a pretty small edge case.

Doug, I am sorry, but that's childish: no matter how small is the
probability, this event _will_ happen.  And it will make the
administrator in question to lose the connectivity of his server:
that's not just "I will lose the message to the log".  He will scratch
his head, because it is very unnatural thing in the all-static
configuration.  Once he will find what happened, he won't be satisfied
with the way FreeBSD works in this area, I promise.  That's not
a feature, that's a bug.

Once again, you're trying to tell me: my solution is better, but yes,
it will horribly fail in the minority of cases up to rendering the
remote system unusable in a very unnatural way that can't be predicted
=66rom the common sense, but requires the administrator to know deeply
the internals of how devd.conf is currently organized.  All I can
answer, that such a solution (if I am correct and it will fail in such
a way) is a no go at all, unless $SOMETHING will be fixed to cure the
problems.

> > People may tell me that
> >=20
> >  - Eygene Ryabinkin should run firewall configuration whose knowledge
> >    of IP for the interface is based on the automagic like ipfw's "me"
> >    verb;
> >=20
> >  - Eygene Ryabinkin should not work with the remote host without access
> >    to its physical console via remote KVM or alike;
>=20
> Second, these are both valid points. :)

The first one is, actually, can't be implemented in the general case
when the interface runs many IPs and I require different firewalling
rules for different IPs: ipfw's "me" catches _all_ IPs in the system
and if there will be some macro of sort "addrs(<ifname>)", it will
catch all IPs of the given interface, at best.  pf's '(<ifname>)'
works only at the ruleset load time, so its's not dynamic at all.

> > I am aware of these fine points, however my meat is that static IP
> > configuration is the _static_ one (cool assertion, isn't it?).  But it
> > has at least one consequence: people view their static IP
> > configurations as a really static ones and tend to think that only their
> > direct actions will change them.  So, any non-atomic changes in
> > configuration won't be regarded as a problem: only direct actions that
> > will initiate the reconfiguration of the network interfaces must
> > change the stuff and changes in configuration files that aren't
> > supplemented with such actions must not change anything.
>=20
> I agree that this change will require user education.

It shouldn't, because it has no real gain apart from fixing the DHCP
issue in the way that is different from mine, even if the solution
will be 100% harmless.  And it isn't harmless, that's the problem.

> However 'ifconfig down' and 'ifconfig up' are actually direct
> actions.

What are you trying to say by this?  When firewall is involved,
it is not just 'ifconfig down', 'ifconfig up', it will require
at least 'service ipfw/pf/ipf restart' and 'service routing restart'.

> Users who don't want this can simply comment out the entry
> in rc.conf, or the entry in devd.conf.

Users that want netif in their devd.conf instead of dhclient can
change the devd.conf by themselves.  And, given that your change
 - makes some interfaces to enter the infinite up/down flapping;
 - makes default route to disappear;
it should be written in bold in devd.conf: "don't stick netif
here, unless you want these 'side effects'".

> > Your way to fix the problem adds the possibility of the
> > linkdown/linkup event combo to alter the configuration that is in the
> > process of being changed.  That's unexpected and one can't be ready
> > for it in all situations (though remote console will save some brain
> > cells): it depends on the external factor one can't fully control.
>=20
> In very rare edge cases, yes.

Damn, rather big part of my work consists of these edge cases, when
two "unlikely" factors came into the existence and systems are
behaving in an "improbable" way.  Doug, OS and its infrastructure must
be reliable and work with POLA in mind.

> > Linkup/linkdown events aren't that rare and generally they are not
> > viewed as something unusual that will ruin people's connectivity:
> > as long as L3 layer and above will stay alive, link flaps on L2
> > shouldn't change its operations apart from outages for the flap
> > duration.
>=20
> But it's the combination of "unexpected L2 flap" AND "being in the
> process of making an rc.conf change" that will trigger the problem
> you describe. And once again, if the user doesn't want the change to
> take effect immediately they can comment it out. If no config exists
> for the interface, nothing bad will happen.

Had you heard about Fukishima power plant?

> > So, my motto here is "Static is static, leave it alone and don't
> > make it to depend on the dynamic events; DHCP is the dynamic
> > protocol by its nature, so it can depend on the dynamic events".
>=20
> While I don't agree that the problems you're describing are enough
> of a possibility to be concerned about, the other alternative that I
> considered is for devd.conf to call a wrapper script that first
> determines whether or not it's a DHCP interface, and then calls
> rc.d/dhclient if it is. However, there are a couple of downsides to
> that. First, it's more work. :)  But seriously, one advantage of
> using netif is that it will also work with interfaces that are
> dynamically configured with IPv6. If we were to move to a wrapper
> script idea I'd like to see it support that as well as IPv4 DHCP.

This wrapper script will either duplicate the internals of dhclient
and whatever machinery for IPv6 in the part of determining the
applicability of the dynamical configuration for this interface or it
will blindly call dhclient and other scripts checking if they were
successful or not.

I would not be against the netif route, but it creates serious
problems with
 a) unnatural behaviour of the network stack in the "edge" cases;
 b) constant link flapping for at least msk(4) interfaces;
 c) default routes (essentially, all routes that go through
    the interface in question): they just disappear, at least
    for the msk(4), sk(4) and nfe(4) interfaces.  The demo
    will be provided in my reply to your message with ID
    4EF96D7D.3030701@FreeBSD.org.

On the other hand, documenting the "quiet" semantics and using it
for dhclient to silence the error will
 - fix the issue;
 - allow wrapper script you're talking about to be written in a simple
   way without duplicating the machinery inside dhclient.
--=20
Eygene Ryabinkin                                        ,,,^..^,,,
[ Life's unfair - but root password helps!           | codelabs.ru ]
[ 82FE 06BC D497 C0DE 49EC  4FF0 16AF 9EAE 8152 ECFB | freebsd.org ]

--JVVqWhpkAs5raV7A
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (FreeBSD)

iF4EAREIAAYFAk75j1sACgkQFq+eroFS7PsJrAEAjjHsVae7/3xuker+gQCsRbyw
9D5nTYg/NS0Jvbq1sbIA/jYCjlVyz7NOuForzDa8JW5w11R3aN/Sfzi90dQ66+X7
=eOVp
-----END PGP SIGNATURE-----

--JVVqWhpkAs5raV7A--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?LKVlrdfIBdRPFfTmZOlaU48u3P0>