Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 21 Apr 2002 00:30:14 -0700
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Brett Glass <brett@lariat.org>
Cc:        chat@freebsd.org
Subject:   Re: How to control address used by INADDR_ANY?
Message-ID:  <3CC26A86.B702FF8@mindspring.com>
References:  <4.3.2.7.2.20020419144005.0358c610@nospam.lariat.org> <4.3.2.7.2.20020419152309.035a96d0@nospam.lariat.org> <4.3.2.7.2.20020420112056.021aaec0@nospam.lariat.org> <4.3.2.7.2.20020420205440.021f37b0@nospam.lariat.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Brett Glass wrote:
> At 08:35 PM 4/20/2002, Terry Lambert wrote:
> >No, you are talking about a program that operates as a proxy;
> 
> Only in the particular case of the caching proxy. I want to
> be able to run other programs there that are not proxies at
> all and have them communicate properly with the Net.

You are still talking about outbound connections from a
machine that's supposedly acting as a router.


> >How about this: if you can come up with an algorithm that will
> >"do what I want" for you for all cases, without crippling the
> >fast path, how about you tell us, and we can think about
> >implementing it?
> 
> OK, here goes. First, let's restate the problem. The stack is
> choosing a source address for an outbound socket opened with
> a source address of INADDR_ANY based on the routing table at
> the time the socket is opened. It's saying, "Let's use one of
> our addresses on the subnet to which we'd route outbound packets
> which are headed for the destination address."

No.  It's not.  The 10.x net is *NOT* "the same subnet".  The
thing you were complaining about is that the source address
being picked is on the 10.x net.

The address to send is based on *the interface*, not on the IP
address.  I can't really stress this enough (apparently).


> As I understand it, you can do this on a Cisco router by saying
> that the interface is up but "unnumbered." But this isn't a
> perfect solution either. (For example, in my case, when
> I want to send a packet to a device inside the ISP's intranet,
> I want my machine to know that it has an address on that
> subnet, respond to ARP "who-has" requests, be able to treat
> them as local, etc.) What I want is to specify that processes on
> my local machine not use source addresses from that subnet even
> if they specify INADDR_ANY. To do this, I want to be able
> to do one of three things:

This is contradictory:

o	when I want to send a packet to a device inside the ISP's
	intranet, I want my machine to know that it has an address
	on that subnet, respond to ARP "who-has" requests, be able
	to treat them as local, etc.)

o	specify that processes on my local machine not use source
	addresses from that subnet even if they specify INADDR_ANY

Pick only one.

The suggestion Matt made is to not put the IP address for the
ISP subnet on the same interface as the routable IP address.
He is correct: you need to do it that way.


> 1) Tag an address assigned to an interface as being disqualified
> from the selection process. (It might be useful to turn this bit
> on by default when one assigns an unregistered address
> to an interface.) This would solve the problem automagically
> in the case of intranets that use unregistered addresses (such
> as many corporate WANs or the network of the ISP I'm dealing with).
> But it wouldn't interfere with NAT.

I already suggeted that the 10.x address should be the alias,
and that the routable address should be the non-alias address.
Have you tried this yet?

When Matt siggested the use of a /32 address... all alias addresses
are considered to be whatever the canonical netmask is, regardless
of the fact that you are required to use a 255.255.255.255 netmask
(i.e. "a /32") on all aliases.

For it to work, then, you *must* use a different interface.


> 2) Specify a default source address for processes on the local
> machine that is independent of the routing table. (An "auto" option,
> indicated by an address of 0.0.0.0, would bring back the old
> algorithm.) This would actually speed up the opening of sockets,
> since no scan of the routing table would be required to pick a
> source address for the socket. And it wouldn't violate the
> semantics of INADDR_ANY.... The process opening the socket has
> indicated no preference, so why not let the administrator specify
> one?)

This may not work for Intranet connections.  It will certainly *not*
speed the routing, since you will *always* have to create a clone
route for any connection, period, and you've just added extra
initialization requirements.

This also breaks, in the face of "most specific match" routing
rules for non-default routed networks.

You *might* be able to specify the source IP for the default
route -- this is, in fact, the first think I suggested, when
I suggested that the inpcb route entry should be hacked -- but
it will mean slightly higher overhead on connection establishment,
in an area where connections-per-second is one of the primary
performance metrics.


> 3) Do both of the above, since each might be useful in specific
> situations.

I don't think your #1 can *ever* work liike you want it to.


> >So you are trying to make the FreeBSD box act like a Cisco router.
> 
> Not really. But why should there be things that a Cisco router
> can't do that a FreeBSD router can't?

I told you: there are bugs.  Patches welcome.

Also, you are trying to turn two boxes into one box, without
seperating the interface space, like you are supposed to, and
without modifying the applications to explicitly ignore one
space, like you are supposed to.


> >As I said before, FreeBSD only considers the destination address
> >when deciding on a route, and this is technically the wrong thing
> >to do for this particualr weird setup you have.
> 
> It's really doing something different than deciding on a route....
> It's deciding what source address to drop into the socket. Again,
> since a socket is defined by the tuple {source address, source
> port, destination address, destination port} and the source address
> must be one to which the other machine can reply (it can't
> be INADDR_ANY, which is zero), the machine must pick something
> at the time the socket is opened and then stick with it,
> come what may. Even if the routing table changes so that a
> different address would have been picked at a later time.

There's an expectation that you will not be mixing routable
and non-routable addresses on the same interface.  That means
that any IP address seen on an interface that is a route will
be reachable from the subnet represented by any of the IP
addresses on the interface.

To put this in simple terms: One interface, one subnet.


> >Another thing that would likely work is to not try to run the
> >proxy services on the same machine that's acting as the router.
> 
> It's the best place to run them, especially when one is doing
> interception caching. The CPU load from the routing is relatively
> light, so there are plenty of resources available for the caching.
> Didn't the InterJet do this?

The InterJet modified its daemons to do explicit bindings to
addresses, rather than using INADDR_ANY.

If I recall correctly, I told you to do that already.  8-).


> >Your information is insufficient.  A block diagram, with a dotted
> >line around the blocks you expect to jam into a single FreeBSD
> >box would be useful.
> 
>                         +-----------------+
>                         |  Router         |    Routable
> Rest of world (via  ----| +--------------+|--- Subnet 1
> Subnet 10 intranet)     | |Interception  ||
>                         | |  caching     ||
>                         | |--------------||--- Routable
>                         | | RBL zone xfer||    Subnet 2
>                         | |--------------||      etc.
>                         | | DNS, DHCP    ||
>                         | |--------------||
>                         | |   sshd       ||
>                         | |--------------||
>                         | | Utilities for||
>                         | | maintenance  ||
>                         | | (e.g. ftp,   ||
>                         | |CVSup, scp    ||
>                         | +--------------+|
>                         +-----------------+

Yeah; here's your problem: the interface space isn't treated as
physically or logically seperate, like it needs to be.  Either
physically seperate it by using different interfaces for different
subnets (e.g. tun0 vs. de0), or logically seperate it by modifying
the daemons to not bind to INADDR_ANY.

I think you also misunderstood my question, so I'll restate it:

	"How does a client installation normally look, when they
	 install it, and there's no FreeBSD box involved?"


> In other words, really just some basic firewall functions and
> interception caching. The box might not actually be running
> DNS or DHCP in all cases, but it'd be nice to be able to, so
> I've added them to the diagram. Note that to do DNS zone
> transfers, CVSup, FTP, or scp as a client, the box will need to
> be able to get to the rest of the world from local processes.

THe sockets2 working group has suggested explicit "from" address
functions for handling this case.


> >It really feels like you want something to work that we will all
> >say "you can't expect that to work!",
> 
> It's perfectly reasonable to expect it to work. The only curve
> I've thrown it that it has not been able to handle is that its
> upstream network is an intranet with unroutable addresses. Not
> unusual either within ISPs or within corporate WANs.

No.  You also put the routable and non-routable addresses on
the same interface.

While it's often ignored, let me say again, for the record: it
is technically not legal to run multiple subnets on the same
wire, and, in fact, switches like the old Extreme Networks
switch that supports only a single IP per port are going to spend
a significant fraction of your bandwidth replacing cached ARP
entries on the switch ports.


> >Arrrrrrrrgh!  Why don't you just post what I post back at
> >me, instead of saying the same thing, and spin-doctoring it?
> 
> You're annoyed that I agree with you? ;-)

No, that you are paraphrasing what I said in order to try to
refute what I said.


> >> Would it affect the "fastpath?" As I understand it, a socket's source
> >> address is defined when it's opened and stays that way thereafter.
> >> (Correct me if I'm wrong there, but isn't a socket uniquely defined during
> >> its lifetime by the tuple of {source address, source port, destination
> >> address, destination port}?) All that would need to be altered would be
> >> the *initial* decision about the source address used. Right?
> >
> >Not as such.  It's uniquely identified at the host by the IP/port
> >destination tuple, *NOT* by *both* the source and destination
> >tuples.
> 
> This can't be. When several clients connect to a server's well-known
> port at the server, the sessions must be distinguished by the source
> addresses and ports. If the source addresses are allowed to change,
> the server won't know which client (or session) is which!

The initial source address decision does not take both tuples into
effect.  It is a routing decision based only on the destination
tuple.  Period.


> >In order to hack this properly, you would have to put two more
> >compares in the tcp_output and ip_output code,
> 
> An option that assigned a fixed source address couldn't help but be
> a net win, because it would avoid a traversal of a linked list of
> routing table entries. Likewise, disqualifying an address would mean
> having a flag or leaving that routing table entry off a linked list.

You can't clone something until you find it, and you can't find it
until you traverse.

You would still have to traverse for the destination, determine that
it was via the default route, and then traverse again until you found
a source that was valid.  So for N total and M on the interface to
the default route (N >> M), then instead of O(N), we're talking
O(N) + O(M).

Personally, I don't want to take that hit, just because someone is
unwilling to call "bind" on their outbound connections, and that call
to "bind" is being required because they are trying to illegally put
two subnets on the same interface (one routable, the other not), and
they are not using the "tun" interface for the main application that
it was designed for.


> The latter would actually speed up the scan of the list because the
> list would be shorter!

No, it wouldn't.  It would slow it down, because you are adding a
match criteria.  See above.


> And one way to implement the fixed source address
> would be to aim the pointer to the linked list at a list with exactly
> one entry: one specifying the fixed address. This would introduce no
> more compares. So, you see, there's no need to add cycles. It's just
> a matter of determining whether adding a compare would be a net win. (It
> might be, if it shortened the code path most of the time.)

No,  This will not work.  You *must* still differentiate between
connections that are to a local subnet, vs. those which must be
forwarded via the default route.  There is no other choice.

The more I think about it, the more evil your ISP's configuration
becomes.  I understand the desire to conserve the published IP
address space.  It seems like using unnumbered interfaces would
be "a" way to do this, but not "the" way to do it.


> >or you would have
> >to hack up the "clone" route to have different precedence ordering
> >of the IP addresses associated with the interface.  The new SYN
> >cache code complicates this type of hack considerably.
> 
> I don't understand. Why would this be? The SYN cache code deals with
> sockets on which the machine is listening, not outbound sockets.

Because the decision has to be made at the time the SYN-ACK is
sent by the host receiving a connection request, so that it can
claim the proper source address, of course.

You are acting like sockets are bound to interfaces, instead of
IP addresses.  While this would be incredibly useful for a lot
of things (including not having to kick your daemons in the head,
should an IP address change out from under them), this is not
how it works, except for INADDR_ANY.


-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-chat" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3CC26A86.B702FF8>