Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 03 Jan 2008 21:12:12 +0600
From:      "Vadim Goncharov" <vadimnuclight@tpu.ru>
To:        "Julian Elischer" <julian@elischer.org>, "Ivo Vachkov" <ivo.vachkov@gmail.com>
Cc:        arch@freebsd.org, FreeBSD Net <freebsd-net@freebsd.org>, Robert Watson <rwatson@freebsd.org>, Qing Li <qingli@freebsd.org>
Subject:   Re: resend: multiple routing table roadmap (format fix)
Message-ID:  <opt4c0imk24fjv08@nuclight.avtf.net>
In-Reply-To: <477416CC.4090906@elischer.org>
References:  <4772F123.5030303@elischer.org> <f85d6aa70712261728h331eadb8p205d350dc7fb7f4c@mail.gmail.com> <477416CC.4090906@elischer.org>

next in thread | previous in thread | raw e-mail | index | archive | help
28.12.07 @ 03:19 Julian Elischer wrote:

> By the way, I might add that in the 6.x compat. version I may end up
> limiting the feature to 8 tables. This is because I need to store some
> stuff in an efficient way in the mbuf, and in a compatible manner this  
> is easiest done by stealing the top 4 bits in the mbuf dlags word
> and defining them as:
>
>   #define M_HAVEFIB	0x10000000
>   #define M_FIBMASK	0x07
>   #define M_FIBNUM	0xe0000000
>   #define M_FIBSHIFT	29
>   #define m_getfib(_m, _default) ((m->m_flags & M_HAVE_FIBNUM) ?  
> ((m->m_flags >> M_FIBSHIFT) & M_FIBMASK) : _default)
>   #M_SETFIB(_m, _fib) do { \
>     _m->m_flags &= ~M_FIBNUM; \
>     _m->m_flags |= (M_HAVEFIB|((_fib & M_FIBMASK) << M_FIBSHIFT));\
> } while (0)
>
> This then becomes very easy to change to use a tag or
> whatever is needed in later versions , and the number can
> be expanded past 8 predefined  FIBs at that time..

If you want it to be a tag, why spent bits in m_flags and not just do it  
as a tag at once? Or it is supposed to completely throw away 6.x (possibly  
7.x too) implementation in favor of right thing in 8.0 ?

>>> This brings us as to how the correct FIB is selected for an outgoing
>>> IPV4 packet.
>>>
>>> Packets fall into one of a number of classes.
>>> 1/ locally generated packets, coming from a socket/PCB.
>>>     Such packets select a FIB from a number associated with the
>>>     socket/PCB. This in turn is inherited from the process,
>>>     but can be changed by a socket option. The process in turn
>>>     inherits it on fork. I have written a utility call setfib
>>>     that acts a bit like nice..
>>>
>>>         setfib -n 3 ping target.example.com # will use fib 3 for ping.

Pretty cool!

>>> 2/ packets received on an interface for forwarding.
>>>     By default these packets would use table 0,
>>>     (or possibly a number settable in a sysctl(not yet)).
>>>     but prior to routing the firewall can inspect them (see below).
>>>
>>> 3/ packets inspected by a packet classifier, which can arbitrarily
>>>     associate a fib with it on a packet by packet basis.
>>>     A fib assigned to a packet by a packet classifier
>>>     (such as ipfw) would over-ride a fib associated by
>>>     a more default source. (such as cases 1 or 2).

Sounds good. I like idea to do routing decisions in firewall, to not  
double kernel code and userspace utilities, like in Linux' iproute2  
(which, however, still have a few parameters and relies on firewall marks  
for others). However, there are some cases, I think, where it could be  
done outisde firewall. For example, make an ifconfig option to use a  
specific FIB as a default for all packets outgoing from this interface's  
address. But here arises another related question - Linux allows to select  
a specific src IP based on a routing table entry - destination address  
(thoughts about pf reply-to/route-ro, huh). In relation to this I can  
remember multipath routing (different metrics?), addresses from one subnet  
on different ifaces (mask wider /32) and so on.
Also it is interesting, how multiple FIBs would interact with host-wide  
events, such as ICMP redirects (which table should be updated?), storing  
of TCP stack metrics (MTU, etc.) and hostcache, and so on. How these and  
above will be solved?..

per ifconfig (>1 host per subnet)/icmp redirects/src to prefer,  
multipath/metrics, tcp stack parameters interaction, iproute2

>>> Routing messages would be associated with their
>>> process, and thus select one FIB or another.

This is not clear. How should the 'route' command work with different  
FIBs, if they are supposed by admin to be used for forwarding, and not the  
straight per-process? I think a setfib option is more consistent than  
running route under setfib command. Also, routing sockets and routing  
daemons - should they work with only one table?..

>>> I have not yet added the changes to ipfw.

Action modifier, like 'ipfw add count setfib 3 ip from any to any' ? There  
were thoughts (I heard,t as a hack before multiple FIBs) about making an  
additional, say, 'nexthop' ipfw action, which acts like fwd, but does not  
accept packet, allowing to continue it through firewall ruleset - thus  
making it more comfortable to separate routing (imagine 'nexthop  
tablearg') and filtering. There are questions with both fwd and new  
supposed option: will fwd still survive? Will it change the output  
interface, like as complete rerouting before calling pfil(9) hooks, so  
that *oif will be changed to be mathed iin rules below? pf  
route-to/reply-to is hanging around...

>>> pf has some similar changes already but they seem to rely on
>>> the various FIBs having symbolic names. Which I do not plan to support
>>> in the first version of these changes.

I think this is what pf team should care about, not we, as it lives in  
../contrib. Though they can use something like sysctl with  
symbolic-name-to-system-FIB-number translator or such.

>>> Interaction with the ARP layer/ LL layer would need to be
>>> revisited as well. Qing Li has been working on this already.

Oh yes, L2 interaction is interesting. How it should work in case of  
planned separation of routing and ARP tables?..

-- 
WBR, Vadim Goncharov



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?opt4c0imk24fjv08>