Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 30 Dec 2009 13:30:01 -0800
From:      Julian Elischer <julian@elischer.org>
To:        Ian Smith <smithi@nimnet.asn.au>
Cc:        Luigi Rizzo <rizzo@iet.unipi.it>, net@freebsd.org
Subject:   Re: RFC: documented and actual behaviour of "ipfw tee"
Message-ID:  <4B3BC659.7010707@elischer.org>
In-Reply-To: <20091230221119.L81420@sola.nimnet.asn.au>
References:  <20091230002447.GA55727@onelab2.iet.unipi.it> <4B3AA290.8000508@elischer.org> <20091230221119.L81420@sola.nimnet.asn.au>

next in thread | previous in thread | raw e-mail | index | archive | help
Ian Smith wrote:
> On Tue, 29 Dec 2009, Julian Elischer wrote:
>  > Luigi Rizzo wrote:
>  > > There a difference between the documented and actual behaviour of
>  > > "ipfw tee" which occurs when there are multiple rules with the same
>  > > number, e.g.
>  > > 
>  > > 	rule_id number  body
>  > > 	r1      500     tee port1 dst-ip 1.2.3.0/24
>  > > 	r2      500     tee port2 dst-ip 1.2.4.0/24
>  > > 	r3      500     accept ip from any to any
>  > > 	r4      510     count ip from any to any
>  > > 
>  > > + the manpage says "processing continues with the NEXT RULE"
>  > >   (so after r1 we have r2, then r3, ...);
>  > > + the implementation behaves as "processing continues with the
>  > >   NEXT NUMBERED RULE" (ie. after 500 continues with 510).
>  > > 
>  > 
>  > TEE should go to the next RULE with the original packet, but if
>  > you reinject the tee'd copy of the packet it should go to the
>  > next rule NUMBER.
> 
> Which is what happens now, right?  Same behaviour on tee reinjection as 
> divert does seem consistent.  So if there is a problem, it's only with 
> the original packet continuing with the next rule if same-numbered?

from Luigi's description I'm not sure what happens now.. :-)

teh two cases are different.
Processing with the original packet acts as if the rule had done nothing.
Processiong with a reinjected packet acts the same as a reinjected 
divert packet.. i.e. next rule NUMBER not next rule.


> 
>  > > The actual behaviour is an artifact of how "divert" is implemented:
>  > > diverted packet only carry the rule number so we cannot tell, on a
>  > > reinject, which of the rules numbered "500" matched, and we restart
>  > > from the next one. Tee was implemented as an extension of divert.
> 
> It seems fair that tee act the same as divert on reinjection, and this 
> can't be changed without breaking existing divert socket code eg natd?

it's also the only way it can work really. It can't tell the
  difference between rules with the same number.


> 
>  > > Skipping rules in my opinion is very unintuitive, but there is
>  > > no way to fix it (unless we extend the API) as the rule_id is only
>  > > known within the kernel.
>  > > 
>  > > For 'tee', however, packets  the situation is different because the
>  > > copy of the packet that remains in the kernel does not lose knowledge
>  > > of the matching rule so we can easily continue from the very next
>  > > rule, same as it happens for dummynet packets with one_pass=0 (and
>  > > tee'd netgraph packets, which I think already do "the right thing").
> 
> Hmm.  After divert you can match 'diverted' to distinguish reinjected 
> packets later.  Does/can/should this apply to reinjected tee'd packets?

yes, there is no such thing as a reinjected tee packet as teh user app 
can't tell if it was diverted or teed.

> 
> Similarly perhaps, with a set of same-numbered nat rules, are mapped 
> packets 'reinjected' at the next rule, or the next higher-numbered rule?

I think NAT processing in the kernel can keep track od where it is up 
to, so next RULE. (differnet from userland nat via divert).


> 
>  > > Since I am doing some work in this are of the code, I'd like to ask
>  > > opinions on how to proceed:
>  > > 
>  > >     A. preserve the current behaviour and fix the manpage;
> 
> I tend to this, though probably not knowing all the ramifications, 
> especially not having played with ng_ipfw stuff at all.
> 
> So for A, here's what we have, with suggested clarification in []:
> 
>      divert port
>              Divert packets that match this rule to the divert(4) socket bound
>              to port port.  The search terminates.  [Reinjected packets continue
>              at the next higher-numbered rule.]
> 
>      tee port
>              Send a copy of packets matching this rule to the divert(4) socket
>              bound to port port.  The search continues with the next rule.
>              [Reinjected packets continue at the next higher-numbered rule.]

yes

> 
>  > >     B. fix the code to behave as the manpage says;

> 
> Seems it's already correct regarding the original packet, and just needs 
> clarifying re the reinjected packets, if I'm following this right?


I think the man page should reflec the behavious mentionned above.
i.e. copy and original packets continue at differnet rules.


> 
>  > >     C. introduce a sysctl to choose between A and B.
>  > > 	Of course this moves the problem on which default
>  > > 	to choose :)

no

>  > > 
>  > > Because it is a very special case that I doubt many people have hit,
>  > > I'd be inclined to do B and consider the old behaviour a bug.

no the original behaviour was not accidental.
They were never going to come to teh same rule unless the next rule is 
on a different number.



> 
> Mike Makonnen's ipfw-classifyd can reinject packets at specified rule 
> numbers by tcp/udp port classification by updating the tag/number, and 
> has the same issue.  There was some confusion there too regarding this,
> that I think a man clarification may have helped avoid.
> 
> I'm also a bit confused by apparent overloading of one_pass function for 
> dummynet pipe, netgraph, ng_tee and now nat too.  What if you want to 
> do kernel nat but wanted one_pass behaviour for pipes?  Separate issue 
> but similar distinction between divert vs in-kernel behaviour maybe?

yes that has sort of worried me too, but I haven't hit it in practice 
(yet).

> 
> FWIW, Ian




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4B3BC659.7010707>