Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 1 Aug 2016 08:16:14 -0300
From:      "Dr. Rolf Jansen" <>
Cc:        Julian Elischer <>
Subject:   Re: ipfw divert filter for IPv4 geo-blocking
Message-ID:  <>
In-Reply-To: <>
References:  <> <> <> <> <> <> <> <> <> <> <> <> <> <> <> <>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
> Am 01.08.2016 um 03:17 schrieb Julian Elischer <>:
> On 30/07/2016 10:17 PM, Dr. Rolf Jansen wrote:
>> I finished the work on CIDR conformity of the IP ranges tables =
generated by the tool geoip. The main constraint is that the start and =
end address of an IP block given by the delegation files MUST BE =
PRESERVED during the transformation to a set of CIDR records. This =
target is achieved by:
>>  1. Finding the largest common netmask boundary of the start address =
>>     int(log2(addr_count)); then iteration like Euclid's algorithm in =
>>     a GCD.
>>  2. Output the CIDR with the given start address and the masklen =
>>     to the found netmask.
>>  3. If the CIDR does not match the whole original IP range then set =
the start
>>     address of the next CIDR block to the next boundary of the common =
>>     and loop over starting at 1. until the original range has been =
> check out the appletalk code I pointed out  to you.. I wrote that in =
93 or so but I remember sweating blood
> over it to get it right.

I read the description of the code and the following sentence made me =
suspicious that aa_dorangeroute() would guarantee the above mentioned =
main constraint  "start and end address of an IP block given by the =
delegation files MUST BE PRESERVED" can be matched. Start/end address =
are said to be anything (even undefined) but fixed in the description.

   Split the range into two subranges such that the middle
   of the two ranges is the point where the highest bit of difference
   between the two addresses makes its transition.

I do not want this.

>> I carefully tested the algorithm and a table that I pipe by the new =
geoip tool into ipfw is 100 % identical to the output of the ipfw =
command 'table N list'.
> though that doesn't mean it is semantically identical to the original =
table due to 'most specific rule wins" behaviour.
> for example:
> if you type in ;
> -> A
> and
> -> B
> then both rules will be listed the same as what you put in
> but if you wanted to get all rules that point to A, without having =
rules that point to B, then you would have to export
>  -> A
> -> A
>  (i.e. TWO rules)

This is definitely not the usage case. The origin of the data to be =
passed to ipfw tables are RIR delegation statistics files, which is =
guaranteed to be consolidated, namely resolved overlaps and joined =
adjacencies, long before any tables for ipfw are generated. Each range =
entry got a well defined, i.e. fixed, i.e non-variable starting address, =
and anything that changes the starting address of the ranges renders the =
table useless. Every entry got a well defined range length, and that one =
also must not be changed, or the table would be useless as well.

In addition, we are talking about automatic generation of thousands of =
entries, and I never ever won't rely on something like 'most specific =
rule wins' behaviour, I want the behaviour as explicit as possible, and =
for this reason I am happy with 'INPUT is 100 % identical to the =

> you could also export
> -> A
> -> 0  (think of it as an "EXCEPT for these" rule)
> which is ALSO two rules but you would need to be sure that the =
receiver knows what to do with them.

This is simply a ridiculous example in the given respect, this sounds =
like you are suggesting fuzzying the input data in order to bring ipfw =
to its limits. This makes life less boring, doesn't it? No thanks.

>> It is worth to note, that already the original RIR delegation files =
contain 457 non CIDR conforming IPv4 ranges in a total of 165815 =
original records. I guess that this number will increase in the future =
because the RIR's ran empty on new IPv4 ranges and are urged to =
subdivide returned old ranges for new delegations. The above algorithm =
is ready for this.
>> Generally, CIDR conforming tables are more than twice as large as =
optimized (joined adjacencies) IP range tables. All said changes have =
been pushed to GitHup already.
> Unfortunately there is no way to specify (using cidr notation) a.b.1.x =
AND a.b.2.x without including a.b.[03].x.
> if you specified the FULL table you could use the "except" feature of =
routing table behaviour where
> a.b.0.x/22  -> A
> a.b.0.x/24  -> B
> a.b.3.x/24  -> B
> gives you the same thing because of the 'most specific rule wins" =
nature of routing table evaluation.
> I believe this is the case in the tables you imported.
> the trick is to be able to take an "optimised" table such as that =
above and produce, given a required subset, just the required part, =
while changing the rules as needed on the fly to "de-optimise" them =
enough to maintain correctness.

Again, this is not the usage case.

>> I am still a little bit amazed how ipfw come to accept incorrect CIDR =
ranges and arbitrarily moves the start/end addresses in order to achieve =
CIDR conformity, and that without any further notice, and that given =
that ipfw can be considered as being quite relevant to system security. =
Or, may I assume that ipfw knows always better than the user what should =
be allowed or denied. Otherwise, perhaps I am the only one ever who =
input incorrect CIDR ranges for processing by ipfw.
> I answered this before but can't see the answer in my out box, plus I =
have added info..
> The ipfw code is derived from the routing code.  it is shorthand =
notation for a.b.c.d [netmask e.f.g.h ]
> there is nothing that says that a.b.c.d need be the first address in =
the range. (though some vendors may require that.)
> to quote wikipedia on the topic (yes, I know, not an authoritative =
> =3D=3D=3D=3D quote =3D=3D=3D=3D
> The address may denote a single, distinct interface address or the =
beginning address of an entire network. The maximum size of the network =
is given by the number of addresses that are possible with the =
remaining, least-significant bits below the prefix. The aggregation of =
these bits is often called the host identifier.
> For example:
> 	=95 represents the IPv4 address =
and its associated routing prefix, or equivalently, its =
subnet mask, which has 24 leading 1-bits.
> I use this all the time when parsing information that contains a =
hostname, and I know the netmask width. It saves me from having to have =
complicated shell code to pull apart the address and zero out the host =
bits of the address.

I got it, anyway this is not an issue anymore for the new geoip table =

Best regards


Want to link to this message? Use this URL: <>