Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Jul 2016 01:51:24 +1000 (EST)
From:      Ian Smith <smithi@nimnet.asn.au>
To:        Julian Elischer <julian@freebsd.org>
Cc:        "Dr. Rolf Jansen" <rj@obsigna.com>, Mike Makonnen <mtm@freebsd.org>, freebsd-ipfw@freebsd.org
Subject:   Re: ipfw divert filter for IPv4 geo-blocking
Message-ID:  <20160728004622.T29054@sola.nimnet.asn.au>
In-Reply-To: <4d76a492-17ae-cbff-f92f-5bbbb1339aad@freebsd.org>
References:  <61DFB3E2-6E34-4EEA-8AC6-70094CEACA72@cyclaero.com> <CAHu1Y739PvFqqEKE74BjzgLa7NNG6Kh55NPnU5MaA-8HsrjkFw@mail.gmail.com> <4D047727-F7D0-4BEE-BD42-2501F44C9550@obsigna.com> <c2cd797d-66db-8673-af4e-552dfa916a76@freebsd.org> <9641D08A-0501-4AA2-9DF6-D5AFE6CB2975@obsigna.com> <4d76a492-17ae-cbff-f92f-5bbbb1339aad@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 27 Jul 2016 10:03:01 +0800, Julian Elischer wrote:
 > On 27/07/2016 3:06 AM, Dr. Rolf Jansen wrote:
 > > > Am 26.07.2016 um 13:23 schrieb Julian Elischer <julian@freebsd.org>:
 > > > On 26/07/2016 1:41 AM, Dr. Rolf Jansen wrote:
 > > > > Once a week, the IP ranges are compiled from original sources into a
 > > > > binary sorted table, containing as of today 83162 consolidated range/cc
 > > > > pairs. On starting-up, the divert daemon reads the binary file in one
 > > > > block and stores the ranges into a totally balanced binary search tree.
 > > > > Looking-up a country code for a given IPv4 address in the BST takes on
 > > > > average 20 nanoseconds on an AWS-EC2 micro instance. I don't know the
 > > > > overhead of diverting, though. I guess this may be one or two orders of
 > > > > magnitudes higher. Even though, I won't see any performance issues.

 > > > yes the diversion to user space is not a fast operation. When we wrote
 > > > it, fast was 10Mbits/sec.
 > > > The firewall tables use a radix tree (*) and might be slower than what
 > > > you have, but possibly it might be made up for by not having to do the
 > > > divert logic. it's not entorely clear from your description why you look
 > > > up  a country rather than just a pass/block result, but maybe different
 > > > sources can access different countries?.

 > > The basic idea was to develop a facility for ipfw for filtering IPv4
 > > packets by country code - see: https://github.com/cyclaero/ipdb
 > > 
 > > I simply put into /etc/rc.conf:
 > > 
 > >     geod_enable="YES"
 > >     geod_flags="-a DE:BR:US"
 > > 
 > > The -a flag tells, that source IP addresses only from these countries are
 > > allowed (i.e. passed through the filter). I added also a -d flag, which
 > > means deny (i.e. drop packets) from the given list of countries.
 > > 
 > > With that in place, I need to add a respective divert rule to the ipfw
 > > ruleset (the divert port of the geod daemon is 8669, remembering that 8668
 > > is the port of the natd daemon):
 > > 
 > >      ipfw -q add 70 divert 8669 tcp from any to any 80,443 in recv WAN_if
 > > setup
 > > 
 > > > I did similar once using ipfw tables but couldn't find a reliable source
 > > > of data.

 > > The IP/CC database is compiled from downloads of the daily published
 > > delegation statistics files of the 5 RIR's. I consider the RIR's being the
 > > authoritative source. Anyway, on my systems the IP/CC-database is updated
 > > only weekly, although, daily updating would be possible. I wrote a shell
 > > script for this, that can be executed by a cron job.
 > > 
 > >      https://github.com/cyclaero/ipdb/blob/master/ipdb-update.sh
 > > <https://github.com/cyclaero/ipdb/blob/master/ipdb-update.sh>;
 > > 
 > > There is another tool called geoip , that I uploaded to GitHub, and that I
 > > use for looking up country codes by IP addresses on the command line.
 > > 
 > >      https://github.com/cyclaero/ipdb/blob/master/geoip.c
 > > 
 > > This one could easily be extended to produce sorted IP ranges per CC that
 > > could be fed into tables of ipfw. I am thinking of adding a command line
 > > option for specifying CC's for which the IP ranges should be exported,
 > > something like:
 > > 
 > >     geoip -e DE:BR:US:IT:FR:ES
 > > 
 > > And this could print sorted IP-Ranges belonging to the listed countries.
 > > For this purpose, what would be the ideal format for directly feeding the
 > > produced output into ipfw tables?

 > The format for using tables directly is the same as that used for routing
 > tables.
 > so imagine that you had to generate a routing table that sent packets to two
 > different routers depending on their source.
 > 
 > here's a simple rule set that filters web traffic by such a 'routing table'
 > except it's routing to two different rules. It also sorts OUTGOING web
 > traffic to the same rules.
 > 
 > ipfw -q /dev/stdin <<-DONE
 > # we hate this guy
 > table 5 add 1.1.1.0/32 1000

Yeah, I've had trouble with him too ..

 > # but all ow our people to visit everyone else in that subnet
 > table 5 add 1.1.0.0/24 2000
 > # we block 1.1.2.0 through 1.1.3.255
 > table 5 add 1.1.2.0/23 1000
 > # but we allow 1.1.4.0 through to 1.1.7.255
 > table 5 add 1.1.4.0/22 2000
 > # etc
 > table 5 add 1.1.8.0/21 1000
 > table 5 add 1.2.0.0/16 1000

 > table 5 add 0.0.0.0/0 2000 # default

Now this was news to me.  It's obvious, especially knowing it uses the 
same radix table method as does routing, but never occurred to me.  Ta!

 > check-state          # If we already decided what to do,  do it
 > # select out only external traffic, into direction specific rules.
 > add 400 skipto 500 ip from any to any in recv WAN_if
 > add 410 skipto 700 ip from any to any out xmit WAN_If
 > add 320 skipto 10000		# a 420 moment?
 > # Incoming packets
 > add 500 skipto tablearg tcp from table(5) to any 80,443 setup keep-state  #
 >  sort tcp setup packets between rules 1000 and 2000
 > add 600 skipto 10000
 > # outgoing packets
 > add 700 skipto tablearg tcp from any  to table(5) 80,443 setup keep-state  #
 >  sort tcp setup packets between rules 100 and 2000
 > add 800 skipto 10000
 > add 1000 drop ip from any to any
 > add 2000 allow ip from any to any
 > # further processing
 > add 10000 .. # further processing for non tcp
 > DONE
 > 
 > for full configurability you could have a rule for each country, and a number
 > for it in the table:
 > table 5 add 150.101.0.0/16 10610     # Australia
 > [...]
 > add 10610 block tcp from any to any 445  # only allow non encrypted web to
 > those Aussie scum.

Indeed :)

 > add 10611 allow ip from any to any
 > 
 > then by changing the rules at that location you could change the policy for a
 > country without changing everything else.
 > (the downside is that dynamic skipto's are not very efficient as they do a
 > linear search of the rules, where static skiptos cache the location of the
 > rule to skip to. it's not a terrible cost but it needs to be  kept in mind.
 > (but faster than a divert socket)

I forget .. is that linear search from the beginning, or from the 
position of the rule querying the table?  Just thnking about grouping 
skipto target rules to minimise traversal.  These targets in turn could 
use static skiptos that will be cached.

 > your application becomes an application for configuring the firewall.
 > (which you do by feeding commands down a pipe to ipfw, which is started as
 > 'ipfw -q /dev/stdin')

I went looking though ports for ipfw-classifyd, which attracted my 
interest in 2008, but seems never to have made it to ports.  Written by 
Mike Makonnen <mtm@FreeBSD.Org> (cc'd), it uses divert sockets with the 
linux- based 'l7' filters for detecting traffic from a wide array of UDP 
and TCP protocols, with the primary intent then of detecting various P2P 
traffic and shunting it through dummynet pipes for bandwidth limiting.

It used a technique of modifying the (passed) ipfw rule number so that 
when returning packets to ipfw, depending on classification, it would 
(re)start at specific rule numbers which were the values assigned to 
particular protocols of interest .. which is kind of analagous to using 
tablearg values as skipto targets.  This might be doubly slow, both from 
the diversion process and linear ruleset scan on return, but I thought 
was a neat solution for such classification, needing to run real-time in 
userland and so not amenable to table construction (as this one is ..)

 > > > > Independent from the actual usage case (geo-blocking), let's talk about
 > > > > divert filtering in general. The original question which is still
 > > > > unanswered can be generalized to, whether "dropping/denying" a package
 > > > > simply means 'forget about it' or whether the divert filter is required
 > > > > to do something more involved, e.g. communicate the situation somehow
 > > > > to ipfw.
 > > > there is no residual information about the packet in the kernel once it
 > > > has been passed to the  user process.
 > > > so just "forgetting to hand it back" is sufficient to drop it.

Checking up on that was the reason I looked at the ipfw-classifyd code 
again; some irrelevant packets there were just dropped, indeed.

 > > OK, many thanks, that just answers my original doubt. At least technically,
 > > my daemon handles package dropping correctly, although, more elegant ways
 > > can be imagined to do the same thing.
 > > 
 > > Best regards
 > > 
 > > Rolf

Interesting discussion, and thanks for info on geoip tables etc.

cheers, Ian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160728004622.T29054>