Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 10 Aug 2006 03:27:34 -0700
From:      "Nikolas Britton" <nikolas.britton@gmail.com>
To:        "Marc G. Fournier" <scrappy@freebsd.org>
Cc:        Paul Schmehl <pauls@utdallas.edu>, freebsd-questions@freebsd.org
Subject:   Re: BSDstats Project v2.0 ...
Message-ID:  <ef10de9a0608100327r5b402d64xc4eef38a4f61ba4e@mail.gmail.com>
In-Reply-To: <ef10de9a0608091700x6cc268ear6566c26f93f1fdf0@mail.gmail.com>
References:  <20060807003815.C7522@ganymede.hub.org> <44D8EC98.8020801@utdallas.edu> <20060808201359.S7522@ganymede.hub.org> <44D91F02.90107@mawer.org> <20060808212719.L7522@ganymede.hub.org> <20060809072313.GA19441@sysadm.stc> <20060809055245.J7522@ganymede.hub.org> <44D9F9C4.4050406@utdallas.edu> <20060809130354.U7522@ganymede.hub.org> <ef10de9a0608091700x6cc268ear6566c26f93f1fdf0@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 8/9/06, Nikolas Britton <nikolas.britton@gmail.com> wrote:
> On 8/9/06, Marc G. Fournier <scrappy@freebsd.org> wrote:
> > On Wed, 9 Aug 2006, Paul Schmehl wrote:
> >
> > > Marc G. Fournier wrote:
> > >> On Wed, 9 Aug 2006, Igor Robul wrote:
> > >>
> > >>> On Tue, Aug 08, 2006 at 09:30:42PM -0300, Marc G. Fournier wrote:
> > >>>> Could create problems long term .. one thing I will be using the
> > >>>> IPs to do is:
> > >>>>
> > >>>> SELECT ip, count(1) FROM systems GROUP BY ip ORDER BY count DESC;
> > >>>>
> > >>>> to look for any 'abnormalities' like todays with Armenia ...
> > >>>>
> > >>>> hashing it would make stuff like that fairly difficult ...
> > >>> You can make _two_ hashes and then concatenate to form unique key.
> > >>> Then you still be able to see "a lot of single IPs". Personaly, I dont
> > >>> care very much about IP/hostname disclosure :-)
> > >>
> > >> Except that you are disclosing that each and every time you send out an
> > >> email, or hit a web site ... :)
> > >>
> > > The systems I'm concerned about are on private IP space, to not send email
> > > and don't have X installed, much less a web browser and can only access
> > > certain FreeBSD sites to update ports.  In fact, they're not even accessible
> > > from *inside* our network except from certain hosts.  In order to
> > > successfully run the stats script on these hosts, I would have to open a hole
> > > in the firewall to bsdstats.hub.org on the correct port.
> > >
> > > And yes, I *am* paranoid.  But if you really want *all* statistics you can
> > > get, then you'll have to deal with us paranoid types.  My workstation, which
> > > is on a public IP, is already registered.
> >
> > Done ... now I really hope that the US stats rise, maybe?  I have a hard
> > time believing that Russia and the Ukraine have more deployments then the
> > 'good ol'US of A' ... or do they? *raised eyebrow*
> >
> > Here is what is now stored in the database (using my IP as a basis)
> >
> > # select * from systems where ip = md5('24.224.179.167');
> >    id  |                ip                |             hostname             | operating_system |  release   | architecture | country |        report_date
> > ------+----------------------------------+----------------------------------+------------------+------------+--------------+---------+---------------------------
> >   1295 | 45c80b9266a5a6683eee9c9798bd6575 | 4a9110019f2ca076407ed838bf190017 | FreeBSD          | 6.1-RC1    | i386         | CA      | 2006-08-09 02:34:05.12579
> >      1 | 45c80b9266a5a6683eee9c9798bd6575 | 9a45e58ab9535d89f0a7d2092b816364 | FreeBSD          | 6.1-STABLE | i386         | CA      | 2006-08-09 16:01:03.34788
> >
>
> Why don't you just broadcast the ip address, it's what your doing now
> anyways. 253^4 is a very small number.
>
> infomatic# perl
> my $num = 0;
> system "date";
> while ($num <= 409715208) {
> $num++
> }
> system "date";
> Wed Aug  9 18:18:45 CDT 2006
> Wed Aug  9 18:20:48 CDT 2006
>
> 2 minutes * 10 = 20 minutes to iterate though 4 billion IP addresses
> on a very slow uni-proc system. I could even store every IP to md5
> hash using less then 222GB of uncompressed space.
>
> If you want... give me the md5 hash of a real ip address that is
> unknown to me and I will hand you the ip address in two days... or
> less. run the IP address though like this:
>
> md5 -s "xxx.xxx.xxx.xxx"
>
> I have other things to do with my time, so I don't really want to do
> this, but if that's what it takes to stop this idea dead I'll do it.
>
>

Here's a better way to explain the problem:

Let's say we need to find Marc's IP address but we only have it's md5
hash value. Some of you may think this is hard to do but it's not. All
we need to do is compute every IP address into a hash and then match
Marc's hash to one in are list:

24.224.179.164 = e7e7a967c5f88d9fb10a1f22cd2133d2
24.224.179.165 = 3aa9b50aa7190f5aca1f78f075dc69c2
24.224.179.166 = c695175e48d649e3496ac715406a488d
24.224.179.167 = 45c80b9266a5a6683eee9c9798bd6575

So what is an IP address?... mathematically speaking it's 4 base 255
numbers grouped together:

{0, ..., 255}.{0, ..., 255}.{0, ..., 255}.{0, ..., 255}

To calculate how many combinations there could be you simply take the
base unit and raise it to the 4th power, since there are 4 of them.
This gives us 255^4 combinations or 4,228,250,625 TCP/IP addresses. We
also know that the first number can't be 0 or 255 and the others can't
be 255, we can also rule out all 127.x.y.z loopback and multicast
224.x.y.z - 239.x.y.z addresses:

(237^1) * (254^3)

This leaves us with 3,883,734,168 valid IP addresses. We can divide
this number by 5,000 and run it through a simple perl script to get a
time estimate on how long it will take to compute all these hashs. We
will split it into 4 parallel jobs:

my $number = 0;
while ($number <= 194187) {
system "md5 -s $number >> /usr/data/hashlist1";
$number++;
}

my $number = 194188;
while ($number <= 388373) {
system "md5 -s $number >> /usr/data/hashlist2";
$number++;
}

my $number = 388374;
while ($number <= 582560) {
system "md5 -s $number >> /usr/data/hashlist3";
$number++;
}

my $number = 582561;
while ($number <= 776747) {
system "md5 -s $number >> /usr/data/hashlist4";
$number++;
}

Ok, it took me 48 minutes to go though 1/5000th of the numbers using 1
dual-core Xeon system. Considering that my algorithm is very
inefficient and that this task is perfect for cluster computing we can
easily beat this time estimate...

If a cracker got hold of your database he could crack all the hashes
in a short amount of time and then have the IP addresses, detailed
system version, and full hardware information for exploitation. very
very bad. For hashes to work correctly the input data needs to be
larger then the hash itself, for example:

asglkhasdlgkjhasldkjhadlkjfhadlgkjhsadlkgjhsadlaskjdhgqalsdjkh
in md5 is e498d452efdfbfda87e522ff3af3b638. To crack that you have to
tackel ether the hash itself or the hash input... both are extremely
large numbers and impossible to brute force using todays hardware:

16^32 is the md5 hash, hexadecimal is base 16:
16^32 = 340,282,366,920,938,463,463,374,607,431,768,211,456

27^62 is the input for the hash, {a, ..., z} is 27 letters:
27^62 = 55,533,286,725,436,600,015,342,211,508,328,744,516
,059,680,22,346,098,411,797,141,428,073,753,123,071
,084,716,289,129


-- 
BSD Podcasts @:
http://bsdtalk.blogspot.com/
http://freebsdforall.blogspot.com/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ef10de9a0608100327r5b402d64xc4eef38a4f61ba4e>