From owner-freebsd-security  Sat Sep 21 14:25:56 1996
Return-Path: owner-security
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.5/8.7.3) id OAA26796
          for security-outgoing; Sat, 21 Sep 1996 14:25:56 -0700 (PDT)
Received: from ns.frihet.com (root@frihet.bayarea.net [205.219.92.1])
          by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id OAA26241
          for <security@freebsd.org>; Sat, 21 Sep 1996 14:24:31 -0700 (PDT)
Received: from ns.frihet.com (tweten@localhost [127.0.0.1])
	by ns.frihet.com (8.7.6/8.6.12) with ESMTP id OAA08641; Sat, 21 Sep 1996 14:23:04 -0700 (PDT)
Message-Id: <199609212123.OAA08641@ns.frihet.com>
X-Mailer: exmh version 1.6.7 5/3/96
Reply-To: "David E. Tweten" <tweten@frihet.com>
To: newton@communica.com.au (Mark Newton)
cc: imp@village.org, spfarrel@midway.uchicago.edu, security@freebsd.org
Subject: Re: comments on the SYN attack 
Date: Sat, 21 Sep 1996 14:23:04 -0700
From: "David E. Tweten" <tweten@frihet.com>
Sender: owner-security@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

newton@communica.com.au said:
>"Putting down someone else's good idea" was not what I was doing, 
>sunshine. 

Well, I am a native Californian, so this may be appropriate.  It's at least 
a lot better than "Tweety Bird," the awful nickname I had to endure through 
grammar school.

But, let us move onward to more serious things.

Mark quotes me:
>>Newly arrived packets are likely to be flood packets. 

and then continues:
>When you look at the other end of the queue, however, you will find 
>that the deterministic approach will ensure that it's highly likely 
>that the oldest packets are illegitimate, and for sufficiently large 
>queue sizes, the chances of a packet at the end of the queue being 
>illegitimate are quite small.

Your assumption here is at the heart of your disagreement with Warner, and 
of late, with me.  If Warner will indulge me the privilege I'll say that 
neither of us is particularly concerned about the case you cite above, 
where "the queue" is long enough or the phony SYNs are arriving slowly 
enough that there is any difference in concentration of legitimate SYNs 
between the ends of "the queue."  Our lack of concern is based upon several 
reasons:

    1.	If "the queue" is that long, there's no flood.  By any definition I 
	understand, a "flood" denial of service attack is one that 
	generates a high enough arrival rate of requests for service that a 
	some legitimate requests cannot be served.  That's a high enough 
	rate that the deterministic algorithm will flush any SYN before it 
	has had time to collect an ACK.

    2.	Even in the high flow, but no flood case when there is enough 
	time in queue under the deterministic algorithm for a legitimate 
	SYN's ACK to arrive, the stochastic algorithm also works well 
	enough to be acceptable.  If you re-examine my table of numbers, 
	you'll see that at the deterministic algorithm's limit, only two 
	tries will get the stochastic algorithm a better than 50% chance of 
	a successful connection.  If you don't like those odds, two more 
	tries will close the gap to certainty by half again, and so on.  
	Lower flows require fewer tries.  TCP does retry failed 
	connections; the stochastic algorithm harnesses that fact to cover 
	for it where it is weak.

    3.	Since no way has been found yet, even probabilistically, to 
	distinguish between a legitimate SYN and a flood attack SYN, I want 
	all SYNs to have a decent chance of surviving long enough to 
	collect an ACK.  That's right, phony SYNs included.  I don't care 
	if some flood attack SYN sits in my queue for a month, so long as I 
	stand a decent chance of completing legitimate connections in spite 
	of its presence.

So what's the upshot?

At high-flow-but-no-flood, both stochastic and deterministic methods work.  
The deterministic method works first try, and the stochastic method may 
require a retry from TCP (think big -- maybe two or three).

For moderate flood conditions, the deterministic algorithm fails utterly.  
The deterministic method continues to work.  For the particular queue size 
I chose for running the numbers, a "moderate flood" condition seemed to 
cover two to four times the fake SYN arrival rate that would cause the 
deterministic method to fail.

Sadly, it seems that even the stochastic method isn't as resistant to flood 
as one would like.  I would have liked to see it withstand ten times the 
arrival rate that will drop the deterministic method with the same "queue" 
size.  Under the conditions I chose for running the numbers, the required 
retry count had begun to skyrocket by the time the flood got eight times as 
strong as was required to kill the deterministic algorithm.

There is still some hope.  The stochastic algorithm's flood resistance 
compared to the deterministic algorithm is dependent upon the size of the 
"queue."  To see that, compare the formulas for "p" and note that a "queue" 
size of 1 makes both deterministic and stochastic algorithms completely 
non-resistant to flood.  Maybe a larger but still reasonable "queue" could 
make it ten times as resistant to flood as the deterministic algorithm.  
I'll leave that exploration to someone else.

>You've calculated p, and defined it as "the probability of a 
>legitimate SYN lasting long enough to be paired with an ACK."  But 
>according to my  reading of your derivation, what you've *actually* 
>calculated is  the probability of ANY single SYN in the queue lasting 
>long enough to be paired with an ACK (assuming the time interval for 
>the pairing of a legitimate SYN with its ACK is constant), without 
>factoring in any allowance for whether that SYN is legitimate.

Exactly.  My crystal ball is a little clouded when it comes to 
distinguishing between flood SYNs and legitimate SYNs.  I'm therefore an 
equal opportunity SYN storer.  I'm quite literally blind to their 
differences.  If you think about it for a moment, I belive you'll conclude 
that under flood conditions, so are you.

>This is an important distinction, because in a flooding situation the 
>vast majority of the packets in the queue will be illegitimate, and 
>the random approach makes no effort to swing the proportions of good 
>SYNs to bad in any positive direction (indeed, by definition that 
>proportion will remain constant if the random number generator 
>delivers a flat probability distribution).   

I think I can make an argument that the proportion of "good SYNs" to "bad" 
will be even smaller than you say under the stochastic method, and that's 
good.

Since we have a flood condition, the deterministic method cannot match any 
"good SYN" with its just-arrived "ACK", so connection completion is not 
available to it as a method of removing "good SYNs" from the queue.  
Therefore, the density of good SYNs in the queue is identical to the 
arrival density.

Under the stochastic algorithm, the occasional good SYN lasts, by chance, 
long enough to be matched with its ACK and to be removed.  That provides a 
mechanism other than random shooting for the stochastic algorithm to remove 
a SYN from the queue.  Since this second method is highly biased toward 
removing good SYNs, the concentration of good SYNs in the queue under flood 
conditions should actually be smaller for the stochastic algorithm than for 
the deterministic algorithm.

>You've defined f as the number of flood SYNs which arrive between a 
>SYN and its ACK.  That would suggest that the probability that a 
>given packet is legitimate is close to 1/f, or 1% for f=100

With apologies to analog engineers, 1/f has nothing to do with the 
probability that a given SYN is legitimate.  Going back to my definition of 
f, 1/f would be the inter-arrival time of flood SYNs, expressed in time 
units of the SYN/ACK to ACK delay for legitimate SYNs.  Not a very useful 
number to your argument.

>I say "close to" because you should have defined f as "the number of SYNs >which arrive between a SYN and its ACK," rather than the number of flood >SYNs.

You are due this point.  I should have listed as another of my simplifying 
assumptions that under flood conditions, flood SYNs so outnumber each 
legitimate SYN that the sum of the number of good SYNs and the number of 
flood SYNs in any sample is approximately equal to the number of flood SYNs 
in the sample.

>With the definition duely corrected, the probability that a given 
>packet in the input stream is legit is precisely 1/f, but even that 
>doesn't accurately reflect the same probability for the domain 
>representing the SYN queue in the kernel, because as  SYNs are ACKed 
>they are removed from the queue;  This reduces the probability that 
>any given packet in the queue is legitimate, but only marginally.  
>Thus I conclude that that probability is *approximately* 1/f).

As I pointed out a couple of paragraphs ago, 1/f has nothing to do with 
your argument.

>What this says is that for n=100, the probability that any individual 
>SYN will still be in the queue is 0.37 at f=100.

Right so far.

>In flooding conditions the number of legitimate SYNs will be 
>vanishingly small compared to illegitimate SYNs -- Or 1% for f=100.

As I pointed out above, you cannot derive your 1% from f.  For the sake of 
argument, I'll accept 1% as a good engineering approximation of 
"vanishingly small".
 
>Under those conditions, about 1% of that p=0.37 value would represent 
>the probability that the packet that's hung around in the queue for 
>long enough to be ACKed is legitimate, and hence the probability that 
>a connection is successful is similarly vanishingly small.

Wrong.  You are saying that the event of interest is composed of two 
statistically independent events.  First is the event that an arriving SYN 
is legitimate (which you assign a probability of 0.1), and second is the 
event that an arriving SYN survives for the assumed-to-be fixed time it 
would take for an ACK to arrive, were it legitimate.  Instead, the event of 
interest is the one you listed earlier, "that any individual SYN will still 
be in the queue" after a SYN/ACK to ACK delay.  The person trying to make a 
legitimate connection has performed a deterministic act.  He has put a 
legitimate SYN into the queue.  He did not submit a legitimate SYN with 1% 
probability and submit an attacking SYN with 99% probability.  We care 
about the probability that it will still be there when the ACK arrives.  We 
don't care what the probability was that it was legitimate.  It's 1.0 and 
that's boring.

>Your odds don't sound that great anymore.

37% per retry still sounds fine to me.  Even your incorrect 0.0037%  per 
retry is better than the 0% probability of success for the deterministic 
algorithm.  I'll grant that 0.0037%, while better than 0%, is not good 
enough.  Fortunately, it's also wrong.

>I'd go as far to say that at f=100 and n=100 *both* algorithms fall 
>in a screaming heap, so perhaps n needs to be increased, yes?

I wouldn't go that far, and no, not for the stochastic algorithm.  I'd say 
that at f=100 and n=100, the deterministic algorithm "falls in a screaming 
heap" and the stochastic algorithm begins to fail soft.  By taking 
advantage of TCP retries, the stochastic algorithm lasts until f gets up to 
800 or so, as we continue to hold n at 100.

>And if you increase n, you provide a larger range of f for which the 
>deterministic value of p is 1, yes?

Yes.  You also provide a larger range of f for which the stochastic 
algorithm survives after the deterministic algorithm has failed.  That 
range might even get larger faster than n.  See above.

>And either way, n doesn't need to be increased by anywhere near the 
>values to which it'd need to be increased under the present 
>implementation.

Agreed.

>While the  deterministic approach says p=1.0, the non-deterministic 
>approach gives a (63% x 1/f) chance of connection failure on the 
>first try.  Yuk, eh?

Leaving aside your 1/f misconception, there's no yuk at all.  The 63% 
probability of failure on the first try, under conditions in which the 
deterministic algorithm is just barely surviving, turns into 40% after the 
second try, 25% after the third, 10% after the fifth, etc.  Since rcmd(3) 
(as a randomly chosen example :-) tries five times, 10% is a more 
interesting probability of failure to consider than is 63%.  Furthermore, 
if you increase f by one, the deterministic algorithm just simply fails, 
irrespective of number of retries, and the stochastic algorithm hangs in at 
a failure rate around 10% after five tries.

>But that's ok:  I understand that you americans have become 
>comfortable with the concept of friendly fire :-) 

Sometimes the most embarrassing friendly fire comes from your own weapon, 
while aimed at your own foot.  Do get well soon.  :-)

>you've provided me with a statistical analysis of the algorithms 
>which says exactly what I told Warner when we started this 
>discussion, namely that the deterministic approach will work 
>*perfectly* until the rate of incoming SYNs is sufficient to  cause 
>the queue to overflow in the time between the reception of a SYN and 
>the reception of its ACK, that the non-deterministic approach 
>contains a built-in chance of destroying an otherwise perfectly good 
>SYN ...

Right so far.

>... and that the two approaches are more or less identical (allowing 
>for small statistical variations).

Bang!  Too bad about that foot!

The part you keep missing is that built-in retries can compensate for the 
stochastic algorithm's failures, and that nothing can save the 
deterministic algorithm when f equals or exceeds n.

>I have to wonder why you bothered.

I bothered in a so-far unsuccessful attempt to make you understand the 
previous paragraph.  Yes, I have better things to do, but sometimes I just 
cant resist a juicy windmill.

>The random method will permit you to increase f by a few percentage 
>points over the ceiling imposed by the deterministic method and still 
>get the odd connection through.

In the case for which I ran the numbers, those "few percentage points" were 
an additional 300 percentage points, after which 50% of rcmd() connections 
would still get through.  Quite a few percentage points, I'd say.

Initially quoting me, Mark continued:
>>Another good way to avoid embarassment is to check the spelling of 
>>people's names before sending e-mail.

>Oh, a spelling flame.

>How trite. 

Actually, I was castigating myself for having misspelled Warner's name as 
"Werner".  While I was at it, I should have checked the spelling of 
"embarrassment" too.  Sigh.

And, ever so slightly out of order:
>The world has enough thin-skinned brittle people already ...

Indeed.
-- 
David E. Tweten         | 2047-bit PGP Key fingerprint: | tweten@frihet.com
12141 Atrium Drive      |    E9 59 E7 5C 6B 88 B8 90    |    tweten@and.com
Saratoga, CA 95070-3162 |    65 30 2A A4 A0 BC 49 AE    |    (408) 446-4131
Those who make good products sell products; those who don't, sell solutions.