From owner-freebsd-security  Tue Jan 25 10:52:23 2000
Delivered-To: freebsd-security@freebsd.org
Received: from lariat.lariat.org (lariat.lariat.org [206.100.185.2])
	by hub.freebsd.org (Postfix) with ESMTP id BD5CD15215
	for <security@FreeBSD.ORG>; Tue, 25 Jan 2000 10:52:20 -0800 (PST)
	(envelope-from brett@lariat.org)
Received: from workhorse (IDENT:ppp0.lariat.org@lariat.lariat.org [206.100.185.2])
	by lariat.lariat.org (8.9.3/8.9.3) with ESMTP id LAA09986;
	Tue, 25 Jan 2000 11:52:13 -0700 (MST)
Message-Id: <4.2.2.20000125113518.01a59100@localhost>
X-Sender: brett@localhost
X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.2 
Date: Tue, 25 Jan 2000 11:52:11 -0700
To: Matthew Dillon <dillon@apollo.backplane.com>,
	Warner Losh <imp@village.org>
From: Brett Glass <brett@lariat.org>
Subject: Re: Merged patches 
Cc: security@FreeBSD.ORG
In-Reply-To: <200001251738.JAA04802@apollo.backplane.com>
References: <4.2.2.20000125095042.01a5aba0@localhost>
 <200001251722.KAA04527@harmony.village.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Sender: owner-freebsd-security@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

At 10:38 AM 1/25/2000 , Matthew Dillon wrote:

>    So we do multiple tests, so what?  Not only will GCC potentially
>     optimize the code, 

I have never seen GCC optimize tests of the individual bits of a word
into a switch.

>but doing multiple tests means the memory references 
>     are already in the L1 cache so, frankly, I doubt you would save more
>     then a few nanoseconds glomming it all together into a switch.  

Caching isn't the issue. Conditional jumps trigger pipeline interlocks
and stalls. A bunch of them in a row is a worst case. It locks up even the 
best superscalar CPUs because the pipelines are tied in knots and you can only
do so much speculative execution. Doing a switch eliminates the pipeline
"train wreck" and at the same time parallelizes the tests in a completely 
portable way. As an ASM programmer, I see MASSIVE speedups when I do this --
usually an order of magnitude at least. 

>In fact,
>     it's quite possible that attempting to optimize it in this fashion will
>     actually make it slower since you have no control over the critical path
>     when you glom things into a switch statement.  

If the compiler generates a jump table (which you can force via an option in
many cases but which a good compiler will do on its own), all of the
paths become short. The cost is fixed: one indexed jump. Because there's
only one jump, branch prediction, speculative execution, etc. work on newer 
CPUs. The penalty is smaller on the older ones, too.

Switches also make the code more readable and make it easy to handle
every case. Some of the problems we're seeing in this code have been caused
by failure to account for some combinations of the TCP option flags. The best
way to ensure code correctness -- now and for the long term -- is to use
a construct that makes it easy to be sure you cover all the bases! It's
not only good style; it's good insurance.

--Brett


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-security" in the body of the message