From owner-freebsd-security  Wed Aug 28 00:52:09 1996
Return-Path: owner-security
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.5/8.7.3) id AAA09870
          for security-outgoing; Wed, 28 Aug 1996 00:52:09 -0700 (PDT)
Received: from psychotic.communica.com.au (gw.communica.com.au [203.8.94.161])
          by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id AAA09863
          for <security@freebsd.org>; Wed, 28 Aug 1996 00:52:02 -0700 (PDT)
Received: from communica.com.au (newton@frenzy [192.82.222.1]) by psychotic.communica.com.au (8.6.12/8.6.9) with SMTP id RAA03946; Wed, 28 Aug 1996 17:18:56 +0930
Received: by communica.com.au (4.1/SMI-4.1)
	id AA19152; Wed, 28 Aug 96 17:18:47 CST
From: newton@communica.com.au (Mark Newton)
Message-Id: <9608280748.AA19152@communica.com.au>
Subject: Re: Vulnerability in the Xt library (fwd)
To: zach@blizzard.gaffaneys.com (Zach Heilig)
Date: Wed, 28 Aug 1996 17:18:46 +0930 (CST)
Cc: newton@communica.com.au, gene@starkhome.cs.sunysb.edu,
        security@freebsd.org
In-Reply-To: <87g258j8a0.fsf@freebsd.gaffaneys.com> from "Zach Heilig" at Aug 28, 96 02:08:07 am
X-Mailer: ELM [version 2.4 PL21]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 8bit
Sender: owner-security@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

Zach Heilig wrote:

 > > Really, strcpy isn't all such a program would need to look for.
 > > There are many C library routines which perform no bounds checking
 > > (sprintf(), gets(), strcpy() to name a few) and, even worse, there
 > > are countless home-grown memory to memory copy routines which have
 > > been written in ignorance of the possible consequences of poor range
 > > checking and the assumption that if a buffer overflows the program
 > > will crash and it's the stupid user's own fault.  Essentially, your
 > > rebadged "lint" would end up attempting to be a program which tests
 > > the "correctness" of code, and if you can write one of them then I
 > > suspect you'll end up richer than Bill Gates :-)
 > 
 > Actually, you can get away a bit cheaper than that.  The compiler
 > could simply complain if a block of memory were passed to a function
 > without first checking its length. 

Eh?  A "block" of memory?  How does one define a block?  Functions don't
usually pass "blocks" anyway, they pass references.

Also, how are you going to tell if the "block" has been length-checked?
I challenge you to come up with some code which detects the following
as a lengthcheck operation:

    char *p, *src;
    extern char *buf;
    unsigned int cnt;

    p = src = getenv("USER_ENV_VAR");
    while (*p++ != '\0') ++cnt;
    if (cnt < MAXBUFCNT) strcpy(buf, src);

Now, there are many ways in which I could express that same code block;
your checker would need to be able to recognise all of them to be able
to give the above code a clean bill of health.

 > There are ways to subvert this
 > method, but a utility like that should catch most such errors.

Hmm.

 > If I can find my notes, I've come up with a way to do range checking,
 > without stepping on the programmers toes too badly (though it would
 > have a noticeable impact on performance).  The basic idea is to keep a
 > table of all the blocks of memory in a program (the beginning and
 > ending addresses), and check to make sure that all pointers are within
 > one of these blocks whenever they are changed (pointers are usually
 > changed less often than they are dereferenced).

Isn't this what the VM system does?  Like, isn't a SIGSEGV caused by 
dereferencing a pointer to a "block" that doesn't exist?

If I write my own memory allocator (like many freely distributable 
programs do -- How many times have you seen malloc.c in a freeware 
source tree?) which works by allocating "blocks" of memory then
dividing them up with internal pointers whenever a malloc() call is
made, your range checking wouldn't work.  Since this is more or less
what the real malloc() call does anyway, it'd probably be infeasible
even without a custom-written malloc.

What would you do if I called "free()"?  Doesn't that leave a pointer
dangling which points into hyperspace?  Consider the same for an 
munmap() which follows an mmap() (or basically any other deallocation
routine).  Oh, that's right, UNIX programmers traditionally avoid 
free()'s and munmap()'s because memory is unlimited, don't they <grin>)

In any case, your approach ignores one of the fundamental advantages
of C:  If a programmer knows what he's doing, he should be able to
do whatever he likes with the memory that has been allocated to him.
If he wants to overwrite bits of his stack with code, or have an array
filled with bytes that make up a machine code program he wants to 
execute, or whatever, then stopping him isn't going to win you any
friends.

 > This method may be even more expensive than you might think, as there

I can think pretty expensively :-)

 > would be several different blocks to test against every time a pointer
 > is changed.  You would merge blocks that were adjacent, but consider
 > the local variable blocks on the stack.  You really shouldn't include
 > the return addresses in the valid pointer list, so you have at least
 > as many blocks as there are function calls on the stack.

We'd better remove longjmp() from the C library, no? :-)

 > The major disadvantage to this method is the high up front cost of not
 > only implementing it, but also testing for and fixing every program
 > that allows user input to overrun a buffer.
 
I suspect we're on the long road of fixing them anyway.

I think your method would basically involve rewriting bits of the
compiler.  If every assignment operation would need to be checked,
you'd need to generate checking code to do it.  I'm still interested
to know how you detect that a program has legitimately allocated a 
new "block," though -- The inability to do that seriously limits the
functionality of your suggestion.

[ remember also that it isn't only the stack that's going to be a
danger here:  globals in bss can get you into trouble too.  Consider two 
strings in adjacent memory;  the first string is initialized from
user input, the second string is initialized from some "trusted" information,
and is eventually passed on to exec() or system().  If the second string
is initialized first, and the user data which initializes the first
string blows the buffer and overflows into the addresses utilized by
the second, then bogus data gets passed to exec() with obvious implications.
Could you detect that kind of thing? ]

   - mark

---
Mark Newton                               Email: newton@communica.com.au
Systems Engineer                          Phone: +61-8-373-2523
Communica Systems                         WWW:   http://www.communica.com.au