Skip site navigation (1)Skip section navigation (2)
Date:      30 Aug 1996 05:02:16 -0500
From:      Zach Heilig <zach@blizzard.gaffaneys.com>
To:        newton@communica.com.au (Mark Newton)
Cc:        security@FreeBSD.org
Subject:   Range checking (was Re: Vulnerability in the Xt library (fwd))
Message-ID:  <87hgplur4n.fsf@freebsd.gaffaneys.com>
In-Reply-To: newton@communica.com.au's message of Wed, 28 Aug 1996 17:18:46 %2B0930 (CST)
References:  <9608280748.AA19152@communica.com.au>

next in thread | previous in thread | raw e-mail | index | archive | help
I spent the past couple days trying to figure out how to do range
checking (which would be one way to avoid the setuid problems FreeBSD
has been having) effectively, and have come to the conclusion that it
is relatively easy to do in a simple manner (see the bcc package
someone put together for gcc), but tricky to actually get right (I
knew that already, but the point has become even more clear to me...)

newton@communica.com.au (Mark Newton) writes:

> Eh?  A "block" of memory?  How does one define a block?  Functions don't
> usually pass "blocks" anyway, they pass references.

Right, references to a block of memory.  I guess I should have been a
little more specific :-).  Eventually, you do get down to dealing with
the actual block of memory though.

> Also, how are you going to tell if the "block" has been length-checked?
> I challenge you to come up with some code which detects the following
> as a lengthcheck operation:

As I said, there are some loopholes with this sort of system, but it
would error on the side of complaining.  The real question is how many
setuid programs are written using a loop to check "block" length as
opposed to using strlen()?  I don't have the disk space to expand the
source tree to look and find out.

>  > The basic idea is to keep a
>  > table of all the blocks of memory in a program (the beginning and
>  > ending addresses), and check to make sure that all pointers are within
>  > one of these blocks whenever they are changed (pointers are usually
>  > changed less often than they are dereferenced).

> Isn't this what the VM system does?  Like, isn't a SIGSEGV caused by 
> dereferencing a pointer to a "block" that doesn't exist?

The VM only cares if the pointer is within the the programs address
space.  We care if it is pointing to the correct variable (or within
an approved block of memory).  It looks to me that the "blocks" of
global/static memory allocated at compile time need a word of memory
between them that is not in a valid "block".  This would (mostly)
prevent an overrun of each individual block/array.

> If I write my own memory allocator (like many freely distributable 
> programs do -- How many times have you seen malloc.c in a freeware 
> source tree?) which works by allocating "blocks" of memory then
> dividing them up with internal pointers whenever a malloc() call is
> made, your range checking wouldn't work.  Since this is more or less
> what the real malloc() call does anyway, it'd probably be infeasible
> even without a custom-written malloc.

You would presumably link the program against a custom malloc() if you
were compiling for range checking.  malloc() (and similar functions)
would have to tell the range checking subsystem every block that was
allocated or freed.  brk(), sbrk(), or most other system calls that
receive memory direct from the kernel would not have this code.
malloc() (and close friends) would have to be careful to leave at
least a word of "invalid" memory between each block it allocated to
allow this range checking to work properly.

The mmap() system call poses a big problem though.  Perhaps the wrapper
could special case those calls that are for memory allocation (usually
those that mmap /dev/zero).

> What would you do if I called "free()"?  Doesn't that leave a pointer
> dangling which points into hyperspace?  Consider the same for an 
> munmap() which follows an mmap() (or basically any other deallocation
> routine).  Oh, that's right, UNIX programmers traditionally avoid 
> free()'s and munmap()'s because memory is unlimited, don't they <grin>)

Oops, it looks like the "valid pointer checks" would have to occur
right at each dereference, as opposed to each time the pointer is
changed... thanks to functions similar to free().

> In any case, your approach ignores one of the fundamental advantages
> of C:  If a programmer knows what he's doing, he should be able to
> do whatever he likes with the memory that has been allocated to him.
> If he wants to overwrite bits of his stack with code, or have an array
> filled with bytes that make up a machine code program he wants to 
> execute, or whatever, then stopping him isn't going to win you any
> friends.

Yes, this is true, but it is very bad practice to stomp all over
memory you did not allocate in some fashion.  This may or may not
include the return address on the stack (I think changing the return
address should be taboo).  Tricks may be ok for standard programs,
where if that program fails it doesn't affect much else, but tricks
that depend on some obscure or undocumented / unintended behavior
should not be used in security sensitive code.

Besides, this is would be a voluntary option (obviously).

>  > would be several different blocks to test against every time a pointer
>  > is changed.  You would merge blocks that were adjacent, but consider
>  > the local variable blocks on the stack.  You really shouldn't include
>  > the return addresses in the valid pointer list, so you have at least
>  > as many blocks as there are function calls on the stack.

> We'd better remove longjmp() from the C library, no? :-)

This does seem to through a pretty big monkeywrench into this scheme,
doesn't it?  The short answer is longjump() and friends would not be
subject to these tests.  Various functions that would be difficult or
impossible to write in the presence of range checking would have to be
compiled without it enabled.  The jmp_buf buffer would be safe, since
pointers would not be able to overrun any other block of memory, thus
preventing an attacker from modifying the contents of that buffer.  I
suspect current setuid programs that use this family of functions
could have a vulnerability in this area.

>  > The major disadvantage to this method is the high up front cost of not
>  > only implementing it, but also testing for and fixing every program
>  > that allows user input to overrun a buffer.

> I suspect we're on the long road of fixing them anyway.

I wonder if it would be cheaper to implement range checking or to have
a formal review of every setuid program...

> I think your method would basically involve rewriting bits of the
> compiler.  If every assignment operation would need to be checked,
> you'd need to generate checking code to do it.  I'm still interested
> to know how you detect that a program has legitimately allocated a 
> new "block," though -- The inability to do that seriously limits the
> functionality of your suggestion.

Due to functions like free() that invalidate a pointer, the check
would have to occur right before every dereference, not just when the
pointer is changed.

The compiler would set up a static structure (that would be added to
at run time) with all the compile time allocated blocks of memory.
All the functions that allocate memory (excepting the lowest level
functions) would inform the range checking subsystem of every new
block (this would include calling a function with an array or
structure as a local variable).  All the functions that "returned"
memory to the system would then inform the range checking system of
each block that was no longer valid (also, this includes returning
from a function).  All the local (non-array) variables would have to
be included in a block (just a single large block should be
sufficient), and the local arrays and structures would each be
separated from each other by an "invalid" word.  The parameters would
have to be added to the list of valid blocks as well (before the
function is called) and removed from the list after the function
returned.  The global simple variables would also be bunched together
into a block, and the global structures and arrays would be separated
by an invalid word.  I did say this would be expensive, right?

The local variables to each function would be sorted so the simple
variables would take only one block, while the arrays and structures
would be separated.  Actually, it seems that some structures could be
considered simple variables as well (if it didn't contain any element
that was an array, it's these blocks of memory that we need to watch
out for).

This system seem like it could interact badly with the variable
arguement functions, but I don't think I've seen any that pass
something other than a simple variable type.  The simple variables can
be bunched together without much (if any) loss in protection.

> [ remember also that it isn't only the stack that's going to be a
> danger here: globals in bss can get you into trouble too.  Consider
> two strings in adjacent memory; the first string is initialized from
> user input, the second string is initialized from some "trusted"
> information, and is eventually passed on to exec() or system().  If
> the second string is initialized first, and the user data which
> initializes the first string blows the buffer and overflows into the
> addresses utilized by the second, then bogus data gets passed to
> exec() with obvious implications.  Could you detect that kind of
> thing? ]

The way I originally suggested this, no, because adjacent blocks would
be merged as far as the range checking was concerned.  I think I've
cleaned up the proposal to include detecting "that kind of thing".
I'm also sure that there are issues I haven't thought of yet that need
to be addressed properly.

This is one of the options I would like in a 'debugging' compiler, and
I would be very interested in helping along these lines.  My problem
being a lack of local disk space (Ok, I could probably make room for
compiler source...).  If this project is active when I get a bigger
fixed disk (hopefully before Christmas, but probably after
Thanksgiving), I'll be happy to pitch in at that time.

-- 
Zach Heilig (zach@blizzard.gaffaneys.com) | ALL unsolicited commercial email
Support bacteria -- it's the              | is unwelcome.  I avoid dealing
only culture some people have!            | with companies that email ads.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?87hgplur4n.fsf>