Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 12 Nov 2013 22:13:46 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Konstantin Belousov <kib@freebsd.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r258039 - in head/sys: kern vm
Message-ID:  <20131112215200.Y1480@besplex.bde.org>
In-Reply-To: <201311120847.rAC8lwi8053235@svn.freebsd.org>
References:  <201311120847.rAC8lwi8053235@svn.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 12 Nov 2013, Konstantin Belousov wrote:

> Log:
>  Avoid overflow for the page counts.
>
>  Reported and tested by:	pho
>  Sponsored by:	The FreeBSD Foundation
>  MFC after:	1 week

> Modified: head/sys/kern/vfs_vnops.c
> ==============================================================================
> --- head/sys/kern/vfs_vnops.c	Tue Nov 12 08:32:10 2013	(r258038)
> +++ head/sys/kern/vfs_vnops.c	Tue Nov 12 08:47:58 2013	(r258039)
> @@ -933,8 +933,9 @@ vn_io_fault(struct file *fp, struct uio
> 	void *rl_cookie;
> 	struct mount *mp;
> 	vm_page_t *prev_td_ma;
> -	int cnt, error, save, saveheld, prev_td_ma_cnt;
> +	int error, save, saveheld, prev_td_ma_cnt;
> 	vm_offset_t addr, end;
> +	vm_size_t cnt;

int was correct for a count.  You can't possibly have the 8TB
of physical memory needed to overflow a 32-bit int page count.
It is reasonably to assume 32-bit ints.

vm mostly uses u_int for page count (starting with cnt.v_page_count
for the total number of pages in the system).  This will need to be fixed
when you have 16TB of physical memory.  This is worse than int because
it asks for sign extension bugs and for not trapping on overflow.

vm_size_t is a very bogus type for a page count.  It is the type for
(virtual only?) sizes in bytes.  Since sizes are in bytes, 32-bit
int isn't quite large enough even on 32-bit systems.

The signeness change in this asks for sign extension bugs.  In this
function, cnt is still compared with -1 after first assigning the
int result returned by vm_fault_quick_hold_pages.  vm_size_t is an
unsuitable type for holding this result, but the comparison still
works because -1 gets converted to vm_size_t and there are no sign
extension bugs in this case.

If there is an overflow error, then it is for inadequate conversion
of types in expressions like (cnt * PAGE_SIZE).  Here the only
problem seems to be in the error checking:

% 		addr = (vm_offset_t)uio_clone->uio_iov->iov_base;
% 		end = round_page(addr + len);
% 		cnt = howmany(end - trunc_page(addr), PAGE_SIZE);
% 		/*
% 		 * A perfectly misaligned address and length could cause
% 		 * both the start and the end of the chunk to use partial
% 		 * page.  +2 accounts for such a situation.
% 		 */
% 		if (cnt > io_hold_cnt + 2) {

If the parameters are untrusted, then howmany() can be almost anything,
including negative.  Assigning it to "int cnt" overflows it before it
can be checked.  I would also worry about round_page(addr + len)
overflowing.  This can overflow for even the valid range:

  	addr = base of highest page in address space
  	len = 1
  	end of page = end of address space
  	end = 0 (overflow)

io_hold_count is the constant 12, so cnt is limited to 14 if it doesn't
overflow before checking it.

There was no check for negative values.  Now there is a bogus one.  The
int could hold negative values and the range check was only from above,
so if a negative value occurred then it caused worse problems later.
Now, if howmany() is negative then the negative value is corrupted to
a large unsigned one.  This exceeds 14, so it is reduced to 12 and doesn't
cause further problems, at least from its size.  Checking for negative
values in the old version and converting them to 12 would have worked
much the same.

> 	vm_prot_t prot;
> 	size_t len, resid;
> 	ssize_t adv;
>
> Modified: head/sys/vm/vm_fault.c
> ==============================================================================
> --- head/sys/vm/vm_fault.c	Tue Nov 12 08:32:10 2013	(r258038)
> +++ head/sys/vm/vm_fault.c	Tue Nov 12 08:47:58 2013	(r258039)
> @@ -1074,7 +1074,7 @@ vm_fault_quick_hold_pages(vm_map_t map,
> {
> 	vm_offset_t end, va;
> 	vm_page_t *mp;
> -	int count;
> +	vm_size_t count;
> 	boolean_t pmap_failed;
>
> 	if (len == 0)
>

This has similar code, but more robust checking and I think it can almost
rule out bad args:

% 	if (len == 0)
% 		return (0);
% 	end = round_page(addr + len);
% 	addr = trunc_page(addr);
% 
% 	/*
% 	 * Check for illegal addresses.
% 	 */
% 	if (addr < vm_map_min(map) || addr > end || end > vm_map_max(map))
% 		return (-1);
% 
% 	count = howmany(end - addr, PAGE_SIZE);
% 	if (count > max_count)
% 		panic("vm_fault_quick_hold_pages: count > max_count");

It checks for illegal addresses after allowing round_page() to overflow.
I think trunc_page() can't overflow.  The check detects overflow in
round_page(addr + len).  So howmany() can't be negative, and the
end > vm_map_max(map) check should prevent it overflowing to 0.  It
can only be very large if we will panic anyway.  But it is technically
incorrect to assign it to "int count" before checking that it fits in
an int.  There are no problems with breaking the type of 'count', since
the above range-checks it to fit in an int, and later uses of it just
return or use it as an int.

This is easy to fix by doing some up-front check that len is not too
large (not more than max_count * PAGE_SIZE after adjusting it to cover
full pages).  We should be careful that max_count * PAGE size doesn't
overflow, and it seems better to not do that multiplication.  The
adjusted length is (end - addr).  This is a difference of vm_offset_t's
and can be represented in a size_t (like len), but can be left in the
expression's result type which is usually vm_offset_t.  Since the result
it is a multiple of PAGE_SIZE, howmany() is not needed and we can just
divide it by PAGE_SIZE.
    (vm rarely uses howmany(), and this use is just a style bug.  vm uses
    macros like btoc() and atop() to convert byte counts to page counts.
    There are a lot of style bugs and logic errors in these too.  btoc()
    converts to "clicks" and code that uses it assumes that clicks are
    pages.  The better named btop() is never used in MI vm code.  atop()
    converts addresses from bytes to pages, but is more often abused to
    convert sizes from bytes to pages.  Some code is so uncouth as to not
    even use any of these macros, but hard-codes them using PAGE_SHIFT or
    PAGE_SIZE.)
So the correct spelling of the conversion to bytes seems to be btop(),
and using this spelling we get a fairly short check with repeated code:

  	if (btop(end - addr) > max_count)
  		panic("vm_fault_quick_hold_pages: count > max_count");
  	count = btop(end - addr);

This depends on btop() being a simple macro and/or the compiler only
evaluating it once for efficiency.  It avoids overflow in the sanity
test and having to find a type for the possibly-overflowing value by
not assigning the value before it is checked.

The code could be rearranged a bit to make this clearer.  vm_mmap()
and obreak() need to be even more careful since they have to check
untrusted application args, but they are too complicated to provide
good examples.

vm_size_t is already abused for page counts in at least.  I have
previously objected using u_long for page counts.  Grepping for
long shows many more errors now.  redzone.c uses caddr_t as well
as u_long, and doesn't use any vm types.

C's type system is too weak to prevent these errors, but some could
be detected by temporarily changing the type of vm_size_t, etc., to
somthing exotic so that it cannot be used without going through the
correct conversion macro.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20131112215200.Y1480>