Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Apr 2000 09:31:03 -0400 (EDT)
From:      pha@pdq.com
To:        FreeBSD-gnats-submit@freebsd.org
Cc:        pha@pdq.com
Subject:   i386/18095: critical VM mmap/madvise bug
Message-ID:  <200004191331.JAA04849@voltron.pdq.com>

next in thread | raw e-mail | index | archive | help

>Number:         18095
>Category:       i386
>Synopsis:       mmap region with non zero offset is corrupted after madvise call
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Apr 19 06:50:01 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator:     Paul H. Anderson
>Release:        FreeBSD 4.0-RELEASE i386
>Organization:
Public Data Queries, Inc.
>Environment:

Any Intel FreeBSD machine with a revamped vm subsystem (sometime after July 5th 1999
snapshot).  Also affects 5.0 snapshots (I found it there, first, then downgraded).

I have not verified whether this is also a problem on alphas, so I don't happen to
know if it is specific to the i386.

>Description:

This is a VM subsystem bug related to a sequence of calls to mmap and madvise.

It first appeared in snapshots sometime after the July 5th, 1999 Walnut Creek
FreeBSD 4.0 snapshot.

What happens is that in the following case, the memory region visible to the
running progrem is corrupted:

 region =  mmap a file with offset 16K
 madvise(region, MADV_SEQUENTIAL)
 madvise(region, MADV_WILLNEED)

 after that sequence, the region has incorrect contents.  If the madvise calls
 are commented out, the map works properly.  My workaround is to not use
 madvise on post July 5th 1999 4.0 snapshot machines.  The incorrect contents
 are actually data from the same file, but shifted by 16384 bytes, except for the
 last 16384 bytes, which are apparently correct.

 I believe that the _very first_ time madvise is used, that it will not fail.
 Any subsequent run will fail until reboot.

>How-To-Repeat:

Run the following program three times -
first time creates the test file,
second time shows operation without madvise.
third time shows operation with madvise (which fails).

See comments a few lines below for run instructions.


/*
 * test whether an object can be mapped in two places in VM.
 *
 * compile this program two ways:
 * cc -o vm vm.c
 * vm foo		(creates the file foo with block ids)
 * vm foo		(tests the file foo - should work, because we aren't using madvise)
 *
 * cc -o vm -DUSEMADVISE vm.c
 * vm foo		(now fails on FreeBSD after July 5th 1999 Walnut creek snapshot,
 *				 because madvise corrupts vm maps for second offset object)
 *
 *
 * usage: vm <filename>
 *
 * if <filename> does not exist, creates a FILESIZE file, populates
 * each unique getpagesize() block of data with a getpagesize()/4
 * array of numbers, each of which is the page index into the file.
 *
 * if <filename> does exist, the file is mapped in two sections.
 * the first section the first HEADERSIZE bytes, and is mapped where the OS
 * wants to map it.  The second section (FILESIZE - HEADERSIZE) is mapped via
 * a second mmap call.
 *
 * then both sections are checked for the correct values.
 *
 *
 * written by Paul H. Anderson (pha@pdq.com) to debug a problem with FreeBSD
 * 4.0 apparently after the vm system was re-written.
 *
 * this program is not copyrighted and may be used for any purpose whatsoever,
 * please give credit, please don't remove this notice.
 */

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <errno.h>
#include <fcntl.h>

#define FILESIZE (1024*1024)	/* file is 1MB long for now */
#define HEADERSIZE (16*1024)	/* header is 16K long for now */

int makethefile(char *filename);
int checkthefile(char *filename);

int main(int argc, char **argv)
{
	struct stat sb;

	if(argc!=2) {
		fprintf(stderr,"usage: %s <filename>\n", argv[0]);
		exit(1);
	}

	if(stat(argv[1], &sb)) {
		if(errno==ENOENT) exit(makethefile(argv[1]));
		perror(argv[0]);
		exit(1);
	}

	/* file exists, so check it. */

	exit(checkthefile(argv[1]));
}

/*
 * basic idea here is to write a file with indentifying information
 * in each block - we just write the block #.
 */

int makethefile(char *filename)
{
	int *block = (int *) calloc(getpagesize() * sizeof(int),1);
	int fd;
	int i;

	printf("creating a %d byte file with blocks marked with their block number.\n", FILESIZE);
	fd = open(filename, O_RDWR|O_CREAT, 0666);

	for(i=0;i<FILESIZE / getpagesize(); i++) {
		block[0] = i;
		if(write(fd, block, getpagesize()) != getpagesize()) {
			fprintf(stderr,"failed to write %d bytes to file %s\n", getpagesize(), filename);
			close(fd);
			exit(1);
		}
	}
	close(fd);
}

/*
 * here we want to check the file
 *
 */
int checkthefile(char *filename)
{
	int *block1, *block2;
	int fd1, fd2;
	int i,j;
	int errors = 0;


	fd1 = open(filename, O_RDONLY);
	block1 = (int *) mmap(NULL, (size_t) HEADERSIZE, PROT_READ, MAP_PRIVATE, fd1, (off_t)0);
#if defined(USEMADVISE)
	madvise(block1, HEADERSIZE, MADV_SEQUENTIAL);
	madvise(block1, HEADERSIZE, MADV_WILLNEED);
#endif

	fd2 = open(filename, O_RDONLY);
	block2 = (int *) mmap(NULL, FILESIZE - HEADERSIZE, PROT_READ, MAP_PRIVATE, fd2, (off_t) HEADERSIZE);
#if defined(USEMADVISE)
	madvise(block2, FILESIZE - HEADERSIZE, MADV_SEQUENTIAL);
	madvise(block2, FILESIZE - HEADERSIZE, MADV_WILLNEED);
#endif

	for(i=0;i<HEADERSIZE / getpagesize(); i++) {
		int index = (i)*getpagesize() / sizeof(int);
		if(block1[index] != i) {
			fprintf(stderr,"expected page with page id %d but got id %d at address %x\n", (int)i, (int)block1[index], (int)&block1[index]);
			errors++;
		}
	}
	/* finished looking at header, now look at second mapped region */

	for(i=HEADERSIZE / getpagesize(); i<FILESIZE / getpagesize(); i++) {
		int index = (i-HEADERSIZE/getpagesize())*getpagesize() / sizeof(int);
		if(block2[index] != i) {
			fprintf(stderr,"expected page with page id %d but got id %d at address %x\n", (int)i, (int)block2[index], (int)&block2[index]);
			errors++;
		}
	}
	if(errors == 0) {
#if defined(USEMADVISE)
		printf("vm system test worked fine - you either fixed the problem, or we are running\n");
		printf("on a FreeBSD machine equal to or prior the July 5th, 1999 Walnut creek snapshot\n");
#else
		printf("madvise not being used, so the fact that this test works only tells you\n");
		printf("that mmap alone with an offset appears to work fine.\n");
#endif
	}
	return errors;
}






>Fix:

no clue.  I understand the VM system was reworked after July 5th snapshot, and I
have to believe that this test case is not common, so it slipped by...



>Release-Note:
>Audit-Trail:
>Unformatted:


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200004191331.JAA04849>