Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 04 Apr 2012 13:39:09 +0400
From:      Andrey Zonov <andrey@zonov.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        alc@freebsd.org, freebsd-hackers@freebsd.org
Subject:   Re: problems with mmap() and disk caching
Message-ID:  <4F7C16BD.3010703@zonov.org>
In-Reply-To: <4F7C1620.6040703@zonov.org>
References:  <4F7B495D.3010402@zonov.org> <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> <4F7C1620.6040703@zonov.org>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------010408020308040305090701
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

I forgot to attach my test program.

On 04.04.2012 13:36, Andrey Zonov wrote:
> On 04.04.2012 11:17, Konstantin Belousov wrote:
>>
>> Calling madvise(MADV_RANDOM) fixes the issue, because the code to
>> deactivate/cache the pages is turned off. On the other hand, it also
>> turns of read-ahead for faulting, and the first loop becomes eternally
>> long.
>
> Now it takes 5 times longer. Anyway, thanks for explanation.
>
>>
>> Doing MADV_WILLNEED does not fix the problem indeed, since willneed
>> reactivates the pages of the object at the time of call. To use
>> MADV_WILLNEED, you would need to call it between faults/memcpy.
>>
>
> I played with it, but no luck so far.
>
>>>
>>> I've also never seen super pages, how to make them work?
>> They just work, at least for me. Look at the output of procstat -v
>> after enough loops finished to not cause disk activity.
>>
>
> The problem was in my test program. I fixed it, now I see super pages
> but I'm still not satisfied. There are several tests below:
>
> 1. With madvise(MADV_RANDOM) I see almost all super pages:
> $ ./mmap /mnt/random-1024 5
> mmap: 1 pass took: 26.438535 (none: 0; res: 262144; super: 511; other: 0)
> mmap: 2 pass took: 0.187311 (none: 0; res: 262144; super: 511; other: 0)
> mmap: 3 pass took: 0.184953 (none: 0; res: 262144; super: 511; other: 0)
> mmap: 4 pass took: 0.186007 (none: 0; res: 262144; super: 511; other: 0)
> mmap: 5 pass took: 0.185790 (none: 0; res: 262144; super: 511; other: 0)
>
> Should it be 512?
>
> 2. Without madvise(MADV_RANDOM):
> $ ./mmap /mnt/random-1024 50
> mmap: 1 pass took: 7.629745 (none: 262112; res: 32; super: 0; other: 0)
> mmap: 2 pass took: 7.301720 (none: 261202; res: 942; super: 0; other: 0)
> mmap: 3 pass took: 7.261416 (none: 260226; res: 1918; super: 1; other: 0)
> [skip]
> mmap: 49 pass took: 0.155368 (none: 0; res: 262144; super: 323; other: 0)
> mmap: 50 pass took: 0.155438 (none: 0; res: 262144; super: 323; other: 0)
>
> Only 323 pages.
>
> 3. If I just re-run test I don't see super pages with any size of "block".
>
> $ ./mmap /mnt/random-1024 5 $((1<<30))
> mmap: 1 pass took: 1.013939 (none: 0; res: 262144; super: 0; other: 0)
> mmap: 2 pass took: 0.267082 (none: 0; res: 262144; super: 0; other: 0)
> mmap: 3 pass took: 0.270711 (none: 0; res: 262144; super: 0; other: 0)
> mmap: 4 pass took: 0.268940 (none: 0; res: 262144; super: 0; other: 0)
> mmap: 5 pass took: 0.269634 (none: 0; res: 262144; super: 0; other: 0)
>
> 4. If I activate madvise(MADV_WILLNEDD) in the copy loop and re-run test
> then I see super pages only if I use "block" greater than 2Mb.
>
> $ ./mmap /mnt/random-1024 1 $((1<<21))
> mmap: 1 pass took: 0.299722 (none: 0; res: 262144; super: 0; other: 0)
> $ ./mmap /mnt/random-1024 1 $((1<<22))
> mmap: 1 pass took: 0.271828 (none: 0; res: 262144; super: 170; other: 0)
> $ ./mmap /mnt/random-1024 1 $((1<<23))
> mmap: 1 pass took: 0.333188 (none: 0; res: 262144; super: 258; other: 0)
> $ ./mmap /mnt/random-1024 1 $((1<<24))
> mmap: 1 pass took: 0.339250 (none: 0; res: 262144; super: 303; other: 0)
> $ ./mmap /mnt/random-1024 1 $((1<<25))
> mmap: 1 pass took: 0.418812 (none: 0; res: 262144; super: 324; other: 0)
> $ ./mmap /mnt/random-1024 1 $((1<<26))
> mmap: 1 pass took: 0.360892 (none: 0; res: 262144; super: 335; other: 0)
> $ ./mmap /mnt/random-1024 1 $((1<<27))
> mmap: 1 pass took: 0.401122 (none: 0; res: 262144; super: 342; other: 0)
> $ ./mmap /mnt/random-1024 1 $((1<<28))
> mmap: 1 pass took: 0.478764 (none: 0; res: 262144; super: 345; other: 0)
> $ ./mmap /mnt/random-1024 1 $((1<<29))
> mmap: 1 pass took: 0.607266 (none: 0; res: 262144; super: 346; other: 0)
> $ ./mmap /mnt/random-1024 1 $((1<<30))
> mmap: 1 pass took: 0.901269 (none: 0; res: 262144; super: 347; other: 0)
>
> 5. If I activate madvise(MADV_WILLNEED) immediately after mmap() then I
> see some number of super pages (the number from test #2).
>
> $ ./mmap /mnt/random-1024 5
> mmap: 1 pass took: 0.178666 (none: 0; res: 262144; super: 323; other: 0)
> mmap: 2 pass took: 0.158889 (none: 0; res: 262144; super: 323; other: 0)
> mmap: 3 pass took: 0.157229 (none: 0; res: 262144; super: 323; other: 0)
> mmap: 4 pass took: 0.156895 (none: 0; res: 262144; super: 323; other: 0)
> mmap: 5 pass took: 0.162938 (none: 0; res: 262144; super: 323; other: 0)
>
> 6. If I read file manually before test then I don't see super pages with
> any size of "block" and madvise(MADV_WILLNEED) doesn't help.
>
> $ ./mmap /mnt/random-1024 5 $((1<<30))
> mmap: 1 pass took: 0.996767 (none: 0; res: 262144; super: 0; other: 0)
> mmap: 2 pass took: 0.311129 (none: 0; res: 262144; super: 0; other: 0)
> mmap: 3 pass took: 0.317430 (none: 0; res: 262144; super: 0; other: 0)
> mmap: 4 pass took: 0.314437 (none: 0; res: 262144; super: 0; other: 0)
> mmap: 5 pass took: 0.310757 (none: 0; res: 262144; super: 0; other: 0)
>
>

-- 
Andrey Zonov

--------------010408020308040305090701
Content-Type: text/plain; charset=windows-1251;
 name="mmap.c"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="mmap.c"

/*_
 * Andrey Zonov (c) 2011
 */

#include <sys/mman.h>
#include <sys/types.h>
#include <sys/time.h>
#include <sys/stat.h>
#include <err.h>
#include <fcntl.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int
main(int argc, char **argv)
{
	int i;
	int fd;
	int num;
	int block;
	int pagesize;
	size_t size;
	size_t none, incore, super, other;
	char *ptr;
	char *ptrp;
	char *tmp;
	char *vec;
	char *vecp;
	struct stat sb;
	struct timeval tp, tp1, tp2;

	if (argc < 2 || argc > 4)
		errx(1, "usage: mmap <filename> [num] [block]");

	fd = open(argv[1], O_RDONLY);
	if (fd == -1)
		err(1, "open()");

	num = 1;
	if (argc >= 3)
		num = atoi(argv[2]);

	pagesize = getpagesize();
	block = pagesize;
	if (argc == 4)
		block = atoi(argv[3]);

	if (fstat(fd, &sb) == -1)
		err(1, "fstat()");
	size = sb.st_size;

#if 0
	if (posix_fadvise(fd, (off_t)0, (off_t)0, POSIX_FADV_WILLNEED) == -1)
		err(1, "posix_fadvise()");
#endif

	ptr = mmap(NULL, size, PROT_READ, /*MAP_PREFAULT_READ |*/ MAP_PRIVATE, fd, (off_t)0);
	if (ptr == MAP_FAILED)
		err(1, "mmap()");

#if 0
	if (madvise(ptr, size, MADV_RANDOM) == -1)
		err(1, "madvise()");
#endif
#if 0
	/* Turn on super pages */
	if (madvise(ptr, size, MADV_WILLNEED) == -1)
		err(1, "madvise()");
#endif

	tmp = calloc(1, block);
	if (tmp == NULL)
		err(1, "calloc()");
	vec = calloc(1, size / pagesize);
	if (vec == NULL)
		err(1, "calloc()");
	for (i = 0; i < num; i++) {
		gettimeofday(&tp1, NULL);
		for (ptrp = ptr; (size_t)(ptrp - ptr) < size; ptrp += block) {
#if 0
			if (madvise(ptrp, block, MADV_WILLNEED) == -1)
				err(1, "madvise()");
#endif
			memcpy(tmp, ptrp, block);
		}
		gettimeofday(&tp2, NULL);
		timersub(&tp2, &tp1, &tp);

		if (mincore(ptr, size, vec) == -1)
			err(1, "mincore()");

		none = incore = super = other = 0;
		for (vecp = vec; (size_t)(vecp - vec) < size / pagesize; vecp++) {
			if (*vecp == 0)
				none++;
			else if (*vecp & MINCORE_INCORE)
				incore++;
			else
				other++;
			if (*vecp & MINCORE_SUPER)
				super++;
		}
		warnx("%2d pass took: %3ld.%06ld (none: %6ld; res: %6ld; super: %6ld; other: %6ld)",
		   i + 1, tp.tv_sec, tp.tv_usec, none, incore, super / (2048/4) /* 2Mb / 4Kb */, other);
	}
	free(vec);
	free(tmp);

	if (munmap(ptr, size) == -1)
		err(1, "munmap()");

	close(fd);

	exit(0);
}

--------------010408020308040305090701--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F7C16BD.3010703>