Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 5 Dec 2000 23:13:08 -0800 (PST)
From:      Matt Dillon <dillon@earth.backplane.com>
To:        News History File User <newsuser@free-pr0n.netscum.dk>
Cc:        hackers@freebsd.org, usenet@tdk.net
Subject:   Re: vm_pageout_scan badness
Message-ID:  <200012060713.eB67D8I91529@earth.backplane.com>
References:  <200012011918.eB1JIol53670@earth.backplane.com> <200012020525.eB25PPQ92768@newsmangler.inet.tele.dk> <200012021904.eB2J4An63970@earth.backplane.com> <200012030700.eB370XJ22476@newsmangler.inet.tele.dk> <200012040053.eB40rnm69425@earth.backplane.com> <200012050545.eB55jL453889@crotchety.newsbastards.org> <200012060519.eB65JS910042@crotchety.newsbastards.org>

next in thread | previous in thread | raw e-mail | index | archive | help

:To recap, the difference here is that by cheating, I was able to mlock
:one of the two files (the behaviour I was hoping to be able to achieve
:through first MAP_NOSYNC alone, then in combination with MADV_WILLNEED
:to keep all the pages in memory so much as possible) and achieve a much
:improved level of performance -- I'm able to catch up on backlogs from
:a full feed that had built up during the time I wasn't cheating -- by
:using memory for the history database files rather than for general
:filesystem caching.  I even have spare capacity!  Woo.
:
:The mlock man page refers to some system limit on wired pages; I get no
:error when mlock()'ing the hash file, and I'm reasonably sure I tweaked
:the INN source to treat both files identically (and on the other machines
:I have running, the timestamps of both files remains pretty much unchanged).
:I'm not sure why I'm not seeing the desired results here with both files
:(maybe some call hidden somewhere I haven't located yet), but I hope you
:can see the improvements so far.  I even let abusive readers pound on
:me.  Well, for a while 'til I got tired of 'em.

    I think you are on to something here.  It's got to be mlock().  Run
    'limit' from csh/tcsh and you will see a 'memorylocked' resource.

    Whatever this resource is as of when innd is run -- presumably however
    it is initialized for the 'news' user (see /etc/login.conf) is going
    to effect mlock() operation.

    mlock() will wire pages.  I think you can safely call it on your 
    two smaller history files (history.hash, history.index).  I can
    definitely see how this could result in better performance.

:I still don't know for certain if the disk updates I am seeing are
:slow because they aren't sorted well, or if they're random pages and
:not a sequential set, given that I hope I've ruled out fragmentation
:of the database files.  I still maintain that in the case of a true
:MADV_RANDOM madvise'd file, any attempts to clean out `unused' pages
:are ill-advised, or if they're needed, anything other than freeing of
:sequential pages results in excess disk activity that gains nothing,
:if it's the case that this is not how it's done, due to the nature
:of random access.

    History files are nortorious for random I/O... the problem is due
    to the hash table being, well, a hash table.  The hash table 
    lookups are bad enough but this will also result in random-like
    lookups on the main history file.  You get a little better
    locality of reference on the main history file (meaning the system
    can do a better job caching it optimally), but the hash tables
    are a lost cause so mlock()ing them could be a very good thing.

:Yeah, hacking the vm source to allow me to mlock() isn't kosher, but
:I wanted to test a theory.  Doing so probably requires a few more
:tweaks in the INN source to handle expiry, so it seems, so I'd rather
:the vm subsystem do this for me automagically with the right invocation
:of the suitable mmap/madvise operations, if this is reasonable.

    At the moment madvise() MADV_WILLNEED does nothing more then activate
    the pages in question and force them into the process'es mmap.
    You have to call it every so often to keep the pages 'fresh'... calling
    it once isn't going to do anything.  

    When you call madvise() MADV_WILLNEED the system has to go through
    a number of steps before the pages will be thrown away:  

	- it has to remove them from the process pmap
	- it has to deactivate them
	- it has to cache them
	- then it can free them

    You may be able to achieve an effect very similar to mlock(), but
    runnable by the 'news' user without hacking the kernel, by 
    writing a quick little C program to mmap() the two smaller history
    files and then madvise() the map using MADV_WILLNEED in a loop
    with a sleep(15).  Keeping in mind that expire may recreate those
    files, the program should unmap, close(), and re-open()/mmap/madvise the 
    descriptors every so often (like once a minute).  You shouldn't have
    to access the underlying pages but that would also have a similar 
    effect.  If you do, use a volatile pointer so GCC doesn't optimize
    the access out of the loop.  e.g.

	for (ptr = mapBase; ptr < mapEnd; ptr += pageSize) {
	    volatile char c = *ptr;
	}

    or

	for (ptr = mapBase; ptr < mapEnd; ptr += pageSize) {
	    dummyroutine(*ptr);
	}

    And my earlier suggestion above would look something like:

	for (;;) {
	    open descriptor
	    map 
	    for (i = 0; i < 15; ++i) {
		madvise(mapBase, mapSize, MADV_WILLNEED);
		sleep(15);
	    }
	    munmap
	    close descriptor
	}

						-Matt



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200012060713.eB67D8I91529>