Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 26 Jun 2002 00:58:25 -0700 (PDT)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Terry Lambert <tlambert2@mindspring.com>
Cc:        Peter Wemm <peter@wemm.org>, Alfred Perlstein <bright@mu.org>, Patrick Thomas <root@utility.clubscholarship.com>, freebsd-hackers@FreeBSD.ORG
Subject:   Re: tunings for many httpds...
Message-ID:  <200206260758.g5Q7wPgZ019100@apollo.backplane.com>
References:  <20020625222632.B7C7D3811@overcee.wemm.org> <3D1970E7.697D4A49@mindspring.com>

next in thread | previous in thread | raw e-mail | index | archive | help
    Hmm.  I'm fairly sure that Linux does not quite do it that way.  I
    believe the 2-level page tables are copy-on-write, but that only
    gives you shareability across a fork() and then only for a little
    while.  I'm fairly certain that Linux cannot share page tables
    for post-fork modifications (like when you mmap() or get a SysV
    shared segment).  The rmap patches are roughly equivalent to our
    i386 pmap code and allow Rik to implement page queues and proper page
    aging.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

:Peter Wemm wrote:
:> >     Even more importantly it would be nice if we could share compatible
:> >     pmap pages, then we would have no need for 4MB pages... 50 mappings
:> >     of the same shared memory segment would wind up using the same pmap
:> >     pages as if only one mapping had been made.  Such a feature would work
:> >     for SysV shared memory and for mmap()s.  I've looked at doing this
:> >     off and on for two years but do not have a sufficient chunk of time
:> >     available yet.
:> 
:> SVR4/Solaris/Digital Unix^H^H^H^H^H^HTru64 do this by having an additional
:> layer between VM and pmap.  The equivalent of our pmap is just another one
:> of the address space handlers.  The SHM stuff etc is often implemented such
:> that it grabbed blocks of 4MB address space to manage in a way that it
:> likes.  This means it constructs its own page tables etc in such a way that
:> they are suitable for common use.  *If* I recall correctly, in SunOS/SVR4/
:> Solaris parlance this is the segment layer.  Naturally there is quite a bit
:> of variation.  It has been a long long time since I looked at this.
:
:Linux 2.4 has this with their "rmap" patch.  Alan Cox compares the
:VM system performance as "similar to what I see with FreeBSD".
:
:They are coming from a perspective of sharing all page mappings
:by pointing them at the same entries, without a reverse lookup
:mechanism (this is what the "rmap" patches add, the reverse lookup;
:linux has always shared equivalent page mappings).
:
:The reverse lookup maintains a linked list (for some reason, this
:is 12 bytes -- don't know why yet) that is a list of the PTE references
:to the mapping.  So the reverse means going backwards and doing a
:linear list traversal if the pages are shared (they usually are, for
:code pages for any program that's running more than one instance).
:
:For page waits, they use a shared hash, and then wake up processes
:unnecessarily, but they expect the contention to be minimal (they
:estimate 4-8% overhead under extreme load with quantum at 100ms).
:
:Doing this in FreeBSD would probably confuse the heck out of the
:exiting page discard code's LRU determination (among other things),
:but it's probably worth it, for the cases you've mentioned.  I think
:the extra overhead in the unloaded case is in the noise, and in the
:loaded case, well worth the trade.
:
:I don't know how the PAE code you were rumored to be doing stands;
:if there were plans to put the PTE's for the process in the bank
:with the program pages that were running there, then doing this
:might prevent that from working very well, if those entries had to
:be shared with entries in another bank.
:
:-- Terry
:


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200206260758.g5Q7wPgZ019100>