From owner-freebsd-hackers Wed Jun 26 0:46:16 2002 Delivered-To: freebsd-hackers@freebsd.org Received: from albatross.prod.itd.earthlink.net (albatross.mail.pas.earthlink.net [207.217.120.120]) by hub.freebsd.org (Postfix) with ESMTP id 1559D37B400 for ; Wed, 26 Jun 2002 00:46:11 -0700 (PDT) Received: from pool0159.cvx21-bradley.dialup.earthlink.net ([209.179.192.159] helo=mindspring.com) by albatross.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 17N7Uv-0003YU-00; Wed, 26 Jun 2002 00:45:53 -0700 Message-ID: <3D1970E7.697D4A49@mindspring.com> Date: Wed, 26 Jun 2002 00:44:40 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Peter Wemm Cc: Matthew Dillon , Alfred Perlstein , Patrick Thomas , freebsd-hackers@FreeBSD.ORG Subject: Re: tunings for many httpds... References: <20020625222632.B7C7D3811@overcee.wemm.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Peter Wemm wrote: > > Even more importantly it would be nice if we could share compatible > > pmap pages, then we would have no need for 4MB pages... 50 mappings > > of the same shared memory segment would wind up using the same pmap > > pages as if only one mapping had been made. Such a feature would work > > for SysV shared memory and for mmap()s. I've looked at doing this > > off and on for two years but do not have a sufficient chunk of time > > available yet. > > SVR4/Solaris/Digital Unix^H^H^H^H^H^HTru64 do this by having an additional > layer between VM and pmap. The equivalent of our pmap is just another one > of the address space handlers. The SHM stuff etc is often implemented such > that it grabbed blocks of 4MB address space to manage in a way that it > likes. This means it constructs its own page tables etc in such a way that > they are suitable for common use. *If* I recall correctly, in SunOS/SVR4/ > Solaris parlance this is the segment layer. Naturally there is quite a bit > of variation. It has been a long long time since I looked at this. Linux 2.4 has this with their "rmap" patch. Alan Cox compares the VM system performance as "similar to what I see with FreeBSD". They are coming from a perspective of sharing all page mappings by pointing them at the same entries, without a reverse lookup mechanism (this is what the "rmap" patches add, the reverse lookup; linux has always shared equivalent page mappings). The reverse lookup maintains a linked list (for some reason, this is 12 bytes -- don't know why yet) that is a list of the PTE references to the mapping. So the reverse means going backwards and doing a linear list traversal if the pages are shared (they usually are, for code pages for any program that's running more than one instance). For page waits, they use a shared hash, and then wake up processes unnecessarily, but they expect the contention to be minimal (they estimate 4-8% overhead under extreme load with quantum at 100ms). Doing this in FreeBSD would probably confuse the heck out of the exiting page discard code's LRU determination (among other things), but it's probably worth it, for the cases you've mentioned. I think the extra overhead in the unloaded case is in the noise, and in the loaded case, well worth the trade. I don't know how the PAE code you were rumored to be doing stands; if there were plans to put the PTE's for the process in the bank with the program pages that were running there, then doing this might prevent that from working very well, if those entries had to be shared with entries in another bank. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message