From owner-freebsd-questions@FreeBSD.ORG Tue Mar 6 23:30:18 2012 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E97171065672 for ; Tue, 6 Mar 2012 23:30:18 +0000 (UTC) (envelope-from cswiger@mac.com) Received: from newmail.codefab.com (rrcs-24-103-228-244.nyc.biz.rr.com [24.103.228.244]) by mx1.freebsd.org (Postfix) with ESMTP id A53F28FC15 for ; Tue, 6 Mar 2012 23:30:18 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by newmail.codefab.com (Postfix) with ESMTP id 1F16A11ED6A3D; Tue, 6 Mar 2012 18:30:12 -0500 (EST) X-Virus-Scanned: amavisd-new at codefab.com Received: from newmail.codefab.com ([127.0.0.1]) by localhost (staging.codefab.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8aulfLa4Ctcb; Tue, 6 Mar 2012 18:30:10 -0500 (EST) Received: from [192.168.1.3] (pool-96-224-39-126.nycmny.east.verizon.net [96.224.39.126]) by newmail.codefab.com (Postfix) with ESMTPSA id A99A811ED6A2D; Tue, 6 Mar 2012 18:30:08 -0500 (EST) Message-ID: <4F569DFF.8040807@mac.com> Date: Tue, 06 Mar 2012 18:30:07 -0500 From: Chuck Swiger User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: Luke Marsden , freebsd-questions@freebsd.org References: <1331061203.2218.38.camel@pow> In-Reply-To: <1331061203.2218.38.camel@pow> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Re: FreeBSD 8.2 - active plus inactive memory leak!? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Mar 2012 23:30:19 -0000 On 3/6/2012 2:13 PM, Luke Marsden wrote: [ ... ] > My current (probably quite simplistic) understanding of the FreeBSD > virtual memory system is that, for each process as reported by top: > > * Size corresponds to the total size of all the text pages for the > process (those belonging to code in the binary itself and linked > libraries) plus data pages (including stack and malloc()'d but > not-yet-written-to memory segments). Size is the amount of the processes' VM address space which has been assigned; the various things you mention indeed are the common things which consume address space, but there are others like shared memory (ie, SysV shmem stuff), memory-mapped hardware like a video card VRAM buffer, thread-local storage, etc. > * Resident corresponds to a subset of the pages above: those pages > which actually occupy physical/core memory. Notably pages may > appear in size but not appear in resident for read-only text > pages from libraries which have not been used yet or which have > been malloc()'d but not yet written-to. Yes. > My understanding for the values for the system as a whole (at the top in > 'top') is as follows: > > * Active / inactive memory is the same thing: resident memory from > processes in use. Being in the inactive as opposed to active > list simply indicates that the pages in question are less > recently used and therefore more likely to get swapped out if > the machine comes under memory pressure. Well, they aren't exactly the same thing. The kernel implements a VM working set algorithm which periodically looks at all of the pages that are in memory and notes whether a process has accessed that page recently. If it has, the page is active; if the page has not been used for "some time", it becomes inactive. If the system has plenty of memory, it will not page or swap anything out. If it is under mild memory pressure, it will only consider pages which are inactive or cache as candidates for which it might page them out. Only under more severe memory pressure will it start looking to swap out entire processes rather than just page individual pages out. [ Although, the FreeBSD implementation supposedly will try to balance the size of the active, inactive, and cache lists (or queues), so it is looking at the active list also-- but you don't want to page out an active page unless you really have to, and if you have to do that, maybe you might as well free up the whole process and let something have enough room to run. ] > * Wired is mostly kernel memory. It's normally all kernel memory; only a rare handful of userland programs such as crypto code like gnupg ever ask for wired memory, AFAIK. > * Cache is freed memory which the kernel has decided to keep in > case it correspond to a useful page in future; it can be cheaply > evicted into the free list. Sort of, although this description fits the "inactive" memory category also. The major distinction is that the system is actively trying to flush any dirty pages in the cache category, so that they are available for reuse by something else immediately. > * Free memory is actually not being used for anything. Yes, although the system likes to have at least a few pre-zeroed pages handy in case an interrupt handler needs them. > It seems that pages which occur in the active + inactive lists must > occur in the resident memory of one or more processes ("or more" since > processes can share pages in e.g. read-only shared libs or COW forked > address space). Everything in the active and inactive (and cache) lists are resident in physical memory. > Conversely, if a page *does not* occur in the resident > memory of any process, it must not occupy any space in the active + > inactive lists. Hmm...if a process gets swapped out entirely, the pages for it will be moved to the cache list, flushed, and then reused as soon as the disk I/O completes. But there is a window where the process can be marked as swapped out (and considered no longer resident), but still has some of it's pages in physical memory. > Therefore the active + inactive memory should always be less than or > equal to the sum of the resident memory of all the processes on the > system, right? No. If you've got a lot of process pages shared (ie, a webserver with lots of httpd children, or a database pulling in a large common shmem area), then your process resident sizes can be very large compared to the system-wide active+inactive count. > This "missing memory" is scary, because it seems to be increasing over > time, and eventually when the system runs out of free memory, I'm > certain it will crash in the same way described in my previous thread > [1]. I don't have enough data to fully evaluate the interactions with ZFS; you can easily get system panics by running out of KVA on a 32-bit system, but that shouldn't apply to a 64-bit kernel. But that's kernel memory, not system VM. What you've described sounds pretty much like a classic load-spiral experienced by pre-forking webservers if you don't constrain the max # of children which can run to something that fits reasonably well without excessive paging, much less swapping. > Is my understanding of the virtual memory system badly broken - in which > case please educate me ;-) or is there a real problem here? If so how > can I dig deeper to help uncover/fix it? You've got a pretty good understanding of VM, but the devil is in the details. Regards, -- -Chuck