From owner-freebsd-stable@FreeBSD.ORG  Wed Mar  7 00:36:27 2012
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B7CE01065678;
	Wed,  7 Mar 2012 00:36:27 +0000 (UTC)
	(envelope-from luke@hybrid-logic.co.uk)
Received: from hybrid-sites.com (ns226322.hybrid-sites.com [176.31.229.137])
	by mx1.freebsd.org (Postfix) with ESMTP id 741958FC22;
	Wed,  7 Mar 2012 00:36:26 +0000 (UTC)
Received: from [127.0.0.1] (helo=ewes)
	by hybrid-sites.com with esmtp (Exim 4.72 (FreeBSD))
	(envelope-from <luke@hybrid-logic.co.uk>)
	id 1S54rf-000JaQ-8o; Wed, 07 Mar 2012 00:36:24 +0000
Received: from [78.105.122.99] (helo=[192.168.1.23]
	by ns226322.hybrid-sites.com
	with esmtp (Hybrid Web Cluster distributed mail proxy)
	(envelope-from <luke@hybrid-logic.co.uk>);
	Wed, 07 Mar 2012 00:36:23 -0000
From: Luke Marsden <luke@hybrid-logic.co.uk>
To: Chuck Swiger <cswiger@mac.com>
In-Reply-To: <4F569DFF.8040807@mac.com>
References: <1331061203.2218.38.camel@pow>  <4F569DFF.8040807@mac.com>
Content-Type: text/plain; charset="UTF-8"
Date: Wed, 07 Mar 2012 00:36:21 +0000
Message-ID: <1331080581.2589.28.camel@pow>
Mime-Version: 1.0
X-Mailer: Evolution 2.32.2 
Content-Transfer-Encoding: 7bit
X-Spam-bar: +
Cc: freebsd-fs@freebsd.org, team@hybrid-logic.co.uk, freebsd-stable@freebsd.org,
	freebsd-questions@freebsd.org
Subject: Re: FreeBSD 8.2 - active plus inactive memory leak!?
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 07 Mar 2012 00:36:27 -0000

Thanks for your email, Chuck.

> > Conversely, if a page *does not* occur in the resident
> > memory of any process, it must not occupy any space in the active +
> > inactive lists.
> 
> Hmm...if a process gets swapped out entirely, the pages for it will be moved 
> to the cache list, flushed, and then reused as soon as the disk I/O completes. 
>   But there is a window where the process can be marked as swapped out (and 
> considered no longer resident), but still has some of it's pages in physical 
> memory.

There's no swapping happening on these machines (intentionally so,
because as soon as we hit swap everything goes tits up), so this window
doesn't concern me.

I'm trying to confirm that, on a system with no pages swapped out, that
the following is a true statement:

        a page is accounted for in active + inactive if and only if it
        corresponds to one or more of the pages accounted for in the
        resident memory lists of all the processes on the system (as per
        the output of 'top' and 'ps')

> > Therefore the active + inactive memory should always be less than or
> > equal to the sum of the resident memory of all the processes on the
> > system, right?
> 
> No.  If you've got a lot of process pages shared (ie, a webserver with lots of 
> httpd children, or a database pulling in a large common shmem area), then your 
> process resident sizes can be very large compared to the system-wide 
> active+inactive count.

But that's what I'm saying...

        sum(process resident sizes) >= active + inactive
        
Or as I said it above, equivalently:
        
        active + inactive <= sum(process resident sizes)

The data I've got from this system, and what's killing us, shows the
opposite: active + inactive > sum(process resident sizes) - by over 5GB
now and growing, which is what keeps causing these machines to crash.

In particular:
Mem: 13G Active, 1129M Inact, 7543M Wired, 120M Cache, 1553M Free

But the total sum of resident memories is 9457M (according to summing
the output from ps or top).

        13G + 1129M = 14441M (active + inact) > 9457M (sum of res)

That's 4984M out, and that's almost enough to push us over the edge.

If my understanding of VM is correct, I don't see how this can happen.
But it's happening, and it's causing real trouble here because our free
memory keeps hitting zero and then we swap-spiral.

What can I do to investigate this discrepancy?  Are there some tools
that I can use to debug the memory allocated in "active" to find out
where it's going, if not to resident process memory?

Thanks,
Luke

-- 
CTO, Hybrid Logic
+447791750420  |  +1-415-449-1165  | www.hybrid-cluster.com