From owner-freebsd-stable@FreeBSD.ORG  Thu Aug 21 07:08:41 2014
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id B4367146
 for <freebsd-stable@freebsd.org>; Thu, 21 Aug 2014 07:08:41 +0000 (UTC)
Received: from pp2.rice.edu (proofpoint2.mail.rice.edu [128.42.201.101])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 6CCE43ED9
 for <freebsd-stable@freebsd.org>; Thu, 21 Aug 2014 07:08:40 +0000 (UTC)
Received: from pps.filterd (pp2.rice.edu [127.0.0.1])
 by pp2.rice.edu (8.14.5/8.14.5) with SMTP id s7L78dva005500;
 Thu, 21 Aug 2014 02:08:39 -0500
Received: from mh1.mail.rice.edu (mh1.mail.rice.edu [128.42.201.20])
 by pp2.rice.edu with ESMTP id 1nw8mur108-1;
 Thu, 21 Aug 2014 02:08:38 -0500
X-Virus-Scanned: by amavis-2.7.0 at mh1.mail.rice.edu, auth channel
Received: from 108-254-203-201.lightspeed.hstntx.sbcglobal.net
 (108-254-203-201.lightspeed.hstntx.sbcglobal.net [108.254.203.201])
 (using TLSv1 with cipher RC4-MD5 (128/128 bits))
 (No client certificate requested) (Authenticated sender: alc)
 by mh1.mail.rice.edu (Postfix) with ESMTPSA id 94AC246027C;
 Thu, 21 Aug 2014 02:08:38 -0500 (CDT)
Message-ID: <53F59AF6.6080204@rice.edu>
Date: Thu, 21 Aug 2014 02:08:38 -0500
From: Alan Cox <alc@rice.edu>
User-Agent: Mozilla/5.0 (X11; FreeBSD i386;
 rv:24.0) Gecko/20100101 Thunderbird/24.2.0
MIME-Version: 1.0
To: "Polyack, Steve" <Steve.Polyack@intermedix.com>,
 "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>
Subject: Re: vmdaemon CPU usage and poor performance in 10.0-RELEASE
References: <4D557EC7CC2A544AA7C1A3B9CBA2B36726098846B4@exchange03.epbs.com>
 <20140813152522.GI9400@home.opsec.eu>
 <4D557EC7CC2A544AA7C1A3B9CBA2B36726098847AF@exchange03.epbs.com>
 <CAJUyCcNoTJ3xqkC_Prz3N+ApEqYy3Mi2gA+uDo33dczaTMONrA@mail.gmail.com>
 <4D557EC7CC2A544AA7C1A3B9CBA2B3672609BBA3C4@exchange03.epbs.com>
 <53F24E5B.1010809@rice.edu>
 <4D557EC7CC2A544AA7C1A3B9CBA2B3672609BBA64F@exchange03.epbs.com>
 <53F2790C.20703@rice.edu>
 <4D557EC7CC2A544AA7C1A3B9CBA2B3672609CF28E5@exchange03.epbs.com>
 <4D557EC7CC2A544AA7C1A3B9CBA2B3672609CF2F8F@exchange03.epbs.com>
 <4D557EC7CC2A544AA7C1A3B9CBA2B3672609CF31F0@exchange03.epbs.com>
 <53F4C4C2.1030109@rice.edu>
 <4D557EC7CC2A544AA7C1A3B9CBA2B3672609CF335D@exchange03.epbs.com>
 <53F4C82E.5000900@rice.edu>
 <4D557EC7CC2A544AA7C1A3B9CBA2B3672609CF33F6@exchange03.epbs.com>
In-Reply-To: <4D557EC7CC2A544AA7C1A3B9CBA2B3672609CF33F6@exchange03.epbs.com>
X-Enigmail-Version: 1.6
Content-Type: multipart/mixed; boundary="------------000308090204000308010002"
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0
 kscore.is_bulkscore=0
 kscore.compositescore=0 circleOfTrustscore=0
 compositescore=0.353842562264287 urlsuspect_oldscore=0.353842562264287
 suspectscore=11 recipient_domain_to_sender_totalscore=0 phishscore=0
 bulkscore=0 kscore.is_spamscore=0 recipient_to_sender_totalscore=0
 recipient_domain_to_sender_domain_totalscore=0 rbsscore=0.353842562264287
 spamscore=0 recipient_to_sender_domain_totalscore=0 urlsuspectscore=0.9
 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1
 engine=7.0.1-1402240000 definitions=main-1408210095
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.18-1
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable/>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 21 Aug 2014 07:08:41 -0000

This is a multi-part message in MIME format.
--------------000308090204000308010002
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

On 08/20/2014 11:22, Polyack, Steve wrote:
>> -----Original Message-----
>> From: Alan Cox [mailto:alc@rice.edu]
>> Sent: Wednesday, August 20, 2014 12:09 PM
>> To: Polyack, Steve; freebsd-stable@freebsd.org
>> Subject: Re: vmdaemon CPU usage and poor performance in 10.0-RELEASE
>>
>> On 08/20/2014 10:56, Polyack, Steve wrote:
>>>> -----Original Message-----
>>>> From: Alan Cox [mailto:alc@rice.edu]
>>>> Sent: Wednesday, August 20, 2014 11:55 AM
>>>> To: Polyack, Steve; freebsd-stable@freebsd.org
>>>> Subject: Re: vmdaemon CPU usage and poor performance in 10.0-RELEASE
>>>>
>>>> On 08/20/2014 09:55, Polyack, Steve wrote:
>>>>>> -----Original Message-----
>>>>>> From: Polyack, Steve
>>>>>> Sent: Wednesday, August 20, 2014 9:14 AM
>>>>>> To: Polyack, Steve; Alan Cox; freebsd-stable@freebsd.org
>>>>>> Subject: RE: vmdaemon CPU usage and poor performance in 10.0-
>> RELEASE
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: owner-freebsd-stable@freebsd.org [mailto:owner-freebsd-
>>>>>>> stable@freebsd.org] On Behalf Of Polyack, Steve
>>>>>>> Sent: Tuesday, August 19, 2014 12:37 PM
>>>>>>> To: Alan Cox; freebsd-stable@freebsd.org
>>>>>>> Subject: RE: vmdaemon CPU usage and poor performance in 10.0-
>>>> RELEASE
>>>>>>>> -----Original Message-----
>>>>>>>> From: owner-freebsd-stable@freebsd.org [mailto:owner-freebsd-
>>>>>>>> stable@freebsd.org] On Behalf Of Alan Cox
>>>>>>>> Sent: Monday, August 18, 2014 6:07 PM
>>>>>>>> To: freebsd-stable@freebsd.org
>>>>>>>> Subject: Re: vmdaemon CPU usage and poor performance in 10.0-
>>>>>> RELEASE
>>>>>>>> On 08/18/2014 16:29, Polyack, Steve wrote:
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: owner-freebsd-stable@freebsd.org [mailto:owner-freebsd-
>>>>>>>>>> stable@freebsd.org] On Behalf Of Alan Cox
>>>>>>>>>> Sent: Monday, August 18, 2014 3:05 PM
>>>>>>>>>> To: freebsd-stable@freebsd.org
>>>>>>>>>> Subject: Re: vmdaemon CPU usage and poor performance in 10.0-
>>>>>>> RELEASE
>>>>>>>>>> On 08/18/2014 13:42, Polyack, Steve wrote:
>>>>>>>>>>> Excuse my poorly formatted reply at the moment, but this seems
>> to
>>>>>>>> have
>>>>>>>>>> fixed our problems.  I'm going to update the bug report with a
>> note.
>>>>>>>>>>> Thanks Alan!
>>>>>>>>>> You're welcome.  And, thanks for letting me know of the outcome.
>>>>>>>>>>
>>>>>>>>> Actually, I may have spoken too soon, as it looks like we're seeing
>>>>>>>> vmdaemon tying up the system again:
>>>>>>>>> root                6  100.0  0.0        0       16  -  DL   Wed04PM      4:37.95
>>>>>>> [vmdaemon]
>>>>>>>>> Is there anything I can check to help narrow down what may be the
>>>>>>>> problem?  KTrace/truss on the "process" doesn't give any
>> information, I
>>>>>>>> suppose because it's actually a kernel thread.
>>>>>>>>
>>>>>>>> Can you provide the full output of top?  Is there anything unusual
>>>> about
>>>>>>>> the hardware or software configuration?
>>>>>>> This may have just been a fluke (maybe NFS caching the old
>>>> vm_pageout.c
>>>>>>> during the first source build).  We've rebuilt and are monitoring it
>> now.
>>>>>>> The hardware consists of a few Dell PowerEdge R720xd servers with
>>>> 256GB
>>>>>>> of RAM and array of SSDs (no ZFS).  64GB is dedicated to postgres
>>>>>>> shared_buffers right now. FreeBSD 10, PostgreSQL 9.3, Slony-I v2.2.2,
>>>> and
>>>>>>> redis-2.8.11 are all in use here.  I can't say that anything is unusual
>> about
>>>>>> the
>>>>>>> configuration.
>>>>>>>
>>>>>> We are still seeing the issue.  It seems to manifest once the "Free"
>>>> memory
>>>>>> gets under 10GB (of 256GB on the system), even though ~200GB of this
>> is
>>>>>> classified as Inactive.  For us, this was about 7 hours of database
>> activity
>>>>>> (initial replication w/ slony).  Right now vmdaemon is consuming 100%
>>>> CPU
>>>>>> and shows 671:34 CPU time when it showed 0:00 up until the problem
>>>>>> manifested.  The full top output (that fits on my screen) is below:
>>>>>>
>>>>>> last pid: 62309;  load averages:  4.05,  4.24,  4.10
>>>>>> up 0+22:34:31  09:08:43
>>>>>> 159 processes: 8 running, 145 sleeping, 1 waiting, 5 lock
>>>>>> CPU: 14.5% user,  0.0% nice,  4.9% system,  0.0% interrupt, 80.5% idle
>>>>>> Mem: 26G Active, 216G Inact, 4122M Wired, 1178M Cache, 1632M Buf,
>>>> 2136M
>>>>>> Free
>>>>>> Swap: 32G Total, 32G Free
>>>>>>
>>>>>>   PID USERNAME       THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU
>>>>>> COMMAND
>>>>>>    11 root            32 155 ki31     0K   512K CPU31  31 669.6H 2934.23% idle
>>>>>>     6 root             1 -16    -     0K    16K CPU19  19 678:57 100.00% vmdaemon
>>>>>>  1963 pgsql            1  45    0 67538M   208M CPU0    0 121:46  17.38%
>> postgres
>>>>>>  2037 pgsql            1  77    0 67536M  2200K *vm ob 14   6:24  15.97%
>> postgres
>>>>>>  1864 pgsql            1  31    0 67536M  1290M semwai  4 174:41  15.19%
>>>> postgres
>>>>>>  1996 pgsql            1  38    0 67538M   202M semwai 16 120:27  15.09%
>>>> postgres
>>>>>>  1959 pgsql            1  39    0 67538M   204M CPU27  27 117:30  15.09%
>> postgres
>>>>>>  1849 pgsql            1  32    0 67536M  1272M semwai 23 126:22  13.96%
>>>> postgres
>>>>>>  1997 pgsql            1  31    0 67538M   206M CPU30  30 122:26  11.77%
>> postgres
>>>>>>  2002 pgsql            1  34    0 67538M   182M sbwait 11  55:20  11.28%
>> postgres
>>>>>>  1961 pgsql            1  32    0 67538M   206M CPU12  12 121:47  10.99%
>> postgres
>>>>>>  1964 pgsql            1  30    0 67538M   206M semwai 28 122:08   9.86%
>> postgres
>>>>>>  1962 pgsql            1  29    0 67538M  1286M sbwait  2  45:49   7.18%
>> postgres
>>>>>>  1752 root             1  22    0 78356K  8688K CPU2    2 175:46   6.88% snmpd
>>>>>>  1965 pgsql            1  25    0 67538M   207M semwai  9 120:55   6.59%
>> postgres
>>>>>>  1960 pgsql            1  23    0 67538M   177M semwai  6  52:42   4.88%
>> postgres
>>>>>>  1863 pgsql            1  25    0 67542M   388M semwai 25   9:12   2.20%
>> postgres
>>>>>>  1859 pgsql            1  22    0 67538M  1453M *vm ob 20   6:13   2.10%
>> postgres
>>>>>>  1860 pgsql            1  22    0 67538M  1454M sbwait  8   6:08   1.95% postgres
>>>>>>  1848 pgsql            1  21    0 67586M 66676M *vm ob 30 517:07   1.66%
>>>> postgres
>>>>>>  1856 pgsql            1  22    0 67538M   290M *vm ob 15   5:39   1.66%
>> postgres
>>>>>>  1846 pgsql            1  21    0 67538M   163M sbwait 15   5:46   1.46% postgres
>>>>>>  1853 pgsql            1  21    0 67538M   110M sbwait 30   8:54   1.17% postgres
>>>>>>  1989 pgsql            1  23    0 67536M  5180K sbwait 18   1:41   0.98% postgres
>>>>>>     5 root             1 -16    -     0K    16K psleep  6   9:33   0.78% pagedaemon
>>>>>>  1854 pgsql            1  20    0 67538M   338M sbwait 22   5:38   0.78% postgres
>>>>>>  1861 pgsql            1  20    0 67538M   286M sbwait 15   6:13   0.68% postgres
>>>>>>  1857 pgsql            1  20    0 67538M  1454M semwai 10   6:19   0.49%
>> postgres
>>>>>>  1999 pgsql            1  36    0 67538M   156M *vm ob 28 120:56   0.39%
>> postgres
>>>>>>  1851 pgsql            1  20    0 67538M   136M sbwait 22   5:48   0.39% postgres
>>>>>>  1975 pgsql            1  20    0 67536M  5688K sbwait 25   1:40   0.29% postgres
>>>>>>  1858 pgsql            1  20    0 67538M   417M sbwait  3   5:55   0.20% postgres
>>>>>>  2031 pgsql            1  20    0 67536M  5664K sbwait  5   3:26   0.10% postgres
>>>>>>  1834 root            12  20    0 71892K 12848K select 20  34:05   0.00% slon
>>>>>>    12 root            78 -76    -     0K  1248K WAIT    0  25:47   0.00% intr
>>>>>>  2041 pgsql            1  20    0 67536M  5932K sbwait 14  12:50   0.00%
>> postgres
>>>>>>  2039 pgsql            1  20    0 67536M  5960K sbwait 17   9:59   0.00% postgres
>>>>>>  2038 pgsql            1  20    0 67536M  5956K sbwait  6   8:21   0.00% postgres
>>>>>>  2040 pgsql            1  20    0 67536M  5996K sbwait  7   8:20   0.00% postgres
>>>>>>  2032 pgsql            1  20    0 67536M  5800K sbwait 22   7:03   0.00% postgres
>>>>>>  2036 pgsql            1  20    0 67536M  5748K sbwait 23   6:38   0.00% postgres
>>>>>>  1812 pgsql            1  20    0 67538M 59185M select  1   5:46   0.00% postgres
>>>>>>  2005 pgsql            1  20    0 67536M  5788K sbwait 23   5:14   0.00% postgres
>>>>>>  2035 pgsql            1  20    0 67536M  4892K sbwait 18   4:52   0.00%
>> <postgres>
>>>>>>  1852 pgsql            1  21    0 67536M  1230M semwai  7   4:47   0.00%
>> postgres
>>>>>>    13 root             3  -8    -     0K    48K -      28   4:46   0.00% geom
>>>>>>
>>>>>>
>>>>> Another thing I've noticed is that this sysctl vm.stats counter is
>> increasing
>>>> fairly rapidly:
>>>>> # sysctl vm.stats.vm.v_pdpages && sleep 1 && sysctl
>>>> vm.stats.vm.v_pdpages
>>>>> vm.stats.vm.v_pdpages: 3455264541
>>>>> vm.stats.vm.v_pdpages: 3662158383
>>>> I'm not sure what that tells us, because both the page daemon and the
>> vm
>>>> ("swap") daemon increment this counter.
>>>>
>>>>> Also, to demonstrate what kind of problems this seems to cause:
>>>>> # time sleep 1
>>>>>
>>>>> real	0m18.288s
>>>>> user	0m0.001s
>>>>> sys	0m0.004s
>>>> If you change the sysctl vm.swap_enabled to 0, how does your system
>>>> behave?
>>>>
>>> Setting vm.swap_enabled to 0 made the problem clear up almost
>> instantly.  vmdaemon is back to 0.00% CPU usage and the system is
>> responsive once again.
>>>
>> I doubt that you need whole process swapping.  The page daemon is
>> probably sufficient.  See how things go for a few days and let me know.
>>
>> There is still a bug here that needs diagnosing and fixing.  So, I will
>> likely send you a debugging patch in the near future, and ask you to
>> reenable swapping under that patch.
>>
> If it helps at all - setting vm.swap_enabled=0 seems to fix the problem even without the aforementioned patch to vm_pageout.c.
>


I have a couple hypotheses for what is causing your problem.  The
attached patch addresses one of them.  Please apply this patch and then
reset vm._swap_enabled back to 1.


--------------000308090204000308010002
Content-Type: text/plain; charset=ISO-8859-15;
 name="page_shortage1.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="page_shortage1.patch"

Index: vm/vm_pageout.c
===================================================================
--- vm/vm_pageout.c	(revision 270258)
+++ vm/vm_pageout.c	(working copy)
@@ -1309,6 +1309,20 @@ relock_queues:
 	vm_pagequeue_unlock(pq);
 
 	/*
+	 * If we didn't get enough free pages, and we have skipped a vnode
+	 * in a writeable object, wakeup the sync daemon.  And kick swapout
+	 * if we did not get enough free pages.
+	 */
+	if (page_shortage > 0) {
+		if (vnodes_skipped && vm_page_count_min())
+			(void) speedup_syncer();
+#if !defined(NO_SWAPPING)
+		if (vm_swap_enabled)
+			vm_req_vmdaemon(VM_SWAP_NORMAL);
+#endif
+	}
+
+	/*
 	 * Compute the number of pages we want to try to move from the
 	 * active queue to the inactive queue.
 	 */
@@ -1418,20 +1432,6 @@ relock_queues:
 		}
 	}
 #endif
-		
-	/*
-	 * If we didn't get enough free pages, and we have skipped a vnode
-	 * in a writeable object, wakeup the sync daemon.  And kick swapout
-	 * if we did not get enough free pages.
-	 */
-	if (vm_paging_target() > 0) {
-		if (vnodes_skipped && vm_page_count_min())
-			(void) speedup_syncer();
-#if !defined(NO_SWAPPING)
-		if (vm_swap_enabled && vm_page_count_target())
-			vm_req_vmdaemon(VM_SWAP_NORMAL);
-#endif
-	}
 
 	/*
 	 * If we are critically low on one of RAM or swap and low on

--------------000308090204000308010002--