Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 6 Apr 2014 12:37:57 +0400
From:      Dmitry Sivachenko <trtrmitya@gmail.com>
To:        John Baldwin <jhb@FreeBSD.org>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: madvise() vs posix_fadvise()
Message-ID:  <00B9699B-80D2-40E6-AA51-7B15191A4BDE@gmail.com>
In-Reply-To: <8DAE3175-FE32-4D17-A386-063DDB6C45F7@gmail.com>
References:  <D6BD48AF-9522-495D-8D54-37854E53C272@gmail.com> <201404031102.38598.jhb@freebsd.org> <EF134BCA-1E92-4C98-8763-9A31EA96839A@gmail.com> <201404041612.35889.jhb@freebsd.org> <5426E303-E35B-4D4A-AB62-3571228A5A2C@gmail.com> <8DAE3175-FE32-4D17-A386-063DDB6C45F7@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On 06 =D0=B0=D0=BF=D1=80. 2014 =D0=B3., at 0:11, Dmitry Sivachenko =
<trtrmitya@gmail.com> wrote:

>=20
> On 05 =D0=B0=D0=BF=D1=80. 2014 =D0=B3., at 1:02, Dmitry Sivachenko =
<trtrmitya@gmail.com> wrote:
>=20
>> On 05 =D0=B0=D0=BF=D1=80. 2014 =D0=B3., at 0:12, John Baldwin =
<jhb@FreeBSD.org> wrote:
>>=20
>>>=20
>>> MADV_WILLNEED is not going to give you what you want.  OTOH, if you =
haven't
>>> tried FreeBSD 10 yet, I would suggest trying that.  There have been =
changes
>>> to pagedaemon that might make it do a better job of kicking out the =
pages
>>> of the log files automatically.
>>>=20
>>=20
>>=20
>> I did. My situation became worse after I moved from stable/9 to =
stable/10.
>> My feeling is that stable/10 pushes rarely used mmaped pages out of =
RAM more aggressively than stable/9 did.
>>=20
>> For now, the only solution I found is doing msync(MS_INVALIDATE) on =
log files after gzipping and after backup via rsync.
>> This moves corresponding memory pages from Inactive to Free and =
prevents system to occupy all free memory with cached log files and to =
purge mmaped data out of RAM to accomodate more disk cache.
>>=20
>> What I would love to see is an ability to tell OS not to release =
mmaped data unless "really needed" (disk cache is not an excuse).
>=20
>=20
> One more observation as it seems to be related.
> If my program allocates RAM via malloc() rather than mmap(), I see =
that VM swaps rarely used parts of malloced data out as disk is being =
used
> (more and more memory goes to Inactive with cached files content).
>=20
> This is also different from stable/9 and seems not good.  Why to keep =
cached content of files forever? (seems there is no timeout for keeping =
cached files content in Inactive state).  So after few days of uptime =
all available RAM is either in Active state with frequently used pages =
of running processes or in Inactive state with cached files data.  =
Rarely used parts of processes memory goes to swap.
>=20
>=20


Look at this (top output is sorted by size):

last pid:  2945;  load averages:  8.94,  8.88,  9.23   up 25+20:18:46  =
12:33:26
94 processes:  6 running, 86 sleeping, 2 zombie
CPU: 22.2% user,  0.0% nice,  0.6% system,  0.0% interrupt, 77.2% idle
Mem: 76G Active, 161G Inact, 7485M Wired, 3504M Cache, 1937M Buf, 1906M =
Free
Swap: 24G Total, 1435M Used, 23G Free, 5% Inuse, 12K In, 196K Out

  PID USERNAME      THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU =
COMMAND
 2330 mitya           1  27    0 24611M 24626M piperd 12  10:10  10.25% =
gsort
99508 mitya           1 103    0 15502M 12382M CPU15  15 652:49 100.00% =
mkcls
79062 mitya           1  52    0 11396M 10721M swread 22  69.2H  87.26% =
aliw
80062 mitya           1  52    0 11282M 10666M swread 27  67.0H  80.18% =
aliw
 1832 mitya           1 103    0  8940M  8707M CPU28  28 232:09 100.00% =
aliw
 1871 mitya           1 103    0  8326M  8258M CPU11  11 219:13 100.00% =
aliw
 2329 mitya           1  52    0  5335M  5043M getblk 12 109:49  86.57% =
phraset
 2002 mitya       1  52    0  3810M  3232M wswbuf  3 186:33  98.39% =
phraset
 2035 mitya       1 102    0  3810M  3232M CPU16  16 179:33  98.68% =
phraset
 2555 mitya           1 103    0  2416M  2196M CPU20  20  81:34 100.00% =
aliw
 2038 mitya       1  23    0   150M  4808K piperd 29   0:00   0.00% =
nbest
 2005 mitya       1  22    0   150M  4808K piperd  3   0:00   0.00% =
nbest
 1381 root            2  20    0   106M 23684K select 18   0:57   0.00% =
ruby19
64642 mitya           1  20    0 96608K  1792K select 22   0:37   0.00% =
sshd
 2864 root            1  20    0 92512K  5392K select  6   0:00   0.00% =
sshd
 2866 mitya           1  20    0 92512K  5384K select 18   0:00   0.00% =
sshd
98119 mitya           1  20    0 92512K  2096K select 23   0:07   0.00% =
sshd


This machine has 256GB of RAM and all running processes use less than =
100GB.
But since now all Free memory moved to Inactive state greedily holding =
cached files, we see processes are swapping.

This strategy could be beneficial for file servers, but not for other =
use cases.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?00B9699B-80D2-40E6-AA51-7B15191A4BDE>