Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 3 Apr 2018 18:00:51 -0700 (PDT)
From:      Don Lewis <truckman@FreeBSD.org>
To:        Andriy Gapon <avg@FreeBSD.org>
Cc:        Bryan Drewery <bdrewery@FreeBSD.org>, Peter Jeremy <peter@rulingia.com>, Jeff Roberson <jroberson@jroberson.net>, FreeBSD current <freebsd-current@FreeBSD.org>
Subject:   Re: Strange ARC/Swap/CPU on yesterday's -CURRENT
Message-ID:  <tkrat.11e402b8455bd0fa@FreeBSD.org>
In-Reply-To: <tkrat.9bab32187c0e8d01@FreeBSD.org>
References:  <20180306173455.oacyqlbib4sbafqd@ler-imac.lerctr.org> <201803061816.w26IGaW5050053@pdx.rh.CN85.dnsmgr.net> <20180306193645.vv3ogqrhauivf2tr@ler-imac.lerctr.org> <20180306221554.uyshbzbboai62rdf@dx240.localdomain> <20180307103911.GA72239@kloomba> <20180311004737.3441dbf9@thor.intern.walstatt.dynvpn.de> <alpine.BSF.2.21.1803111038460.1232@desktop> <20180320070745.GA12880@server.rulingia.com> <2b3db2af-03c7-65ff-25e7-425cfd8815b6@FreeBSD.org> <1fd2b47b-b559-69f8-7e39-665f0f599c8f@FreeBSD.org> <tkrat.9bab32187c0e8d01@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On  1 Apr, Don Lewis wrote:
> On 27 Mar, Andriy Gapon wrote:
>> On 24/03/2018 01:21, Bryan Drewery wrote:
>>> On 3/20/2018 12:07 AM, Peter Jeremy wrote:
>>>>
>>>> On 2018-Mar-11 10:43:58 -1000, Jeff Roberson <jroberson@jroberson.net> wrote:
>>>>> Also, if you could try going back to r328953 or r326346 and let me know if 
>>>>> the problem exists in either.  That would be very helpful.  If anyone is 
>>>>> willing to debug this with me contact me directly and I will send some 
>>>>> test patches or debugging info after you have done the above steps.
>>>>
>>>> I ran into this on 11-stable and tracked it to r326619 (MFC of r325851).
>>>> I initially got around the problem by reverting that commit but either
>>>> it or something very similar is still present in 11-stable r331053.
>>>>
>>>> I've seen it in my main server (32GB RAM) but haven't managed to reproduce
>>>> it in smaller VBox guests - one difficulty I faced was artificially filling
>>>> ARC.
>> 
>> First, it looks like maybe several different issues are being discussed and
>> possibly conflated in this thread.  I see reports related to ZFS, I see reports
>> where ZFS is not used at all.  Some people report problems that appeared very
>> recently while others chime in with "yes, yes, I've always had this problem".
>> This does not help to differentiate between problems and to analyze them.
>> 
>>> Looking at the ARC change you referred to from r325851
>>> https://reviews.freebsd.org/D12163, I am convinced that ARC backpressure
>>> is completely broken.
>> 
>> Does your being convinced come from the code review or from experiments?
>> If the former, could you please share your analysis?
>> 
>>> On my 78GB RAM system with ARC limited to 40GB and
>>> doing a poudriere build of all LLVM and GCC packages at once in tmpfs I
>>> can get swap up near 50GB and yet the ARC remains at 40GB through it
>>> all.  It's always been slow to give up memory for package builds but it
>>> really seems broken right now.
>> 
>> Well, there are multiple angles.  Maybe it's ARC that does not react properly,
>> or maybe it's not being asked properly.
>> 
>> It would be useful to monitor the system during its transition to the state that
>> you reported.  There are some interesting DTrace probes in ARC, specifically
>> arc-available_memory and arc-needfree are first that come to mind.  Their
>> parameters and how frequently they are called are of interest.  VM parameters
>> could be of interest as well.
>> 
>> A rant.
>> 
>> Basically, posting some numbers and jumping to conclusions does not help at all.
>> Monitoring, graphing, etc does help.
>> ARC is a complex dynamic system.
>> VM (pagedaemon, UMA caches) is a complex dynamic system.
>> They interact in a complex dynamic ways.
>> Sometimes a change in ARC is incorrect and requires a fix.
>> Sometimes a change in VM is incorrect and requires a fix.
>> Sometimes a change in VM requires a change in ARC.
>> These three kinds of problems can all appear as a "problem with ARC".
>> 
>> For instance, when vm.lowmem_period was introduced you wouldn't find any mention
>> of ZFS/ARC.  But it does affect period between arc_lowmem() calls.
>> 
>> Also, pin-pointing a specific commit requires proper bisecting and proper
>> testing to correctly attribute systemic behavior changes to code changes.
> 
> I just upgraded my main package build box (12.0-CURRENT, 8 cores, 32 GB
> RAM) from r327616 to r331716.  I was seeing higher swap usage and larger
> ARC sizes before the upgrade than I remember from the distant past, but
> ARC was still at least somewhat responsive to memory pressure and I
> didn't notice any performance issues.
> 
> After the upgrade, ARC size seems to be pretty unresponsive to memory
> demand.  Currently the machine is near the end of a poudriere run to
> build my usual set of ~1800 ports.  The only currently running build is
> chromium and the machine is paging heavily.  Settings of interest are:
>   USE_TMPFS="wrkdir data localbase"
>   ALLOW_MAKE_JOBS=yes
> 
> last pid: 96239;  load averages:  1.86,  1.76,  1.83    up 3+14:47:00  12:38:11
> 108 processes: 3 running, 105 sleeping
> CPU: 18.6% user,  0.0% nice,  2.4% system,  0.0% interrupt, 79.0% idle
> Mem: 129M Active, 865M Inact, 61M Laundry, 29G Wired, 1553K Buf, 888M Free
> ARC: 23G Total, 8466M MFU, 10G MRU, 5728K Anon, 611M Header, 3886M Other
>      17G Compressed, 32G Uncompressed, 1.88:1 Ratio
> Swap: 40G Total, 17G Used, 23G Free, 42% Inuse, 4756K In
> 
>   PID USERNAME       THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAN
> 96239 nobody           1  76    0   140M 93636K CPU5    5   0:01  82.90% clang-
> 96238 nobody           1  75    0   140M 92608K CPU7    7   0:01  80.81% clang-
>  5148 nobody           1  20    0   590M   113M swread  0   0:31   0.29% clang-
> 57290 root             1  20    0 12128K  2608K zio->i  7   8:11   0.28% find
> 78958 nobody           1  20    0   838M   299M swread  0   0:23   0.19% clang-
> 97840 nobody           1  20    0   698M   140M swread  4   0:27   0.13% clang-
> 96066 nobody           1  20    0   463M   104M swread  1   0:32   0.12% clang-
> 11050 nobody           1  20    0   892M   154M swread  2   0:39   0.12% clang-
> 
> Pre-upgrade I was running r327616, which is newer than either of the
> commits that Jeff mentioned above.  It seems like there has been a
> regression since then.
> 
> I also don't recall seeing this problem on my Ryzen box, though it has
> 2x the core count and 2x the RAM.  The last testing that I did on it was
> with r329844.

I reconfigured my Ryzen box to be more similar to my default package
builder by disabling SMT and half of the RAM, to limit it to 8 cores
and 32 GB and then started bisecting to try to track down the problem.
For each test, I first filled ARC by tarring /usr/ports/distfiles to
/dev/null.  The commit range that I was searching was r329844 to
r331716.  I narrowed the range to r329844 to r329904.  With r329904
and newer, ARC is totally unresponsive to memory pressure and the
machine pages heavily.  I see ARC sizes of 28-29GB and 30GB of wired
RAM, so there is not much leftover for getting useful work done.  Active
memory and free memory both hover under 1GB each.  Looking at the
commit logs over this range, the most likely culprit is:

r329882 | jeff | 2018-02-23 14:51:51 -0800 (Fri, 23 Feb 2018) | 13 lines

Add a generic Proportional Integral Derivative (PID) controller algorithm and
use it to regulate page daemon output.

This provides much smoother and more responsive page daemon output, anticipating
demand and avoiding pageout stalls by increasing the number of pages to match
the workload.  This is a reimplementation of work done by myself and mlaier at
Isilon.


It is quite possible that the recent fixes to the PID controller will
fix the problem.  Not that r329844 was trouble free ... I left tar
running over lunchtime to fill ARC and the OOM killer nuked top, tar,
ntpd, both of my ssh sessions into the machine, and multiple instances
of getty while I was away.  I was able to log in again and successfully
run poudriere, and ARC did respond to the memory pressure and cranked
itself down to about 5 GB by the end of the run.  I did not see the same
problem with tar when I did the same with r329904.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?tkrat.11e402b8455bd0fa>