Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 20 Aug 2018 08:59:41 -0600
From:      Warner Losh <imp@bsdimp.com>
To:        Mark Millard <marklmi@yahoo.com>
Cc:        bob prohaska <fbsd@www.zefox.net>, freebsd-arm <freebsd-arm@freebsd.org>,  Mark Johnston <markj@freebsd.org>
Subject:   Re: RPI3 swap experiments ["was killed: out of swap space" with: "v_free_count: 5439, v_inactive_count: 1"]
Message-ID:  <CANCZdfroR72T_9Omo0bAoWLs9JsTPvJ4xnYrqabSgUuX-Pe7qg@mail.gmail.com>
In-Reply-To: <BD92A3B6-BCD2-4FA5-A9C6-0B950D3AD378@yahoo.com>
References:  <20180808153800.GF26133@www.zefox.net> <20180808204841.GA19379@raichu> <2DC1A479-92A0-48E6-9245-3FF5CFD89DEF@yahoo.com> <20180809033735.GJ30738@phouka1.phouka.net> <20180809175802.GA32974@www.zefox.net> <20180812173248.GA81324@phouka1.phouka.net> <20180812224021.GA46372@www.zefox.net> <B81E53A9-459E-4489-883B-24175B87D049@yahoo.com> <20180813021226.GA46750@www.zefox.net> <0D8B9A29-DD95-4FA3-8F7D-4B85A3BB54D7@yahoo.com> <20180815221728.GA59074@www.zefox.net> <9EA5D75D-A03F-4B25-B65E-03E93DE30130@yahoo.com> <BD92A3B6-BCD2-4FA5-A9C6-0B950D3AD378@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Aug 20, 2018 at 8:00 AM, Mark Millard via freebsd-arm <
freebsd-arm@freebsd.org> wrote:

> There is a way to explore Mark Johnston's swap information
> reports (from his patches for such).
>
> Taking a Pine64+ 2GB as an example (4 cores with 1
> HW-thread per core, 2 GiBytes of RAM, USB device for
> root file system and swap partition):
>
> In one login:
> # nice -20 gstat -pd
>
> In another login:
> # stress -d 2 -m 4 --vm-keep --vm-bytes 536870912
>
> That "4" and "536870912" total to the 2 GiBytes so
> swapping is induced for the context in question.
> (Scale --vm-bytes appropriately to context.)
>
> [stress is from sysutils/stress .]
>
> gstat tends to show things such as:
>
> dT: 1.006s  w: 1.000s
>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps
>  ms/d   %busy Name
>     0      0      0      0    0.0      0      0    0.0      0      0
> 0.0    0.0| mmcsd0
>    56    312      0      0    0.0    312  19985  142.6      0      0
> 0.0   99.6| da0
>
> where the ms/w and kBps are fairly stable but the Length
> of the queue length is widely variable. For the above it
> makes the likes of 56 writes queued * 142.6 ms/write (mean)
> [as an estimate] a rather large total time for the last
> of the queued writes to complete. (If I understand how
> to interpret the above.)
>

No. 142.6ms/write is the average of the time that the operations that
completed during the polling interval took to complete. There's no
estimating here.

So, at 6 or 7 per second for the operation to complete, coupled with a
parallel factor of 1 (typical for low end junk flash), we wind up with 56
operations in the queue taking 8-10s to complete.


> It appears to me that, compared to a observed capacity of
> roughly around 20 MiBytes/sec for writes, large amounts of
> bytes are being queued up to be written in a short time,
> for which it just takes a while for the backlog to be
> finished.
>

Yes. That matches my expectation as well. In other devices, I've found that
I needed to rate-limit things to more like 50-75% of the max value to keep
variance in performance low. It's the whole reason I wrote the CAM I/O
scheduler.


> The following is from multiple such runs, several manually
> stopped but some killed because of sustained low free
> memory. I had left vm.pageout_oom_seq=12 in place for this,
> making the kills easier to get than the 120 figure would. It
> does not take very long generally for some sort of message to
> show up.
>
>
>
> waited 9s for async swap write
> waited 9s for swap buffer
> waited 9s for async swap write
> waited 9s for async swap write
> waited 9s for async swap write
> v_free_count: 1357, v_inactive_count: 1
> Aug 20 06:04:27 pine64 kernel: pid 1010 (stress), uid 0, was killed: out
> of swap space
> waited 5s for async swap write
> waited 5s for swap buffer
> waited 5s for async swap write
> waited 5s for async swap write
> waited 5s for async swap write
> waited 13s for async swap write
> waited 12s for swap buffer
> waited 13s for async swap write
> waited 12s for async swap write
> waited 12s for async swap write
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 161177, size: 131072
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 164766, size: 65536
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 164064, size: 12288
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 161630, size: 131072
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 161177, size: 131072
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 164766, size: 65536
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 161630, size: 131072
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 164064, size: 12288
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 161177, size: 131072
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 164766, size: 65536
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 161630, size: 131072
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 164064, size: 12288
> waited 65s for async swap write
> waited 65s for swap buffer
> waited 65s for async swap write
> waited 65s for async swap write
> waited 65s for async swap write
> v_free_count: 955, v_inactive_count: 1
> Aug 20 06:11:49 pine64 kernel: pid 1047 (stress), uid 0, was killed: out
> of swap space
> waited 5s for async swap write
> waited 5s for swap buffer
> waited 5s for async swap write
> waited 5s for async swap write
> waited 5s for async swap write
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314021, size: 12288
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314084, size: 32768
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314856, size: 32768
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314638, size: 131072
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 312518, size: 4096
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 312416, size: 16384
> waited 39s for async swap write
> waited 39s for swap buffer
> waited 39s for async swap write
> waited 39s for async swap write
> waited 39s for async swap write
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314802, size: 24576
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314934, size: 40960
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 315617, size: 32768
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 312484, size: 4096
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 328616, size: 131072
> waited 31s for async swap write
> waited 31s for swap buffer
> waited 31s for async swap write
> waited 31s for async swap write
> waited 31s for async swap write
> waited 5s for async swap write
> waited 5s for swap buffer
> waited 5s for async swap write
> waited 5s for async swap write
> waited 5s for async swap write


These numbers are consistent with the theory that the swap device becomes
overwhelmed, spiking latency and causing crappy down-stream effects. You
can use the I/O scheduler to limit the write rates at the low end. You
might also be able to schedule a lower write queue depth at the top end as
well, but I've not seen good ways to do that.

Warner



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfroR72T_9Omo0bAoWLs9JsTPvJ4xnYrqabSgUuX-Pe7qg>