Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 Jul 2018 13:49:57 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        bob prohaska <fbsd@www.zefox.net>
Cc:        "Rodney W. Grimes" <freebsd-rwg@pdx.rh.CN85.dnsmgr.net>, freebsd-arm@freebsd.org
Subject:   Re: RPI3 swap experiments
Message-ID:  <23793AAA-A339-4DEC-981F-21C7CC4FE440@yahoo.com>
In-Reply-To: <20180731191016.GD94742@www.zefox.net>
References:  <20180731153531.GA94742@www.zefox.net> <201807311602.w6VG2xcN072497@pdx.rh.CN85.dnsmgr.net> <20180731191016.GD94742@www.zefox.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2018-Jul-31, at 12:10 PM, bob prohaska <fbsd at www.zefox.net> wrote:

> On Tue, Jul 31, 2018 at 09:02:59AM -0700, Rodney W. Grimes wrote:
>>=20
>> An easy way of triggering OOM that I ran accross the other day is =
simply:
>> truncate -s 4G foo
>> grep Anything foo
>>=20
>> grep(1) well gladly grow up to 4G trying to create a "line" of text
>> to search for the string "Anything".  On a system with less than
>> 4G of free memory this triggers an OOM and starts killing processes.
>>=20
>=20
> If grep actually uses 4GB before OOMA kills it then maybe OOMA is =
working
> correctly. Naive reasoning wonders why it would take that much memory =
to find
> an eight-character string, but that's not germane here. Unless, of =
course,
> buildworld is using grep in a similar way.=20
>=20
> What I'm seeing is OOMA kills that happen when swap usage is low, swap
> paritions don't seem overwhelmingly busy and read write delays are far =
less
> than maximums observed for the session.=20
>=20
> One possibility is that my gstat logs are not accurate measurements of
> storage activity. The logging script is
>=20
> #!/bin/sh
> while true
> do  gstat -abd -I 10s ; date ; swapinfo ; tail -n 2 /var/log/messages=20=

> done
>=20
> Does the script contain errors or omissions?

A sudden change is easily possible (sub-second by far).

The scripting technique is not good at providing real-time
information that is fairly detailed over an interval just
before the OOMA starts up to it having started.

I'm not aware of a good way to get such information over
such an interval.

The best I'm aware of is to change the initiation of the
OOMA kills to dump out the information that could lead to
the OOMA-kill-needed classification. (This might not tell
how it progressed to that point, which might also be
needed. ) ["Change" here might just be enabling some
existing debug mode for all I know.]

> In the most recent case the worst delays were 15 seconds for read and =
38=20
> seconds for write, OOMA didn't act until two hours later, when swap =
usage
> was only about 130 MB and write delay to swap was 135 ms. That's what =
tempts
> me to think the kills are an artifact of some other behavior.=20
>=20
> This is why I'd like to see what happens if OOMA could simply be =
turned off
> or its trigger level adjusted. On a Pi, bogging down for a few minutes =
during
> a buildworld session is perfectly ok. I appreciate that an e-commerce =
server
> or cloud computing system is a different kettle of fish entirely.

At this point we have no clue just what internal tracking
leads to the initiation of OOMA kills: no clue just what
would be involved/appropriate.

> Some weeks (months?) ago there was a thread about swap being broken. =
Was
> that in any way related to what I'm seeing?

There was some ZFS context stuff that seemed to be independent of
UFS stuff relative to memory use. I continue to see reports tied
to ZFS contexts.

But I'm not sure if this is in any way related to what you are
calling "swap being broken". I do not remember anything about
swap being directly broken for swap partitions. (Swap files are
a different issue and are problematical.)


=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?23793AAA-A339-4DEC-981F-21C7CC4FE440>