From owner-freebsd-arm@freebsd.org  Thu Aug  9 15:43:01 2018
Return-Path: <owner-freebsd-arm@freebsd.org>
Delivered-To: freebsd-arm@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3E176106A8C8
 for <freebsd-arm@mailman.ysv.freebsd.org>;
 Thu,  9 Aug 2018 15:43:01 +0000 (UTC)
 (envelope-from wlosh@bsdimp.com)
Received: from mail-io0-x22b.google.com (mail-io0-x22b.google.com
 [IPv6:2607:f8b0:4001:c06::22b])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G3" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id C30F17A9E9
 for <freebsd-arm@freebsd.org>; Thu,  9 Aug 2018 15:43:00 +0000 (UTC)
 (envelope-from wlosh@bsdimp.com)
Received: by mail-io0-x22b.google.com with SMTP id y10-v6so5108946ioa.10
 for <freebsd-arm@freebsd.org>; Thu, 09 Aug 2018 08:43:00 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=bsdimp-com.20150623.gappssmtp.com; s=20150623;
 h=mime-version:sender:in-reply-to:references:from:date:message-id
 :subject:to:cc;
 bh=3av7Q2H/2ko+y8n4YNo45iwlMtwA0xfkztAqsW8TMoc=;
 b=oeSeJffk/TcYkhBAt7ntvXGw8/xAT6CL0Th6cPO+c2TCTU2w2HgJExYP9tUEiYRD4W
 6NeDN1zZC+7DMtWX9eJJBfGKNBlD49jvWmzVrRZjPUx+QExYZw3g7koSSLxRyswO0P9w
 QM8ApDrr4msxZ8KxF1tlWhASRclI1es5pOt8K1KlpMujU6zw9ruxa7VgY/kxH5KRsH5k
 xaC3Vdy6g24hrCaiN+mOVb2ij829OzIenFiGPSxt5zb8KPvdCrzJ+d9o4dh7gA0UIgY1
 uoxdpokR7kbsfIlDAs0xHZEQvDBd5k3rWbmaGmSw2DHkLAXEicxtqfcHOHs16TsQ4lBZ
 zxVg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:sender:in-reply-to:references:from
 :date:message-id:subject:to:cc;
 bh=3av7Q2H/2ko+y8n4YNo45iwlMtwA0xfkztAqsW8TMoc=;
 b=AaO0swXB56RUwG+3PAjy3mEdqEuF/i+lpuztRhLP5ZlFKnAkcE5UpR58cg9d37bWGq
 SONA9poUCHAFiLP5fF0DdWcbdO/Nr6dgztQiFSDTTljogL9VKZzSH8tHeaLw9F0rgLuC
 ZYG7ny0JmsNHjg1dVebHFAqRz4y4Dixa3i9Zca0ELle02huExioWTTmHkK8qbHpFC2XH
 q+lThnsAjWi8YjchO3uSMb0gNpFLGEoIVFQ0cGQyIbSJ/8QP0gbNZPI69js57Ve1Z715
 6xTqliR43RCOz2cUxgoHN60po9EKM9p80CziZ9tVAN0Oj3G/u/gkhopIQ59tN7UCxB0T
 hAvw==
X-Gm-Message-State: AOUpUlEqh73ylVusTNFpwX20R0zjJ/GFzwN5Ilgl9uUQQ8oG+EQWyidk
 yCCKTkpFl8SrsL94bBw7sdmI0woUen0/fJ6uU0ZcaA==
X-Google-Smtp-Source: AA+uWPxRYMcD7xCQM37J4xy/y11s2eCOXFgbsZCVIz/ZhrgD9fJyfz0KqlFvqkKhPlKOdsPNeY3CEnMnBAQzjHedAY8=
X-Received: by 2002:a6b:3902:: with SMTP id
 g2-v6mr2207811ioa.168.1533829380027; 
 Thu, 09 Aug 2018 08:43:00 -0700 (PDT)
MIME-Version: 1.0
Sender: wlosh@bsdimp.com
Received: by 2002:a4f:381a:0:0:0:0:0 with HTTP;
 Thu, 9 Aug 2018 08:42:59 -0700 (PDT)
X-Originating-IP: [2603:300b:6:5100:1052:acc7:f9de:2b6d]
In-Reply-To: <20180809153710.GC30347@www.zefox.net>
References: <6BFE7B77-A0E2-4FAF-9C68-81951D2F6627@yahoo.com>
 <20180802002841.GB99523@www.zefox.net> <20180802015135.GC99523@www.zefox.net>
 <EC74A5A6-0DF4-48EB-88DA-543FD70FEA07@yahoo.com>
 <20180806155837.GA6277@raichu>
 <20180808153800.GF26133@www.zefox.net> <20180808204841.GA19379@raichu>
 <20180809065648.GB30347@www.zefox.net> <20180809152152.GC68459@raichu>
 <CANCZdfpKOTBrxiNhaeHHRp-2iw5a4eXt+md_1LTD-c0+AE6qxg@mail.gmail.com>
 <20180809153710.GC30347@www.zefox.net>
From: Warner Losh <imp@bsdimp.com>
Date: Thu, 9 Aug 2018 09:42:59 -0600
X-Google-Sender-Auth: 5e1z89KCu2qTrGtcbh8L3oE5yFI
Message-ID: <CANCZdfrC0s8X-LxJmrDmkxmz+GUMNsHSMpBEQmp1S5ahcvptpg@mail.gmail.com>
Subject: Re: RPI3 swap experiments ["was killed: out of swap space" with:
 "v_free_count: 5439, v_inactive_count: 1"]
To: bob prohaska <fbsd@www.zefox.net>
Cc: Mark Johnston <markj@freebsd.org>,
 "freebsd-arm@freebsd.org" <freebsd-arm@freebsd.org>
Content-Type: text/plain; charset="UTF-8"
X-Content-Filtered-By: Mailman/MimeDel 2.1.27
X-BeenThere: freebsd-arm@freebsd.org
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: "Porting FreeBSD to ARM processors." <freebsd-arm.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-arm>,
 <mailto:freebsd-arm-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arm/>
List-Post: <mailto:freebsd-arm@freebsd.org>
List-Help: <mailto:freebsd-arm-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-arm>,
 <mailto:freebsd-arm-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Aug 2018 15:43:01 -0000

On Thu, Aug 9, 2018 at 9:37 AM, bob prohaska <fbsd@www.zefox.net> wrote:

> On Thu, Aug 09, 2018 at 09:28:09AM -0600, Warner Losh wrote:
> > On Thu, Aug 9, 2018 at 9:21 AM, Mark Johnston <markj@freebsd.org> wrote:
> >
> > > On Wed, Aug 08, 2018 at 11:56:48PM -0700, bob prohaska wrote:
> > > > On Wed, Aug 08, 2018 at 04:48:41PM -0400, Mark Johnston wrote:
> > > > > On Wed, Aug 08, 2018 at 08:38:00AM -0700, bob prohaska wrote:
> > > > > > The patched kernel ran longer than default but OOMA still halted
> > > buildworld around
> > > > > > 13 MB. That's considerably farther than a default build world
> have
> > > run but less than
> > > > > > observed when setting vm.pageout_oom_seq=120 alone. Log files
> are at
> > > > > > http://www.zefox.net/~fbsd/rpi3/swaptests/r337226M/
> > > 1gbsdflash_1gbusbflash/batchqueue/
> > > > > >
> > > > > > Both changes are now in place and -j4 buildworld has been
> restarted.
> > > > >
> > > > > Looking through the gstat output, I'm seeing some pretty abysmal
> > > average
> > > > > write latencies for da0, the flash drive.  I also realized that my
> > > > > reference to r329882 lowering the pagedaemon sleep period was
> wrong -
> > > > > things have been this way for much longer than that.  Moreover, as
> you
> > > > > pointed out, bumping oom_seq to a much larger value wasn't quite
> > > > > sufficient.
> > > > >
> > > > > I'm curious as to what the worst case swap I/O latencies are in
> your
> > > > > test, since the average latencies reported in your logs are high
> enough
> > > > > to trigger OOM kills even with the increased oom_seq value.  When
> the
> > > > > current test finishes, could you try repeating it with this patch
> > > > > applied on top? https://people.freebsd.org/~
> > > markj/patches/slow_swap.diff
> > > > > That is, keep the non-default oom_seq setting and modification to
> > > > > VM_BATCHQUEUE_SIZE, and apply this patch on top.  It'll cause the
> > > kernel
> > > > > to print messages to the console under certain conditions, so a
> log of
> > > > > console output will be interesting.
> > > >
> > > > The run finished with a panic, I've collected the logs and terminal
> > > output at
> > > > http://www.zefox.net/~fbsd/rpi3/swaptests/r337226M/
> > > 1gbsdflash_1gbusbflash/batchqueue/pageout120/slow_swap/
> > > >
> > > > There seems to be a considerable discrepancy between the wait times
> > > reported
> > > > by the patch and the wait times reported by gstat in the first
> couple of
> > > > occurrences. The fun begins at timestamp Wed Aug  8 21:26:03 PDT
> 2018 in
> > > > swapscript.log.
> > >
> > > The reports of "waited for swap buffer" are especially bad: during
> those
> > > periods, the laundry thread is blocked waiting for in-flight swap
> writes
> > > to finish before sending any more.  Because the system is generally
> > > quite starved for clean pages that it can reuse, it's relying on swap
> > > I/O to clean more.  If that fails, the system eventually has no choice
> > > but to start killing processes (where the time period corresponding to
> > > "eventually" is determined by vm.pageout_oom_seq).
> > >
> >
> >
> > Based on these latencies, I think the system is behaving more or less as
> > > expected from the VM's perspective.  I do think the default oom_seq
> value
> > > is too low and will get that addressed in 12.0.
> >
> >
> > Yea. I think we need to take a more active role in managing latencies on
> > some cards. Properly managed, they won't climb that high. Since there's
> no
> > tagged queueing to these devices, there's an I/O depth of one. The
> default
> > policy is to do them in order (since it's flash) which means that
> processes
> > that machine-gun down requests swamp everybody else and do
> > back-to-back-to-back writes which, at least for the few drives I have
> > looked at in detail tends to induce pathological behavior.
> >
>
> There's a kernel building now with
> options         CAM_IOSCHED_DYNAMIC
> in the config file. Is it still worth trying? Anything else to try?
>

It won't be a cure-all, out of the box, I don't think. However, the read
biasing code may help sneak a few 'reads' in between writes which may help
keep away from the pathological behavior.... Or not, it's hard to say...
I've not looked at swapping to super-crappy nand (I mean thumb drives) in
as much detail as the drives we use for work.

Warner