From owner-freebsd-current@FreeBSD.ORG  Tue Mar  5 16:01:42 2013
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@FreeBSD.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id B419D865;
 Tue,  5 Mar 2013 16:01:42 +0000 (UTC) (envelope-from ian@FreeBSD.org)
Received: from mho-02-ewr.mailhop.org (mho-04-ewr.mailhop.org [204.13.248.74])
 by mx1.freebsd.org (Postfix) with ESMTP id 8E20BA83;
 Tue,  5 Mar 2013 16:01:42 +0000 (UTC)
Received: from c-24-8-230-52.hsd1.co.comcast.net ([24.8.230.52]
 helo=damnhippie.dyndns.org)
 by mho-02-ewr.mailhop.org with esmtpsa (TLSv1:AES256-SHA:256)
 (Exim 4.72) (envelope-from <ian@FreeBSD.org>)
 id 1UCuJB-00033b-98; Tue, 05 Mar 2013 16:01:41 +0000
Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240])
 by damnhippie.dyndns.org (8.14.3/8.14.3) with ESMTP id r25G1aWM001306;
 Tue, 5 Mar 2013 09:01:36 -0700 (MST) (envelope-from ian@FreeBSD.org)
X-Mail-Handler: Dyn Standard SMTP by Dyn
X-Originating-IP: 24.8.230.52
X-Report-Abuse-To: abuse@dyndns.com (see
 http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse
 reporting information)
X-MHO-User: U2FsdGVkX192UnEtHVpFaUX6sF4jkSa8
Subject: Re: access to hard drives is "blocked" by writes to a flash drive
From: Ian Lepore <ian@FreeBSD.org>
To: Don Lewis <truckman@FreeBSD.org>
In-Reply-To: <201303050527.r255R0Gd012437@gw.catspoiler.org>
References: <201303050527.r255R0Gd012437@gw.catspoiler.org>
Content-Type: text/plain; charset="us-ascii"
Date: Tue, 05 Mar 2013 09:01:36 -0700
Message-ID: <1362499296.1291.6.camel@revolution.hippie.lan>
Mime-Version: 1.0
X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port 
Content-Transfer-Encoding: 7bit
Cc: deeptech71@gmail.com, phk@phk.freebsd.dk, freebsd-current@FreeBSD.org,
 peter@rulingia.com
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Mar 2013 16:01:42 -0000

On Mon, 2013-03-04 at 21:27 -0800, Don Lewis wrote:
> On  4 Mar, Ian Lepore wrote:
> > On Sun, 2013-03-03 at 19:01 -0800, Don Lewis wrote:
> >> On  3 Mar, Poul-Henning Kamp wrote:
> >> 
> >> > For various reasons (see: Lemming-syncer) FreeBSD will block all I/O
> >> > traffic to other disks too, when these pileups gets too bad.
> >> 
> >> The Lemming-syncer problem should have mostly been fixed by 231160 in
> >> head (231952 in stable/9 and 231967 in stable/8) a little over a year
> >> ago. The exceptions are atime updates, mmaped files with dirty pages,
> >> and quotas. Under certain workloads I still notice periodic bursts of
> >> seek noise. After thinking about it for a bit, I suspect that it could
> >> be atime updates, but I haven't tried to confirm that.
> >> 
> >> When using TCQ or NCQ, perhaps we should limit the number of outstanding
> >> writes per device to leave some slots open for reads.  We should
> >> probably also prioritize reads over writes unless we are under memory
> >> pressure.
> >>  
> > 
> > Then either those changes didn't have the intended effect, or the
> > problem we're seeing with lack of system responsiveness when there's a
> > large backlog of writes to a slow device is not the lemming-syncer
> > problem.  It's also not a lack of TCQ/NCQ slots, given that no such
> > thing exists with SD card IO.
> > 
> > When this is going on, the process driving the massive output spends
> > almost all its time in a wdrain wait, and if you try to launch an app
> > that isn't already in cache, a siginfo generally shows it to be in a
> > getblk wait.
> 
> If your only drive is a single SD card, then you're pretty much hosed
> when I/O is blocked because the SD card is doing an erase.  It can only
> handle one command at a time, and if a write blocks, there's nothing
> that we can do to get it to execute a read until it is done with the
> write command that it is hung up on.  I'm not familiar with the lower
> layers, but things might be less bad if read ops can jump ahead and get
> sent to the drive before any queued writes.
> 

Yes, but an erase-block operation on a nand flash takes on the order of
500uS, not 8-10 seconds, which is the kind of interactive
non-responsiveness you experience in these situations.  The very nature
of SD cards is one operation at a time, so no internal operation
queueing is in play to explain the long (apparent) hangs.

I've debated playing with the bio work loop in mmcsd to see if moving
reads ahead of writes was helpful, but that seems like a dangerous path
to go down without some mitigation strategy to ensure that writes go
through eventually.  That seems especially important when you consider
that writes may be necessary to free up memory to un-wedge other things
that are waiting.  (Yeah, people don't often use sd cards as swap
storage, but I've done so in a pinch.)  All in all, I've never pursued
it because it feels like the wrong layer to address the problem at.

-- Ian