From owner-svn-src-head@freebsd.org Thu Sep 3 19:17:45 2015 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7948E9C9EA0 for ; Thu, 3 Sep 2015 19:17:45 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-qg0-f45.google.com (mail-qg0-f45.google.com [209.85.192.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3B974E2F for ; Thu, 3 Sep 2015 19:17:45 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by qgez77 with SMTP id z77so37130394qge.1 for ; Thu, 03 Sep 2015 12:17:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=OaeZaBT43pI/Itvb87bEUZPS885MW34lLX4OMfaa/+0=; b=D/GUgZymRrM/A/E3KYk51YSYHyFSQZYQz4VXQbbFTK4RYUiP/81DKZCV5Z0chSBd7k U5K0zLbkFuh6aAGzn6+xkAPTwpNhS4IC2trYfdYh8/adRK+qR1YzRA1CZN8WZWnAnF7S BPqH11o7e5D20BJztF8oab27yBwdmrCP9VKYMyL7IQLZAphIn2XlwGnpgWqwbcmD0aLR 6ji4kIos+19pxpdRP8SOgT3BdOZaQyxVV2x8zbnwWiN/HxN6dAYvdtEnLOR8pxqnU8un SjDVNhJvTfMfmch3D/B9YiR462vRJP/JGsDc6X3TgSvHtye+rTqRv7DoOkBPoGGROQ8C T7Ew== X-Gm-Message-State: ALoCoQlA8LuvqEGvw3lEDSB5qd0tKSjHxcaWkf/uZyuvoLH5QHsFTBr1hvU3qZKVv7mlOvDK7w37 MIME-Version: 1.0 X-Received: by 10.140.164.7 with SMTP id k7mr72283505qhk.40.1441307864242; Thu, 03 Sep 2015 12:17:44 -0700 (PDT) Sender: wlosh@bsdimp.com Received: by 10.140.80.164 with HTTP; Thu, 3 Sep 2015 12:17:44 -0700 (PDT) X-Originating-IP: [69.53.245.39] In-Reply-To: <20150903120606.GB75381@brick.home> References: <201509021729.t82HTULW036119@repo.freebsd.org> <20150903120606.GB75381@brick.home> Date: Thu, 3 Sep 2015 13:17:44 -0600 X-Google-Sender-Auth: PqRpKJljyluadjVRCLuG_xYAg_0 Message-ID: Subject: Re: svn commit: r287405 - head/sys/geom From: Warner Losh To: Warner Losh , src-committers , "svn-src-all@freebsd.org" , "svn-src-head@freebsd.org" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Sep 2015 19:17:45 -0000 On Thu, Sep 3, 2015 at 6:06 AM, Edward Tomasz Napiera=C5=82a wrote: > On 0902T1729, Warner Losh wrote: > > Author: imp > > Date: Wed Sep 2 17:29:30 2015 > > New Revision: 287405 > > URL: https://svnweb.freebsd.org/changeset/base/287405 > > > > Log: > > After the introduction of direct dispatch, the pacing code in g_down(= ) > > broke in two ways. One, the pacing variable was accessed in multiple > > threads in an unsafe way. Two, since large numbers of I/O could come > > down from the buf layer at one time, large numbers of allocation > > failures could happen all at once, resulting in a huge pace value tha= t > > would limit I/Os to 10 IOPS for minutes (or even hours) at a > > time. While a real solution to these problems requires substantial > > work (to go to a no-allocation after the first model, or to have some > > way to wait for more memory with some kind of reserve for pager and > > swapper requests), it is relatively easy to make this simplistic > > pacing less pathological. > > Shouldn't we emit some warning the first time this happens, to aid > in debugging strange IO performance degradation? Something like... No. We shouldn't. We should fix the damn underlying problem instead of polishing this turd of a rate limiter. It is really horrible by design and my patches make it less horrible only by degree. I know it is tempting, but this is no different than dropping a packet or many of the other places where we return ENOMEM and do retries or not based on the local policy. Well, apart from the horrible policy we implement.... > > > @@ -688,7 +699,7 @@ g_io_deliver(struct bio *bp, int error) > > bp->bio_driver2 =3D NULL; > > bp->bio_pflags =3D 0; > > g_io_request(bp, cp); > > if (warned_about_pace =3D=3D 0) { > printf("WARNING: GEOM io allocation failed; expect reduce= d > IO performance\n"); > warned_about_pace =3D 1; > } > We have no good facility to pace these messages out, and failures are typically very transient. The old code would explode a shortage that was < 1s into reduced I/O performance for minutes or hours (yes, hours). The new code typically will see shortages in the millisecond to tens of millisecond duration. Adding this would be just noise. We don't whine when we drop packets in the network layer. We just provide a count of dropped packets. This also causes a performance degradation. There's adequate statistical reporting in uma to see if this is happening. Anyway, I will spend exactly 0 more energy on polishing the POS pace turd. I'll spend somewhat more time and energy on exploring real fixes to this issue. Warner > - pace++; > > + pace =3D 1; > > return; > > } > >