From owner-svn-src-all@FreeBSD.ORG Thu Dec 27 18:01:04 2012 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E661D388; Thu, 27 Dec 2012 18:01:04 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-la0-f44.google.com (mail-la0-f44.google.com [209.85.215.44]) by mx1.freebsd.org (Postfix) with ESMTP id C0B158FC0A; Thu, 27 Dec 2012 18:01:03 +0000 (UTC) Received: by mail-la0-f44.google.com with SMTP id d3so12668916lah.3 for ; Thu, 27 Dec 2012 10:01:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=4ZwKJvHisjZgHQJMc3cH/xx2/crJMZaaw+AQE0LZQEA=; b=QHSd5ILF747odBTZylOYAH6ywWjeMM2rC8yDphLFqcSJy5r/q+F0jyxXay+C2Xe8o1 O1YZH+JlPSlPePPZJ82DglkzydPKEEXCBxA18coHmSslNnAtNjBXev9Ny+xzfpQuO/2G uFk0bjKxPP8dp/REHdGUeYGnw61CIy777FJUYqtqoqR2T1vDuZDGjvkZEkSaizoEhDi1 lrKUvND9Km5Bw+CHgcmPMlIEXUh6mHNJfK1PHHdE8azbuLuM+yy4hYAJw58Hfhmvp9ro 5ZdCTFGZuOCmXWm78G3xNmgBooPOizxyEtRrDCYR9NZdlaZLqrrEmXc2qrzvOO2KpGeE h2XQ== MIME-Version: 1.0 Received: by 10.152.122.133 with SMTP id ls5mr29590888lab.9.1356631261657; Thu, 27 Dec 2012 10:01:01 -0800 (PST) Sender: asmrookie@gmail.com Received: by 10.112.84.193 with HTTP; Thu, 27 Dec 2012 10:01:01 -0800 (PST) In-Reply-To: References: <201212271236.qBRCawuU078203@svn.freebsd.org> <20121227124657.GX80310@FreeBSD.org> <20121227132507.GY80310@FreeBSD.org> Date: Thu, 27 Dec 2012 10:01:01 -0800 X-Google-Sender-Auth: LS7ZS69pGL6KCPGo0_wNBhBT4qo Message-ID: Subject: Re: svn commit: r244732 - head/sys/sys From: Attilio Rao To: Gleb Smirnoff Content-Type: text/plain; charset=UTF-8 Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: attilio@FreeBSD.org List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Dec 2012 18:01:05 -0000 On Thu, Dec 27, 2012 at 8:37 AM, Attilio Rao wrote: > On Thu, Dec 27, 2012 at 7:26 AM, Attilio Rao wrote: >> On Thu, Dec 27, 2012 at 5:53 AM, Attilio Rao wrote: >>> On Thu, Dec 27, 2012 at 5:25 AM, Gleb Smirnoff wrote: >>>> On Thu, Dec 27, 2012 at 04:55:22AM -0800, Attilio Rao wrote: >>>> A> On Thu, Dec 27, 2012 at 4:46 AM, Gleb Smirnoff wrote: >>>> A> > Attilio, >>>> A> > >>>> A> > On Thu, Dec 27, 2012 at 12:36:58PM +0000, Attilio Rao wrote: >>>> A> > A> Author: attilio >>>> A> > A> Date: Thu Dec 27 12:36:58 2012 >>>> A> > A> New Revision: 244732 >>>> A> > A> URL: http://svnweb.freebsd.org/changeset/base/244732 >>>> A> > A> >>>> A> > A> Log: >>>> A> > A> br_prod_tail and br_cons_tail members are used as barrier to >>>> A> > A> signal bug_ring ownership. However, instructions can be reordered >>>> A> > A> around members write leading to stale values for ie. br_prod_bufs. >>>> A> > A> >>>> A> > A> Use correct memory barriers to ensure proper ordering of the >>>> A> > A> ownership tokens updates. >>>> A> > A> >>>> A> > A> Sponsored by: EMC / Isilon storage division >>>> A> > A> MFC after: 2 weeks >>>> A> > >>>> A> > Have you profiled this? >>>> A> > >>>> A> > After this change the buf_ring actually gains its own hand-rolled >>>> A> > mutex: >>>> A> > >>>> A> > while (atomic_load_acq_32(&br->br_prod_tail) != prod_head) >>>> A> > cpu_spinwait(); >>>> A> > >>>> A> > The only difference with mutex(9) is that this one isn't monitored >>>> A> > by WITNESS. >>>> A> >>>> A> I think you are not correct. It doesn't disable interrupts (as >>>> A> spinlock do) and it doesn't sleep. >>>> A> So your analogy is completely off. >>>> A> >>>> A> Also, on x86 atomic_store_rel_*() is a simple write. The only thing >>>> A> that really changes is the atomic_load_acq_*() that introduces a >>>> A> locked instruction. >>>> >>>> This only thing, the locked instruction, affects performance a lot. I >>>> suspect strong forwarding degradation after your change. Can you please >>>> profile that? >>> >>> Yes but it is a matter of incorrect code vs. slower instruction. >>> Also, you are not considering that there are much heavier-weight >>> instructions already (wmb(), rmb(), which I'm going to change soon >>> into actual barriers btw). I highly doubt you can measure the latency >>> introduced by atomic_load_acq_*() when mfence and stuff is in place. >>> The pessimization should only account for a small fraction of the >>> overall performance. >>> >>>> A> > The idea behind buf_ring was lockless storing and lockless fetching >>>> A> > from a ring and now this vanished. >>>> A> > >>>> A> > What does your change actually fixes except precise accounting of >>>> A> > br_prod_bufs that are actually unused and should be better garbage >>>> A> > collected rather than fixed? >>>> A> >>>> A> The write of br_prod_tail must happens as very last thing, also after >>>> A> the whole buf setup. The only way you can enforce this is with a >>>> A> memory barrier. I can double-check if we can garbage collect >>>> A> br_prod_bufs but this should not be enough yet. >>>> >>>> Do you have a core file that illustrates that a ring can get into >>>> inconsistent state? >>> >>> I don't I got it by code inspection. The br_prod_tail update must >>> happen as very last thing because it means the buf is "ready-to-go" >>> and it will be owned. >>> >>> However, the prior wmb() may be helpful in this case, at least for one >>> case. I will do a follow up soon. For a longer discussion, I plan to >>> move this into a real atomic_store_rel_*() soon. >> >> Speaking of which, as you are here, I just found out that r241037 >> breaks the alignment of the structure. >> Infact the padding member is not updated accordingly. >> >> We don't have a param to control L2 caches, but I think that we can >> safely align them to the L1 cacheline for sure. >> Also, note that this padding is completely broken for MI requirements >> (it just assumes blindly 128 bytes L2 cachelines, which not always >> true even on i386). > > More specifically this patch: > http://www.freebsd.org/~attilio/bufring_pad.patch > > Of course I don't think the optimization is important in the > DEBUG_BUFRING on case, so the patch should be fine. And this further patch (so that you can see incremental changes) improves the membar situation, IMHO: http://www.freebsd.org/~attilio/bufring_pad_membar.patch It basically reverts latest introduced changes and uses full membar to ensure consistency. Comments should be self-explanatory. Please review. Attilio -- Peace can only be achieved by understanding - A. Einstein