From owner-freebsd-fs@freebsd.org Mon Aug 24 23:47:12 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3E6789C17E4 for ; Mon, 24 Aug 2015 23:47:12 +0000 (UTC) (envelope-from jason.unovitch@gmail.com) Received: from mail-qk0-x22e.google.com (mail-qk0-x22e.google.com [IPv6:2607:f8b0:400d:c09::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id EE4081BE5 for ; Mon, 24 Aug 2015 23:47:11 +0000 (UTC) (envelope-from jason.unovitch@gmail.com) Received: by qkfh127 with SMTP id h127so91699220qkf.1 for ; Mon, 24 Aug 2015 16:47:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=h/pxgzqdNcHl3z4cea1YemR8zL9MnowaRFE3iE9iVmI=; b=vbchplTiPC8KhvXyp/pLX/M6irDg/29DH3UeSrnoVjFcSWtWZ5fgh2ATa4N4sKE8a0 qpMPv5itE6DsG2vW9uY5bqs+zaeahIMp6hGQghOALFbXoX6may0o1NogoOqn8dqKLDCu yELaUyfxTPIcmCATln2ZWbPVSJ8Q1s8xbL9XOu8T9tz64oFCtH7uV1Pem1qmzdSmr9YY LMEyY2upIYiwolZbRctGkVij94OeRm2iLS/CqwpvUpQbltuyNE+k1ErzwCjRWdgOcUYJ WsW4obD3cw8iY/yBkBVXlW+vMFFAuW/4byKKrEDLF1NnOOB0xcnZ2QEPqI8eydbUX7qz 4asA== X-Received: by 10.55.42.65 with SMTP id q62mr36525240qkh.12.1440460031054; Mon, 24 Aug 2015 16:47:11 -0700 (PDT) Received: from Silverstone.nc-us.unovitch.com ([2606:a000:5687:de02:be5f:f4ff:fe5d:f28]) by smtp.gmail.com with ESMTPSA id t105sm12413768qgd.5.2015.08.24.16.47.10 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Aug 2015 16:47:10 -0700 (PDT) Date: Mon, 24 Aug 2015 19:47:08 -0400 From: Jason Unovitch To: freebsd-fs@freebsd.org Subject: Re: solaris assert: avl_is_empty(&dn -> dn_dbufs) panic Message-ID: <20150824234708.GA9687@Silverstone.nc-us.unovitch.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Aug 2015 23:47:12 -0000 For reference I opened https://bugs.FreeBSD.org/202607 before I came across this discussion. > Hi, > > I'll need a little time to fully reload the context for these changes. However, reintroducing a blocking loop is not the right fix - it was a hack in the original code. :-) My hunch is that removing the assert is safe, but it would be nice to have a core dump to better understand why the list isn't empty. > > -- > Justin > Justin, I have the contents of my /var/crash available. I also have a beadm boot environment of a known bad (r287028) as well as the known good (r286204) that I am currently running on. I should be able to replicate this as needed and provide some assistance. > > On Aug 21, 2015, at 12:38 PM, Xin Li wrote: > > > > Hi, > > > > A quick glance at the changes suggests that Justin's changeset may be > > related. The reasoning is here: > > > > https://reviews.csiden.org/r/131/ > > > > Related Illumos ticket: > > > > https://www.illumos.org/issues/5056 > > > > In dnode_evict_dbufs(), remove multiple passes over dn->dn_dbufs. > > This is possible now that objset eviction is asynchronously > > completed in a different context once dbuf eviction completes. > > > > In the case of objset eviction, any dbufs held by children will > > be evicted via dbuf_rele_and_unlock() once their refcounts go > > to zero. Even when objset eviction is not active, the ordering > > of the avl tree guarantees that children will be released before > > parents, allowing the parent's refcounts to naturally drop to > > zero before they are inspected in this single loop. > > > > ==== > > > > So, upon return from dnode_evict_dbufs(), there could be some > > DB_EVICTING buffers on the AVL pending release and thus breaks the > > invariant. > > > > Should we restore the loop where we yield briefly with the lock > > released, then reacquire and recheck? > > > > Cheers, Jason