From owner-freebsd-current@FreeBSD.ORG Thu May 10 10:39:03 2012 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3AE311065674 for ; Thu, 10 May 2012 10:39:03 +0000 (UTC) (envelope-from pho@holm.cc) Received: from relay00.pair.com (relay00.pair.com [209.68.5.9]) by mx1.freebsd.org (Postfix) with SMTP id D71E48FC18 for ; Thu, 10 May 2012 10:39:02 +0000 (UTC) Received: (qmail 65081 invoked from network); 10 May 2012 10:38:19 -0000 Received: from 87.58.144.241 (HELO x2.osted.lan) (87.58.144.241) by relay00.pair.com with SMTP; 10 May 2012 10:38:19 -0000 X-pair-Authenticated: 87.58.144.241 Received: from x2.osted.lan (localhost [127.0.0.1]) by x2.osted.lan (8.14.5/8.14.5) with ESMTP id q4AAd01s077599; Thu, 10 May 2012 12:39:00 +0200 (CEST) (envelope-from pho@x2.osted.lan) Received: (from pho@localhost) by x2.osted.lan (8.14.5/8.14.5/Submit) id q4AAd0Xk077583; Thu, 10 May 2012 12:39:00 +0200 (CEST) (envelope-from pho) Date: Thu, 10 May 2012 12:39:00 +0200 From: Peter Holm To: Mateusz Guzik Message-ID: <20120510103900.GA77554@x2.osted.lan> References: <4FA6F324.4080107@FreeBSD.org> <4FA82269.6080406@FreeBSD.org> <20120507201153.GA19942@dft-labs.eu> <20120508194514.GA10688@x2.osted.lan> <20120510102118.GA26472@dft-labs.eu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120510102118.GA26472@dft-labs.eu> User-Agent: Mutt/1.4.2.3i Cc: Doug Barton , Sergey Kandaurov , freebsd-current , mckusick@freebsd.org Subject: Re: panic, seems related to r234386 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 May 2012 10:39:03 -0000 On Thu, May 10, 2012 at 12:21:18PM +0200, Mateusz Guzik wrote: > On Tue, May 08, 2012 at 09:45:14PM +0200, Peter Holm wrote: > > On Mon, May 07, 2012 at 10:11:53PM +0200, Mateusz Guzik wrote: > > > On Mon, May 07, 2012 at 12:28:41PM -0700, Doug Barton wrote: > > > > On 05/06/2012 15:19, Sergey Kandaurov wrote: > > > > > On 7 May 2012 01:54, Doug Barton wrote: > > > > >> I got this with today's current, previous (working) kernel is r232719. > > > > >> > > > > >> panic: _mtx_lock_sleep: recursed on non-recursive mutex struct mount mtx > > > > >> @ /frontier/svn/head/sys/kern/vfs_subr.c:4595 > > > > > > > > ... > > > > > > > > > Please try this patch. > > > > > > > > > > Index: fs/ext2fs/ext2_vfsops.c > > > > > =================================================================== > > > > > --- fs/ext2fs/ext2_vfsops.c (revision 235108) > > > > > +++ fs/ext2fs/ext2_vfsops.c (working copy) > > > > > @@ -830,7 +830,6 @@ > > > > > /* > > > > > * Write back each (modified) inode. > > > > > */ > > > > > - MNT_ILOCK(mp); > > > > > loop: > > > > > MNT_VNODE_FOREACH_ALL(vp, mp, mvp) { > > > > > if (vp->v_type == VNON) { > > > > > > > > > > > > > Didn't help, sorry. I put 234385 through some pretty heavy load > > > > yesterday, and everything was fine. As soon as I move up to 234386, the > > > > panic triggered again. So I cleaned everything up, applied your patch, > > > > built a kernel from scratch, and rebooted. It was Ok for a few seconds > > > > after boot, then panic'ed again, I think in a different place, but I'm > > > > not sure because subsequent attempts to fsck the file systems caused new > > > > panics which overwrote the old ones before they could be saved. > > > > > > > > > > Another MNT_ILOCK was hiding few lines below, try this patch: > > > > > > http://student.agh.edu.pl/~mjguzik/patches/ext2fs-ilock.patch > > > > > > I've tested this a bit and I believe this fixes your problem. > > > > > > > Gave this a spin and found what looks like a deadlock: > > > > http://people.freebsd.org/~pho/stress/log/ext2fs.txt > > > > Not a new problem, it would seem. Same issue with 8.3-PRERELEASE r232656M. > > > > pid 2680 (fts) holds lock for vnode cb4be414 and tries to lock cc0ac15c > pid 2581 (openat) holds lock for vnode cc0ac15c and tries to lock cb4be414 > > openat calls rmdir foo/bar and ext2_rmdir unlocks and tries to lock > again foo's vnode. > > This is fairly easly reproducible with concurrently running mkdir and fts > testcase programs that are provided by stress2. > > I'll try to come up with a patch by the end of the week. > Great. Thank you for looking at this. - Peter