From owner-freebsd-stable@FreeBSD.ORG Thu Nov 13 06:53:06 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AF0B51065670; Thu, 13 Nov 2008 06:53:06 +0000 (UTC) (envelope-from wb@freebie.xs4all.nl) Received: from smtp-vbr6.xs4all.nl (smtp-vbr6.xs4all.nl [194.109.24.26]) by mx1.freebsd.org (Postfix) with ESMTP id 5735D8FC0A; Thu, 13 Nov 2008 06:53:06 +0000 (UTC) (envelope-from wb@freebie.xs4all.nl) Received: from freebie.xs4all.nl (freebie.xs4all.nl [82.95.250.254]) by smtp-vbr6.xs4all.nl (8.13.8/8.13.8) with ESMTP id mAD6r2Df060532; Thu, 13 Nov 2008 07:53:02 +0100 (CET) (envelope-from wb@freebie.xs4all.nl) Received: from freebie.xs4all.nl (localhost [127.0.0.1]) by freebie.xs4all.nl (8.14.2/8.14.2) with ESMTP id mAD6r1dR001289; Thu, 13 Nov 2008 07:53:01 +0100 (CET) (envelope-from wb@freebie.xs4all.nl) Received: (from wb@localhost) by freebie.xs4all.nl (8.14.2/8.14.2/Submit) id mAD6r1ui001288; Thu, 13 Nov 2008 07:53:01 +0100 (CET) (envelope-from wb) Date: Thu, 13 Nov 2008 07:53:01 +0100 From: Wilko Bulte To: Jeremy Chadwick Message-ID: <20081113065300.GA1276@freebie.xs4all.nl> References: <20081112175826.GD26195@carrick.bishnet.net> <20081112194735.GK47073@deviant.kiev.zoral.com.ua> <20081113004102.GD24360@carrick.bishnet.net> <20081113044200.GA10419@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20081113044200.GA10419@icarus.home.lan> User-Agent: Mutt/1.5.16 (2007-06-09) X-Virus-Scanned: by XS4ALL Virus Scanner Cc: Kostik Belousov , Tim Bishop , freebsd-stable@freebsd.org Subject: Re: System deadlock when using mksnap_ffs X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Nov 2008 06:53:06 -0000 Quoting Jeremy Chadwick, who wrote on Wed, Nov 12, 2008 at 08:42:00PM -0800 .. > On Thu, Nov 13, 2008 at 12:41:02AM +0000, Tim Bishop wrote: > > On Wed, Nov 12, 2008 at 09:47:35PM +0200, Kostik Belousov wrote: > > > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote: > > > > I've been playing around with snapshots lately but I've got a problem on > > > > one of my servers running 7-STABLE amd64: > > > > > > > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN amd64 > > > > > > > > I run the mksnap_ffs command to take the snapshot and some time later > > > > the system completely freezes up: > > > > > > > > paladin# cd /u2/.snap/ > > > > paladin# mksnap_ffs /u2 test.1 > > > > > > > > It only happens on this one filesystem, though, which might be to do > > > > with its size. It's not over the 2TB marker, but it's pretty close. It's > > > > also backed by a hardware RAID system, although a smaller filesystem on > > > > the same RAID has no issues. > > > > > > > > Filesystem 1K-blocks Used Avail Capacity Mounted on > > > > /dev/da0s1a 2078881084 921821396 990749202 48% /u2 > > > > > > > > To clarify "completely freezes up": unresponsive to all services over > > > > the network, except ping. On the console I can switch between the ttys, > > > > but none of them respond. The only way out is to hit the reset button. > > > > > > You need to provide information described in the > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html > > > and especially > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html > > > > Ok, I've done that, and removed the patch that seemed to fix things. > > > > The first thing I notice after doing this on the console is that I can > > still ctrl+t the process: > > > > load: 0.14 cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k > > > > But the top and ps I left running on other ttys have all stopped > > responding. > > Then in my book, the patch didn't fix anything. :-) The system is > still "deadlocking"; snapshot generation **should not** wedge the system > hard like this. > > Also, during my own testing, I am always able to use Ctrl-T to get > SIGINFO from the running process (mksnap_ffs). That behaviour does not > change for me. > > The rest of the below information is good -- but I'm confused about > something: is there anyone out there who can use mksnap_ffs on a > filesystem (/usr is a good test source) and NOT experience this > deadlocking problem? Literally *every* FreeBSD box I have root access > to suffers from this problem, so I'm a little baffled why we end-users > need to keep providing debugging output when it should be easy as pie > for a developer to do "dump -0 -L -a -f /path/fs.dump /usr" and watch > their system wedge. dump -L on my RELENG_7 machine does not wedge it. So there must be multiple factors influencing the snap creating problems or not. Wilko