Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 13 Nov 2008 12:26:42 +0200
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Jeremy Chadwick <koitsu@freebsd.org>
Cc:        Tim Bishop <tim@bishnet.net>, freebsd-stable@freebsd.org
Subject:   Re: System deadlock when using mksnap_ffs
Message-ID:  <20081113102642.GQ47073@deviant.kiev.zoral.com.ua>
In-Reply-To: <20081113044200.GA10419@icarus.home.lan>
References:  <20081112175826.GD26195@carrick.bishnet.net> <20081112194735.GK47073@deviant.kiev.zoral.com.ua> <20081113004102.GD24360@carrick.bishnet.net> <20081113044200.GA10419@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help

--xYeFQzU4VZLrHqxU
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Nov 12, 2008 at 08:42:00PM -0800, Jeremy Chadwick wrote:
> On Thu, Nov 13, 2008 at 12:41:02AM +0000, Tim Bishop wrote:
> > On Wed, Nov 12, 2008 at 09:47:35PM +0200, Kostik Belousov wrote:
> > > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote:
> > > > I've been playing around with snapshots lately but I've got a probl=
em on
> > > > one of my servers running 7-STABLE amd64:
> > > >=20
> > > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 1=
0 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN  amd64
> > > >=20
> > > > I run the mksnap_ffs command to take the snapshot and some time lat=
er
> > > > the system completely freezes up:
> > > >=20
> > > > paladin# cd /u2/.snap/
> > > > paladin# mksnap_ffs /u2 test.1
> > > >=20
> > > > It only happens on this one filesystem, though, which might be to do
> > > > with its size. It's not over the 2TB marker, but it's pretty close.=
 It's
> > > > also backed by a hardware RAID system, although a smaller filesyste=
m on
> > > > the same RAID has no issues.
> > > >=20
> > > > Filesystem  1K-blocks       Used     Avail Capacity  Mounted on
> > > > /dev/da0s1a 2078881084 921821396 990749202    48%    /u2
> > > >=20
> > > > To clarify "completely freezes up": unresponsive to all services ov=
er
> > > > the network, except ping. On the console I can switch between the t=
tys,
> > > > but none of them respond. The only way out is to hit the reset butt=
on.
> > >=20
> > > You need to provide information described in the
> > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/=
kerneldebug.html
> > > and especially
> > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/=
kerneldebug-deadlocks.html
> >=20
> > Ok, I've done that, and removed the patch that seemed to fix things.
> >=20
> > The first thing I notice after doing this on the console is that I can
> > still ctrl+t the process:
> >=20
> > load: 0.14  cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k
> >=20
> > But the top and ps I left running on other ttys have all stopped
> > responding.
>=20
> Then in my book, the patch didn't fix anything.  :-)  The system is
> still "deadlocking"; snapshot generation **should not** wedge the system
> hard like this.
You systematically mix two completely different issues:
- first one is the _deadlock_ experienced by Tim;
- second one is the slowdown during snapshot creation.
In fact, I may count third, where dump itself hangs, as a usermode process,
but kernel still normally operates.

Patch posted should fix or paper over the first issue for practical means.
Third issue most likely fixed by the subr_sleepqueue race fix.

--xYeFQzU4VZLrHqxU
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (FreeBSD)

iEYEARECAAYFAkkcAOEACgkQC3+MBN1Mb4gBLgCeJpvjH91HS+aZkdvC9fg6gAqF
m6MAoK4f2shdnDrmgyu7mj0xfptk5iSM
=hB/Y
-----END PGP SIGNATURE-----

--xYeFQzU4VZLrHqxU--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081113102642.GQ47073>