Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 5 May 2006 15:42:39 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        David Kirchner <dpk@dpk.net>
Cc:        stable@freebsd.org, Robert Watson <rwatson@freebsd.org>, Kris Kennaway <kris@obsecurity.org>
Subject:   Re: quota deadlock on 6.1-RC1
Message-ID:  <20060505124239.GG35756@deviant.kiev.zoral.com.ua>
In-Reply-To: <35c231bf0605041659m2d90e50y9026f18af592f9f5@mail.gmail.com>
References:  <44579EE1.6010300@rogers.com> <20060502180557.GA91762@xor.obsecurity.org> <4457A02C.9040408@rogers.com> <20060502182302.GA92027@xor.obsecurity.org> <20060503110503.O58458@fledge.watson.org> <35c231bf0605031821s582b6d03j3ee9d434a596f62a@mail.gmail.com> <20060504014241.GA38346@xor.obsecurity.org> <35c231bf0605032005n4fe38769v9637a9393efb791a@mail.gmail.com> <20060504100110.P17611@fledge.watson.org> <35c231bf0605041659m2d90e50y9026f18af592f9f5@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--ewQ5hdP4CtoTt3oD
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, May 04, 2006 at 04:59:33PM -0700, David Kirchner wrote:
> Here's how to reproduce the snapshot deadlock I'm seeing, with 6.1-RC2
> cvsup'd as of 5 or 6 hours ago:
>=20
> 1) dd if=3D/dev/zero of=3D/usr/bigfile bs=3D1024 seek=3D209715200 count=
=3D0
> 2) mdconfig -a -t vnode -f /usr/bigfile
> 3) bsdlabel -w md0 auto
> 4) newfs -U md0a
> 5) fsck -v /dev/md0a # ^C this after a second or so, this makes the FS di=
rty
> 6) mount /dev/md0a /mnt
> 7) fsck -v -B /dev/md0a
>=20
> in another window:
> 8) while true; do ls -al /mnt/.snap;sleep 1;done
>=20
> It locks up every time for me, with no further disk activity.
> Unfortunately, for some reason, my server console became unaccessable,
> so I'm not able to get to the kdb prompt. If I can get to it later,
> what should I run other than "show lockedvnodes" and "show threads"?
> Also, can anyone else try these steps and verify if they cause the
> same problem for you?
I repeat you recipe on CURRENT.
What I got was the completely unresponsively system,
that was _not_ deadlocked. It has slowly made a progress. Slowness is
surely related to hole in the file backing fsck'ed (and snapshotted)
filesystem. Snapshotting slowly made a progress, with lot of disk
activity. After it had finished, system resumed normal operation.

Tor Egge committed several fixes into CURRENT, that certainly
help in this situation.
>=20
> In my initial tests, filed in a PR, steps #1 and #2 were unnecessary
> as I was working with real disks. The result is the same here. Still,
> I am curious if anyone else can get the same result with a real disk
> >=3D200GB in size. I am unable to duplicate it with a 20GB partition,
> and I am not sure why.
>=20
> --
> David 'dpk' Kirchner

--ewQ5hdP4CtoTt3oD
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (FreeBSD)

iD8DBQFEW0g+C3+MBN1Mb4gRAvDVAJ9QMHMrazK7lnYEIAAXFOIu4xR/1gCgo6QC
xwrZo7eUAFqDbqLY3l+Rz4I=
=aOET
-----END PGP SIGNATURE-----

--ewQ5hdP4CtoTt3oD--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060505124239.GG35756>