Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 9 Mar 2010 13:58:15 +0100
From:      Pawel Jakub Dawidek <pjd@FreeBSD.org>
To:        Borja Marcos <borjam@sarenet.es>
Cc:        FreeBSD Stable <freebsd-stable@freebsd.org>, Stefan Bethke <stb@lassitu.de>
Subject:   Re: Many processes stuck in zfs
Message-ID:  <20100309125815.GF3155@garage.freebsd.pl>
In-Reply-To: <EC9BC6B4-8D0E-4FE3-852F-0E3A24569D33@sarenet.es>
References:  <864468D4-DCE9-493B-9280-00E5FAB2A05C@lassitu.de> <20100309122954.GE3155@garage.freebsd.pl> <EC9BC6B4-8D0E-4FE3-852F-0E3A24569D33@sarenet.es>

next in thread | previous in thread | raw e-mail | index | archive | help

--+sHJum3is6Tsg7/J
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Mar 09, 2010 at 01:57:07PM +0100, Borja Marcos wrote:
>=20
> On Mar 9, 2010, at 1:29 PM, Pawel Jakub Dawidek wrote:
>=20
> > On Tue, Mar 09, 2010 at 10:15:53AM +0100, Stefan Bethke wrote:
> >> Over the past couple of months, I've more or less regularly observed m=
achines having more and more processes stuck in the zfs wchan.  The process=
es never recover from that, and trying to reboot only gets the entire syste=
m stuck, without any console messages.  I can enter the debugger, and I hav=
e saved a couple of dumps.
> >>=20
> >> The situation seems to be triggered by zfs receive'ing snapshots from =
the sister machine (both synchronize their active ZFS filesystems to each o=
ther, using zfs send and zfs receive).  It appears it's the receiving causi=
ng trouble.
> >>=20
> >> Both machines run 8-stable from mid-February, with a single-disk ZFS p=
ool, with ARC limited to 512M, prefetch and ZIL disabled via loader.conf.
> >>=20
> >> What should I be looking at to further diagnose?
> >=20
> > What kind of hardware do you have there? There is 3-way deadlock I've a
> > fix for which would be hard to trigger on single or dual core machines.
> >=20
> > Feel free to try the fix:
> >=20
> > 	http://people.freebsd.org/~pjd/patches/zfs_3way_deadlock.patch
>=20
> Maybe related to the deadlock I reported when I was receiving an incremen=
tal snapshot while the target dataset was being read?

Could be. This deadlock is in general related to zfs recv functionality.

--=20
Pawel Jakub Dawidek                       http://www.wheelsystems.com
pjd@FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

--+sHJum3is6Tsg7/J
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (FreeBSD)

iEYEARECAAYFAkuWReYACgkQForvXbEpPzQXUgCff7LzvckBJCEu/KzhxEwApHCe
hXcAoPS1vGVYm+6SnLr4LHP3k9+tdXWq
=GWQu
-----END PGP SIGNATURE-----

--+sHJum3is6Tsg7/J--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100309125815.GF3155>