Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 16 Mar 2007 12:32:45 +0200
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Ulrich Spoerlein <uspoerlein@gmail.com>
Cc:        stable@freebsd.org
Subject:   Re: Snapshot deadlock while dumping
Message-ID:  <20070316103245.GI80993@deviant.kiev.zoral.com.ua>
In-Reply-To: <7ad7ddd90703160121u6e5b208fqcbc4221a0cbdd03f@mail.gmail.com>
References:  <7ad7ddd90703160121u6e5b208fqcbc4221a0cbdd03f@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--hABqaeELJqnDDeDE
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Mar 16, 2007 at 09:21:13AM +0100, Ulrich Spoerlein wrote:
> Hi,
>=20
> One of our fileservers deadlocked, again. It is running RELENG_6 from
> 2006-11-14 and was running dump(8) -L on a 11% filled 400GB UFS2
> volume. It is hanging for 3h hours now, and there is no disk activity.
>=20
> # ps axl | grep snap
>    0    46     0   1  -4  0     0     8 snaplk DL    ??   98:58.88=20
>    [bufdaemon]
>    0    48     0   0  -4  0     0     8 snaplk DL    ??   68:22.58 [synce=
r]
>    0 15179 11192   5   8  0  1708  1044 wait   I+    p1    0:00.00 sh
> -c /sbin/mksnap_ffs /export/
>    0 18738 15179   0  -8  0  2776  1756 getbuf D+    p1    0:04.07
> /sbin/mksnap_ffs /export/homes
>=20
> Quotas are enabled in the server, but the filesystems are currently
> mounted without quota support (they were once mounted with userquota,
> though).
>=20
> Thanks,
> Uli
And, what is the question ? You know what is needed to debug the hang.
In addition to DDB, "options DEBUG_LOCKS, DEBUG_VFS_LOCKS" would be
very helpful.

=46rom the wait channel for proc 18738, I suspect that the problem might
be the LOR between cg buffer lock and snaplk. The fix was committed to
CURRENT some time ago, and I'm waiting for re@ decision whether the
change could be MFCed.

Meantime, if you can systematically reproduce the problem, I would recommend
you, in addition to providing proper deadlock report, to try the following
patch (it was heavily reviewed and tested before committed to CURRENT):

http://people.freebsd.org/~kib/misc/bdwrite.8.patch

(just ignore xfs chunk).

>=20
> PS: I can't break to DDB, as it is not configured for this server.
> What are the recommended DDB settings for _production_ servers? I want
> them to reboot on panic, but be able to grab the panic string via
> serial console. Is something like this gonna do the trick? Is there
> some kind of performance impact?
>=20
> options KDB
> options DDB
> options KDB_UNATTENDED
> options ALT_BREAK_TO_DEBUGGER
>=20
> It should *NOT* enter the debugger, if I plug/pull an RS232 cable. I
> read somewhere, that some controllers do send a break if the cable
> gets pulled, IIRC.
It seems to be reasonable set of options (see above for DEBUG_VFS_LOCKS,
that would have some impact on performance).

--hABqaeELJqnDDeDE
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (FreeBSD)

iD8DBQFF+nJMC3+MBN1Mb4gRAoNDAJ44/lO39GovF7GdF6WVFdPq76kBzQCgsGRE
efePAlAmfIzhBr+6/EJZoCs=
=6/Mu
-----END PGP SIGNATURE-----

--hABqaeELJqnDDeDE--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070316103245.GI80993>