Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 26 Apr 2006 20:33:19 +0300
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Pawel Jakub Dawidek <pjd@freebsd.org>
Cc:        Kostik Belousov <kostikbel@gmail.com>, freebsd-stable@freebsd.org, Dmitry Morozovsky <marck@rinet.ru>, Kris Kennaway <kris@obsecurity.org>
Subject:   Re: fsck_ufs locked in snaplk
Message-ID:  <20060426173319.GH1446@deviant.kiev.zoral.com.ua>
In-Reply-To: <20060426164228.GB17000@garage.freebsd.pl>
References:  <20060425133532.GD1446@deviant.kiev.zoral.com.ua> <20060425095610.ibv24kg1kw00s040@www.wolves.k12.mo.us> <20060425185741.C71240@woozle.rinet.ru> <20060425153909.GE1446@deviant.kiev.zoral.com.ua> <20060425162252.GA54244@xor.obsecurity.org> <444EC8EA.8050305@asd.aplus.net> <20060426011703.GA61794@xor.obsecurity.org> <20060426134217.C93749@woozle.rinet.ru> <20060426133617.GG1446@deviant.kiev.zoral.com.ua> <20060426164228.GB17000@garage.freebsd.pl>

next in thread | previous in thread | raw e-mail | index | archive | help

--3607uds81ZQvwCD0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Apr 26, 2006 at 06:42:28PM +0200, Pawel Jakub Dawidek wrote:
> On Wed, Apr 26, 2006 at 04:36:17PM +0300, Kostik Belousov wrote:
> +> On Wed, Apr 26, 2006 at 01:43:42PM +0400, Dmitry Morozovsky wrote:
> +> > On Tue, 25 Apr 2006, Kris Kennaway wrote:
> +> >=20
> +> > KK> What people are seeing now must be some other problem that I wan=
't
> +> > KK> able to reproduce.
> +> > KK>=20
> +> > KK> Once I hear back from someone who can reproduce it with debugging
> +> > KK> enabled (I'm also trying) we can try to fix it.
> +> >=20
> +> > Please try to simulate user who is over soft quota and is out of gra=
ce period.=20
> +> > I'm trying to do so as well, but currently quite busy with other tas=
ks :(
> +>=20
> +> I'm not sure whether the following is the issue you met, but:
> +>=20
> +> dqsync from sys/ufs/ufs/ufs_quota.c calls vn_start_secondary_write()
> +> unconditionally. As result, mp->mnt_secondary_accwrites counter
> +> from the struct mount will always increase after the entry to the dqsy=
nc.
> +> ffs_snapshot calls ffs_sync, that calls dsync, that
> +> iterates over vnodes and calls dqsync on them.
> +> And, after the qsync, ffs_sync checks whether mp->mnt_secondary_accwri=
tes
> +> changes by calling softdep_check_suspend (see line 1221 of ffs_vfsops.=
c).
> +> If changed, ffs_sync would restart the syncing loop, that never finish=
es.
> +>=20
> +> This is very strange, since if true, it basicaly means that snapshots
> +> and quotas shall lead to immediate deadlock ...
> +>=20
> +> The following patch moves call to vn_start_secondary_write after
> +> check for DQ_MOD. Please, try it.
>=20
> Your patch must not be against HEAD, because in HEAD we have:
>=20
> 	if ((dq->dq_flags & DQ_MOD) =3D=3D 0)
> 		return (0);
> 	if ((dqvp =3D dq->dq_ump->um_quotas[dq->dq_type]) =3D=3D NULLVP)
> 		panic("dqsync: file");
> 	(void) vn_start_secondary_write(dqvp, &mp, V_WAIT);
> 	if (vp !=3D dqvp)
> 		vn_lock(dqvp, LK_EXCLUSIVE | LK_RETRY, td);
>=20
> As you can see DQ_MOD is checked before vn_start_secondary_write().
Aha, I overlooked this first check, it explains why deadlock is rare.

Look, _after_ this check and vn_start_sec_write placed sleep point,
and after that (correctly) there is another check for DQ_MOD.

--3607uds81ZQvwCD0
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (FreeBSD)

iD8DBQFET67eC3+MBN1Mb4gRAr8SAKDvBzpM0iuOOPxLqW9R/xEXc14wQgCgqYYh
PVODxB9dtTtjDY4sHO81XBM=
=yupJ
-----END PGP SIGNATURE-----

--3607uds81ZQvwCD0--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060426173319.GH1446>