Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 1 May 2014 11:20:57 -0700
From:      David Wolfskill <david@catwhisker.org>
To:        Kirk McKusick <mckusick@mckusick.com>
Cc:        fs@freebsd.org
Subject:   Re: SU+J: 185 processes in state "suspfs" for >8 hrs. ... not good, right?
Message-ID:  <20140501182057.GJ1120@albert.catwhisker.org>
In-Reply-To: <201405011651.s41GphgX089174@chez.mckusick.com>
References:  <20140501161856.GH1120@albert.catwhisker.org> <201405011651.s41GphgX089174@chez.mckusick.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--94Ornb/7sD1MvElF
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, May 01, 2014 at 09:51:43AM -0700, Kirk McKusick wrote:
> ...
>=20
> The following fix for related problems was made to head and MFC'ed
> to stable/10 but not stable/9.
>=20
> *** stable/9/sys/ufs/ffs/ffs_vnops.c	2014-03-05 08:51:48.000000000 -0800
> --- stable/9/sys/ufs/ffsffs_vnops.c	2014-05-01 09:41:35.000000000 -0700
> ***************
> *** 258,266 ****
>   			continue;
>   		if (bp->b_lblkno > lbn)
>   			panic("ffs_syncvnode: syncing truncated data.");
> ! 		if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_NOWAIT, NULL))
>   			continue;
> - 		BO_UNLOCK(bo);
>   		if ((bp->b_flags & B_DELWRI) =3D=3D 0)
>   			panic("ffs_fsync: not dirty");
>   		/*
> --- 258,274 ----
>   			continue;
>   		if (bp->b_lblkno > lbn)
>   			panic("ffs_syncvnode: syncing truncated data.");
> ! 		if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_NOWAIT, NULL) =3D=3D 0) {
> ! 			BO_UNLOCK(bo);
> ! 		} else if (wait !=3D 0) {
> ! 			if (BUF_LOCK(bp,
> ! 			    LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK,
> ! 			    BO_LOCKPTR(bo)) !=3D 0) {
> ! 				bp->b_vflags &=3D ~BV_SCANNED;
> ! 				goto next;
> ! 			}
> ! 		} else
>   			continue;
>   		if ((bp->b_flags & B_DELWRI) =3D=3D 0)
>   			panic("ffs_fsync: not dirty");
>   		/*
>=20
> The associated comment is:
>=20
>     If we fail to do a non-blocking acquire of a buf lock while doing a
>     waiting sync pass we need to do a blocking acquire and restart.
>     Another thread, typically the buf daemon, may have this buf locked and
>     if we don't wait we can fail to sync the file.  This lead to a great
>     variety of softdep panics and deadlocks because we rely on all
>     dependencies being flushed before proceeding in several cases.
>=20
> Let me know if it helps your problem. If it does, I will MFC it to 9.
> There have been several other fixes made to SU+J that are more likely
> to be the cause of your problem, but they are not easily back-ported
> to stable/9. So if this does not fix your problem my only suggestions
> are to turn off journaling or move to running on stable/10.
> ...

Hrrrmmm...  Looks as if the above reflects stable/10's r251171 (in
particular, "Convert the bufobj lock to rwlock.") -- stable/9 doesn't
seem to know about BO_LOCKPTR(), and gcc makes some assumptions.  That
doesn't turn out well.

I think that migrating to stable/10 might make more sense than figuring
out how to fix this, especially if there are other causes of the
observed failure that are fixed in stable/10.

Thanks....

Peace,
david
--=20
David H. Wolfskill				david@catwhisker.org
Taliban: Evil cowards with guns afraid of truth from a 14-year old girl.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.

--94Ornb/7sD1MvElF
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (FreeBSD)

iQJ8BAEBCgBmBQJTYpCHXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQ4RThEMDY4QTIxMjc1MDZFRDIzODYzRTc4
QTY3RjlDOERFRjQxOTNCAAoJEIpn+cje9Bk7l9EQAJ1hsxQq+vRo+KQy81yGcy/u
r8n+SsL1PdFQX3VPHjaHs/fUY0if37rdlIAiwFbQP4EjPR1MSMBU4e9XLI6rB4Zh
jxyTt9BlCZpx/jP3LveyM+F2weX6gFM8tiu5MpRTuiEQu4yYqGBJ1HygEj8isSDb
kdA5TN/MBKLsbAS5B/WpI9/OD0Q4E1Q5sQArpzJYgVH/NTOo3HI71IhtuDwm+NDb
BJBZVOP+TWjEnlS9BqYKdfiZgDaaaHq3YQC/mD+eEqqx51CtnF0RXTV/D+1hyyHc
wfA8Y5l3s/NB31BER6okNkp+5Fh32f+dhdzYE3b6KO42j/KmE8j+5him03TDG2jC
4aBsPbhlgLsO1El+JMJeai4YOjJV27UG+yDDC4P4yUsV2l080QyBiTyi3NXvOM6C
Fc71X4fEfE+1BZtbNvIaAB0i4RSyClSMaBca2IjoI7eAE9uxXt9+p9dTiDD3P7bs
HhCMNV01KRKsmZBOLSBQxAPupUNw1MS/dNOxi573Wn45zJQVYjvP1u3xoTe3+5Ul
zyWsTHT84laDTTj2S1R4SbHPH1ZV/Gvp61kfA+tOM6MsoZwcnm2csdBk5NH/WILp
ECpLeSfRoSU/hK46XVXwO1LSmMzj0ALQ1hy0yqB7NW/ESJY+yqtWyvdSityfj2/8
9kbpr6oLpaUA9CWZ3D8H
=hsUU
-----END PGP SIGNATURE-----

--94Ornb/7sD1MvElF--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140501182057.GJ1120>