Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 18 Aug 2013 18:04:26 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Andriy Gapon <avg@FreeBSD.org>
Cc:        freebsd-fs@FreeBSD.org
Subject:   Re: Deadlock in nullfs/zfs somewhere
Message-ID:  <20130818150426.GR4972@kib.kiev.ua>
In-Reply-To: <520209B0.1030402@FreeBSD.org>
References:  <20130718185215.GE5991@kib.kiev.ua> <51E91277.3070309@FreeBSD.org> <20130719103025.GJ5991@kib.kiev.ua> <51E95CDD.7030702@FreeBSD.org> <20130719184243.GM5991@kib.kiev.ua> <51E99477.1030308@FreeBSD.org> <20130721071124.GY5991@kib.kiev.ua> <51EBABAB.5040808@FreeBSD.org> <20130721161854.GC5991@kib.kiev.ua> <520209B0.1030402@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--5+EaIICOHcajyXe8
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Aug 07, 2013 at 11:47:44AM +0300, Andriy Gapon wrote:
>=20
> Kostik,
>=20
> thank you for being patient with me and explaining details of the contrac=
t and
> inner workings of VFS suspend.
>=20
> As we discussed out-of-band, unfortunately, it seems that it is impossibl=
e to
> implement the same contract for ZFS.  The reason is that ZFS filesystems =
appear
> as many independent filesystems, but in reality they share a common pool.=
  So
> suspending a single filesystem does not suspend the pool and that is cont=
rary to
> current VFS suspend concept.
>=20
> Additionally, ZFS needs a "full" suspend mechanism that would prevent bot=
h read
> and write access from VFS layer.  The current VFS suspend mechanism suspe=
nd
> writes / modifications only.
>=20
> I am not sure how to reconcile the differences...
> Here is a number of rough ideas.  I will highly appreciate your opinion a=
nd
> suggestions.
>=20
> Idea #1.
> Add a new suspend type to VFS layer that would correspond to the needs of=
 ZFS.
> This is quite laborious as it would require adding vn_start_read calls in=
 many
> places.  Also, making two kinds of VFS suspend play nice with each other =
could
> be non-trivial.
If you mean a 'full suspend' mechanism which is to be added, as opposed
to the existing 'write suspend', then yes, this is a correct approach,
which would probably be useful outside ZFS as well. It's immediate
application is e.g. for the unmounts.

It is indeed very laborous and probably quite non-trivial, since the
suspend lock should be before any filesystem-level blocking primitives,
probably including vfs_busy().

>=20
> Idea #2.
> This is perhaps an ugly approach, but I already have it implemented local=
ly.
> The idea is to re-use / abuse vnode locking as a ZFS suspend barrier.
> (This can be considered to be analogous to putting vn_start_op() / vn_end=
_op()
> into vop_lock / vop_unlock).
> That is, ZFS would override VOP_LOCK/VOP_UNLOCK to check for internal
> suspension.  The necessary care would be taken to respect all locking fla=
gs
> including LK_NOWAIT.  Recursive entry would have to be supported too.
Please note that nandfd used somewhat similar approach, where it caused
obvious bugs last time I looked. At least, lookups were knowingly broken
regarding to lock order.

Devfs uses internal lock to protect the mount point, which is after
vnode locks.  Correcting the operation of the dm_lock required quite
an efforts, look at the DEVFS_DMP_DROP etc in devfs code.

If this is constrained to zfs without any effect on VFS, I do not care.
>=20
> Idea #3.
> Provide some other mechanism to expose ZFS suspension state to VFS.  And =
then
> use that mechanism to avoid blocking on calls to ZFS in the strategic /
> sensitive places like vlrureclaim(), vtryrecycle(), etc.

--5+EaIICOHcajyXe8
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.20 (FreeBSD)

iQIcBAEBAgAGBQJSEOJ5AAoJEJDCuSvBvK1B548P/RwgoqB4aaTE0Tc5gplC4Tma
cIpfIqDmN4lP9HIGDYVfTJuli/hnugs8NkK17yTrsHrjS8MbLMRUOucWIkO2pk99
3XoXF/ANpAkEGSgZH0ECmQEgK97FRvMJd25WSbdNZuT9AeN7jd327EwlaXNcDDZo
I+TZZpry7IM7aICU48wBW02N5rmn5L5qNu5rLjpXkdE88beBanDwIv/9zeZlqI3p
fcOQYrrL+/2mdbyGzdaNSS0AYCrLGN8V9M7XlucYxxxxkxxoMI+QHSaSxj1m1pUK
LTcAdyKO1uLlXC6sLKe/2cTOtK8M+nmf60IzxyDlY0f+oAwVavhg6bSx7oShXeV+
msNbkVHXdFvTRWeHXo68FEub+MRStze8SZ2flSLOZuhly9R/0mliydONRnXilIbD
3vyPaochFQuogERVZluz+x+TawQW9mGuCrUC8oPjRpr0JbMPBhZBHCj/EZzsM96r
6DjSVMKhQuIQEG5ZRRSFORVPVKPQ67C4mznRkvyUPHf2STiJD+y/FIa2M1/8KnX0
9Q+ZaBluI+Euqocyeuk7OITrBj/InYru7MYVHzMpIvu9HC5uqyQRB4ew+C15CEuA
yEJSA5wNOD4osV9BL461KDearD/9w4CsQaRTcm+hfVjpjoZJ8bW9iPY7ESW66CGt
D/Axn/q581c6UwNQd2w0
=nt4a
-----END PGP SIGNATURE-----

--5+EaIICOHcajyXe8--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130818150426.GR4972>