Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 6 Jul 2014 21:12:26 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Steve Wills <swills@freebsd.org>
Cc:        virtualization@freebsd.org, Ryan Stone <rysto32@gmail.com>, FreeBSD Current <current@freebsd.org>
Subject:   Re: tmpfs panic
Message-ID:  <20140706181226.GE93733@kib.kiev.ua>
In-Reply-To: <20140706172511.GA84461@mouf.net>
References:  <20140706135333.GA80856@mouf.net> <20140706154621.GA81830@mouf.net> <CAFMmRNzTFOVBSoU%2BCMnnEJ_rUooLC4v742hetMtXWMu_RmPzYw@mail.gmail.com> <20140706172511.GA84461@mouf.net>

next in thread | previous in thread | raw e-mail | index | archive | help

--l118U0+vX1D/6gtA
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sun, Jul 06, 2014 at 05:25:12PM +0000, Steve Wills wrote:
> On Sun, Jul 06, 2014 at 12:28:07PM -0400, Ryan Stone wrote:
> > On Sun, Jul 6, 2014 at 11:46 AM, Steve Wills <swills@freebsd.org> wrote:
> > > I should have noted this system is running in bhyve. Also I'm told th=
is panic
> > > may be related to the fact that the system is running in bhyve.
> > >
> > > Looking at it a little more closely:
> > >
> > > (kgdb) list *__mtx_lock_sleep+0xb1
> > > 0xffffffff809638d1 is in __mtx_lock_sleep (/usr/src/sys/kern/kern_mut=
ex.c:431).
> > > 426                      * owner stops running or the state of the lo=
ck changes.
> > > 427                      */
> > > 428                     v =3D m->mtx_lock;
> > > 429                     if (v !=3D MTX_UNOWNED) {
> > > 430                             owner =3D (struct thread *)(v & ~MTX_=
FLAGMASK);
> > > 431                             if (TD_IS_RUNNING(owner)) {
> > > 432                                     if (LOCK_LOG_TEST(&m->lock_ob=
ject, 0))
> > > 433                                             CTR3(KTR_LOCK,
> > > 434                                                 "%s: spinning on =
%p held by %p",
> > > 435                                                 __func__, m, owne=
r);
> > > (kgdb)
> > >
> > > I'm told that MTX_CONTESTED was set on the unlocked mtx and that MTX_=
CONTENDED
> > > is spuriously left behind, and to ask how lock prefix is handled in b=
hyve. Any
> > > of that make sense to anyone?
> >=20
> > The mutex has both MTX_CONTESTED and MTX_UNOWNED set on it?  That is a
> > special sentinel value that is set on a mutex when it is destroyed
> > (see MTX_DESTROYED in sys/mutex.h).  If that is the case it looks like
> > you've stumbled upon some kind of use-after-free in tmpfs.  I doubt
> > that bhyve is responsible (other than perhaps changing the timing
> > around making the panic more likely to happen).
>=20
> Given the first thing seen was:
>=20
> Freed UMA keg (TMPFS node) was not empty (16 items).  Lost 1 pages of mem=
ory.
>=20
> this sounds reasonable to me.
>=20
> What can I do to help find and elliminate the source of the error?

The most worrying fact there is that the mutex which is creating trouble
cannot be anything other but allnode_lock, from the backtrace.  For this
mutex to be destroyed, the unmount of the corresponding mount point must
run to completion.

In particular, it must get past the vflush(9) call in tmpfs_unmount().
This call reclaims all vnodes belonging to the unmounted mount point.
New vnodes cannot be instantiated meantime, since insmntque(9) is
blocked by the MNTK_UNMOUNT flag.

That said, the backtrace indicates that we have live vnode, which is
reclaimed, and also we have the mutex which is in the destroyed (?)
state.  My basic claim is that the two events cannot co-exist, at least,
this code path was heavily exercised and most issues were fixed during
several years.

I cannot exclude the possibility of tmpfs/VFS screwing things up,
but given the above reasoning, and the fact that this is the first
appearance of the MTX_DESTROED problem for the tmpfs unmounting code,
which was not changed for long time, I would at least ask some things
about bhyve.  I.e., I would rather first look at the locked prefix
emulation then at the tmpfs.

--l118U0+vX1D/6gtA
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBAgAGBQJTuZGJAAoJEJDCuSvBvK1B1QUP/AlpvZasnrIqIIq2SK8EuVFs
9G4jIYeaYI8bDSFrzQe/RbZETI89YLRVTSOBvO2W4lazu8em4oLBFYw7U8Kdw3bn
ITX14XRvCvRAb27Dg2PKo2tr2vN7sdlIZUdl4prBcs8ik9zQi1+kgiT1BF/snTAD
CqRsLoVmKlocfEfZSINEA7Emm+9OM5rsF7ktCiPIMVl/NeoznvfVz7j+ilz5RAer
Qeb3TWfBorbzZXO5nXFYxBaw6lgP/ZHqfdj0bGyaUQhhrNz/V02cmi1gITtZEKmR
QBdbdATcV/Y6hdAZkEUwRrHAw7/Zl/JF+0Hk4mjru7Ri8cV1Ki4K4bZ4iEWgCIqn
aDLIFjX1iTixmA03wDbX30Vo9gsX8m5qq4wYHHDLbttsO2LtclWiwpj1rqS7bjKS
Un6I2KUS+kT5vJ/YPNgg1a0P340CbLQy4rz6akl+T+VS390VrDwOH+IiTMD5hoWd
Dy+BRZn0gVxToHls89xu+Wc6u1KPK+0Iwngr+JktiCYlJggsEYjs+RsC0fRX3eC1
9mVYvVBo4Mjg8hFWP/U1xE+stJvYfzplj/JlUACtoV7gK90gHBllTL7Gqu1DoZjY
g2ozS3yvosdeiVB8q9hXCtY03yZt+ddNFjd67oQ2k8/3O41zhoHPeqFWTMepL/id
GlPcsrk94NTHNiVAxNqt
=9jD2
-----END PGP SIGNATURE-----

--l118U0+vX1D/6gtA--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140706181226.GE93733>