Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 23 Dec 2015 10:58:37 +0100
From:      Fabian Keil <freebsd-listen@fabiankeil.de>
To:        freebsd-fs@freebsd.org
Subject:   Re: ZFS:dmu_objset_find_dp_impl() - panic: vm_fault: fault on nofault entry, addr: fffffe0094653000
Message-ID:  <20151223105837.53b2c1ae@fabiankeil.de>
In-Reply-To: <20151222161200.19ab1832@fabiankeil.de>
References:  <20151222161200.19ab1832@fabiankeil.de>

next in thread | previous in thread | raw e-mail | index | archive | help
--Sig_/LkPuX3yEHhb4Kyxp4q4kfIC
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Fabian Keil <freebsd-listen@fabiankeil.de> wrote:

> Using a kernel based on r292334, I got this panic while importing
> a ZFS pool with vfs.zfs.spa_load_verify_data and
> vfs.zfs.spa_load_verify_metadata set to 0.
>=20
> I've not been able to reproduce it yet and the changed sysctl's above
> may not actually matter (but I usually use the defaults).

I unintentionally reproduced it yesterday with the same kernel
using the default values for the sysctls above.

> The pool has a single leaf vdev that is backed by ggatec which transfers =
the
> data over a slow and easily saturated connection (< ~120 kB/s up). Graph:
> https://www.fabiankeil.de/talks/versteckter-block-speicher/mgp00030.html
>=20
> fk@r500 /usr/crash $kgdb /usr/lib/debug/boot/kernel/kernel.debug vmcore.2=
=20
> [...]
> Unread portion of the kernel message buffer:
> [11912] panic: vm_fault: fault on nofault entry, addr: fffffe0094653000
> [11912] cpuid =3D 0
> [11912] KDB: stack backtrace:
> [...]
> #0  doadump (textdump=3D0) at pcpu.h:221
> 221	pcpu.h: No such file or directory.
> 	in pcpu.h
> (kgdb) where
> #0  doadump (textdump=3D0) at pcpu.h:221
> #1  0xffffffff8031752b in db_dump (dummy=3D<value optimized out>, dummy2=
=3Dfalse, dummy3=3D0, dummy4=3D0x0) at /usr/src/sys/ddb/db_command.c:533
> #2  0xffffffff8031731e in db_command (cmd_table=3D0x0) at /usr/src/sys/dd=
b/db_command.c:440
> #3  0xffffffff803170b4 in db_command_loop () at /usr/src/sys/ddb/db_comma=
nd.c:493
> #4  0xffffffff80319bbb in db_trap (type=3D<value optimized out>, code=3D0=
) at /usr/src/sys/ddb/db_main.c:251
> #5  0xffffffff805e2dc3 in kdb_trap (type=3D3, code=3D0, tf=3D<value optim=
ized out>) at /usr/src/sys/kern/subr_kdb.c:654
> #6  0xffffffff8087f207 in trap (frame=3D0xfffffe0094f8f220) at /usr/src/s=
ys/amd64/amd64/trap.c:549
> #7  0xffffffff808641b7 in calltrap () at /usr/src/sys/amd64/amd64/excepti=
on.S:234
> #8  0xffffffff805e24ab in kdb_enter (why=3D0xffffffff8097216b "panic", ms=
g=3D0x32 <Address 0x32 out of bounds>) at cpufunc.h:63
> #9  0xffffffff8059ea4f in vpanic (fmt=3D<value optimized out>, ap=3D<valu=
e optimized out>) at /usr/src/sys/kern/kern_shutdown.c:750
> #10 0xffffffff8059e8a3 in panic (fmt=3D0x0) at /usr/src/sys/kern/kern_shu=
tdown.c:688
> #11 0xffffffff80835650 in vm_fault_hold (map=3D<value optimized out>, vad=
dr=3D<value optimized out>, fault_type=3D<value optimized out>, fault_flags=
=3D<value optimized out>, m_hold=3D<value optimized out>)
>     at /usr/src/sys/vm/vm_fault.c:332
> #12 0xffffffff808332f8 in vm_fault (map=3D0xfffff80002000000, vaddr=3D<va=
lue optimized out>, fault_type=3D1 '\001', fault_flags=3D0) at /usr/src/sys=
/vm/vm_fault.c:277
> #13 0xffffffff8087f97a in trap_pfault (frame=3D0xfffffe0094f8f8d0, usermo=
de=3D0) at /usr/src/sys/amd64/amd64/trap.c:734
> #14 0xffffffff8087f21e in trap (frame=3D0xfffffe0094f8f8d0) at /usr/src/s=
ys/amd64/amd64/trap.c:435
> #15 0xffffffff808641b7 in calltrap () at /usr/src/sys/amd64/amd64/excepti=
on.S:234
> #16 0xffffffff81900c9a in dmu_objset_find_dp_impl (dcp=3D0xfffff80078cb02=
00) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c=
:1630
> #17 0xffffffff81901189 in dmu_objset_find_dp_cb (arg=3D0xfffff80078cb0200=
) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:1=
746
[...]
> Given the location of the trap, this could be a regression caused
> by the import of illumos #5269 (zpool import slow) in r286686:
> https://svnweb.freebsd.org/base?view=3Drevision&revision=3Dr286686

On the other hand I've never seen the issue with previous kernels
and two times with the one based on r292334.

I've updated to a kernel based on r292616 to see if it makes a difference
(there were quite a few vm changes).

Fabian

--Sig_/LkPuX3yEHhb4Kyxp4q4kfIC
Content-Type: application/pgp-signature
Content-Description: OpenPGP digital signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAlZ6cE0ACgkQBYqIVf93VJ03xwCgv6n9yYpeREldBPZOhEPxB0+w
AmwAnjnlbZyrJ7vLPwAjPWVsBW9izMzr
=HlWV
-----END PGP SIGNATURE-----

--Sig_/LkPuX3yEHhb4Kyxp4q4kfIC--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20151223105837.53b2c1ae>