FreeBSD Mail Archives

Date:      Thu, 29 Oct 2015 13:35:41 +0100
From:      Fabian Keil <freebsd-listen@fabiankeil.de>
To:        freebsd-current@freebsd.org
Cc:        "Steven Hartland" <killing@multiplay.co.uk>, Xin Li <delphij@delphij.net>, "Alexander Motin" <mav@ixsystems.com>
Subject:   Re: ZFS-related panic: "possible" spa->spa_errlog_lock deadlock
Message-ID:  <20151029133541.2bb55af9@fabiankeil.de>
In-Reply-To: <20151028135821.0d375ec5@fabiankeil.de>
References:  <492dbacb.5942cc9b@fabiankeil.de> <540C66AC.8070809@delphij.net> <4fa875ba.3cc970d7@fabiankeil.de> <540C8039.7010309@delphij.net> <20151028135821.0d375ec5@fabiankeil.de>

--Sig_/4TEXJLc2m1/KqUySJtNSQbe
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Fabian Keil <freebsd-listen@fabiankeil.de> wrote:

> Xin Li <delphij@delphij.net> wrote:
>=20
> > On 9/7/14 11:23 PM, Fabian Keil wrote: =20
> > > Xin Li <delphij@delphij.net> wrote:
> > >    =20
> > >> On 9/7/14 9:02 PM, Fabian Keil wrote:   =20
> > >>> Using a kernel built from FreeBSD 11.0-CURRENT r271182 I got
> > >>> the following panic yesterday:
> > >>>=20
> > >>> [...] Unread portion of the kernel message buffer: [6880]
> > >>> panic: deadlkres: possible deadlock detected for
> > >>> 0xfffff80015289490, blocked for 1800503 ticks   =20
> > >>=20
> > >> Any chance to get all backtraces (e.g. thread apply all bt full
> > >> 16)? I think a different thread that held the lock have been
> > >> blocked, probably related to your disconnected vdev.   =20
> > >=20
> > > Output of "thread apply all bt full 16" is available at:=20
> > > http://www.fabiankeil.de/tmp/freebsd/kgdb-output-spa_errlog_lock-dead=
lock.txt
> > >
> > >  A lot of the backtraces prematurely end with "Cannot access memory
> > > at address", therefore I also added "thread apply all bt" output.
> > >=20
> > > Apparently there are at least two additional threads blocking below
> > > spa_get_stats(): =20
> [...]
> > Yes, thread 1182 owned the lock and is waiting for the zio be done.
> > Other threads that wanted the lock would have to wait.
> >=20
> > I don't have much clue why the system entered this state, however, as
> > the operations should have errored out (the GELI device is gone on
> > 21:44:56 based on your log, which suggests all references were closed)
> > instead of waiting. =20

> I finally found the time to analyse the problem which seems
> to be that spa_sync() requires at least one writeable vdev to
> complete, but holds the lock(s) required to remove or bring back
> vdevs.
>=20
> Letting spa_sync() drop the lock and wait for at least one vdev
> to become writeable again seems to make the problem unreproducible
> for me, but probably merely shrinks the race window and thus is not
> a complete solution.
>=20
> For details see:
> https://www.fabiankeil.de/sourcecode/electrobsd/ZFS-Optionally-let-spa_sy=
nc-wait-for-writable-vdev.diff

Patch updated to unbreak the userspace build and to note that the
deadlock can still be reproduced with an artifical test case like:

Shell 1:
  mdconfig -u 0 -f /dpool/scratch/test-vdev.img
  zpool create test /dev/md0
  while sleep 1; do
        mdconfig -d -u 0 -o force &&
        mdconfig -f /dpool/scratch/test-vdev.img &&
        zpool clear test;
  done
Shell 2:
  # Cause writes to the pool from another shell, for example
  # by creating datasets.

Log excerpt (from test begin to deadlock):
Oct 29 12:34:28 kendra ZFS: vdev state changed, pool_guid=3D160393537382368=
08887 vdev_guid=3D3080051161477470469
[...]
Oct 29 12:46:57 kendra ZFS: vdev state changed, pool_guid=3D160393537382368=
08887 vdev_guid=3D3080051161477470469
Oct 29 12:46:59 kendra ZFS: vdev is removed, pool_guid=3D160393537382368088=
87 vdev_guid=3D3080051161477470469
Oct 29 12:46:59 kendra ZFS: vdev state changed, pool_guid=3D160393537382368=
08887 vdev_guid=3D3080051161477470469
Oct 29 12:47:00 kendra kernel: g_dev_taste: make_dev_p() failed (gp->name=
=3Dmd0, error=3D17)

With the deadman enabled, this will also cause:
 panic: I/O to pool 'test' appears to be hung on vdev guid 3080051161477470=
469 at '/dev/md0'.
 cpuid =3D 0
 KDB: stack backtrace:
 db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe01136a=
f870
 vpanic() at vpanic+0x182/frame 0xfffffe01136af8f0
 panic() at panic+0x43/frame 0xfffffe01136af950
 vdev_deadman() at vdev_deadman+0x127/frame 0xfffffe01136af9a0
 vdev_deadman() at vdev_deadman+0x40/frame 0xfffffe01136af9f0
 spa_deadman() at spa_deadman+0x86/frame 0xfffffe01136afa20
 softclock_call_cc() at softclock_call_cc+0x1a3/frame 0xfffffe01136afaf0
 softclock() at softclock+0x94/frame 0xfffffe01136afb20
 intr_event_execute_handlers() at intr_event_execute_handlers+0x1b6/frame 0=
xfffffe01136afb60
 ithread_loop() at ithread_loop+0xa6/frame 0xfffffe01136afbb0
 fork_exit() at fork_exit+0x9c/frame 0xfffffe01136afbf0
 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe01136afbf0

With test's txg_sync_thread being the offender:
 (kgdb) tid 101874
 [Switching to thread 819 (Thread 101874)]#0  sched_switch (td=3D0xfffff800=
513649a0, newtd=3D<value optimized out>, flags=3D<value optimized out>) at =
/usr/src/sys/kern/sched_ule.c:1969
 1969                    cpuid =3D PCPU_GET(cpuid);
 (kgdb) where
 #0  sched_switch (td=3D0xfffff800513649a0, newtd=3D<value optimized out>, =
flags=3D<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1969
 #1  0xffffffff805a3a18 in mi_switch (flags=3D260, newtd=3D0x0) at /usr/src=
/sys/kern/kern_synch.c:470
 #2  0xffffffff805ea15a in sleepq_wait (wchan=3D0x0, pri=3D0) at /usr/src/s=
ys/kern/subr_sleepqueue.c:631
 #3  0xffffffff80530509 in _cv_wait (cvp=3D0xfffff8002678ea98, lock=3D0xfff=
ff8002678ea78) at /usr/src/sys/kern/kern_condvar.c:139
 #4  0xffffffff81930bbb in zio_wait (zio=3D<value optimized out>) at /usr/s=
rc/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1535
 #5  0xffffffff818e4871 in dsl_pool_sync (dp=3D0xfffff80047dfd000, txg=3D76=
) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c:540
 #6  0xffffffff81903653 in spa_sync (spa=3D0xfffff8009dfe2000, txg=3D76) at=
 /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:6734
 #7  0xffffffff8190ccfa in txg_sync_thread (arg=3D0xfffff80047dfd000) at /u=
sr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c:517
 #8  0xffffffff80556edc in fork_exit (callout=3D0xffffffff8190c970 <txg_syn=
c_thread>, arg=3D0xfffff80047dfd000, frame=3D0xfffffe011c27bc00) at /usr/sr=
c/sys/kern/kern_fork.c:1011
 #9  0xffffffff8085b91e in fork_trampoline () at /usr/src/sys/amd64/amd64/e=
xception.S:609
 #10 0x0000000000000000 in ?? ()
 (kgdb) f 6
 #6  0xffffffff81903653 in spa_sync (spa=3D0xfffff8009dfe2000, txg=3D76) at=
 /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:6734
 (kgdb) p spa->spa_name
 $3 =3D 0xfffff8009dfe2000 "test"

Fabian

--Sig_/4TEXJLc2m1/KqUySJtNSQbe
Content-Type: application/pgp-signature
Content-Description: OpenPGP digital signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAlYyEp4ACgkQBYqIVf93VJ0sxgCfYP3pBHRJnWy1IZIxJiNERYDN
NoAAn2SGahuiiTpA4II8Q3kF+1815Pvc
=83oi
-----END PGP SIGNATURE-----

--Sig_/4TEXJLc2m1/KqUySJtNSQbe--

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20151029133541.2bb55af9>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation