Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 21 Sep 2014 16:10:31 +0200
From:      Fabian Keil <freebsd-listen@fabiankeil.de>
To:        <freebsd-fs@freebsd.org>
Subject:   panic: solaris assert: bpobj_iterate(&spa->spa_deferred_bpobj, spa_free_sync_cb, zio, tx) == 0 (0x6 == 0x0), file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c, line: 6156
Message-ID:  <4c86b205.1f11cf29@fabiankeil.de>

next in thread | raw e-mail | index | archive | help
--Sig_/GIgwPXJOJWzwtyNRQYwxpqw
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Two days ago a power outage took out a zpool but not the laptop
it was attached to. This resulted in:

Sep 19 22:50:58 r500 kernel: [41317] ugen1.2: <Intenso> at usbus1 (disconne=
cted)
Sep 19 22:50:58 r500 kernel: [41317] umass0: at uhub1, port 2, addr 2 (disc=
onnected)
Sep 19 22:50:58 r500 kernel: [41317] da0 at umass-sim0 bus 0 scbus2 target =
0 lun 0
Sep 19 22:50:58 r500 kernel: [41317] da0: <  > detached
Sep 19 22:50:58 r500 kernel: [41317] pass2 at umass-sim0 bus 0 scbus2 targe=
t 0 lun 0
Sep 19 22:50:58 r500 kernel: [41317] pass2: <  > detached
Sep 19 22:50:58 r500 kernel: [41317] (pass2:umass-sim0:0:0:0): Periph destr=
oyed
Sep 19 22:50:58 r500 kernel: [41317] GEOM_ELI: Device label/intenso1.eli de=
stroyed.
Sep 19 22:50:58 r500 kernel: [41317] GEOM_ELI: Detached label/intenso1.eli =
on last close.
Sep 19 22:50:58 r500 kernel: [41317] (da0:umass-sim0:0:0:0): Periph destroy=
ed
Sep 19 22:50:58 r500 ZFS: vdev is removed, pool_guid=3D13312956307733420090=
 vdev_guid=3D11021414854688829035
[...]
Sep 19 22:50:58 r500 kernel: [41318] system power profile changed to 'econo=
my'
Sep 19 22:50:58 r500 kernel: [41318] acpi_acad0: Off Line
Sep 19 22:50:59 r500 power_profile: changed to 'economy'

Followed by a panic:

(kgdb) where
#0  doadump (textdump=3D0) at pcpu.h:219
#1  0xffffffff8030eeae in db_dump (dummy=3D<value optimized out>, dummy2=3D=
0, dummy3=3D0, dummy4=3D0x0) at /usr/src/sys/ddb/db_command.c:543
#2  0xffffffff8030e98d in db_command (cmd_table=3D0x0) at /usr/src/sys/ddb/=
db_command.c:449
#3  0xffffffff8030e704 in db_command_loop () at /usr/src/sys/ddb/db_command=
.c:502
#4  0xffffffff80311160 in db_trap (type=3D<value optimized out>, code=3D0) =
at /usr/src/sys/ddb/db_main.c:231
#5  0xffffffff805d7bc1 in kdb_trap (type=3D3, code=3D0, tf=3D<value optimiz=
ed out>) at /usr/src/sys/kern/subr_kdb.c:654
#6  0xffffffff8085ab67 in trap (frame=3D0xfffffe00955ed850) at /usr/src/sys=
/amd64/amd64/trap.c:542
#7  0xffffffff8083eef2 in calltrap () at /usr/src/sys/amd64/amd64/exception=
.S:231
#8  0xffffffff805d72be in kdb_enter (why=3D0xffffffff8095b0cd "panic", msg=
=3D<value optimized out>) at cpufunc.h:63
#9  0xffffffff80597d01 in panic (fmt=3D<value optimized out>) at /usr/src/s=
ys/kern/kern_shutdown.c:739
#10 0xffffffff8133d22f in assfail3 (a=3D<value optimized out>, lv=3D<value =
optimized out>, op=3D<value optimized out>, rv=3D<value optimized out>, f=
=3D<value optimized out>, l=3D<value optimized out>)
    at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:91
#11 0xffffffff811477f8 in spa_sync (spa=3D0xfffff8005b727000, txg=3D69362) =
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:6155
#12 0xffffffff81150ed6 in txg_sync_thread (arg=3D0xfffff8000291a000) at /us=
r/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c:517
#13 0xffffffff8055e4fa in fork_exit (callout=3D0xffffffff81150b30 <txg_sync=
_thread>, arg=3D0xfffff8000291a000, frame=3D0xfffffe00955edc00) at /usr/src=
/sys/kern/kern_fork.c:977
#14 0xffffffff8083f42e in fork_trampoline () at /usr/src/sys/amd64/amd64/ex=
ception.S:605
#15 0x0000000000000000 in ?? ()

The kernel is based on FreeBSD 11.0-CURRENT r271788.

Later on another power outage took out the pool again,
but this time it was just faulted as expected.

The pool is:

fk@r500 ~ $zpool status intenso1
  pool: intenso1
 state: ONLINE
status: One or more devices has experienced an error resulting in data
	corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
	entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub in progress since Fri Sep 19 23:07:49 2014
        400G scanned out of 941G at 29.3M/s, 5h15m to go
        0 repaired, 42.48% done
config:

	NAME                  STATE     READ WRITE CKSUM
	intenso1              ONLINE       0     0     0
	  label/intenso1.eli  ONLINE       0     0     0

errors: 8 data errors, use '-v' for a list

Once the scrub is complete, I expect the "data errors" to be gone
as they are merely the result of temporary read errors after the
second outage. Apparently those aren't properly handled with the
given pool layout, but that's another issue.

Fabian

--Sig_/GIgwPXJOJWzwtyNRQYwxpqw
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAlQe3FYACgkQBYqIVf93VJ227QCgvYZlnbLgM9ovd8h7BRplciWW
s0MAoJu14e80wyEFnAusizsZfYjbaIpv
=FqUi
-----END PGP SIGNATURE-----

--Sig_/GIgwPXJOJWzwtyNRQYwxpqw--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4c86b205.1f11cf29>