Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 13 Oct 2014 12:13:58 -0700
From:      "K. Macy" <kmacy@freebsd.org>
To:        Steven Hartland <killing@multiplay.co.uk>
Cc:        "freebsd-fs@FreeBSD.org" <freebsd-fs@freebsd.org>, FreeBSD Stable <freebsd-stable@freebsd.org>
Subject:   Re: zfs pool import hangs on [tx->tx_sync_done_cv]
Message-ID:  <CAHM0Q_Oeka25-kdSDRC2evS1R8wuQ0_XgbcdZCjS09aXJ9_WWQ@mail.gmail.com>
In-Reply-To: <E2E24A91B8B04C2DBBBC7E029A12BD05@multiplay.co.uk>
References:  <54372173.1010100@ijs.si> <644FA8299BF848E599B82D2C2C298EA7@multiplay.co.uk> <54372EBA.1000908@ijs.si> <DE7DD7A94E9B4F1FBB3AFF57EDB47C67@multiplay.co.uk> <543731F3.8090701@ijs.si> <543AE740.7000808@ijs.si> <A5BA41116A7F4B23A9C9E469C4146B99@multiplay.co.uk> <CAHM0Q_N%2BC=3qgUnyDkEugOFcL=J8gBjbTg8v45Vz3uT=e=Fn2g@mail.gmail.com> <6E01BBEDA9984CCDA14F290D26A8E14D@multiplay.co.uk> <CAHM0Q_OpV2sAQQAH6Cj_=yJWAOt8pTPWQ-m45JSiXDpBwT6WTA@mail.gmail.com> <E2E24A91B8B04C2DBBBC7E029A12BD05@multiplay.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
>>> Yer I would have got the zio details but typically its "optimised out" by
>>> the
>>> compiler, so will need some effort to track that down unfortunately :(
>>>
>>
>> Well, let me know if you can. Re-creating a new 10.x VM is taking a while
>> as it's taking me forever to checkout the sources.
>>
>> Things like that need to somehow continue to be accessible.
>
>
> I believe there's some pool corruption here somewhere as every once in a
> while
> I trip and ASSERT panic:
> panic: solaris assert: size >= SPA_MINBLOCKSIZE ||
> range_tree_space(msp->ms_tree) == 0, file:
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c,
> line: 1636
>


<... snip>

You are correct.

(kgdb) p ((zio_t *)$r14)->io_reexecute
$32 = 2 '\002'
(kgdb) p ((zio_t *)$r14)->io_flags
$33 = 0
(kgdb) p ((zio_t *)$r14)->io_spa->spa_suspended
$34 = 1 '\001'

This means zio_suspend has been called from zio_done:
 else if (zio->io_reexecute & ZIO_REEXECUTE_SUSPEND) {
/*
* We'd fail again if we reexecuted now, so suspend
* until conditions improve (e.g. device comes online).
*/
zio_suspend(spa, zio);
}

If failure mode were panic we would have panicked when attempting the import:
void
zio_suspend(spa_t *spa, zio_t *zio)
{
if (spa_get_failmode(spa) == ZIO_FAILURE_MODE_PANIC)
fm_panic("Pool '%s' has encountered an uncorrectable I/O "
   "failure and the failure mode property for this pool "
"is set to panic.", spa_name(spa));



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHM0Q_Oeka25-kdSDRC2evS1R8wuQ0_XgbcdZCjS09aXJ9_WWQ>