Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 13 Oct 2014 20:02:12 +0100
From:      "Steven Hartland" <killing@multiplay.co.uk>
To:        "K. Macy" <kmacy@freebsd.org>
Cc:        Mark Martinec <Mark.Martinec+freebsd@ijs.si>, "freebsd-fs@FreeBSD.org" <freebsd-fs@freebsd.org>, FreeBSD Stable <freebsd-stable@freebsd.org>
Subject:   Re: zfs pool import hangs on [tx->tx_sync_done_cv]
Message-ID:  <E2E24A91B8B04C2DBBBC7E029A12BD05@multiplay.co.uk>
References:  <54372173.1010100@ijs.si> <644FA8299BF848E599B82D2C2C298EA7@multiplay.co.uk> <54372EBA.1000908@ijs.si> <DE7DD7A94E9B4F1FBB3AFF57EDB47C67@multiplay.co.uk> <543731F3.8090701@ijs.si> <543AE740.7000808@ijs.si> <A5BA41116A7F4B23A9C9E469C4146B99@multiplay.co.uk> <CAHM0Q_N%2BC=3qgUnyDkEugOFcL=J8gBjbTg8v45Vz3uT=e=Fn2g@mail.gmail.com> <6E01BBEDA9984CCDA14F290D26A8E14D@multiplay.co.uk> <CAHM0Q_OpV2sAQQAH6Cj_=yJWAOt8pTPWQ-m45JSiXDpBwT6WTA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

----- Original Message ----- 
From: "K. Macy" <kmacy@freebsd.org>
> On Mon, Oct 13, 2014 at 1:06 AM, Steven Hartland <killing@multiplay.co.uk>
> wrote:
>
>> ----- Original Message ----- From: "K. Macy" <kmacy@freebsd.org>
>>
>>  A recent quick read of the code would lead me to believe that zio_wait not
>>> returning there means that the zio never reached the zio_done stage.
>>> Parent
>>> zios seem to yield in a couple of stages in the pipeline if they have
>>> incomplete children. They determine this by calling zio_wait_for_children
>>> with zio child types and their corresponding wait type. In so doing they
>>> set the io_stall to the count of the number of waiters of the first
>>> non-zero check. This parent I/O will be resumed by the last child zio of
>>> that type and wait state in zio_notify_parent. I'm sure you know all this
>>> -
>>> but I wrote it to preface asking for the following fields of the zio being
>>> waited on in dsl_pool_sync_mos: io_stall (i.e, which field in io_children
>>> is pointed to) *io_stall,  io_children[*][*], io_child_list (at a first
>>> glance just the addresses). The other alternative is that it reexecuting
>>> has gotten in to a bad place in the state machine so io_reexecute.
>>>
>>
>> Yer I would have got the zio details but typically its "optimised out" by
>> the
>> compiler, so will need some effort to track that down unfortunately :(
>>
>
> Well, let me know if you can. Re-creating a new 10.x VM is taking a while
> as it's taking me forever to checkout the sources.
>
> Things like that need to somehow continue to be accessible.

I believe there's some pool corruption here somewhere as every once in a while
I trip and ASSERT panic:
panic: solaris assert: size >= SPA_MINBLOCKSIZE || range_tree_space(msp->ms_tree) == 0, file: 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c, line: 1636


#3  0xffffffff80607ed3 in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:688
#4  0xffffffff8179321d in assfail (a=<value optimized out>, f=<value optimized out>, l=<value optimized out>) at 
/usr/src/sys/modules/opensolaris/../../cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:81
#5  0xffffffff81501ecf in metaslab_passivate (msp=0xfffff800091e5800, size=0) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c:1636
#6  0xffffffff81501ca7 in metaslab_group_alloc (mg=0xfffff80044ef7400, psize=512, asize=512, txg=11733518, min_distance=536281088, 
dva=<value optimized out>, d=-1) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c:2206
#7  0xffffffff81500c9d in metaslab_alloc_dva (spa=0xfffffe00022bb000, mc=0xfffff800045d0c00, psize=512, dva=0xfffffe000e409640, d=0, 
hintdva=<value optimized out>, flags=<value optimized out>)
    at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c:2381
#8  0xffffffff81500709 in metaslab_alloc (spa=0xfffffe00022bb000, mc=0xfffff800045d0c00, psize=512, bp=0xfffffe000e409640, ndvas=3, 
txg=11733518, hintbp=0x0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c:2586
#9  0xffffffff8154ca8a in zio_dva_allocate (zio=0xfffff80044a57398) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:2444
#10 0xffffffff81548d54 in zio_execute (zio=0xfffff80044a57398) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1407
#11 0xffffffff8154d25f in zio_ready (zio=0xfffff800487d2730) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3031
#12 0xffffffff81548d54 in zio_execute (zio=0xfffff800487d2730) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1407
#13 0xffffffff80651410 in taskqueue_run_locked (queue=0xfffff80054291000) at /usr/src/sys/kern/subr_taskqueue.c:342
#14 0xffffffff80651dcb in taskqueue_thread_loop (arg=<value optimized out>) at /usr/src/sys/kern/subr_taskqueue.c:563


(kgdb) frame 5
#5  0xffffffff81501ecf in metaslab_passivate (msp=0xfffff800091e5800, size=0) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c:1636
1636            ASSERT(size >= SPA_MINBLOCKSIZE || range_tree_space(msp->ms_tree) == 0);
(kgdb) print size
$5 = 0
(kgdb) print msp->ms_tree
$6 = (range_tree_t *) 0xfffff800091e5400
(kgdb) print *msp->ms_tree
$7 = {
  rt_root = {
    avl_root = 0xfffff80048fab840,
    avl_compar = 0xffffffff81502850 <range_tree_seg_compare>,
    avl_offset = 0,
    avl_numnodes = 1,
    avl_size = 64
  },
  rt_space = 5632,
  rt_ops = 0xffffffff8160b110,
  rt_arg = 0xfffff800091e5800,
  rt_histogram = {0 <repeats 12 times>, 1, 0 <repeats 51 times>},
  rt_lock = 0xfffff800091e5800
}
(kgdb) print *msp
$8 = {
  ms_lock = {
    lock_object = {
      lo_name = 0xffffffff815eb310 "msp->ms_lock",
      lo_flags = 40960000,
      lo_data = 0,
      lo_witness = 0x0
    },
    sx_lock = 18446735279027798016
  },
  ms_load_cv = {
    cv_description = 0xffffffff815eb31e "msp->ms_load_cv",
    cv_waiters = 0
  },
  ms_sm = 0xfffff8000924bb80,
  ms_ops = 0xffffffff8160b100,
  ms_id = 119,
  ms_start = 3992977408,
  ms_size = 33554432,
  ms_fragmentation = 90,
  ms_alloctree = {0xfffff800091e5000, 0xfffff800091ec000, 0xfffff800091e3c00, 0xfffff800091e3400},
  ms_freetree = {0xfffff800091e4c00, 0xfffff800091e4000, 0xfffff800091e4400, 0xfffff800091e3000},
  ms_defertree = {0xfffff800091e3800, 0xfffff800091df400},
  ms_tree = 0xfffff800091e5400,
  ms_condensing = 0,
  ms_condense_wanted = 0,
  ms_loaded = 1,
  ms_loading = 0,
  ms_deferspace = 0,
  ms_weight = 13835058055282163712,
  ms_access_txg = 11733526,
  ms_size_tree = {
    avl_root = 0xfffff80048fab858,
    avl_compar = 0xffffffff81502360 <metaslab_rangesize_compare>,
    avl_offset = 24,
    avl_numnodes = 1,
    avl_size = 64
  },
  ms_lbas = {0, 0, 0, 0, 0, 0, 0, 0, 0, 4026524672, 4026530816, 0 <repeats 53 times>},
  ms_group = 0xfffff80044ef7400,
  ms_group_node = {
    avl_child = {0x0, 0x0},
    avl_pcb = 18446735278773074705
  },
  ms_txg_node = {
    tn_next = {0x0, 0xfffff800091d7b28, 0xfffff800091e2328, 0x0},
    tn_member = "\000\001\001"
  }
}

Also when it stalls I've seen this in the zfs debug:-
 13  39363                     :zfs-dprintf dnode.c - dnode_free_range:1655: ds=mos obj=31 blkid=0 nblks=1125899906842624 
txg=3078494
 13  39363                     :zfs-dprintf dbuf.c - dbuf_free_range:816: ds=mos obj=31 start=0 end=0
 13  39363                     :zfs-dprintf dbuf.c - dbuf_dirty:1133: ds=mos obj=40 lvl=0 blkid=-1 size=140
 13  39363                     :zfs-dprintf dnode.c - dnode_setdirty:1286: ds=mos obj=40 txg=3078494

nblks looks suspisiously large and I'm not sure blkid should be -1?

For reference to see this I'm using:
sysctl debug.zfs_flags=64
dtrace  -n 'zfs-dprintf {printf("%s - %s:%d: %s", stringof(arg0), stringof(arg1), arg2, stringof(arg3))}'

With all that said the following looks like it might indicate the issue:
zdb -e -m sys1boot

Metaslabs:
        vdev          0
        metaslabs   127   offset                spacemap          free
        ---------------   -------------------   ---------------   -------------
        metaslab      0   offset            0   spacemap    417   free        0
        metaslab      1   offset      2000000   spacemap     24   free        0
        metaslab      2   offset      4000000   spacemap     64   free        0
        metaslab      3   offset      6000000   spacemap   1094   free        0
        metaslab      4   offset      8000000   spacemap   1091   free        0
        metaslab      5   offset      a000000   spacemap   1093   free        0
        metaslab      6   offset      c000000   spacemap   1095   free        0
        metaslab      7   offset      e000000   spacemap   1096   free        0
        metaslab      8   offset     10000000   spacemap   1098   free        0
        metaslab      9   offset     12000000   spacemap   1097   free        0
        metaslab     10   offset     14000000   spacemap   1099   free        0
        metaslab     11   offset     16000000   spacemap   1102   free        0
        metaslab     12   offset     18000000   spacemap    113   free        0
        metaslab     13   offset     1a000000   spacemap    110   free        0
        metaslab     14   offset     1c000000   spacemap    115   free        0
        metaslab     15   offset     1e000000   spacemap   1103   free        0
        metaslab     16   offset     20000000   spacemap    416   free        0
        metaslab     17   offset     22000000   spacemap     81   free        0
        metaslab     18   offset     24000000   spacemap     57   free        0
        metaslab     19   offset     26000000   spacemap    102   free        0
        metaslab     20   offset     28000000   spacemap    100   free        0
        metaslab     21   offset     2a000000   spacemap    104   free        0
        metaslab     22   offset     2c000000   spacemap    125   free        0
        metaslab     23   offset     2e000000   spacemap    105   free        0
        metaslab     24   offset     30000000   spacemap    108   free        0
        metaslab     25   offset     32000000   spacemap    126   free        0
        metaslab     26   offset     34000000   spacemap    248   free        0
        metaslab     27   offset     36000000   spacemap     55   free        0
        metaslab     28   offset     38000000   spacemap     82   free        0
        metaslab     29   offset     3a000000   spacemap    149   free        0
        metaslab     30   offset     3c000000   spacemap    152   free        0
        metaslab     31   offset     3e000000   spacemap    155   free        0
        metaslab     32   offset     40000000   spacemap    156   free        0
        metaslab     33   offset     42000000   spacemap     72   free        0
        metaslab     34   offset     44000000   spacemap     96   free        0
        metaslab     35   offset     46000000   spacemap    159   free        0
        metaslab     36   offset     48000000   spacemap    158   free        0
        metaslab     37   offset     4a000000   spacemap    160   free        0
        metaslab     38   offset     4c000000   spacemap   1271   free        0
        metaslab     39   offset     4e000000   spacemap    161   free        0
        metaslab     40   offset     50000000   spacemap   1273   free        0
        metaslab     41   offset     52000000   spacemap   1277   free        0
        metaslab     42   offset     54000000   spacemap     35   free        0
        metaslab     43   offset     56000000   spacemap     25   free        0
        metaslab     44   offset     58000000   spacemap     32   free        0
        metaslab     45   offset     5a000000   spacemap    150   free        0
        metaslab     46   offset     5c000000   spacemap    151   free        0
        metaslab     47   offset     5e000000   spacemap    163   free        0
        metaslab     48   offset     60000000   spacemap    415   free        0
        metaslab     49   offset     62000000   spacemap     98   free        0
        metaslab     50   offset     64000000   spacemap    109   free        0
        metaslab     51   offset     66000000   spacemap    171   free        0
        metaslab     52   offset     68000000   spacemap     80   free        0
        metaslab     53   offset     6a000000   spacemap    134   free        0
        metaslab     54   offset     6c000000   spacemap    135   free        0
        metaslab     55   offset     6e000000   spacemap    154   free        0
        metaslab     56   offset     70000000   spacemap    140   free        0
        metaslab     57   offset     72000000   spacemap    141   free        0
        metaslab     58   offset     74000000   spacemap   1272   free        0
        metaslab     59   offset     76000000   spacemap    138   free        0
        metaslab     60   offset     78000000   spacemap    139   free        0
        metaslab     61   offset     7a000000   spacemap     18   free        0
        metaslab     62   offset     7c000000   spacemap    148   free        0
        metaslab     63   offset     7e000000   spacemap   1270   free        0
        metaslab     64   offset     80000000   spacemap    114   free        0
        metaslab     65   offset     82000000   spacemap    112   free        0
        metaslab     66   offset     84000000   spacemap    116   free        0
        metaslab     67   offset     86000000   spacemap    164   free        0
        metaslab     68   offset     88000000   spacemap    243   free        0
        metaslab     69   offset     8a000000   spacemap    128   free        0
        metaslab     70   offset     8c000000   spacemap   1101   free        0
        metaslab     71   offset     8e000000   spacemap    153   free        0
        metaslab     72   offset     90000000   spacemap    120   free        0
        metaslab     73   offset     92000000   spacemap     62   free        0
        metaslab     74   offset     94000000   spacemap    122   free        0
        metaslab     75   offset     96000000   spacemap    131   free        0
        metaslab     76   offset     98000000   spacemap    129   free        0
        metaslab     77   offset     9a000000   spacemap    157   free        0
        metaslab     78   offset     9c000000   spacemap    133   free        0
        metaslab     79   offset     9e000000   spacemap    137   free        0
        metaslab     80   offset     a0000000   spacemap    165   free        0
        metaslab     81   offset     a2000000   spacemap    168   free        0
        metaslab     82   offset     a4000000   spacemap    170   free        0
        metaslab     83   offset     a6000000   spacemap    127   free        0
        metaslab     84   offset     a8000000   spacemap    180   free        0
        metaslab     85   offset     aa000000   spacemap    162   free        0
        metaslab     86   offset     ac000000   spacemap   1100   free        0
        metaslab     87   offset     ae000000   spacemap    107   free        0
        metaslab     88   offset     b0000000   spacemap    119   free        0
        metaslab     89   offset     b2000000   spacemap     61   free        0
        metaslab     90   offset     b4000000   spacemap     60   free        0
        metaslab     91   offset     b6000000   spacemap    123   free        0
        metaslab     92   offset     b8000000   spacemap    130   free        0
        metaslab     93   offset     ba000000   spacemap     59   free        0
        metaslab     94   offset     bc000000   spacemap    167   free        0
        metaslab     95   offset     be000000   spacemap    136   free        0
        metaslab     96   offset     c0000000   spacemap    144   free        0
        metaslab     97   offset     c2000000   spacemap    166   free        0
        metaslab     98   offset     c4000000   spacemap    169   free        0
        metaslab     99   offset     c6000000   spacemap     58   free        0
        metaslab    100   offset     c8000000   spacemap     56   free        0
        metaslab    101   offset     ca000000   spacemap     54   free        0
        metaslab    102   offset     cc000000   spacemap     53   free        0
        metaslab    103   offset     ce000000   spacemap     52   free        0
        metaslab    104   offset     d0000000   spacemap    106   free        0
        metaslab    105   offset     d2000000   spacemap     51   free        0
        metaslab    106   offset     d4000000   spacemap     50   free        0
        metaslab    107   offset     d6000000   spacemap    121   free        0
        metaslab    108   offset     d8000000   spacemap    124   free        0
        metaslab    109   offset     da000000   spacemap     49   free        0
        metaslab    110   offset     dc000000   spacemap    142   free        0
        metaslab    111   offset     de000000   spacemap    132   free        0
        metaslab    112   offset     e0000000   spacemap     48   free        0
        metaslab    113   offset     e2000000   spacemap     47   free        0
        metaslab    114   offset     e4000000   spacemap     46   free        0
        metaslab    115   offset     e6000000   spacemap     45   free        0
        metaslab    116   offset     e8000000   spacemap    143   free        0
        metaslab    117   offset     ea000000   spacemap     44   free        0
        metaslab    118   offset     ec000000   spacemap     43   free     177K
        metaslab    119   offset     ee000000   spacemap     42   free    13.0K
        metaslab    120   offset     f0000000   spacemap     41   free        0
        metaslab    121   offset     f2000000   spacemap    179   free        0
        metaslab    122   offset     f4000000   spacemap     40   free        0
        metaslab    123   offset     f6000000   spacemap     39   free        0
        metaslab    124   offset     f8000000   spacemap     38   free        0
        metaslab    125   offset     fa000000   spacemap     37   free        0
        metaslab    126   offset     fc000000   spacemap     36   free        0

Only two metaslabs with any free?

zfs list -r sys1boot
NAME            USED  AVAIL  REFER  MOUNTPOINT
sys1boot       1.76G  2.08G    11K  /sys1boot
sys1boot/ROOT  1.72G  2.08G  1.20G  /sys1boot/ROOT


NAME       SIZE  ALLOC   FREE   FRAG  EXPANDSZ    CAP  DEDUP  HEALTH  ALTROOT
sys1boot  3.97G  3.97G   190K     0%         -    99%  1.00x  ONLINE  -

So basically the zpool says its out of space even though zfs says its under
half full?

Given this I'm guessing that on import when it goes to write an update its
hitting a brick wall and can't, with the zio being flagged as can't fail
it seems to get stuck.

    Regards
    Steve 




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E2E24A91B8B04C2DBBBC7E029A12BD05>