Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 3 Apr 2010 05:08:41 GMT
From:      Alex Bakhtin <Alex.Bakhtin@gmail.com>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/145339: [zfs] deadlock after detaching block device from raidz pool
Message-ID:  <201004030508.o3358fhm097037@www.freebsd.org>
Resent-Message-ID: <201004030747.o337l8ak097875@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         145339
>Category:       kern
>Synopsis:       [zfs] deadlock after detaching block device from raidz pool
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Apr 03 07:47:07 UTC 2010
>Closed-Date:
>Last-Modified:
>Originator:     Alex Bakhtin
>Release:        8.0-STABLE
>Organization:
>Environment:
FreeBSD tarzan-new.private.flydrag.ru 8.0-STABLE FreeBSD 8.0-STABLE #1: Sat Apr  3 04:54:06 UTC 2010     bakhtin@tarzan-new.private.flydrag.ru:/mnt/obj/usr/src.old/sys/DEBUG  amd64

>Description:
Detaching (physically) block device when there is intensive writing to the pool causes deadlock. Tested on 8.0-STABLE/amd64 csuped at 02 Apr 2010. gmirror on the same system handles device detach properly. Detaching device when zfs is idle doesn't cause any problem.


Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x48
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff805815f9
stack pointer           = 0x28:0xffffff8000065b80
frame pointer           = 0x28:0xffffff8000065bb0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 3 (g_up)
exclusive spin mutex uart_hwmtx (uart_hwmtx) r = 0 (0xffffff0002a62838) locked @ /usr/src.old/sys/dev/uart/uart_cpu.h:92
exclusive lockmgr zfs (zfs) r = 0 (0xffffff0123079098) locked @ /usr/src.old/sys/kern/vfs_vnops.c:607
exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xffffff000c3728f0) locked @ /usr/src.old/sys/kern/uipc_sockbuf.c:148
exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xffffff000c35a8f0) locked @ /usr/src.old/sys/kern/uipc_sockbuf.c:148
exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xffffff0121a31648) locked @ /usr/src.old/sys/kern/uipc_sockbuf.c:148

0xffffff0123079000: tag zfs, type VREG
    usecount 1, writecount 1, refcount 1 mountedhere 0
    flags ()
    v_object 0xffffff0126114e58 ref 0 pages 0
    lock type zfs: EXCL by thread 0xffffff000c2d7740 (pid 2134)
#0 0xffffffff80579d27 at __lockmgr_args+0x777
#1 0xffffffff80613339 at vop_stdlock+0x39
#2 0xffffffff808d020b at VOP_LOCK1_APV+0x9b
#3 0xffffffff806300d7 at _vn_lock+0x57
#4 0xffffffff806316d8 at vn_write+0x218
#5 0xffffffff805d71e5 at dofilewrite+0x85
#6 0xffffffff805d89e0 at kern_writev+0x60
#7 0xffffffff805d8ae5 at write+0x55
#8 0xffffffff8087b488 at syscall+0x118
#9 0xffffffff80861611 at Xfast_syscall+0xe1


db:0:kdb.enter.default>  bt
Tracing pid 3 tid 100010 td 0xffffff0002899740
_mtx_lock_flags() at _mtx_lock_flags+0x39
vdev_geom_io_intr() at vdev_geom_io_intr+0x62
g_io_schedule_up() at g_io_schedule_up+0xed
g_up_procbody() at g_up_procbody+0x6f
fork_exit() at fork_exit+0x12a
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffff8000065d30, rbp = 0 ---



tarzan-new# zdb -vvv
storage
    version=14
    name='storage'
    state=0
    txg=578
    pool_guid=3309800284037274155
    hostid=4266611921
    hostname='tarzan-new.private.flydrag.ru'
    vdev_tree
        type='root'
        id=0
        guid=3309800284037274155
        children[0]
                type='raidz'
                id=0
                guid=11076638880661644944
                nparity=1
                metaslab_array=23
                metaslab_shift=36
                ashift=9
                asize=10001970626560
                is_log=0
                children[0]
                        type='disk'
                        id=0
                        guid=134064330288565023
                        path='/dev/ad10'
                        whole_disk=0
                        DTL=33
                children[1]
                        type='disk'
                        id=1
                        guid=6567589632071309972
                        path='/dev/ad12'
                        whole_disk=0
                        DTL=32
                children[2]
                        type='disk'
                        id=2
                        guid=6024702546194706986
                        path='/dev/ad14'
                        whole_disk=0
                        DTL=27
                children[3]
                        type='disk'
                        id=3
                        guid=10837092740689261565
                        path='/dev/ad16'
                        whole_disk=0
                        DTL=31
                children[4]
                        type='disk'
                        id=4
                        guid=4165337351109841378
                        path='/dev/ad18'
                        whole_disk=0
                        DTL=30
tarzan-new#

Core:
http://flydrag.dyndns.org:9090/freebsd/zfs-deadlock/core.txt.9


>How-To-Repeat:
Install 8.0-STABLE, create raidz pool,
run command dd if=/dev/zero of=/zfs/test bs=10m
detach (physically) one hard disk when it is writing.


>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201004030508.o3358fhm097037>