Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 05 Aug 2009 09:33:06 -0400
From:      Boris Kochergin <spawk@acm.poly.edu>
To:        Pawel Jakub Dawidek <pjd@FreeBSD.org>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs
Message-ID:  <4A798A12.4070408@acm.poly.edu>
In-Reply-To: <20090805115621.GG1784@garage.freebsd.pl>
References:  <4A78AA71.9050107@acm.poly.edu> <4A78AFB2.10103@acm.poly.edu> <20090805115621.GG1784@garage.freebsd.pl>

next in thread | previous in thread | raw e-mail | index | archive | help
Pawel Jakub Dawidek wrote:
> On Tue, Aug 04, 2009 at 06:01:22PM -0400, Boris Kochergin wrote:
>   
>> In a subsequent attempt at "zfs mount -a", the following panic happened:
>>
>> Fatal trap 12: page fault while in kernel mode
>>     
> [...]
>
> Could you try to mount file systems one by one? For example you have:
>
> 	tank
> 	tank/foo
> 	tank/foo/bar
> 	tank/baz
>
> And you do:
>
> 	# mount -t zfs tank /tank
> 	# mount -t zfs tank/foo /tank/foo
> 	# mount -t zfs tank/foo/bar /tank/foo/bar
> 	# mount -t zfs tank/baz /tank/baz
>
>   
There is only one filesystem (home), but "mount -t zfs home /usr/home" 
did work while the problem disk (ad13) was disconnected from the system. 
I started moving its data off to a new geom_raid3 array, and there was a 
panic shortly after:

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0xffffffffffffffe9
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff8103a9e7
stack pointer           = 0x28:0xffffff8077f26430
frame pointer           = 0x28:0xffffff8077f26500
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 972 (cp)
panic: from debugger
Uptime: 4m28s
Physical memory: 4082 MB
Dumping 2532 MB: 2517 2501 2485 2469 2453 2437 2421 2405 2389 2373 2357 
2341 2325 2309 2293 2277 2261 2245 2229 2213 2197 2181 2165 2149 2133 
2117 2101 2085 2069 2053 2037 2021 2005 1989 1973 1957 1941 1925 1909 
1893 1877 1861 1845 1829 1813 1797 1781 1765 1749 1733 1717 1701 1685 
1669 1653 1637 1621 1605 1589 1573 1557 1541 1525 1509 1493 1477 1461 
1445 1429 1413 1397 1381 1365 1349 1333 1317 1301 1285 1269 1253 1237 
1221 1205 1189 1173 1157 1141 1125 1109 1093 1077 1061 1045 1029 1013 
997 981 965 949 933 917 901 885 869 853 837 821 805 789 773 757 741 725 
709 693 677 661 645 629 613 597 581 565 549 533 517 501 485 469 453 437 
421 405 389 373 357 341 325 309 293 277 261 245 229 213 197 181 165 149 
133 117 101 85 69 53 37 21 5

Reading symbols from /boot/kernel/zfs.ko...Reading symbols from 
/boot/kernel/zfs.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/zfs.ko
Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from 
/boot/kernel/opensolaris.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/opensolaris.ko
Reading symbols from 
/usr/src/sys/modules/geom/geom_raid3/geom_raid3.ko...done.
Loaded symbols for /usr/src/sys/modules/geom/geom_raid3/geom_raid3.ko
#0  doadump () at pcpu.h:223
223     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) where
#0  doadump () at pcpu.h:223
#1  0xffffffff8058d881 in boot (howto=260) at 
/usr/src/sys/kern/kern_shutdown.c:419
#2  0xffffffff8058dc5b in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:575
#3  0xffffffff801d9767 in db_panic (addr=Variable "addr" is not available.
) at /usr/src/sys/ddb/db_command.c:478
#4  0xffffffff801d9b71 in db_command (last_cmdp=0xffffffff80bd2120, 
cmd_table=Variable "cmd_table" is not available.
) at /usr/src/sys/ddb/db_command.c:445
#5  0xffffffff801d9dc0 in db_command_loop () at 
/usr/src/sys/ddb/db_command.c:498
#6  0xffffffff801dbd49 in db_trap (type=Variable "type" is not available.
) at /usr/src/sys/ddb/db_main.c:229
#7  0xffffffff805b9704 in kdb_trap (type=12, code=0, tf=Variable "tf" is 
not available.
) at /usr/src/sys/kern/subr_kdb.c:534
#8  0xffffffff8086b5cd in trap_fatal (frame=0xffffff8077f26380, 
eva=18446744073709551593) at /usr/src/sys/amd64/amd64/trap.c:847
#9  0xffffffff8086b994 in trap_pfault (frame=0xffffff8077f26380, 
usermode=0) at /usr/src/sys/amd64/amd64/trap.c:768
#10 0xffffffff8086c16b in trap (frame=0xffffff8077f26380) at 
/usr/src/sys/amd64/amd64/trap.c:494
#11 0xffffffff80854d73 in calltrap () at 
/usr/src/sys/amd64/amd64/exception.S:224
#12 0xffffffff8103a9e7 in arc_evict (state=Variable "state" is not 
available.
) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1489
#13 0xffffffff8103b049 in arc_get_data_buf (buf=0xffffff00873d23f0) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2170
#14 0xffffffff8103b46e in arc_buf_alloc (spa=0xffffff0003536000, 
size=16384, tag=Variable "tag" is not available.
) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1156
#15 0xffffffff8103c6a0 in arc_read_nolock (pio=0xffffff00039a92d0, 
spa=0xffffff0003536000, bp=0xffffff800947a380, done=0xffffffff8103f360 
<dbuf_read_done>, private=0xffffff008740dc40, priority=0, zio_flags=1,
    arc_flags=0xffffff8077f266ec, zb=0xffffff8077f266c0) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2607
#16 0xffffffff8103cd6c in arc_read (pio=0xffffff00039a92d0, 
spa=0xffffff0003536000, bp=0xffffff800947a380, pbuf=0xffffff002d89f5a0, 
done=0xffffffff8103f360 <dbuf_read_done>, private=0xffffff008740dc40, 
priority=0, zio_flags=1,
    arc_flags=0xffffff8077f266ec, zb=0xffffff8077f266c0) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2508
#17 0xffffffff8103f7e9 in dbuf_read (db=0xffffff008740dc40, 
zio=0xffffff00039a92d0, flags=14) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:521
#18 0xffffffff8103fd56 in dbuf_findbp (dn=Variable "dn" is not available.
) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1381
#19 0xffffffff8103fe62 in dbuf_hold_impl (dn=0xffffff002d526300, 
level=Variable "level" is not available.
) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1617
#20 0xffffffff81040ddb in dbuf_hold (dn=Variable "dn" is not available.
) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1689
#21 0xffffffff81042f4d in dmu_buf_hold_array_by_dnode 
(dn=0xffffff002d526300, offset=Variable "offset" is not available.
) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:223
#22 0xffffffff810433e2 in dmu_buf_hold_array (os=Variable "os" is not 
available.
) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:284
#23 0xffffffff8104357f in dmu_read_uio (os=Variable "os" is not available.
) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:649
#24 0xffffffff810a21b1 in zfs_freebsd_read (ap=Variable "ap" is not 
available.
) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:591
#25 0xffffffff806244c0 in vn_read (fp=0xffffff0003435460, 
uio=0xffffff8077f26b00, active_cred=0xffffff002d376900, flags=Variable 
"flags" is not available.
) at vnode_if.h:384
#26 0xffffffff805c93a1 in dofileread (td=0xffffff0003780ab0, fd=3, 
fp=0xffffff0003435460, auio=0xffffff8077f26b00, offset=Variable "offset" 
is not available.
) at file.h:227
#27 0xffffffff805c9720 in kern_readv (td=0xffffff0003780ab0, fd=3, 
auio=0xffffff8077f26b00) at /usr/src/sys/kern/sys_generic.c:237
#28 0xffffffff805c9815 in read (td=Variable "td" is not available.
) at /usr/src/sys/kern/sys_generic.c:153
#29 0xffffffff8086bbff in syscall (frame=0xffffff8077f26c80) at 
/usr/src/sys/amd64/amd64/trap.c:984
#30 0xffffffff80855051 in Xfast_syscall () at 
/usr/src/sys/amd64/amd64/exception.S:373
#31 0x0000000800737d6c in ?? ()
Previous frame inner to this frame (corrupt stack?)

I reconnected the bad disk and tried "mount -t zfs home /usr/home" but 
the command does not return (it's been running for a few minutes at the 
time of this writing). However, the machine does not panic or lock up.

Thank you for your help.

-Boris



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4A798A12.4070408>