From owner-freebsd-current@FreeBSD.ORG  Sat Jun 20 07:11:49 2009
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 980B81065670
	for <freebsd-current@freebsd.org>; Sat, 20 Jun 2009 07:11:49 +0000 (UTC)
	(envelope-from serenity@exscape.org)
Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net
	[80.76.149.212])
	by mx1.freebsd.org (Postfix) with ESMTP id 2AB568FC0C
	for <freebsd-current@freebsd.org>; Sat, 20 Jun 2009 07:11:48 +0000 (UTC)
	(envelope-from serenity@exscape.org)
Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:44889
	helo=mx.exscape.org)
	by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.69)
	(envelope-from <serenity@exscape.org>) id 1MHujg-0006XO-5a
	for freebsd-current@freebsd.org; Sat, 20 Jun 2009 09:11:38 +0200
Received: from [192.168.1.5] (macbookpro [192.168.1.5])
	(using TLSv1 with cipher AES128-SHA (128/128 bits))
	(No client certificate requested)
	by mx.exscape.org (Postfix) with ESMTPSA id 495896A98E
	for <freebsd-current@freebsd.org>;
	Sat, 20 Jun 2009 09:11:36 +0200 (CEST)
Message-Id: <72163521-40BF-4764-8B74-5446A88DFBF8@exscape.org>
From: Thomas Backman <serenity@exscape.org>
To: FreeBSD current <freebsd-current@freebsd.org>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v935.3)
Date: Sat, 20 Jun 2009 09:11:34 +0200
X-Mailer: Apple Mail (2.935.3)
X-Originating-IP: 83.253.252.234
X-Scan-Result: No virus found in message 1MHujg-0006XO-5a.
X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MHujg-0006XO-5a
	6a313ac0145f1ada6e227df48fcaf444
Subject: "New" ZFS crash on FS (pool?) unmount/export
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 20 Jun 2009 07:11:49 -0000

I just ran into this tonight. Not sure exactly what triggered it - the  
box stopped responding to pings at 02:07AM and it has a cron backup  
job using zfs send/recv at 02:00, so I'm guessing it's related, even  
though the backup probably should have finished before then... Hmm.  
Anyway.

r194478.

kernel trap 12 with interrupts disabled

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x288
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff805a4989
stack pointer           = 0x28:0xffffff803e8b57e0
frame pointer           = 0x28:0xffffff803e8b5840
code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 57514 (zpool)
panic: from debugger
cpuid = 0
Uptime: 10h22m13s
Physical memory: 2027 MB

(kgdb) bt
#0  doadump () at pcpu.h:223
#1  0xffffffff8059c409 in boot (howto=260) at /usr/src/sys/kern/ 
kern_shutdown.c:419
#2  0xffffffff8059c85c in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:575
#3  0xffffffff801f1377 in db_panic (addr=Variable "addr" is not  
available.
) at /usr/src/sys/ddb/db_command.c:478
#4  0xffffffff801f1781 in db_command (last_cmdp=0xffffffff80c38620,  
cmd_table=Variable "cmd_table" is not available.
) at /usr/src/sys/ddb/db_command.c:445
#5  0xffffffff801f19d0 in db_command_loop () at /usr/src/sys/ddb/ 
db_command.c:498
#6  0xffffffff801f3969 in db_trap (type=Variable "type" is not  
available.
) at /usr/src/sys/ddb/db_main.c:229
#7  0xffffffff805ce465 in kdb_trap (type=12, code=0,  
tf=0xffffff803e8b5730) at /usr/src/sys/kern/subr_kdb.c:534
#8  0xffffffff8088715d in trap_fatal (frame=0xffffff803e8b5730,  
eva=Variable "eva" is not available.
) at /usr/src/sys/amd64/amd64/trap.c:847
#9  0xffffffff80887fb2 in trap (frame=0xffffff803e8b5730) at /usr/src/ 
sys/amd64/amd64/trap.c:345
#10 0xffffffff8086e007 in calltrap () at /usr/src/sys/amd64/amd64/ 
exception.S:223
#11 0xffffffff805a4989 in _sx_xlock_hard (sx=0xffffff0043557d50,  
tid=18446742975830720512, opts=Variable "opts" is not available.
)
     at /usr/src/sys/kern/kern_sx.c:575
#12 0xffffffff805a52fe in _sx_xlock (sx=Variable "sx" is not available.
) at sx.h:155
#13 0xffffffff80fe2995 in zfs_freebsd_reclaim () from /boot/kernel/ 
zfs.ko
#14 0xffffffff808cefca in VOP_RECLAIM_APV (vop=0xffffff0043557d38,  
a=0xffffff0043557d50) at vnode_if.c:1926
#15 0xffffffff80626f6e in vgonel (vp=0xffffff00437a7938) at vnode_if.h: 
830
#16 0xffffffff8062b528 in vflush (mp=0xffffff0060f2a000, rootrefs=0,  
flags=0, td=0xffffff0061528000)
     at /usr/src/sys/kern/vfs_subr.c:2450
#17 0xffffffff80fdd3a8 in zfs_umount () from /boot/kernel/zfs.ko
#18 0xffffffff8062420a in dounmount (mp=0xffffff0060f2a000,  
flags=1626513408, td=Variable "td" is not available.
)
     at /usr/src/sys/kern/vfs_mount.c:1287
#19 0xffffffff80624975 in unmount (td=0xffffff0061528000,  
uap=0xffffff803e8b5c00)
     at /usr/src/sys/kern/vfs_mount.c:1172
#20 0xffffffff8088783f in syscall (frame=0xffffff803e8b5c90) at /usr/ 
src/sys/amd64/amd64/trap.c:984
#21 0xffffffff8086e290 in Xfast_syscall () at /usr/src/sys/amd64/amd64/ 
exception.S:364
#22 0x000000080104e49c in ?? ()
Previous frame inner to this frame (corrupt stack?)

BTW, I got a (one) "force unmount is experimental" on the console. On  
regular shutdown I usually get one per filesystem, it seems (at least  
10) and this pool should contain exactly as many filesystems as the  
root pool since it's a copy of it. On running the backup script  
manually post-crash, though, I didn't get any.

Also worth noting is that I was running DTrace all night to test its  
stability. I'm pretty sure the script was
dtrace -n 'syscall::open:entry { @a[copyinstr(arg0)] = count(); }'

0 swap was used and 277700 pages (~1084 MB or 50%) RAM was free,  
according to the core.txt.

Regards,
Thomas