Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 13 Jun 2013 19:52:17 -0500
From:      The BSD Dreamer <beastie@tardisi.com>
To:        freebsd-fs@freebsd.org
Subject:   Re: ZFS triggered 9-STABLE r246646 panic "vdrop: holdcnt 0"
Message-ID:  <51BA6941.7040909@tardisi.com>
In-Reply-To: <513E8E95.6010802@freebsd.org>
References:  <513E8E95.6010802@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help


On 03/11/2013 21:10, Lawrence Stewart wrote:
> Hi all,
> 
> I got this panic yesterday. I haven't seen it before (or since), but I
> have the crashdump and kernel here if there's additional information I
> can provide that would be useful in finding the cause.
> 
> The machine runs ZFS exclusively and was under quite heavy CPU and IO
> load at the time of the crash as I was compiling in a VirtualBox VM and
> on the host itself, as well as running a full KDE desktop environment.
> I'm fairly certain the machine was not swapping at the time of the crash.
> 
> lstewart@lstewart> uname -a
> FreeBSD lstewart 9.1-STABLE FreeBSD 9.1-STABLE #8 r246646M: Mon Feb 11
> 14:57:13 EST 2013
> root@lstewart:/usr/obj/usr/src/sys/LSTEWART-DESKTOP  amd64
> 
> lstewart@lstewart> sudo kgdb /boot/kernel/kernel /var/crash/vmcore.0
> 
> [...]
> 
> (kgdb) bt
> #0  doadump (textdump=<value optimized out>) at pcpu.h:229
> #1  0xffffffff808e5824 in kern_reboot (howto=260) at
> /usr/src/sys/kern/kern_shutdown.c:448
> #2  0xffffffff808e5d27 in panic (fmt=0x1 <Address 0x1 out of bounds>) at
> /usr/src/sys/kern/kern_shutdown.c:636
> #3  0xffffffff8097a71e in vdropl (vp=<value optimized out>) at
> /usr/src/sys/kern/vfs_subr.c:2465
> #4  0xffffffff80b4da2b in vm_page_alloc (object=0xffffffff8132c000,
> pindex=143696, req=32) at /usr/src/sys/vm/vm_page.c:1569
> #5  0xffffffff80b3f312 in kmem_back (map=0xfffffe00020000e8,
> addr=18446743524542296064, size=131072, flags=705200752)
>     at /usr/src/sys/vm/vm_kern.c:361

I just came home to find that my system had panic'd (around
11:30am)....and this was the only FreeBSD 9 'panic: vdrop: holdcnt: 0'
that I found.

The machine runs ZFS exclusively as well....CPU would be busy, since I
run BOINC and distributed.net (go Team FreeBSD :)  And, IO load would be
high from BackupPC_nightly running...out of the box this job starts at
1am, but I had moved it to run at 11am so that it doesn't run into all
things that get scheduled in cron around this time, along with all the
backups that I'm running...  as well as out of the way when I'm checking
email and such first thing in the morning over coffee before heading
into work.  And, it takes a few hours to grind through the 7.2TB zpool...

Its possible that this was happening when it was set to 1am, but I never
had a crash dump when it had happened and no indication that a panic was
why.  Though I did later find out that recollindex cleans itself up when
something goes wrong by sending TERM to its pgid....and running
recollindex as root from cron during this time....means its sending TERM
to init.  And, not running it anymore seems to have solved that.... and
there didn't seem to be any reason to move BackupPC_nightly back.

Plus the other problem would have me wake up to find the machine with
console screen in single user mode.  With this, I came home to gnome
login screen....

So, my system is:

lchen@zen:~ 102> uname -a
FreeBSD zen.lhaven.homeip.net 9.1-RELEASE-p3 FreeBSD 9.1-RELEASE-p3 #0:
Mon Apr 29 18:27:25 UTC 2013
root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64

but, when I try to look at the dump:


lchen@zen:~ 103> sudo kgdb /boot/kernel/kernel /var/crash/vmcore.0
Password:
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...(no debugging
symbols found)...
Attempt to extract a component of a value that is not a structure pointer.
Attempt to extract a component of a value that is not a structure pointer.
#0  0xffffffff808e9ecb in doadump ()
(kgdb)

There's no kernel.symbols either.  The only one that is, is the backup
of my 9.0 kernel.  Is that because I've been using freebsd-update to update?

Here's the info.0 file....

lchen@zen:~ 104> sudo cat /var/crash/info.0
Dump header from device /dev/gpt/swap0
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 9172926464B (8747 MB)
  Blocksize: 512
  Dumptime: Thu Jun 13 11:31:10 2013
  Hostname: zen.lhaven.homeip.net
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 9.1-RELEASE-p3 #0: Mon Apr 29 18:27:25 UTC 2013
    root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC
  Panic String: vdrop: holdcnt 0
  Dump Parity: 4285100545
  Bounds: 0
  Dump Status: good

So, just to see if anything meaningful might result....I move my
/etc/make.conf aside and do a "make buildkernel", and tried a

kgdb /usr/obj/usr/src/sys/generic/kernel.debug /var/crash/vmcore.0

which get's me this...

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
panic: vdrop: holdcnt 0
cpuid = 1
KDB: stack backtrace:
#0 0xffffffff809208d6 at kdb_backtrace+0x66
#1 0xffffffff808ea8ee at panic+0x1ce
#2 0xffffffff8097fa86 at vdropl+0x366
#3 0xffffffff80b522ab at vm_page_alloc+0x28b
#4 0xffffffff80bd9096 at uma_small_alloc+0x66
#5 0xffffffff80b3b5fa at keg_alloc_slab+0x9a
#6 0xffffffff80b3bb72 at keg_fetch_slab+0xb2
#7 0xffffffff80b3bede at zone_fetch_slab+0x3e
#8 0xffffffff80b3b229 at zone_alloc_item+0x59
#9 0xffffffff80b3b431 at uma_large_malloc+0x31
#10 0xffffffff808d5a99 at malloc+0xd9
#11 0xffffffff815b28ee at zio_write_bp_init+0x1fe
#12 0xffffffff815b2063 at zio_execute+0xc3
#13 0xffffffff815b3fad at zio_ready+0x17d
#14 0xffffffff815b2063 at zio_execute+0xc3
#15 0xffffffff8092cf85 at taskqueue_run_locked+0x85
#16 0xffffffff8092df06 at taskqueue_thread_loop+0x46
#17 0xffffffff808bba1f at fork_exit+0x11f
Uptime: 15d13h35m36s
Dumping 8747 out of 16308
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

Reading symbols from /boot/kernel/nullfs.ko...done.
Loaded symbols for /boot/kernel/nullfs.ko
Reading symbols from /boot/kernel/zfs.ko...done.
Loaded symbols for /boot/kernel/zfs.ko
Reading symbols from /boot/kernel/opensolaris.ko...done.
Loaded symbols for /boot/kernel/opensolaris.ko
Reading symbols from /boot/kernel/if_tap.ko...done.
Loaded symbols for /boot/kernel/if_tap.ko
Reading symbols from /boot/kernel/aio.ko...done.
Loaded symbols for /boot/kernel/aio.ko
Reading symbols from /boot/kernel/accf_data.ko...done.
Loaded symbols for /boot/kernel/accf_data.ko
Reading symbols from /boot/kernel/accf_http.ko...done.
Loaded symbols for /boot/kernel/accf_http.ko
Reading symbols from /boot/kernel/coretemp.ko...done.
Loaded symbols for /boot/kernel/coretemp.ko
Reading symbols from /boot/kernel/cpuctl.ko...done.
Loaded symbols for /boot/kernel/cpuctl.ko
Reading symbols from /boot/kernel/sem.ko...done.
Loaded symbols for /boot/kernel/sem.ko
Reading symbols from /boot/modules/cuse4bsd.ko...done.
Loaded symbols for /boot/modules/cuse4bsd.ko
Reading symbols from /boot/modules/vboxdrv.ko...done.
Loaded symbols for /boot/modules/vboxdrv.ko
Reading symbols from /boot/modules/nvidia.ko...done.
Loaded symbols for /boot/modules/nvidia.ko
Reading symbols from /boot/kernel/linux.ko...done.
Loaded symbols for /boot/kernel/linux.ko
Reading symbols from /boot/kernel/libiconv.ko...done.
Loaded symbols for /boot/kernel/libiconv.ko
Reading symbols from /boot/kernel/libmchain.ko...done.
Loaded symbols for /boot/kernel/libmchain.ko
Reading symbols from /boot/kernel/cd9660_iconv.ko...done.
Loaded symbols for /boot/kernel/cd9660_iconv.ko
Reading symbols from /boot/kernel/msdosfs_iconv.ko...done.
Loaded symbols for /boot/kernel/msdosfs_iconv.ko
Reading symbols from /boot/kernel/ichwd.ko...done.
Loaded symbols for /boot/kernel/ichwd.ko
Reading symbols from /boot/kernel/fdescfs.ko...done.
Loaded symbols for /boot/kernel/fdescfs.ko
Reading symbols from /boot/kernel/ipl.ko...done.
Loaded symbols for /boot/kernel/ipl.ko
Reading symbols from /boot/modules/vboxnetflt.ko...done.
Loaded symbols for /boot/modules/vboxnetflt.ko
Reading symbols from /boot/kernel/netgraph.ko...done.
Loaded symbols for /boot/kernel/netgraph.ko
Reading symbols from /boot/kernel/ng_ether.ko...done.
Loaded symbols for /boot/kernel/ng_ether.ko
Reading symbols from /boot/modules/vboxnetadp.ko...done.
Loaded symbols for /boot/modules/vboxnetadp.ko
Reading symbols from /usr/local/modules/fuse.ko...done.
Loaded symbols for /usr/local/modules/fuse.ko
Reading symbols from /boot/kernel/linprocfs.ko...done.
Loaded symbols for /boot/kernel/linprocfs.ko
Reading symbols from /boot/kernel/linsysfs.ko...done.
Loaded symbols for /boot/kernel/linsysfs.ko
Reading symbols from /usr/local/libexec/linux_adobe/linux_adobe.ko...done.
Loaded symbols for /usr/local/libexec/linux_adobe/linux_adobe.ko
Reading symbols from /usr/local/modules/rtc.ko...done.
Loaded symbols for /usr/local/modules/rtc.ko
#0  doadump (textdump=Variable "textdump" is not available.
) at pcpu.h:224
224		__asm("movq %%gs:0,%0" : "=r" (td));
(kgdb) bt
#0  doadump (textdump=Variable "textdump" is not available.
) at pcpu.h:224
#1  0xffffffff808ea3d1 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:448
#2  0xffffffff808ea8c7 in panic (fmt=0x1 <Address 0x1 out of bounds>)
    at /usr/src/sys/kern/kern_shutdown.c:636
#3  0xffffffff8097fa86 in vdropl (vp=Variable "vp" is not available.
) at /usr/src/sys/kern/vfs_subr.c:2400
#4  0xffffffff80b522ab in vm_page_alloc (object=0x0, pindex=0, req=32)
    at /usr/src/sys/vm/vm_page.c:1537
#5  0xffffffff80bd9096 in uma_small_alloc (zone=Variable "zone" is not
available.
)
    at /usr/src/sys/amd64/amd64/uma_machdep.c:58
#6  0xffffffff80b3b5fa in keg_alloc_slab (keg=0xfffffe043ffef0e0,
    zone=0xfffffe043ffee000, wait=258) at /usr/src/sys/vm/uma_core.c:844
#7  0xffffffff80b3bb72 in keg_fetch_slab (keg=0xfffffe043ffef0e0,
    zone=0xfffffe043ffee000, flags=2) at /usr/src/sys/vm/uma_core.c:2173
#8  0xffffffff80b3bede in zone_fetch_slab (zone=0xfffffe043ffee000,
    keg=0xfffffe043ffef0e0, flags=2) at /usr/src/sys/vm/uma_core.c:2233
#9  0xffffffff80b3b229 in zone_alloc_item (zone=0xfffffe043ffee000,
udata=0x0,
    flags=2) at /usr/src/sys/vm/uma_core.c:2490
#10 0xffffffff80b3b431 in uma_large_malloc (size=16384, wait=2)
    at /usr/src/sys/vm/uma_core.c:3064
#11 0xffffffff808d5a99 in malloc (size=16384, mtp=0xffffffff81734c20,
flags=2)
    at /usr/src/sys/kern/kern_malloc.c:492
#12 0xffffffff815b28ee in zio_write_bp_init () from /boot/kernel/zfs.ko
---Type <return> to continue, or q <return> to quit---
#13 0x0000000000000010 in ?? ()
#14 0xfffffe022b9726e0 in ?? ()
#15 0xfffffe03c81a2a50 in ?? ()
#16 0xffffff801b78e880 in ?? ()
#17 0xfffffe000e99e000 in ?? ()
#18 0xffffff8471d93ae0 in ?? ()
#19 0xffffffff815b2063 in zio_execute () from /boot/kernel/zfs.ko
#20 0x0000000000000000 in ?? ()
#21 0x0000000000000000 in ?? ()
#22 0xfffffe03c81a2a50 in ?? ()
#23 0xffffff801b78e880 in ?? ()
#24 0xfffffe000e99e000 in ?? ()
#25 0xffffff8471d93b10 in ?? ()
#26 0xffffffff815b3fad in zio_ready () from /boot/kernel/zfs.ko
#27 0xfffffe03c81a2a50 in ?? ()
#28 0x0000000000000006 in ?? ()
#29 0x0000000000000006 in ?? ()
#30 0xffffff8471d93b50 in ?? ()
#31 0xffffffff815b2063 in zio_execute () from /boot/kernel/zfs.ko
#32 0xfffffe0013c79800 in ?? ()
#33 0xfffffe03c81a2d90 in ?? ()
#34 0xfffffe0013c70000 in ?? ()
#35 0x0000000000000001 in ?? ()
---Type <return> to continue, or q <return> to quit---
#36 0xfffffe0013c70000 in ?? ()
#37 0xffffff8471d93bc0 in ?? ()
#38 0xffffffff8092cf85 in taskqueue_run_locked (queue=0xffffff800904e380)
    at /usr/src/sys/kern/subr_taskqueue.c:308
Previous frame inner to this frame (corrupt stack?)
(kgdb) l *0xffffffff8097fa86
0xffffffff8097fa86 is at /usr/src/sys/kern/vfs_subr.c:2400.
2395		int active;
2396	
2397		ASSERT_VI_LOCKED(vp, "vdropl");
2398		CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
2399		if (vp->v_holdcnt <= 0)
2400			panic("vdrop: holdcnt %d", vp->v_holdcnt);
2401		vp->v_holdcnt--;
2402		if (vp->v_holdcnt > 0) {
2403			VI_UNLOCK(vp);
2404			return;

so, it seems to work, but beyond the fact that it says to panic if
vp->v_holdcnt is <= 0...don't know how to look to see why this variable
had come to be 0, when it thinks it shouldn't have.

I have periodic (about twice a year) scrubs enabled on my system, and
the zpool for backuppc was last scrubbed on May 24th (it took 47h57m -
repaired 0 with 0 errors.)

-- 
  Name: Lawrence "The Dreamer" Chen      Email: beastie@tardisi.com
 Snail: 1530 College Ave, A5              Blog: http://lawrencechen.net
        Manhattan, KS 66502-2768         Phone: 785-789-4132



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?51BA6941.7040909>