Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 22 Aug 2009 13:00:43 +0200
From:      Thomas Backman <serenity@exscape.org>
To:        Pawel Jakub Dawidek <pjd@FreeBSD.org>
Cc:        freebsd-fs@freebsd.org, FreeBSD current <freebsd-current@freebsd.org>
Subject:   Re: Yet another ZFS recv panic; old but rarely seen
Message-ID:  <F1014768-BE19-4AC4-9E0B-52C8DF9B5ADD@exscape.org>
In-Reply-To: <20090821110031.GB1962@garage.freebsd.pl>
References:  <7F161876-8DA7-4617-98B6-7CD54C691BC6@exscape.org> <306284EA-C89C-433C-9D33-E6CF44305800@exscape.org> <20090821110031.GB1962@garage.freebsd.pl>

next in thread | previous in thread | raw e-mail | index | archive | help
On Aug 21, 2009, at 13:00, Pawel Jakub Dawidek wrote:
>
> Right, the bug is already fixed in OpenSolaris. If you can reproduce  
> the
> problem, you might try this patch:
>
> 	http://people.freebsd.org/~pjd/patches/dirtying_dbuf.patch
I tried to reproduce it, a lot (~750 incremental send/recvs) but no  
"luck". I've only gotten it twice AFAIK, and that's since May.
However, during the stress, I got a solaris assert panic (I've still  
got -DDEBUG=1), after a couple hours:

Unread portion of the kernel message buffer:
panic: solaris assert: (int64_t)(arc_stats.arcstat_p.value.ui64) >= 0,  
file: /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/ 
common/fs/zfs/arc.c, line: 2044
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
panic() at panic+0x182
arc_get_data_buf() at arc_get_data_buf+0x2a0
arc_buf_alloc() at arc_buf_alloc+0xe6
arc_read_nolock() at arc_read_nolock+0xf7
arc_read() at arc_read+0xaf
dbuf_read() at dbuf_read+0x62b
dmu_buf_hold() at dmu_buf_hold+0xcc
zap_lockdir() at zap_lockdir+0x68
zap_lookup_norm() at zap_lookup_norm+0x45
zap_lookup() at zap_lookup+0x2e
dsl_prop_changed_notify() at dsl_prop_changed_notify+0x1c9
dsl_prop_changed_notify() at dsl_prop_changed_notify+0x157
dsl_prop_set_sync() at dsl_prop_set_sync+0x2ab
dsl_sync_task_group_sync() at dsl_sync_task_group_sync+0x173
dsl_pool_sync() at dsl_pool_sync+0x122
spa_sync() at spa_sync+0x35e
txg_sync_thread() at txg_sync_thread+0x2d7
fork_exit() at fork_exit+0x118
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffff803e8cdd30, rbp = 0 ---
KDB: enter: panic
panic: from debugger
cpuid = 0
Uptime: 1d3h17m21s
Physical memory: 2029 MB

GDB backtrace is the same until spa_sync(), at which point (#26) it  
turns into ??'s until
#61 0xffffffff80b75447 in txg_sync_thread () from /boot/kernel/zfs.ko
Previous frame inner to this frame (corrupt stack?)

core.txt vmstat -s:
     29040 pages active
     28905 pages inactive
       143 pages in VM cache
    231106 pages wired down (903MiB out of ~2048)
214771 pages free

2GB RAM, amd64.

Regards,
Thomas



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?F1014768-BE19-4AC4-9E0B-52C8DF9B5ADD>