Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 07 Aug 2009 09:45:38 -0400
From:      Boris Kochergin <spawk@acm.poly.edu>
To:        Pawel Jakub Dawidek <pjd@FreeBSD.org>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs
Message-ID:  <4A7C3002.8000003@acm.poly.edu>
In-Reply-To: <20090807074400.GB1607@garage.freebsd.pl>
References:  <4A78AA71.9050107@acm.poly.edu> <4A78AFB2.10103@acm.poly.edu> <20090805115621.GG1784@garage.freebsd.pl> <4A798A12.4070408@acm.poly.edu> <20090807073738.GA1607@garage.freebsd.pl> <20090807074400.GB1607@garage.freebsd.pl>

next in thread | previous in thread | raw e-mail | index | archive | help
Pawel Jakub Dawidek wrote:
> On Fri, Aug 07, 2009 at 09:37:38AM +0200, Pawel Jakub Dawidek wrote:
>   
>> On Wed, Aug 05, 2009 at 09:33:06AM -0400, Boris Kochergin wrote:
>>     
>>> Fatal trap 12: page fault while in kernel mode
>>> fault virtual address   = 0xffffffffffffffe9
>>> fault code              = supervisor read data, page not present
>>> instruction pointer     = 0x20:0xffffffff8103a9e7
>>> stack pointer           = 0x28:0xffffff8077f26430
>>> frame pointer           = 0x28:0xffffff8077f26500
>>> code segment            = base 0x0, limit 0xfffff, type 0x1b
>>>                        = DPL 0, pres 1, long 1, def32 0, gran 1
>>> processor eflags        = interrupt enabled, resume, IOPL = 0
>>> current process         = 972 (cp)
>>>       
>> [...]
>>     
>>> /usr/src/sys/amd64/amd64/trap.c:494
>>> #11 0xffffffff80854d73 in calltrap () at 
>>> /usr/src/sys/amd64/amd64/exception.S:224
>>> #12 0xffffffff8103a9e7 in arc_evict (state=Variable "state" is not 
>>> available.
>>> ) at 
>>> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1489
>>>       
>> Could you tell me what do you have at this line in your source? I don't
>> think you use HEAD... What exact FreeBSD version are you using?
>>     
>
> You already gave version number in your first mail, sorry about that.
> 8.0-BETA2 should be very close to HEAD (or it actually was HEAD), so I
> guess this is the code we are looking at:
>
> 1488:		/* "lookahead" for better eviction candidate */
> 1489:		if (recycle && ab->b_size != bytes &&
> 1490:		    ab_prev && ab_prev->b_size == bytes)
> 1491:			continue;
>
> If 'ab' is corrupted it should panic earlier, so it seems ab_prev is
> corrupted, can you see what it points to in gdb?
>
>   
Yeah, that's what the code looks like. For convenience, I've put the 
source tree the system was built using up at:

http://acm.poly.edu/~spawk/src/

Maybe my kgdb chops aren't up to par, but I can't seem to see what 
ab_prev points to:

(kgdb) up
#12 0xffffffff8103a9e7 in arc_evict (state=Variable "state" is not 
available.
) at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1489
1489                    if (recycle && ab->b_size != bytes &&
Current language:  auto; currently c
(kgdb) list
1484                        LBOLT - ab->b_arc_access < 
arc_min_prefetch_lifespan)) {
1485                            skipped++;
1486                            continue;
1487                    }
1488                    /* "lookahead" for better eviction candidate */
1489                    if (recycle && ab->b_size != bytes &&
1490                        ab_prev && ab_prev->b_size == bytes)
1491                            continue;
1492                    hash_lock = HDR_LOCK(ab);
1493                    have_lock = MUTEX_HELD(hash_lock);
(kgdb) print ab
$13 = (arc_buf_hdr_t *) 0xffffff0003ebc410
(kgdb) print ab->b_size
$14 = 1
(kgdb) print bytes
$15 = 16384
(kgdb) print ab_prev
No symbol "ab_prev" in current context.

-Boris



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4A7C3002.8000003>