From owner-freebsd-hackers@FreeBSD.ORG  Thu Oct 11 20:43:42 2007
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D515016A417
	for <freebsd-hackers@freebsd.org>; Thu, 11 Oct 2007 20:43:42 +0000 (UTC)
	(envelope-from kevin@your.org)
Received: from tokyo01.jp.mail.your.org (tokyo01.jp.mail.your.org [204.9.54.5])
	by mx1.freebsd.org (Postfix) with ESMTP id 50B5B13C481
	for <freebsd-hackers@freebsd.org>; Thu, 11 Oct 2007 20:43:41 +0000 (UTC)
	(envelope-from kevin@your.org)
Received: from mail.your.org (server3-a.your.org [64.202.112.67])
	by tokyo01.jp.mail.your.org (Postfix) with ESMTP id A781D2AD55C4
	for <freebsd-hackers@freebsd.org>; Thu, 11 Oct 2007 20:20:54 +0000 (UTC)
Received: from [69.31.99.11] (pool011.dhcp.your.org [69.31.99.11])
	(using TLSv1 with cipher AES128-SHA (128/128 bits))
	(No client certificate requested)
	by mail.your.org (Postfix) with ESMTP id F20DAA0A44E
	for <freebsd-hackers@freebsd.org>; Thu, 11 Oct 2007 20:20:53 +0000 (UTC)
Mime-Version: 1.0 (Apple Message framework v752.3)
Content-Transfer-Encoding: 7bit
Message-Id: <EFB19341-E461-47A8-800E-E91942041CA3@your.org>
Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
To: freebsd-hackers@freebsd.org
From: "Kevin - Your.Org" <kevin@your.org>
Date: Thu, 11 Oct 2007 15:20:58 -0500
X-Mailer: Apple Mail (2.752.3)
Subject: Debugging kernel assembly calls - no stack frame
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Oct 2007 20:43:42 -0000


I've got a kernel crash that I'm trying to debug. The below is on a  
test system running 6.1-RELEASE, but it's also happening under 6.2  
nearly identically. The meat of it:

Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x10
fault code              = supervisor write, page not present
instruction pointer     = 0x20:0xc0850e64
stack pointer           = 0x28:0xe901dba0
frame pointer           = 0x28:0xe901dc08
code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, def32 1, gran 1
processor eflags        = resume, IOPL = 0
current process         = 22 (irq16: sdla0 twe0)
trap number             = 12
panic: page fault

(kgdb) bt
#0  doadump () at pcpu.h:165
#1  0xc064e3f5 in boot (howto=260) at ../../../kern/kern_shutdown.c:402
#2  0xc064e68c in panic (fmt=0xc089f053 "%s") at ../../../kern/ 
kern_shutdown.c:558
#3  0xc0852d54 in trap_fatal (frame=0xe901db60, eva=16) at ../../../ 
i386/i386/trap.c:836
#4  0xc0852536 in trap (frame=
       {tf_fs = 8, tf_es = 40, tf_ds = 40, tf_edi = 16, tf_esi =  
-978130928, tf_ebp = -385754104, tf_isp = -385754228, tf_ebx = 59,  
tf_edx = 16, tf_ecx = 14, tf_eax = 16, tf_trapno = 12, tf_err = 2,  
tf_eip = -1065021852, tf_cs = 32, tf_eflags = 589843, tf_esp =  
-985082112, tf_ss = -985044480})
     at ../../../i386/i386/trap.c:269
#5  0xc084257a in calltrap () at ../../../i386/i386/exception.s:139
#6  0xc0850e64 in memcpy () at ../../../i386/i386/support.s:681
Previous frame inner to this frame (corrupt stack?)


It looks like memcpy is being called with bad pointers. Inside the  
memcpy frame I see that %edi is 0x10, which can't be right. However,  
I want to know who is calling memcpy this way. My understanding is  
that the *.s function calls don't set up a frame, so gdb is getting  
lost here. I'm trying to trace things back myself, but not having  
much luck. My thought processes so far:


673     ENTRY(memcpy)
674             pushl   %edi
675             pushl   %esi
676             movl    12(%esp),%edi
677             movl    16(%esp),%esi
678             movl    20(%esp),%ecx
679             movl    %edi,%eax
680             shrl    $2,%ecx                         /* copy by 32- 
bit words */
681             cld                                     /* nope, copy  
forwards */
682             rep
683             movsl
684             movl    20(%esp),%ecx
685             andl    $3,%ecx                         /* any bytes  
left? */
686             rep
687             movsb
688             popl    %esi
689             popl    %edi
690             ret

Memcpy pushes two registers onto the stack immediately. Following  
that should be the return pointer. Following that should be the  
destination address, the source address, and the count.

(kgdb) up 6
#6  0xc0850e64 in memcpy () at ../../../i386/i386/support.s:681
681             cld                                     /* nope, copy  
forwards */
(kgdb) info registers esp
esp            0xc548d700       0xc548d700

0xc548d700 should be the stack pointer at the time of the crash.

(kgdb) x/12xw 0xc548d700
0xc548d700:     0x00000000      0x00000000      0xc5b2e800       
0x00000010
0xc548d710:     0x00000013      0x00000001      0x00000000       
0x00000010
0xc548d720:     0x00000000      0x00000000      0x00000000       
0x00000000

memcpy pushes twice on the stack when it starts up, which are the two  
0x00000000s at the beginning. Following that is 0xc5b2e800, which is  
where the ret will go at the end.  After that, it's obvious the  
destination and source aren't right. Even then though, if I look at  
the registers:

(kgdb) info registers
eax            0x10     16
ecx            0xe      14
edx            0x10     16
ebx            0x3b     59
esp            0xc548d700       0xc548d700
ebp            0xe901dc08       0xe901dc08
esi            0xc5b2e810       -978130928
edi            0x10     16
eip            0xc0850e64       0xc0850e64
eflags         0x90013  589843
cs             0x20     32
ss             0xc5496a00       -985044480
ds             0x28     40
es             0x28     40
fs             0x8      8
gs             0x0      0

edi looks like it got the destination address loaded correctly, but  
esi and ecx don't match what I'd expect from seeing the stack.  
Ignoring that and looking at the return address, I see this:

(kgdb) x/20i 0xc5b2e800
0xc5b2e800:     add    %ah,0xffffffa8(%eax)
0xc5b2e803:     sub    %eax,%esi
0xc5b2e805:     add    %eax,(%eax)
0xc5b2e807:     pusha
0xc5b2e808:     add    (%eax),%al
0xc5b2e80a:     add    %al,(%eax)
0xc5b2e80c:     cmp    (%eax),%eax
0xc5b2e80e:     add    %al,(%eax)
0xc5b2e810:     strl   (%eax)
0xc5b2e813:     add    %al,0x0(%ebp)
0xc5b2e816:     add    %dh,(%esp,%eax,2)
0xc5b2e819:     ds
0xc5b2e81a:     inc    %eax
0xc5b2e81b:     add    %bh,(%esi,%eax,1)
0xc5b2e81e:     mov    $0xb7,%dh
0xc5b2e820:     push   %ebx
0xc5b2e821:     xchg   %eax,%ebp
0xc5b2e822:     dec    %eax
0xc5b2e823:     adc    0x1f(%ebp),%al
0xc5b2e826:     arpl   %cx,(%eax)

That... doesn't look like it makes any sense.


Am I trashing the stack after memcpy is getting called, or is this  
dump corrupted somehow? If any of you were debugging this, how would  
you proceed?

-- Kevin