From owner-freebsd-hackers@FreeBSD.ORG Thu Oct 11 20:43:42 2007 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D515016A417 for ; Thu, 11 Oct 2007 20:43:42 +0000 (UTC) (envelope-from kevin@your.org) Received: from tokyo01.jp.mail.your.org (tokyo01.jp.mail.your.org [204.9.54.5]) by mx1.freebsd.org (Postfix) with ESMTP id 50B5B13C481 for ; Thu, 11 Oct 2007 20:43:41 +0000 (UTC) (envelope-from kevin@your.org) Received: from mail.your.org (server3-a.your.org [64.202.112.67]) by tokyo01.jp.mail.your.org (Postfix) with ESMTP id A781D2AD55C4 for ; Thu, 11 Oct 2007 20:20:54 +0000 (UTC) Received: from [69.31.99.11] (pool011.dhcp.your.org [69.31.99.11]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mail.your.org (Postfix) with ESMTP id F20DAA0A44E for ; Thu, 11 Oct 2007 20:20:53 +0000 (UTC) Mime-Version: 1.0 (Apple Message framework v752.3) Content-Transfer-Encoding: 7bit Message-Id: Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed To: freebsd-hackers@freebsd.org From: "Kevin - Your.Org" Date: Thu, 11 Oct 2007 15:20:58 -0500 X-Mailer: Apple Mail (2.752.3) Subject: Debugging kernel assembly calls - no stack frame X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Oct 2007 20:43:42 -0000 I've got a kernel crash that I'm trying to debug. The below is on a test system running 6.1-RELEASE, but it's also happening under 6.2 nearly identically. The meat of it: Fatal trap 12: page fault while in kernel mode fault virtual address = 0x10 fault code = supervisor write, page not present instruction pointer = 0x20:0xc0850e64 stack pointer = 0x28:0xe901dba0 frame pointer = 0x28:0xe901dc08 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = resume, IOPL = 0 current process = 22 (irq16: sdla0 twe0) trap number = 12 panic: page fault (kgdb) bt #0 doadump () at pcpu.h:165 #1 0xc064e3f5 in boot (howto=260) at ../../../kern/kern_shutdown.c:402 #2 0xc064e68c in panic (fmt=0xc089f053 "%s") at ../../../kern/ kern_shutdown.c:558 #3 0xc0852d54 in trap_fatal (frame=0xe901db60, eva=16) at ../../../ i386/i386/trap.c:836 #4 0xc0852536 in trap (frame= {tf_fs = 8, tf_es = 40, tf_ds = 40, tf_edi = 16, tf_esi = -978130928, tf_ebp = -385754104, tf_isp = -385754228, tf_ebx = 59, tf_edx = 16, tf_ecx = 14, tf_eax = 16, tf_trapno = 12, tf_err = 2, tf_eip = -1065021852, tf_cs = 32, tf_eflags = 589843, tf_esp = -985082112, tf_ss = -985044480}) at ../../../i386/i386/trap.c:269 #5 0xc084257a in calltrap () at ../../../i386/i386/exception.s:139 #6 0xc0850e64 in memcpy () at ../../../i386/i386/support.s:681 Previous frame inner to this frame (corrupt stack?) It looks like memcpy is being called with bad pointers. Inside the memcpy frame I see that %edi is 0x10, which can't be right. However, I want to know who is calling memcpy this way. My understanding is that the *.s function calls don't set up a frame, so gdb is getting lost here. I'm trying to trace things back myself, but not having much luck. My thought processes so far: 673 ENTRY(memcpy) 674 pushl %edi 675 pushl %esi 676 movl 12(%esp),%edi 677 movl 16(%esp),%esi 678 movl 20(%esp),%ecx 679 movl %edi,%eax 680 shrl $2,%ecx /* copy by 32- bit words */ 681 cld /* nope, copy forwards */ 682 rep 683 movsl 684 movl 20(%esp),%ecx 685 andl $3,%ecx /* any bytes left? */ 686 rep 687 movsb 688 popl %esi 689 popl %edi 690 ret Memcpy pushes two registers onto the stack immediately. Following that should be the return pointer. Following that should be the destination address, the source address, and the count. (kgdb) up 6 #6 0xc0850e64 in memcpy () at ../../../i386/i386/support.s:681 681 cld /* nope, copy forwards */ (kgdb) info registers esp esp 0xc548d700 0xc548d700 0xc548d700 should be the stack pointer at the time of the crash. (kgdb) x/12xw 0xc548d700 0xc548d700: 0x00000000 0x00000000 0xc5b2e800 0x00000010 0xc548d710: 0x00000013 0x00000001 0x00000000 0x00000010 0xc548d720: 0x00000000 0x00000000 0x00000000 0x00000000 memcpy pushes twice on the stack when it starts up, which are the two 0x00000000s at the beginning. Following that is 0xc5b2e800, which is where the ret will go at the end. After that, it's obvious the destination and source aren't right. Even then though, if I look at the registers: (kgdb) info registers eax 0x10 16 ecx 0xe 14 edx 0x10 16 ebx 0x3b 59 esp 0xc548d700 0xc548d700 ebp 0xe901dc08 0xe901dc08 esi 0xc5b2e810 -978130928 edi 0x10 16 eip 0xc0850e64 0xc0850e64 eflags 0x90013 589843 cs 0x20 32 ss 0xc5496a00 -985044480 ds 0x28 40 es 0x28 40 fs 0x8 8 gs 0x0 0 edi looks like it got the destination address loaded correctly, but esi and ecx don't match what I'd expect from seeing the stack. Ignoring that and looking at the return address, I see this: (kgdb) x/20i 0xc5b2e800 0xc5b2e800: add %ah,0xffffffa8(%eax) 0xc5b2e803: sub %eax,%esi 0xc5b2e805: add %eax,(%eax) 0xc5b2e807: pusha 0xc5b2e808: add (%eax),%al 0xc5b2e80a: add %al,(%eax) 0xc5b2e80c: cmp (%eax),%eax 0xc5b2e80e: add %al,(%eax) 0xc5b2e810: strl (%eax) 0xc5b2e813: add %al,0x0(%ebp) 0xc5b2e816: add %dh,(%esp,%eax,2) 0xc5b2e819: ds 0xc5b2e81a: inc %eax 0xc5b2e81b: add %bh,(%esi,%eax,1) 0xc5b2e81e: mov $0xb7,%dh 0xc5b2e820: push %ebx 0xc5b2e821: xchg %eax,%ebp 0xc5b2e822: dec %eax 0xc5b2e823: adc 0x1f(%ebp),%al 0xc5b2e826: arpl %cx,(%eax) That... doesn't look like it makes any sense. Am I trashing the stack after memcpy is getting called, or is this dump corrupted somehow? If any of you were debugging this, how would you proceed? -- Kevin