From owner-freebsd-current@FreeBSD.ORG Thu May 21 16:54:10 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A2CDC106564A for ; Thu, 21 May 2009 16:54:10 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 0C6518FC15 for ; Thu, 21 May 2009 16:54:10 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:44057 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.69) (envelope-from ) id 1M7BWq-00007M-4N for freebsd-current@freebsd.org; Thu, 21 May 2009 18:54:02 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id DB7353477D for ; Thu, 21 May 2009 18:53:56 +0200 (CEST) Message-Id: <0C235698-3ED2-4AE9-A7D1-5DC56D8324A4@exscape.org> From: Thomas Backman To: freebsd-current@freebsd.org In-Reply-To: <949B5884-5303-4EFF-AC7D-293640FFA012@exscape.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Thu, 21 May 2009 18:53:56 +0200 References: <949B5884-5303-4EFF-AC7D-293640FFA012@exscape.org> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1M7BWq-00007M-4N. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1M7BWq-00007M-4N c7bedfb053bd26a8e55269b312bbed48 Subject: Re: DTrace panic while probing syscall::open (and possibly many others) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 May 2009 16:54:11 -0000 On May 13, 2009, at 03:19 PM, Thomas Backman wrote: > OK, so I first posted a thread on the forums about this in 7.2- > RELEASE: > http://forums.freebsd.org/showthread.php?t=3834 > Then filed a PR, kern/134408: > http://www.freebsd.org/cgi/query-pr.cgi?pr=134408 > > The very same bug remains in 8-CURRENT/amd64 as of May 13, ~10(am) > GMT+2. > > Steps to reproduce: > 1) Build DTrace capable kernel (I followed the wiki DTrace > instructions) > 2) Reboot; kldload dtraceall > 3) dtrace -n 'syscall::open:entry { self->path = arg0; } > syscall::open:return { printf("%s\n", copyinstr(self->path)); }' > 4) Crash. > > Backtrace: > [root@vmware /usr/obj/usr/src/sys/DTRACE]# kgdb kernel.debug /var/ > crash/vmcore.3 > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and > you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for > details. > This GDB was configured as "amd64-marcel-freebsd"... > > Unread portion of the kernel message buffer: > panic: from debugger > cpuid = 0 > Uptime: 3m10s > Physical memory: 368 MB > Dumping 81 MB: 66 50 34 18 2 > > Reading symbols from /boot/kernel/dtraceall.ko...Reading symbols > from /boot/kernel/dtraceall.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/dtraceall.ko > Reading symbols from /boot/kernel/profile.ko...Reading symbols from / > boot/kernel/profile.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/profile.ko > Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols > from /boot/kernel/opensolaris.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/opensolaris.ko > Reading symbols from /boot/kernel/cyclic.ko...Reading symbols from / > boot/kernel/cyclic.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/cyclic.ko > Reading symbols from /boot/kernel/dtrace.ko...Reading symbols from / > boot/kernel/dtrace.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/dtrace.ko > Reading symbols from /boot/kernel/systrace.ko...Reading symbols > from /boot/kernel/systrace.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/systrace.ko > Reading symbols from /boot/kernel/sdt.ko...Reading symbols from / > boot/kernel/sdt.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/sdt.ko > Reading symbols from /boot/kernel/fbt.ko...Reading symbols from / > boot/kernel/fbt.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/fbt.ko > Reading symbols from /boot/kernel/dtnfsclient.ko...Reading symbols > from /boot/kernel/dtnfsclient.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/dtnfsclient.ko > Reading symbols from /boot/kernel/dtmalloc.ko...Reading symbols > from /boot/kernel/dtmalloc.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/dtmalloc.ko > #0 doadump () at pcpu.h:223 > 223 __asm __volatile("movq %%gs:0,%0" : "=r" (td)); > (kgdb) bt > #0 doadump () at pcpu.h:223 > #1 0xffffffff80566b23 in boot (howto=260) at /usr/src/sys/kern/ > kern_shutdown.c:420 > #2 0xffffffff80566fac in panic (fmt=Variable "fmt" is not available. > ) at /usr/src/sys/kern/kern_shutdown.c:576 > #3 0xffffffff801d3ef7 in db_panic (addr=Variable "addr" is not > available. > ) at /usr/src/sys/ddb/db_command.c:478 > #4 0xffffffff801d43a1 in db_command (last_cmdp=0xffffffff80bd3620, > cmd_table=Variable "cmd_table" is not available. > ) at /usr/src/sys/ddb/db_command.c:445 > #5 0xffffffff801d45f0 in db_command_loop () at /usr/src/sys/ddb/ > db_command.c:498 > #6 0xffffffff801d6599 in db_trap (type=Variable "type" is not > available. > ) at /usr/src/sys/ddb/db_main.c:229 > #7 0xffffffff80597135 in kdb_trap (type=10, code=0, > tf=0xfffffffe4e64e450) at /usr/src/sys/kern/subr_kdb.c:534 > #8 0xffffffff80843f81 in trap (frame=0xfffffffe4e64e450) at /usr/ > src/sys/amd64/amd64/trap.c:606 > #9 0xffffffff8081edc7 in calltrap () at /usr/src/sys/amd64/amd64/ > exception.S:223 > #10 0xffffffff8123c128 in dtrace_panic (format=Variable "format" is > not available. > ) > at /usr/src/sys/modules/dtrace/dtrace/../../../cddl/contrib/ > opensolaris/uts/common/dtrace/dtrace.c:601 > #11 0xffffffff8123c200 in dtrace_copycheck > (uaddr=18446744071581326184, kaddr=Variable "kaddr" is not available. > ) at dtrace_isa.c:527 > #12 0xffffffff8123c2bc in dtrace_copyinstr (uaddr=34365395808, > kaddr=18446744066201920856, size=256, > flags=0xffffffff8122f120) at dtrace_isa.c:558 > #13 0xffffffff81249e84 in dtrace_dif_emulate > (difo=0xffffff00026a2d80, mstate=0xfffffffe4e64ea00, > vstate=0xffffff0002548838, state=0xffffff0002548800) > at /usr/src/sys/modules/dtrace/dtrace/../../../cddl/contrib/ > opensolaris/uts/common/dtrace/dtrace.c:3446 > #14 0xffffffff8124b20a in dtrace_probe (id=Variable "id" is not > available. > ) > at /usr/src/sys/modules/dtrace/dtrace/../../../cddl/contrib/ > opensolaris/uts/common/dtrace/dtrace.c:6220 > #15 0xffffffff8137b155 in systrace_probe () from /boot/kernel/ > systrace.ko > #16 0xffffffff80843c4d in syscall (frame=0xfffffffe4e64ec90) at /usr/ > src/sys/amd64/amd64/trap.c:990 > #17 0xffffffff8081f050 in Xfast_syscall () at /usr/src/sys/amd64/ > amd64/exception.S:364 > #18 0x00000008005411fc in ?? () > Previous frame inner to this frame (corrupt stack?) > > Hope this helps to fix this bug - I assume syscall::open isn't the > only probe > affected as it's simply the very first one I tried. > > Same panic on two computers (a "real" one, A64 3200+, nForce4, 2GB > RAM; > and a Macbook Pro C2D running VMware Fusion). Same panic in 7.2 and > 8.0. OK, a follow-up on this one... Let me preface it: WTF? OK, now that's out of my system. Well, not really. So, I'm not a programmer. I program mostly high-level languages (such as Objective-C/Cocoa for OS X, some python, perl etc) and have NO clue what I'm doing in kernel modules.... BUT... Being curious, I tried to see if I could add some printf statements to dtrace_copycheck() to see what it'd print before crashing. What happens? Well, this is what I changed (the types may well be screwed up - as I said, I'm not a competent C programmer! Heck, I can see right now that they are): In /sys/cddl/dev/dtrace/amd64/dtrace_isa.c: static int dtrace_copycheck(uintptr_t uaddr, uintptr_t kaddr, size_t size) { printf("in dtrace_copycheck(), pre-ASSERT:\n"); printf("kaddr = %u, kernelbase = %u, size = %d, kaddr+size = %u\n \n", (unsigned int)kaddr, (unsigned int)kernelbase, (unsigned int)size, (unsigned int)(kaddr+size)); ASSERT(kaddr >= kernelbase && kaddr + size >= kaddr); if (uaddr + size >= kernelbase || uaddr + size < uaddr) { ... So, I added two printf statements. What happens? IT BECOMES STABLE. How the hell does this happen? I figured "oh, someone must have fixed the bug!", commented my lines, unloaded the modules, recompiled, re- loaded "my" modules without printf... PANIC! I tried it again with the printf, and... it works. This deserves a big "WTF" in my opinion. Any clues as to what's going on here? In the meantime, I added printf("%s", ""); above the ASSERT line, which fixes the bug! I'm as confused as one can get here, could I get some insights as to why this works?! :-) Does having something above the ASSERT line make the assertion not execute? Regards, Thomas