From owner-freebsd-current@FreeBSD.ORG Fri Jun 19 17:41:52 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 862AD106566C; Fri, 19 Jun 2009 17:41:52 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 0BB2D8FC0C; Fri, 19 Jun 2009 17:41:52 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:42331 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.69) (envelope-from ) id 1MHi5l-0003tE-5b; Fri, 19 Jun 2009 19:41:35 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 3CB2B6AD67; Fri, 19 Jun 2009 19:41:32 +0200 (CEST) Message-Id: From: Thomas Backman To: FreeBSD current In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Fri, 19 Jun 2009 19:41:29 +0200 References: <949B5884-5303-4EFF-AC7D-293640FFA012@exscape.org> <0C235698-3ED2-4AE9-A7D1-5DC56D8324A4@exscape.org> <200905212129.47892.mel.flynn+fbsd.current@mailing.thruhere.net> <44F486FA-E798-448D-BE31-F7A51EF1F612@exscape.org> <60173AF0-7E54-4BDD-8927-0DADA9DAD1B4@exscape.org> <20090522200306.GE2630@atarininja.org> <20090617225849.GB28509@atarininja.org> <4A3A1D27.4010802@icyb.net.ua> <4A3BBF5A.6060702@icyb.net.ua> <4A3BC481.1010600@cs.rice.edu> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MHi5l-0003tE-5b. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MHi5l-0003tE-5b 1df895e374f1b4af289ce37002d95280 Cc: Alan Cox , John Birrell , Alan Cox , Andriy Gapon Subject: Re: DTrace panic while probing syscall::open (and possibly many others) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Jun 2009 17:41:52 -0000 On Jun 19, 2009, at 07:32 PM, Thomas Backman wrote: > > On Jun 19, 2009, at 07:01 PM, Alan Cox wrote: > >> Andriy Gapon wrote: >>> on 18/06/2009 14:42 Thomas Backman said the following: >>> >>>> On Jun 18, 2009, at 12:55 PM, Andriy Gapon wrote: >>>> >>>> >>>>> on 18/06/2009 12:43 Thomas Backman said the following: >>>>> >>>>>> at dtrace_isa.c:527 >>>>>> #14 0xffffffff816b31fc in dtrace_copyinstr (uaddr=34365163021, >>>>>> kaddr=18446743524025463312, size=256, flags=0xffffffff8146e0c0) >>>>>> at dtrace_isa.c:558 >>>>>> >>>>> kaddr=18446743524025463312 == FFFFFF8004467210 >>>>> I think kernelbase on amd64 is 0xFFFFFFFF80000000. >>>>> FFFFFF8004467210 kaddr >>>>> is smaller than >>>>> FFFFFFFF80000000 kernelbase >>>>> >>>>> The numbers do look suspiciously similar, so I am not sure if >>>>> you are >>>>> seeing a >>>>> race or a real bug somewhere. >>>>> -- >>>>> Andriy Gapon >>>>> >>>> Hmmm... >>>> Looking around a bit for these numbers, I found, in >>>> /sys/amd64/include/vmparam.h: >>>> >>>> /* >>>> * Virtual addresses of things. Derived from the page directory and >>>> * page table indexes from pmap.h for precision. >>>> * >>>> * 0x0000000000000000 - 0x00007fffffffffff user map >>>> * 0x0000800000000000 - 0xffff7fffffffffff does not exist (hole) >>>> * 0xffff800000000000 - 0xffff804020100fff recursive page table >>>> (512GB >>>> slot) >>>> * 0xffff804020101000 - 0xfffffeffffffffff unused >>>> * 0xffffff0000000000 - 0xffffff7fffffffff 512GB direct map >>>> mappings >>>> * 0xffffff8000000000 - 0xffffffffffffffff 512GB kernel map >>>> * >>>> * Within the kernel map: >>>> * >>>> * 0xffffffff80000000 KERNBASE >>>> */ >>>> >>>> So, kaddr is inside the "kernel map", but not KERNBASE. What this >>>> means, >>>> I have no clue whatsoever. (I'm not a kernel developer and I >>>> don't know >>>> too much about (virtual) memory either!) >>>> >>> >>> Thomas, >>> >>> I think that you were correct that one needs to be somewhat of a >>> VM expert here. >>> It seems that amd64 is the only[?] platform where KERNBASE != >>> VM_MIN_KERNEL_ADDRESS (0xffffffff80000000 and 0xffffff8000000000 >>> correspondingly). >>> That makes the assert in sys/cddl/dev/dtrace/amd64/dtrace_isa.c >>> bogus in my opinion: >>> static int >>> dtrace_copycheck(uintptr_t uaddr, uintptr_t kaddr, size_t size) >>> { >>> ASSERT(kaddr >= kernelbase && kaddr + size >= kaddr); >>> >>> If the purpose of the assert is to ensure that [kaddr:kaddr+size) >>> is within kernel >>> memory, then it should use VM_MIN_KERNEL_ADDRESS instead of >>> KERNBASE. Or maybe >>> even use something like the macro in sys/amd64/include/stack.h: >>> #define INKERNEL(va) (((va) >= DMAP_MIN_ADDRESS && (va) < >>> DMAP_MAX_ADDRESS) \ >>> || ((va) >= VM_MIN_KERNEL_ADDRESS && (va) < >>> VM_MAX_KERNEL_ADDRESS)) >>> >>> >> >> Yes. Your analysis is correct. >> >> Alan > Very interesting. > I replaced the ASSERT line temporarily: > > --- ../src_r194478-UNTOUCHED/sys/cddl/dev/dtrace/amd64/ > dtrace_isa.c 2009-06-19 13:10:05.661079736 +0200 > +++ sys/cddl/dev/dtrace/amd64/dtrace_isa.c 2009-06-19 > 19:24:42.362125129 +0200 > @@ -524,7 +524,7 @@ > static int > dtrace_copycheck(uintptr_t uaddr, uintptr_t kaddr, size_t size) > { > - ASSERT(kaddr >= kernelbase && kaddr + size >= kaddr); > + ASSERT(kaddr >= 0xffffff8000000000 && kaddr + size >= kaddr); > > if (uaddr + size >= kernelbase || uaddr + size < uaddr) { > DTRACE_CPUFLAG_SET(CPU_DTRACE_BADADDR); > > ... and it works! I obviously haven't tried it for extended periods > or anything, but at least it's working so far. > Should the ASSERT simply use this (as a #define somewhere) or the > INKERNEL macro, though? BTW... Should "kernelbase" in the line following the ASSERT also be replaced, or not? As far as I can understand (not too far in these contexts ;) it (should) check/s to see whether the userspace data, to be copied, is inside the kernel *map*(?)... which at the moment, I guess it doesn't. Correct? Regards, Thomas