From owner-freebsd-current@FreeBSD.ORG Tue Jun 21 20:00:28 2005 Return-Path: X-Original-To: current@FreeBSD.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 81A9416A41C; Tue, 21 Jun 2005 20:00:28 +0000 (GMT) (envelope-from gallatin@cs.duke.edu) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 40B5443D58; Tue, 21 Jun 2005 20:00:28 +0000 (GMT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.13.4/8.13.4) with ESMTP id j5LK0R78005729 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 21 Jun 2005 16:00:27 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.12.9p2/8.12.9/Submit) id j5LK0Ml1060482; Tue, 21 Jun 2005 16:00:22 -0400 (EDT) (envelope-from gallatin) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17080.29141.918333.170950@grasshopper.cs.duke.edu> Date: Tue, 21 Jun 2005 16:00:21 -0400 (EDT) To: John Baldwin In-Reply-To: <200506171434.49008.jhb@FreeBSD.org> References: <20050510223636.GA49927@xor.obsecurity.org> <20050529175056.GA99318@xor.obsecurity.org> <200506171434.49008.jhb@FreeBSD.org> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid Cc: amd64@FreeBSD.org, current@FreeBSD.org, Kris Kennaway Subject: Re: Fatal trap 12 in exec_copyout_strings() X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Jun 2005 20:00:28 -0000 John Baldwin writes: > On Sunday 29 May 2005 01:50 pm, Kris Kennaway wrote: > > On Tue, May 10, 2005 at 03:36:36PM -0700, Kris Kennaway wrote: > > > Got this on a dual amd64 with 8GB RAM running 6.0 from last week: > > > > > > Fatal trap 12: page fault while in kernel mode > > > cpuid = 1; apic id = 01 > > > fault virtual address = 0xffffffffa9cdc000 > > > fault code = supervisor read, page not present > > > instruction pointer = 0x8:0xffffffff8037759f > > > stack pointer = 0x10:0xffffffffba1637d0 > > > frame pointer = 0x10:0xffffffffba163820 > > > code segment = base 0x0, limit 0xfffff, type 0x1b > > > = DPL 0, pres 1, long 1, def32 0, gran 1 > > > processor eflags = interrupt enabled, resume, IOPL = 0 > > > current process = 52247 (sh) > > > [thread pid 52247 tid 100149 ] > > > Stopped at exec_copyout_strings+0x12f: > > > db> wh > > > Tracing pid 52247 tid 100149 td 0xffffff016e5724c0 > > > exec_copyout_strings() at exec_copyout_strings+0x12f > > > do_execve() at do_execve+0x39a > > > kern_execve() at kern_execve+0xab > > > execve() at execve+0x49 > > > syscall() at syscall+0x382 > > > Xfast_syscall() at Xfast_syscall+0xa8 > > > --- syscall (59, FreeBSD ELF64, execve), rip = 0x80090622c, rsp = > > > 0x7fffffffe058, rbp = 0xffffffff --- db> > > > > I've got this panic twice more since. > > Do you have a kernel.debug? Can you do 'list *exec_copyout_strings+0x12f'? I > think I've seen reports of the linux32_exec_copyout_strings() having a > similar fault as well on amd64. I just got this on my freshly installed UP, 512MB athlon64. For me, its 100% reproducable when running a cross-compiler built on FreeBSD-4. Fatal trap 12: page fault while in kernel mode fault virtual address = 0xffffffff90ba3000 fault code = supervisor read, page not present instruction pointer = 0x8:0xffffffff80412c20 stack pointer = 0x10:0xffffffff9666b730 frame pointer = 0x10:0xffffffff9666b770 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 2095 (lanai-gcc) [thread pid 2095 tid 100077 ] Stopped at ia32_copyout_strings+0x110: Even after compiling with -O -pipe -g, GDB's stack trace is not so great: #0 doadump () at pcpu.h:172 #1 0xffffffff801877a3 in db_fncall (dummy1=0, dummy2=0, dummy3=0, dummy4=0x0) at ../../../ddb/db_command.c:531 #2 0xffffffff801874f0 in db_command (last_cmdp=0xffffffff805caee8, cmd_table=0x0, aux_cmd_tablep=0xffffffff80475ab0, aux_cmd_tablep_end=0xffffffff80475ab8) at ../../../ddb/db_command.c:349 #3 0xffffffff80187617 in db_command_loop () at ../../../ddb/db_command.c:455 #4 0xffffffff801897a5 in db_trap (type=-1771653920, code=0) at ../../../ddb/db_main.c:221 #5 0xffffffff802a87fd in kdb_trap (type=12, code=0, tf=0xffffffff9666b680) at ../../../kern/subr_kdb.c:471 #6 0xffffffff803df595 in trap_fatal (frame=0xffffffff9666b680, eva=18446744071842705408) at ../../../amd64/amd64/trap.c:650 #7 0xffffffff803df232 in trap_pfault (frame=0xffffffff9666b680, usermode=0) at ../../../amd64/amd64/trap.c:578 #8 0xffffffff803dee6a in trap (frame= {tf_rdi = 4294956412, tf_rsi = 4294956816, tf_rdx = -1771651824, tf_rcx = -1771651824, tf_r8 = 64, tf_r9 = 0, tf_rax = 0, tf_rbx = -1866846208, tf_rbp = -1771653264, tf_r10 = -1099156726496, tf_r11 = 264314796, tf_r12 = 4294956816, tf_r13 = 4294956416, tf_r14 = 23, tf_r15 = 46, tf_trapno = 12, tf_addr = -1866846208, tf_flags = -2143426697, tf_err = 0, tf_rip = -2143212512, tf_cs = 8, tf_rflags = 66118, tf_rsp = -1771653312, tf_ss = 16}) at ../../../amd64/amd64/trap.c:357 #9 0xffffffff803cdb8b in calltrap () at ../../../amd64/amd64/exception.S:172 ---Type to continue, or q to quit--- #10 0x00000000ffffd57c in ?? () #11 0x00000000ffffd710 in ?? () #12 0xffffffff9666bd10 in ?? () #13 0xffffffff9666bd10 in ?? () #14 0x0000000000000040 in ?? () #15 0x0000000000000000 in ?? () #16 0x0000000000000000 in ?? () #17 0xffffffff90ba3000 in ?? () #18 0xffffffff9666b770 in ?? () #19 0xffffff0015275d20 in ?? () #20 0x000000000fc11fac in ?? () #21 0x00000000ffffd710 in ?? () #22 0x00000000ffffd580 in ?? () #23 0x0000000000000017 in ?? () #24 0x000000000000002e in ?? () #25 0x000000000000000c in ?? () #26 0xffffffff90ba3000 in ?? () #27 0xffffffff803de777 in suword32 () at ../../../amd64/amd64/support.S:452 #28 0x0000000000000000 in ?? () #29 0xffffffff80412c20 in ia32_copyout_strings (imgp=0xffffffff90ba3000) at ../../../compat/ia32/ia32_sysvec.c:245 #30 0xffffffff8026b9c2 in do_execve (td=0xffffff00186ca980, args=0x0, mac_p=0x0) at ../../../kern/kern_exec.c:452 #31 0xffffffff8026b52e in kern_execve (td=0xffffff00186ca980, args=0xffffffff9666bb10, mac_p=0x0) at ../../../kern/kern_exec.c:250 #32 0xffffffff80411859 in freebsd32_execve (td=0xffffff00186ca980, uap=0x0) at ../../../compat/freebsd32/freebsd32_misc.c:321 #33 0xffffffff80411047 in ia32_syscall (frame= {tf_rdi = 0, tf_rsi = 0, tf_rdx = 1, tf_rcx = 134563369, tf_r8 = 0, tf_r9 = 0, tf_rax = 59, tf_rbx = 672188772, tf_rbp = 4294955352, tf_r10 = 0, tf_r11 = 0, tf_r12 = 0, tf_r13 = 0, tf_r14 = 0, tf_r15 = 0, tf_trapno = 0, tf_addr = 0, tf_flags = 12, tf_err = 2, tf_rip = 671876628, tf_cs = 27, tf_rflags = 647, tf_rsp = 4294955308, tf_ss = 35}) at ../../../amd64/ia32/ia32_syscall.c:186 #34 0xffffffff803cddfd in Xint0x80_syscall () at ia32_exception.S:64 The line number gdb says the crash happened on does not correspond to what ddb says. (kgdb) frame 29 #29 0xffffffff80412c20 in ia32_copyout_strings (imgp=0xffffffff90ba3000) at ../../../compat/ia32/ia32_sysvec.c:245 245 suword32(vectp++, (u_int32_t)(intptr_t)destp); vs: (kgdb) l *0xffffffff80412c20 0xffffffff80412c20 is in ia32_copyout_strings (../../../compat/ia32/ia32_sysvec.c:247). 242 * Fill in argument portion of vector table. 243 */ 244 for (; argc > 0; --argc) { 245 suword32(vectp++, (u_int32_t)(intptr_t)destp); 246 while (*stringp++ != 0) 247 destp++; 248 destp++; 249 } 250 251 /* a null vector table pointer separates the argp's from the envp's */ The deref of stringp makes sense, as it corresponds to the faulting address reported in ddb: (kgdb) p stringp $27 = 0xffffffff90ba3000
(kgdb) p *imgp->args $33 = { buf = 0xffffffff90ba3000
, begin_argv = 0xffffffff90ba3000
, begin_envv = 0xffffffff90ba313d
, endp = 0xffffffff90ba389f
, fname = 0xffffffff90be3000 "/home/gallatin/lanaitools/intel_FreeBSD/lib/gcc-lib/lanai/2.95.2..1.6/cc1", stringspace = 259937, argc = 23, envc = 46 } I'm puzzled. fname seems to be buf+ARGV_MAX, so its not like something randomly scribbled on this memory. In the debugger, the memory just below buf+ARGV_MAX seems to be unmapped. But we've done copyins in freebsd32_exec_copyin_args(), otherwise endp would not have been advanced. So we've written to this memory. It is almost like somebody freed buf through buf + 262144. Drew