From owner-freebsd-amd64@FreeBSD.ORG Mon Jan 7 18:18:34 2013 Return-Path: Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 2ADFC7AC for ; Mon, 7 Jan 2013 18:18:34 +0000 (UTC) (envelope-from freebsd@gal.dk) Received: from denene.dvconsulting.dk (denene.dvconsulting.dk [195.234.155.141]) by mx1.freebsd.org (Postfix) with SMTP id 66D39B0A for ; Mon, 7 Jan 2013 18:18:32 +0000 (UTC) Received: (qmail 92073 invoked from network); 7 Jan 2013 19:11:50 +0100 Received: from localhost (HELO denene.dvconsulting.dk) (127.0.0.1) by denene.dvconsulting.dk (qpsmtpd/0.84) with SMTP; Mon, 07 Jan 2013 19:11:50 +0100 Received: (qmail 92066 invoked by uid 1114); 7 Jan 2013 19:11:50 +0100 Received: from [85.233.252.158] (HELO [10.0.1.96]) (85.233.252.158) (smtp-auth username mail@dvconsulting.dk, mechanism cram-md5) by denene.dvconsulting.dk (qpsmtpd/0.84) with (AES128-SHA encrypted) ESMTPSA; Mon, 07 Jan 2013 19:11:50 +0100 Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: amd64/175091: Crash: Fatal trap 12: page fault while in kernel mode From: Rasmus Skaarup In-Reply-To: <201301070932.37097.jhb@freebsd.org> Date: Mon, 7 Jan 2013 19:11:48 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <201301070805.r0785IeP031201@red.freebsd.org> <201301070932.37097.jhb@freebsd.org> To: John Baldwin X-Mailer: Apple Mail (2.1499) X-Mailman-Approved-At: Mon, 07 Jan 2013 19:16:51 +0000 Cc: freebsd-gnats-submit@freebsd.org, freebsd-amd64@freebsd.org X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Jan 2013 18:18:34 -0000 Thank you for the quick response. I enabled the setting in rc.conf as = you mentioned, and the machine has crashed twice since. The two dumps = are uploaded here: http://gal.dk/crash0.tar.gz http://gal.dk/crash1.tar.gz gdb output for the original error: (gdb) l *vm_fault_hold+0x1b13 0xffffffff80b41133 is in vm_fault_hold (/usr/src/sys/vm/vm_fault.c:936). 931 * because pmap_enter() may sleep. We don't put the = page 932 * back on the active queue until later so that the = pageout daemon 933 * won't find it (yet). 934 */ 935 pmap_enter(fs.map->pmap, vaddr, fault_type, fs.m, prot, = wired); 936 if ((fault_flags & VM_FAULT_CHANGE_WIRING) =3D=3D 0 && = wired =3D=3D 0) 937 vm_fault_prefault(fs.map->pmap, vaddr, = fs.entry); 938 VM_OBJECT_LOCK(fs.object); 939 vm_page_lock(fs.m); 940=09 (gdb)=20 The two other crashes, had different excuses. Here is the first: Fatal trap 9: general protection fault while in kernel mode cpuid =3D 3; apic id =3D 03 instruction pointer =3D 0x20:0xffffffff81612ace stack pointer =3D 0x28:0xffffff816230a4c0 frame pointer =3D 0x28:0xffffff816230a4e0 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 18778 (imapd) processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 18778 (imapd) trap number =3D 9 panic: general protection fault cpuid =3D 1 KDB: stack backtrace: #0 0xffffffff809208a6 at kdb_backtrace+0x66 #1 0xffffffff808ea8be at panic+0x1ce #2 0xffffffff80bd8240 at trap_fatal+0x290 #3 0xffffffff80bd88d5 at trap+0x105 #4 0xffffffff80bc315f at calltrap+0x8 #5 0xffffffff8164915d at dnode_free_range+0x29d #6 0xffffffff81639d5f at dmu_free_long_range_impl+0x13f #7 0xffffffff81639f9c at dmu_free_long_range+0x4c #8 0xffffffff816a7839 at zfs_rmnode+0x69 #9 0xffffffff816be9b6 at zfs_inactive+0x66 #10 0xffffffff816beb7a at zfs_freebsd_inactive+0x1a #11 0xffffffff8097f61d at vinactive+0x8d #12 0xffffffff80982de8 at vputx+0x2d8 #13 0xffffffff80986f4f at kern_unlinkat+0x1df #14 0xffffffff80bd7ae6 at amd64_syscall+0x546 #15 0xffffffff80bc3447 at Xfast_syscall+0xf7 Uptime: 1h17m3s (gdb) l *trap_fatal+0x290=20 0xffffffff80bd8240 is in trap_fatal = (/usr/src/sys/amd64/amd64/trap.c:852). 847 printf("Idle\n"); 848 } 849=09 850 #ifdef KDB 851 if (debugger_on_panic || kdb_active) 852 if (kdb_trap(type, 0, frame)) 853 return; 854 #endif 855 printf("trap number =3D %d\n", type); 856 if (type <=3D MAX_TRAP_MSG) (gdb)=20 The other new crash: Fatal trap 12: page fault while in kernel mode cpuid =3D 1; apic id =3D 01 fault virtual address =3D 0x0 fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff80bcf1fb stack pointer =3D 0x28:0xffffff8161fef950 frame pointer =3D 0x28:0xffffff8161fef990 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 75787 (httpd) trap number =3D 12 panic: page fault cpuid =3D 1 KDB: stack backtrace: #0 0xffffffff809208a6 at kdb_backtrace+0x66 #1 0xffffffff808ea8be at panic+0x1ce #2 0xffffffff80bd8240 at trap_fatal+0x290 #3 0xffffffff80bd857d at trap_pfault+0x1ed #4 0xffffffff80bd8b9e at trap+0x3ce #5 0xffffffff80bc315f at calltrap+0x8 #6 0xffffffff80bcf290 at pmap_is_modified+0x40 #7 0xffffffff80b52f7e at vm_page_dontneed+0x17e #8 0xffffffff80b4f0cd at vm_object_madvise+0x4dd #9 0xffffffff80b49beb at vm_map_madvise+0x1bb #10 0xffffffff80b4bff1 at sys_madvise+0x91 #11 0xffffffff80bd7ae6 at amd64_syscall+0x546 #12 0xffffffff80bc3447 at Xfast_syscall+0xf7 Uptime: 5h5m23s (gdb) l *pmap_is_modified+0x40 0xffffffff80bcf290 is in pmap_is_modified = (/usr/src/sys/amd64/amd64/pmap.c:4264). 4259 VM_OBJECT_LOCK_ASSERT(m->object, MA_OWNED); 4260 if ((m->oflags & VPO_BUSY) =3D=3D 0 && 4261 (m->aflags & PGA_WRITEABLE) =3D=3D 0) 4262 return (FALSE); 4263 rw_wlock(&pvh_global_lock); 4264 rv =3D pmap_is_modified_pvh(&m->md) || 4265 ((m->flags & PG_FICTITIOUS) =3D=3D 0 && 4266 = pmap_is_modified_pvh(pa_to_pvh(VM_PAGE_TO_PHYS(m)))); 4267 rw_wunlock(&pvh_global_lock); 4268 return (rv); (gdb) l *trap_pfault+0x1ed 0xffffffff80bd857d is in trap_pfault = (/usr/src/sys/amd64/amd64/trap.c:773). 768 if (td->td_intr_nesting_level =3D=3D 0 && 769 PCPU_GET(curpcb)->pcb_onfault !=3D NULL) { 770 frame->tf_rip =3D = (long)PCPU_GET(curpcb)->pcb_onfault; 771 return (0); 772 } 773 trap_fatal(frame, eva); 774 return (-1); 775 } 776=09 777 return((rv =3D=3D KERN_PROTECTION_FAILURE) ? SIGBUS : = SIGSEGV); (gdb)=20 (not sure I'm gbd'ing what you need, but let me know). I am beginning to suspect the hardware, but the strange thing is that = the host (CentOS 6.3) and the other virtual machine works completely = fine. And the other virtual machine has plenty of user on it. Best regards, Rasmus skaarup On 07/01/2013, at 15.32, John Baldwin wrote: > On Monday, January 07, 2013 03:05:18 AM Rasmus Skaarup wrote: >>> Number: 175091 >>> Category: amd64 >>> Synopsis: Crash: Fatal trap 12: page fault while in kernel = mode >>> Confidential: no >>> Severity: non-critical >>> Priority: low >>> Responsible: freebsd-amd64 >>> State: open >>> Quarter: >>> Keywords: >>> Date-Required: >>> Class: sw-bug >>> Submitter-Id: current-users >>> Arrival-Date: Mon Jan 07 08:10:01 UTC 2013 >>> Closed-Date: >>> Last-Modified: >>> Originator: Rasmus Skaarup >>> Release: 9.1-RELEASE >>> Organization: >>=20 >>> Environment: >> FreeBSD thirdhost 9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243825: Tue Dec = 4 >> 09:23:10 UTC 2012 =20 >> root@farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 >>=20 >>> Description: >> On of my virtualized FreeBSD machines has been panic'ing two times = within >> the last two weeks. After the first panic I ran freebsd-update and >> upgraded to 9.1-RELEASE succesfully. Today the machine panic'ed = again. >>=20 >> I have another virtualized FreeBSD machine running on the same host, = and it >> does not exhibit this behaviour. >>=20 >> Here is the output from dmesg, after reboot: >>=20 >> **** >> Fatal trap 12: page fault while in kernel mode >> cpuid =3D 2; apic id =3D 02 >> fault virtual address =3D 0x48 >> fault code =3D supervisor read data, page not present >> instruction pointer =3D 0x20:0xffffffff80bd5139 >> stack pointer =3D 0x28:0xffffff81625536c0 >> frame pointer =3D 0x28:0xffffff8162553750 >> code segment =3D base 0x0, limit 0xfffff, type 0x1b >> =3D DPL 0, pres 1, long 1, def32 0, gran 1 >> processor eflags =3D interrupt enabled, resume, IOPL =3D 0 >> current process =3D 62083 (httpd) >> trap number =3D 12 >> panic: page fault >> cpuid =3D 2 >> KDB: stack backtrace: >> #0 0xffffffff809208a6 at kdb_backtrace+0x66 >> #1 0xffffffff808ea8be at panic+0x1ce >> #2 0xffffffff80bd8240 at trap_fatal+0x290 >> #3 0xffffffff80bd857d at trap_pfault+0x1ed >> #4 0xffffffff80bd8b9e at trap+0x3ce >> #5 0xffffffff80bc315f at calltrap+0x8 >> #6 0xffffffff80b41133 at vm_fault_hold+0x1b13 >> #7 0xffffffff80b41cc3 at vm_fault+0x73 >> #8 0xffffffff80bd84b4 at trap_pfault+0x124 >> #9 0xffffffff80bd8c6c at trap+0x49c >> #10 0xffffffff80bc315f at calltrap+0x8 >> Uptime: 13h6m22s >> ********* >=20 > Can you enable crashdumps by setting 'dumpdev=3D"AUTO"' in = /etc/rc.conf? >=20 > Also, can you run 'gdb /boot/kernel/kernel' and then at the prompt run > 'l *vm_fault_hold+0x1b13' and reply with the output? >=20 > --=20 > John Baldwin >=20