From owner-freebsd-amd64@FreeBSD.ORG  Mon Jan  7 18:18:34 2013
Return-Path: <owner-freebsd-amd64@FreeBSD.ORG>
Delivered-To: freebsd-amd64@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 2ADFC7AC
 for <freebsd-amd64@freebsd.org>; Mon,  7 Jan 2013 18:18:34 +0000 (UTC)
 (envelope-from freebsd@gal.dk)
Received: from denene.dvconsulting.dk (denene.dvconsulting.dk
 [195.234.155.141]) by mx1.freebsd.org (Postfix) with SMTP id 66D39B0A
 for <freebsd-amd64@freebsd.org>; Mon,  7 Jan 2013 18:18:32 +0000 (UTC)
Received: (qmail 92073 invoked from network); 7 Jan 2013 19:11:50 +0100
Received: from localhost (HELO denene.dvconsulting.dk) (127.0.0.1)
 by denene.dvconsulting.dk (qpsmtpd/0.84) with SMTP;
 Mon, 07 Jan 2013 19:11:50 +0100
Received: (qmail 92066 invoked by uid 1114); 7 Jan 2013 19:11:50 +0100
Received: from [85.233.252.158] (HELO [10.0.1.96]) (85.233.252.158)
 (smtp-auth username mail@dvconsulting.dk, mechanism cram-md5)
 by denene.dvconsulting.dk (qpsmtpd/0.84) with (AES128-SHA encrypted) ESMTPSA;
 Mon, 07 Jan 2013 19:11:50 +0100
Content-Type: text/plain; charset=iso-8859-1
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Subject: Re: amd64/175091: Crash: Fatal trap 12: page fault while in kernel
 mode
From: Rasmus Skaarup <freebsd@gal.dk>
In-Reply-To: <201301070932.37097.jhb@freebsd.org>
Date: Mon, 7 Jan 2013 19:11:48 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <AB7C0328-3764-480E-ACB0-FEAECAD9E200@gal.dk>
References: <201301070805.r0785IeP031201@red.freebsd.org>
 <201301070932.37097.jhb@freebsd.org>
To: John Baldwin <jhb@freebsd.org>
X-Mailer: Apple Mail (2.1499)
X-Mailman-Approved-At: Mon, 07 Jan 2013 19:16:51 +0000
Cc: freebsd-gnats-submit@freebsd.org, freebsd-amd64@freebsd.org
X-BeenThere: freebsd-amd64@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Porting FreeBSD to the AMD64 platform <freebsd-amd64.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-amd64>,
 <mailto:freebsd-amd64-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-amd64>
List-Post: <mailto:freebsd-amd64@freebsd.org>
List-Help: <mailto:freebsd-amd64-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
 <mailto:freebsd-amd64-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jan 2013 18:18:34 -0000


Thank you for the quick response. I enabled the setting in rc.conf as =
you mentioned, and the machine has crashed twice since. The two dumps =
are uploaded here:

http://gal.dk/crash0.tar.gz
http://gal.dk/crash1.tar.gz

gdb output for the original error:

(gdb) l *vm_fault_hold+0x1b13
0xffffffff80b41133 is in vm_fault_hold (/usr/src/sys/vm/vm_fault.c:936).
931		 * because pmap_enter() may sleep.  We don't put the =
page
932		 * back on the active queue until later so that the =
pageout daemon
933		 * won't find it (yet).
934		 */
935		pmap_enter(fs.map->pmap, vaddr, fault_type, fs.m, prot, =
wired);
936		if ((fault_flags & VM_FAULT_CHANGE_WIRING) =3D=3D 0 && =
wired =3D=3D 0)
937			vm_fault_prefault(fs.map->pmap, vaddr, =
fs.entry);
938		VM_OBJECT_LOCK(fs.object);
939		vm_page_lock(fs.m);
940=09
(gdb)=20


The two other crashes, had different excuses. Here is the first:

Fatal trap 9: general protection fault while in kernel mode
cpuid =3D 3; apic id =3D 03
instruction pointer     =3D 0x20:0xffffffff81612ace
stack pointer           =3D 0x28:0xffffff816230a4c0
frame pointer           =3D 0x28:0xffffff816230a4e0
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
current process         =3D 18778 (imapd)
processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
current process         =3D 18778 (imapd)
trap number             =3D 9
panic: general protection fault
cpuid =3D 1
KDB: stack backtrace:
#0 0xffffffff809208a6 at kdb_backtrace+0x66
#1 0xffffffff808ea8be at panic+0x1ce
#2 0xffffffff80bd8240 at trap_fatal+0x290
#3 0xffffffff80bd88d5 at trap+0x105
#4 0xffffffff80bc315f at calltrap+0x8
#5 0xffffffff8164915d at dnode_free_range+0x29d
#6 0xffffffff81639d5f at dmu_free_long_range_impl+0x13f
#7 0xffffffff81639f9c at dmu_free_long_range+0x4c
#8 0xffffffff816a7839 at zfs_rmnode+0x69
#9 0xffffffff816be9b6 at zfs_inactive+0x66
#10 0xffffffff816beb7a at zfs_freebsd_inactive+0x1a
#11 0xffffffff8097f61d at vinactive+0x8d
#12 0xffffffff80982de8 at vputx+0x2d8
#13 0xffffffff80986f4f at kern_unlinkat+0x1df
#14 0xffffffff80bd7ae6 at amd64_syscall+0x546
#15 0xffffffff80bc3447 at Xfast_syscall+0xf7
Uptime: 1h17m3s

(gdb) l *trap_fatal+0x290=20
0xffffffff80bd8240 is in trap_fatal =
(/usr/src/sys/amd64/amd64/trap.c:852).
847			printf("Idle\n");
848		}
849=09
850	#ifdef KDB
851		if (debugger_on_panic || kdb_active)
852			if (kdb_trap(type, 0, frame))
853				return;
854	#endif
855		printf("trap number		=3D %d\n", type);
856		if (type <=3D MAX_TRAP_MSG)
(gdb)=20


The other new crash:

Fatal trap 12: page fault while in kernel mode
cpuid =3D 1; apic id =3D 01
fault virtual address   =3D 0x0
fault code              =3D supervisor read data, page not present
instruction pointer     =3D 0x20:0xffffffff80bcf1fb
stack pointer           =3D 0x28:0xffffff8161fef950
frame pointer           =3D 0x28:0xffffff8161fef990
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
current process         =3D 75787 (httpd)
trap number             =3D 12
panic: page fault
cpuid =3D 1
KDB: stack backtrace:
#0 0xffffffff809208a6 at kdb_backtrace+0x66
#1 0xffffffff808ea8be at panic+0x1ce
#2 0xffffffff80bd8240 at trap_fatal+0x290
#3 0xffffffff80bd857d at trap_pfault+0x1ed
#4 0xffffffff80bd8b9e at trap+0x3ce
#5 0xffffffff80bc315f at calltrap+0x8
#6 0xffffffff80bcf290 at pmap_is_modified+0x40
#7 0xffffffff80b52f7e at vm_page_dontneed+0x17e
#8 0xffffffff80b4f0cd at vm_object_madvise+0x4dd
#9 0xffffffff80b49beb at vm_map_madvise+0x1bb
#10 0xffffffff80b4bff1 at sys_madvise+0x91
#11 0xffffffff80bd7ae6 at amd64_syscall+0x546
#12 0xffffffff80bc3447 at Xfast_syscall+0xf7
Uptime: 5h5m23s

(gdb) l *pmap_is_modified+0x40
0xffffffff80bcf290 is in pmap_is_modified =
(/usr/src/sys/amd64/amd64/pmap.c:4264).
4259		VM_OBJECT_LOCK_ASSERT(m->object, MA_OWNED);
4260		if ((m->oflags & VPO_BUSY) =3D=3D 0 &&
4261		    (m->aflags & PGA_WRITEABLE) =3D=3D 0)
4262			return (FALSE);
4263		rw_wlock(&pvh_global_lock);
4264		rv =3D pmap_is_modified_pvh(&m->md) ||
4265		    ((m->flags & PG_FICTITIOUS) =3D=3D 0 &&
4266		    =
pmap_is_modified_pvh(pa_to_pvh(VM_PAGE_TO_PHYS(m))));
4267		rw_wunlock(&pvh_global_lock);
4268		return (rv);
(gdb) l *trap_pfault+0x1ed
0xffffffff80bd857d is in trap_pfault =
(/usr/src/sys/amd64/amd64/trap.c:773).
768			if (td->td_intr_nesting_level =3D=3D 0 &&
769			    PCPU_GET(curpcb)->pcb_onfault !=3D NULL) {
770				frame->tf_rip =3D =
(long)PCPU_GET(curpcb)->pcb_onfault;
771				return (0);
772			}
773			trap_fatal(frame, eva);
774			return (-1);
775		}
776=09
777		return((rv =3D=3D KERN_PROTECTION_FAILURE) ? SIGBUS : =
SIGSEGV);
(gdb)=20

(not sure I'm gbd'ing what you need, but let me know).


I am beginning to suspect the hardware, but the strange thing is that =
the host (CentOS 6.3) and the other virtual machine works completely =
fine. And the other virtual machine has plenty of user on it.


Best regards,
Rasmus skaarup


On 07/01/2013, at 15.32, John Baldwin <jhb@freebsd.org> wrote:

> On Monday, January 07, 2013 03:05:18 AM Rasmus Skaarup wrote:
>>> Number:         175091
>>> Category:       amd64
>>> Synopsis:       Crash: Fatal trap 12: page fault while in kernel =
mode
>>> Confidential:   no
>>> Severity:       non-critical
>>> Priority:       low
>>> Responsible:    freebsd-amd64
>>> State:          open
>>> Quarter:
>>> Keywords:
>>> Date-Required:
>>> Class:          sw-bug
>>> Submitter-Id:   current-users
>>> Arrival-Date:   Mon Jan 07 08:10:01 UTC 2013
>>> Closed-Date:
>>> Last-Modified:
>>> Originator:     Rasmus Skaarup
>>> Release:        9.1-RELEASE
>>> Organization:
>>=20
>>> Environment:
>> FreeBSD thirdhost 9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243825: Tue Dec =
 4
>> 09:23:10 UTC 2012   =20
>> root@farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  amd64
>>=20
>>> Description:
>> On of my virtualized FreeBSD machines has been panic'ing two times =
within
>> the last two weeks. After the first panic I ran freebsd-update and
>> upgraded to 9.1-RELEASE succesfully. Today the machine panic'ed =
again.
>>=20
>> I have another virtualized FreeBSD machine running on the same host, =
and it
>> does not exhibit this behaviour.
>>=20
>> Here is the output from dmesg, after reboot:
>>=20
>> ****
>> Fatal trap 12: page fault while in kernel mode
>> cpuid =3D 2; apic id =3D 02
>> fault virtual address   =3D 0x48
>> fault code              =3D supervisor read data, page not present
>> instruction pointer     =3D 0x20:0xffffffff80bd5139
>> stack pointer           =3D 0x28:0xffffff81625536c0
>> frame pointer           =3D 0x28:0xffffff8162553750
>> code segment            =3D base 0x0, limit 0xfffff, type 0x1b
>>                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
>> processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
>> current process         =3D 62083 (httpd)
>> trap number             =3D 12
>> panic: page fault
>> cpuid =3D 2
>> KDB: stack backtrace:
>> #0 0xffffffff809208a6 at kdb_backtrace+0x66
>> #1 0xffffffff808ea8be at panic+0x1ce
>> #2 0xffffffff80bd8240 at trap_fatal+0x290
>> #3 0xffffffff80bd857d at trap_pfault+0x1ed
>> #4 0xffffffff80bd8b9e at trap+0x3ce
>> #5 0xffffffff80bc315f at calltrap+0x8
>> #6 0xffffffff80b41133 at vm_fault_hold+0x1b13
>> #7 0xffffffff80b41cc3 at vm_fault+0x73
>> #8 0xffffffff80bd84b4 at trap_pfault+0x124
>> #9 0xffffffff80bd8c6c at trap+0x49c
>> #10 0xffffffff80bc315f at calltrap+0x8
>> Uptime: 13h6m22s
>> *********
>=20
> Can you enable crashdumps by setting 'dumpdev=3D"AUTO"' in =
/etc/rc.conf?
>=20
> Also, can you run 'gdb /boot/kernel/kernel' and then at the prompt run
> 'l *vm_fault_hold+0x1b13' and reply with the output?
>=20
> --=20
> John Baldwin
>=20