From owner-freebsd-current@FreeBSD.ORG Tue Feb 13 19:10:32 2007 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B212D16A527; Tue, 13 Feb 2007 19:10:32 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id A54A113C461; Tue, 13 Feb 2007 19:10:32 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from obsecurity.dyndns.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id 0F9A11A4DAE; Tue, 13 Feb 2007 11:10:32 -0800 (PST) Received: by obsecurity.dyndns.org (Postfix, from userid 1000) id 5D75651426; Tue, 13 Feb 2007 14:10:31 -0500 (EST) Date: Tue, 13 Feb 2007 14:10:30 -0500 From: Kris Kennaway To: Kostik Belousov Message-ID: <20070213191030.GA68059@xor.obsecurity.org> References: <20070213185312.GF67616@xor.obsecurity.org> <20070213190222.GE25802@deviant.kiev.zoral.com.ua> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="FL5UXtIhxfXey3p5" Content-Disposition: inline In-Reply-To: <20070213190222.GE25802@deviant.kiev.zoral.com.ua> User-Agent: Mutt/1.4.2.2i Cc: amd64@freebsd.org, current@freebsd.org, Kris Kennaway Subject: Re: Page fault in amd64 pmap_qremove from vm_thread_new() X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Feb 2007 19:10:32 -0000 --FL5UXtIhxfXey3p5 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Feb 13, 2007 at 09:02:23PM +0200, Kostik Belousov wrote: > On Tue, Feb 13, 2007 at 01:53:12PM -0500, Kris Kennaway wrote: > > I get this frequently when running stress2 on an 8-core amd64 system: > >=20 > > Fatal trap 12: page fault while in kernel mode > > Fatal trap 12: page fault while in kernel mode > >=20 > >=20 > > cpuid =3D 2; > >=20 > >=20 > > apic id =3D 02 > >=20 > > Fatal trap 12: page fault while in kernel mode > >=20 > > cpuid =3D 5; fault virtual address =3D 0xffff807ffffff040 > > Fatal trap 12: page fault while in kernel mode > > Fatal trap 12: page fault while in kernel mode > >=20 > > cpuid =3D 4; apic id =3D 05 > > apic id =3D 04 > > fault virtual address =3D 0xffff807ffffff0e0 > > fault virtual address =3D 0xffff807ffffff0b8 > > cpuid =3D 0; fault code =3D supervisor write data, page not p= resent > >=20 > > instruction pointer =3D 0x8:0xffffffff803deedd > > cpuid =3D 3; stack pointer =3D 0x10:0xffffffffc7647720 > > fault code =3D supervisor write data, page not present > >=20 > > instruction pointer =3D 0x8:0xffffffff803deedd > > apic id =3D 00 > > stack pointer =3D 0x10:0xffffffffcfd7e720 > > fault code =3D supervisor write data, page not present > > frame pointer =3D 0x10:0xffffffffc7647730 > > frame pointer =3D 0x10:0xffffffffcfd7e730 > > Fatal trap 12: page fault while in kernel mode > >=20 > > cpuid =3D 6; > > instruction pointer =3D 0x8:0xffffffff803deedd > >=20 > > stack pointer =3D 0x10:0xffffffffb2b93720 > >=20 > > frame pointer =3D 0x10:0xffffffffb2b93730 > >=20 > > code segment =3D base 0x0, limit 0xfffff, type 0x1b > >=20 > > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > >=20 > > processor eflags =3D > > interrupt enabled, > > resume, Fatal trap 12: page fault while in kernel mode > > apic id =3D 06 > > cpuid =3D 7; fault virtual address =3D 0xffff807ffffff108 > > apic id =3D 07 > > fault code =3D supervisor write data, page not present > > code segment =3D base 0x0, limit 0xfffff, type 0x1b > > apic id =3D 03 > > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > > fault virtual address =3D 0xffff807ffffff068 > > IOPL =3D 0 > > fault code =3D supervisor write data, page not present > > fault virtual address =3D 0xffff807ffffff018 > > instruction pointer =3D 0x8:0xffffffff803deedd > > instruction pointer =3D 0x8:0xffffffff803deedd > > Fatal trap 12: page fault while in kernel mode > > stack pointer =3D 0x10:0xffffffffbf901720 > > cpuid =3D 4; stack pointer =3D 0x10:0xffffffffb1c11720 > > processor eflags =3D frame pointer =3D 0x10:0xffffffffb1= c11730 > > interrupt enabled, resume, fault code =3D supervisor write da= ta, page not present > > IOPL =3D 0 > > instruction pointer =3D 0x8:0xffffffff803deedd > > current process =3D stack pointer =3D 0x10:0xffffffffd5= b25720 > > frame pointer =3D 0x10:0xffffffffbf901730 > > frame pointer =3D 0x10:0xffffffffd5b25730 > > code segment =3D base 0x0, limit 0xfffff, type 0x1b > > current process =3D =3D DPL 0, pres 1, lo= ng 1, def32 0, gran 1 > > code segment =3D base 0x0, limit 0xfffff, type 0x1b > > code segment =3D base 0x0, limit 0xfffff, type 0x1b > > 18747 (thr2) > > [thread pid 18747 tid 142909 ] > > Stopped at pmap_qremove+0x2d: movq $0,(%rcx,%rax,8) > > db> wh > > Tracing pid 18747 tid 142909 td 0xffffff0095710cd0 > > pmap_qremove() at pmap_qremove+0x2d > > vm_thread_new() at vm_thread_new+0x8d > > thread_init() at thread_init+0x16 > > slab_zalloc() at slab_zalloc+0x282 > > uma_zone_slab() at uma_zone_slab+0x1ae > > uma_zalloc_bucket() at uma_zalloc_bucket+0x19d > > uma_zalloc_arg() at uma_zalloc_arg+0x3a3 > > thread_alloc() at thread_alloc+0x1f > > create_thread() at create_thread+0xc5 > > kern_thr_new() at kern_thr_new+0x75 > > thr_new() at thr_new+0x62 > > syscall() at syscall+0x310 > > Xfast_syscall() at Xfast_syscall+0xab > > --- syscall (455, FreeBSD ELF64, thr_new), rip =3D 0x8007a1cac, rsp =3D= 0x7fffffffdef8, rbp =3D 0 --- > > db> show allpcpu > > Current CPU: 2 > >=20 > > cpuid =3D 0 > > curthread =3D 0xffffff00717e8290: pid 18944 "thr2" > > curpcb =3D 0xffffffffe2e33d50 > > fpcurthread =3D none > > idlethread =3D 0xffffff00b9aa6520: pid 17 "idle: cpu0" > > spin locks held: > >=20 > > cpuid =3D 1 > > curthread =3D 0xffffff0015e9d7b0: pid 18736 "thr2" > > curpcb =3D 0xffffffffbceefd50 > > fpcurthread =3D none > > idlethread =3D 0xffffff00b9aa6290: pid 16 "idle: cpu1" > > spin locks held: > > exclusive spin mutex sio r =3D 0 (0xffffffff806bf3c0) locked @ dev/sio/= sio.c:1390 > >=20 > > cpuid =3D 2 > > curthread =3D 0xffffff0095710cd0: pid 18747 "thr2" > > curpcb =3D 0xffffffffcfd7ed50 > > fpcurthread =3D none > > idlethread =3D 0xffffff00b9aa6000: pid 15 "idle: cpu2" > > spin locks held: > >=20 > > cpuid =3D 3 > > curthread =3D 0xffffff00ad485290: pid 18743 "thr2" > > curpcb =3D 0xffffffffd5b25d50 > > fpcurthread =3D none > > idlethread =3D 0xffffff00b9a63cd0: pid 14 "idle: cpu3" > > spin locks held: > >=20 > > cpuid =3D 4 > > curthread =3D 0xffffff0098fc7000: pid 18942 "thr2" > > curpcb =3D 0xffffffffc77fad50 > > fpcurthread =3D none > > idlethread =3D 0xffffff00b9a63000: pid 13 "idle: cpu4" > > spin locks held: > > exclusive spin mutex turnstile chain r =3D 0 (0xffffffff80613ed8) locke= d @ kern/subr_turnstile.c:489 > >=20 > > cpuid =3D 5 > > curthread =3D 0xffffff00215b8cd0: pid 18708 "thr2" > > curpcb =3D 0xffffffffb2b93d50 > > fpcurthread =3D none > > idlethread =3D 0xffffff00b9a8fcd0: pid 12 "idle: cpu5" > > spin locks held: > >=20 > > cpuid =3D 6 > > curthread =3D 0xffffff005b72d520: pid 18718 "thr2" > > curpcb =3D 0xffffffffb1c11d50 > > fpcurthread =3D none > > idlethread =3D 0xffffff00b9a8fa40: pid 11 "idle: cpu6" > > spin locks held: > >=20 > > cpuid =3D 7 > > curthread =3D 0xffffff0078aae7b0: pid 18782 "thr2" > > curpcb =3D 0xffffffffbf901d50 > > fpcurthread =3D none > > idlethread =3D 0xffffff00b9a8f7b0: pid 10 "idle: cpu7" > > spin locks held: > >=20 > > For some reason ddb doesn't give sensible backtraces for the running th= reads: > >=20 > > db> wh 18944 > > Tracing pid 18944 tid 130433 td 0xffffff009daa7290 > > fork_trampoline() at fork_trampoline > > db> wh 18736 > > Tracing pid 18736 tid 165977 td 0xffffff00632b2cd0 > > fork_trampoline() at fork_trampoline > > db> wh 18747 > > Tracing pid 18747 tid 165890 td 0xffffff0037403000 > > fork_trampoline() at fork_trampoline > > db> wh 18743 > > Tracing pid 18743 tid 165929 td 0xffffff004f59e000 > > fork_trampoline() at fork_trampoline > > db> wh 18942 > > Tracing pid 18942 tid 130531 td 0xffffff000a166520 > > fork_trampoline() at fork_trampoline > > db> wh 18708 > > Tracing pid 18708 tid 166269 td 0xffffff005c28a290 > > fork_trampoline() at fork_trampoline > > db> wh 18718 > > Tracing pid 18718 tid 111088 td 0xffffff0081f51a40 > > fork_trampoline() at fork_trampoline > > db> wh 18782 > > Tracing pid 18782 tid 166078 td 0xffffff0052b4c000 > > fork_trampoline() at fork_trampoline >=20 > Is the backtrace for faulted thread always the same ? And this is CURRENT= ? This is current, I haven't tried to reproduce on 6.x yet (but can do so). The trace through vm_thread_new() is always the same. Kris --FL5UXtIhxfXey3p5 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFF0g0mWry0BWjoQKURAo6AAKDiiY/F71gaauv+9yT9OjGiDElTcwCghuaF /zCNn/x2TJrN11orVWJapPU= =VUPw -----END PGP SIGNATURE----- --FL5UXtIhxfXey3p5--