Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 09 Jun 2014 10:30:26 -0500
From:      Alan Cox <alc@rice.edu>
To:        "freebsd-arm@freebsd.org" <freebsd-arm@freebsd.org>, alc@freebsd.org
Subject:   Re: svn commit: r266850 - in head/sys/arm/xscale: i80321 i8134x ixp425 pxa
Message-ID:  <5395D312.5000302@rice.edu>
In-Reply-To: <20140609042206.GQ31367@funkthat.com>
References:  <CAJ-Vmo=h39AYXhPFBx7dzUe%2BQtksPB8QMaAQcoqoM6UiKZe2XA@mail.gmail.com> <20140529173803.GA5294@ci0.org> <20140530063228.GD43976@funkthat.com> <5388ABF1.3030200@rice.edu> <20140601081153.GU43976@funkthat.com> <53935755.70908@rice.edu> <20140608003944.GK31367@funkthat.com> <53949D96.3060409@rice.edu> <20140608235611.GP31367@funkthat.com> <53950BB9.3090808@rice.edu> <20140609042206.GQ31367@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------030200060505090800030104
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

On 06/08/2014 23:22, John-Mark Gurney wrote:
> Alan Cox wrote this message on Sun, Jun 08, 2014 at 20:19 -0500:
>> On 06/08/2014 18:56, John-Mark Gurney wrote:
>>> Alan Cox wrote this message on Sun, Jun 08, 2014 at 12:29 -0500:
>>>> On 06/07/2014 19:39, John-Mark Gurney wrote:
>>>>> Alan Cox wrote this message on Sat, Jun 07, 2014 at 13:17 -0500:
>>>>>> On 06/01/2014 03:11, John-Mark Gurney wrote:
>>>>>>> Alan Cox wrote this message on Fri, May 30, 2014 at 11:04 -0500:
>>>>>>>> On 05/30/2014 01:32, John-Mark Gurney wrote:
>>>>>>>>> Olivier Houchard wrote this message on Thu, May 29, 2014 at 19:38 +0200:
>>>>>>>>>> On Thu, May 29, 2014 at 10:19:18AM -0700, Adrian Chadd wrote:
>>>>>>>>>>> On 29 May 2014 10:16, Olivier Houchard <cognet@ci0.org> wrote:
>>>>>>>>>>>> On Thu, May 29, 2014 at 10:14:53AM -0700, Adrian Chadd wrote:
>>>>>>>>>>>>> Have you tested this on xscale hardware?
>>>>>>>>>>>> Yeah, my two last commits were an attempt to get the AVILA kernel to boot
>>>>>>>>>>>> again.
>>>>>>>>>>> Woo! What can I provide to help you do this? :-)
>>>>>>>>>>>
>>>>>>>>>>> (Drinks? Food? Donations?)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> Drinks and food are always appreciated ;)
>>>>>>>>>> It almost boots for me now, except a few userland programs gets SIGSEGV or
>>>>>>>>>> SIGILL along the way, trying to figure out why.
>>>>>>>>> Thanks for fixing ddb... I'm getting panic messages again...  bad
>>>>>>>>> news is that my panic is still around:
>>>>>>>>> panic: vm_page_alloc: page 0xc07e73b0 is wired
>>>>>>>>>
>>>>>>>>> Though, interestingly, it looks like sparc64 has a similar panic:
>>>>>>>>> https://www.freebsd.org/cgi/query-pr.cgi?pr=187080
>>>>>>>>>
>>>>>>>>> kib, Alan, any clue to why this is happening?  Any suggestions as to
>>>>>>>>> help track it down?
>>>>>>>> I'm afraid not.  The dump below shows a perfectly normal, in-use page. 
>>>>>>>> If this page had actually been free prior to the vm_page_alloc() call,
>>>>>>>> then other fields, like dirty, would have been different.  In other
>>>>>>>> words, this isn't just a problem with the wire count.
>>>>>>>>
>>>>>>>> What object is vm_page_alloc() being performed on?
>>>>>>> Is this enough?  Or do you need more?
>>>>>>>
>>>>>>> panic: vm_page_alloc: page 0xc07e73b0 is wired, obj: 0xc1500b40
>>>>>>> KDB: enter: panic
>>>>>>> [ thread pid 781 tid 100051 ]
>>>>>>> Stopped at      kdb_enter+0x40: ldrb    r15, [r15, r15, ror r15]!
>>>>>>> db> show object/f 0xc1500b40
>>>>>>> Object 0xc1500b40: type=2, size=0xa, res=9, ref=0, flags=0x0 ruid -1 charge 0
>>>>>>>  sref=0, backing_object(0)=(0)+0x0
>>>>>>>   memory:=(off=0x0,page=0x8f0000),(off=0x1,page=0x8f1000),(off=0x2,page=0x8ee000),(off=0x3,page=0x8ef000),(off=0x4,page=0x8f3000),(off=0x5,page=0x8f4000)
>>>>>>>    ...(off=0x6,page=0x8fa000),(off=0x7,page=0x8fb000),(off=0x8,page=0x8fc000)
>>>>>>>
>>>>>>> If you need more, let me know what/how to get it, and I will...
>>>>>>>
>>>>>> Anyone who has seen the "wired page" panic, please try the attached
>>>>>> patch.  It introduces some new KASSERT()s that may help me to narrow
>>>>>> down the problem.  I haven't been able to trigger these KASSERT()s on
>>>>>> amd64, but the symptoms that you guys are reporting are consistent with
>>>>>> a bug that would trigger these KASSERT()s.
>>>>> Ok, it triggered the xxx one:
>>>>> Starting sendmail_msp_queue.
>>>>> panic: vm_phys_free_contig: xxx
>>>>> KDB: enter: panic
>>>>> [ thread pid 782 tid 100051 ]
>>>>> Stopped at      kdb_enter+0x40: ldrb    r15, [r15, r15, ror r15]!
>>>>> db> bt
>>>>> Tracing pid 782 tid 100051 td 0xc1470000
>>>>> db_trace_self() at db_trace_self
>>>>>          pc = 0xc0566ec8  lr = 0xc0566f54 (db_trace_thread+0x50)
>>>>>          sp = 0xcd830850  fp = 0xc03db694
>>>>> db_trace_thread() at db_trace_thread+0x50
>>>>>          pc = 0xc0566f54  lr = 0xc022cd14 (db_command_init+0x620)
>>>>>          sp = 0xcd8308b0  fp = 0xc03db694
>>>>> db_command_init() at db_command_init+0x620
>>>>>          pc = 0xc022cd14  lr = 0xc022c3ec (db_skip_to_eol+0x480)
>>>>>          sp = 0xcd8308c8  fp = 0xc03db694
>>>>>          r4 = 0xc0683c30  r5 = 0x00000000
>>>>> db_skip_to_eol() at db_skip_to_eol+0x480
>>>>>          pc = 0xc022c3ec  lr = 0xc022c554 (db_command_loop+0x5c)
>>>>>          sp = 0xcd830968  fp = 0xc03db694
>>>>>          r4 = 0xcd83097c  r5 = 0xc0683efc
>>>>>          r6 = 0x00000000  r7 = 0x00000000
>>>>>          r8 = 0x00000001 r10 = 0x600000d3
>>>>> db_command_loop() at db_command_loop+0x5c
>>>>>          pc = 0xc022c554  lr = 0xc022e99c (X_db_sym_numargs+0xec)
>>>>>          sp = 0xcd830970  fp = 0xc03db694
>>>>> X_db_sym_numargs() at X_db_sym_numargs+0xec
>>>>>          pc = 0xc022e99c  lr = 0xc03db8c4 (kdb_trap+0x94)
>>>>>          sp = 0xcd830a88  fp = 0xc03db694
>>>>>          r4 = 0x00000000
>>>>> kdb_trap() at kdb_trap+0x94
>>>>>          pc = 0xc03db8c4  lr = 0xc0578eb0 (undefinedinstruction+0x2c8)
>>>>>          sp = 0xcd830aa8  fp = 0xc03db694
>>>>>          r4 = 0x00000000  r5 = 0x00000000
>>>>>          r6 = 0x00000000  r7 = 0xcd830b20
>>>>>          r8 = 0xe7ffffff r10 = 0xe7ffffff
>>>>> undefinedinstruction() at undefinedinstruction+0x2c8
>>>>>          pc = 0xc0578eb0  lr = 0xc0568a0c (exception_exit)
>>>>>          sp = 0xcd830b20  fp = 0xc0613e70
>>>>>          r4 = 0xffffffff  r5 = 0xffff1004
>>>>>          r6 = 0xc06d0ebc  r7 = 0xcd830ba4
>>>>>          r8 = 0xc1470000  r9 = 0x00000013
>>>>>         r10 = 0x00000010
>>>>> exception_exit() at exception_exit
>>>>>          pc = 0xc0568a0c  lr = 0xc03db68c (kdb_enter+0x38)
>>>>>          sp = 0xcd830b70  fp = 0xc0613e70
>>>>>          r0 = 0x00000012  r1 = 0x60000013
>>>>>          r2 = 0xc06df2ac  r3 = 0xc06d0ee8
>>>>>          r4 = 0xc05e5258  r5 = 0xc06155e8
>>>>>          r6 = 0xc06d0ebc  r7 = 0xcd830ba4
>>>>>          r8 = 0xc1470000  r9 = 0x00000013
>>>>>         r10 = 0x00000010 r12 = 0xc05e2518
>>>>> kdb_enter() at kdb_enter+0x44
>>>>>          pc = 0xc03db698  lr = 0xc03aa094 (kern_reboot+0x948)
>>>>>          sp = 0xcd830b78  fp = 0xc0613e70
>>>>>          r4 = 0x00000100
>>>>> kern_reboot() at kern_reboot+0x948
>>>>>          pc = 0xc03aa094  lr = 0xc03aa164 (kassert_panic+0x68)
>>>>>          sp = 0xcd830b90  fp = 0xc0613e70
>>>>>          r4 = 0xc06155e8  r5 = 0xc07e74a0
>>>>>          r6 = 0xc07e6fa0  r7 = 0x00000004
>>>>>          r8 = 0x00000010
>>>>> kassert_panic() at kassert_panic+0x68
>>>>>          pc = 0xc03aa164  lr = 0xc055a0a8 (vm_phys_free_contig+0x8c)
>>>>>          sp = 0xcd830bb0  fp = 0xc0613e70
>>>>>          r0 = 0xc06155e8  r1 = 0xc07e6d20
>>>>>          r2 = 0xc06e6a70  r3 = 0x00000000
>>>>>          r4 = 0xc07e73b0
>>>>> vm_phys_free_contig() at vm_phys_free_contig+0x8c
>>>>>          pc = 0xc055a0a8  lr = 0xc055ca70 (vm_reserv_startup+0x4bc)
>>>>>          sp = 0xcd830bd0  fp = 0xc0613e70
>>>>>          r4 = 0xc08fb2cc  r5 = 0x00000008
>>>>>          r6 = 0x000000e8  r7 = 0xc08fb280
>>>>>          r8 = 0x00000005 r10 = 0x00000001
>>>>> vm_reserv_startup() at vm_reserv_startup+0x4bc
>>>>>          pc = 0xc055ca70  lr = 0xc055cb40 (vm_reserv_startup+0x58c)
>>>>>          sp = 0xcd830be8  fp = 0xc0613e70
>>>>>          r4 = 0xc08fb280  r5 = 0x00000000
>>>>>          r6 = 0xc14b7280  r7 = 0x00000040
>>>>>          r8 = 0x00000000
>>>>> vm_reserv_startup() at vm_reserv_startup+0x58c
>>>>>          pc = 0xc055cb40  lr = 0xc055ce08 (vm_reserv_reclaim_inactive+0x34)
>>>>>          sp = 0xcd830bf0  fp = 0xc0613e70
>>>>>          r4 = 0xc06e6550
>>>>> vm_reserv_reclaim_inactive() at vm_reserv_reclaim_inactive+0x34
>>>>>          pc = 0xc055ce08  lr = 0xc0554cb8 (vm_page_alloc+0x280)
>>>>>          sp = 0xcd830bf8  fp = 0xc0613e70
>>>>> vm_page_alloc() at vm_page_alloc+0x280
>>>>>          pc = 0xc0554cb8  lr = 0xc0540eb0 (vm_fault_hold+0x60c)
>>>>>          sp = 0xcd830c30  fp = 0xcd830dac
>>>>>          r4 = 0xc14b7280  r5 = 0xc0618d00
>>>>>          r6 = 0xcd830eb0  r7 = 0xc1470000
>>>>>          r8 = 0xcd830e60  r9 = 0x00000000
>>>>>         r10 = 0x00000000
>>>>> vm_fault_hold() at vm_fault_hold+0x60c
>>>>>          pc = 0xc0540eb0  lr = 0xc05426b8 (vm_fault+0x44)
>>>>>          sp = 0xcd830db0  fp = 0x00000002
>>>>>          r4 = 0xc14c8a0c  r5 = 0xc0618d00
>>>>>          r6 = 0xcd830eb0  r7 = 0xc1470000
>>>>>          r8 = 0xcd830e60  r9 = 0x00000001
>>>>>         r10 = 0x00000000
>>>>> vm_fault() at vm_fault+0x44
>>>>>          pc = 0xc05426b8  lr = 0xc05782d0 (data_abort_handler+0x35c)
>>>>>          sp = 0xcd830dc0  fp = 0x00000002
>>>>> data_abort_handler() at data_abort_handler+0x35c
>>>>>          pc = 0xc05782d0  lr = 0xc0568a0c (exception_exit)
>>>>>          sp = 0xcd830dc0  fp = 0x00000002
>>>>> data_abort_handler() at data_abort_handler+0x35c
>>>>>          pc = 0xc05782d0  lr = 0xc0568a0c (exception_exit)
>>>>>          sp = 0xcd830e60  fp = 0x20c43000
>>>>>          r4 = 0xffffffff  r5 = 0xffff1004
>>>>>          r6 = 0x00000000  r7 = 0x20443740
>>>>>          r8 = 0x0009b8e4  r9 = 0x00000001
>>>>>         r10 = 0x00000004
>>>>> exception_exit() at exception_exit
>>>>>          pc = 0xc0568a0c  lr = 0x204140d0 (0x204140d0)
>>>>>          sp = 0xcd830eb0  fp = 0x20c43000
>>>>>          r0 = 0x00000000  r1 = 0x20c4302c
>>>>>          r2 = 0x00000004  r3 = 0x00000000
>>>>>          r4 = 0x20446190  r5 = 0x20c4302c
>>>>>          r6 = 0x00000000  r7 = 0x20443740
>>>>>          r8 = 0x0009b8e4  r9 = 0x00000001
>>>>>         r10 = 0x00000004 r12 = 0x00000001
>>>>> Unable to unwind into user mode
>>>>>
>>>>> Hope this helps, let me know if you need anything else...
>>>>>
>>>> Please try the attached patch.  It adds another KASSERT() loop.
>>>>
>>>> Depending on which KASSERT() fires, that will tell us whether to look
>>>> deeper at this function or its caller for the source of the problem.
>>> Ok, that panic is:
>>> panic: vm_phys_free_contig: start 0xc07e6d20 21 24
>>>
>>> Let me know if you need any more info...  oh, btw, the last %u needed
>>> to be %lu since it was a u_long, not an unsigned...
>>>
>> Ok.  Here is the next debug patch.
> so, it's crashing in the same place:
> panic: vm_phys_free_contig: start 0xc07e6d20 21 24
>
> so, I commented out this KASSERT, and now it panics with:
> panic: vm_phys_free_contig: xxx 0xc07e6fa0 13 16
>
> so I commented out this KASSERT too, and it panics back w/ the original
> panic..  So it didn't hit the new KASSERT in vm_reserv_break...
>

Next patch...It should panic in vm_reserv_break this time and tell me if
the reservation being broken belongs to the same object as the inuse
page that is being inappropriately freed.



--------------030200060505090800030104
Content-Type: text/plain; charset=ISO-8859-15;
 name="arm_debug4.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="arm_debug4.patch"

Index: vm/vm_phys.c
===================================================================
--- vm/vm_phys.c	(revision 267209)
+++ vm/vm_phys.c	(working copy)
@@ -693,9 +693,16 @@ vm_phys_free_pages(vm_page_t m, int order)
 void
 vm_phys_free_contig(vm_page_t m, u_long npages)
 {
+	vm_page_t m_tmp;
 	u_int n;
 	int order;
 
+	for (m_tmp = m; m_tmp < &m[npages]; m_tmp++)
+		KASSERT(m_tmp->object == NULL ||
+		    (m_tmp->flags & PG_CACHED) != 0,
+		    ("vm_phys_free_contig: start %p %td %lu",
+		    m, m_tmp - m, npages));
+
 	/*
 	 * Avoid unnecessary coalescing by freeing the pages in the largest
 	 * possible power-of-two-sized subsets.
@@ -714,6 +721,11 @@ vm_phys_free_contig(vm_page_t m, u_long npages)
 		n = 1 << order;
 		if (npages < n)
 			break;
+		for (m_tmp = m; m_tmp < &m[n]; m_tmp++)
+			KASSERT(m_tmp->object == NULL ||
+			    (m_tmp->flags & PG_CACHED) != 0,
+			    ("vm_phys_free_contig: xxx %p %td %u",
+			    m, m_tmp - m, n));
 		vm_phys_free_pages(m, order);
 		m += n;
 	}
@@ -721,6 +733,11 @@ vm_phys_free_contig(vm_page_t m, u_long npages)
 	for (; npages > 0; npages -= n) {
 		order = flsl(npages) - 1;
 		n = 1 << order;
+		for (m_tmp = m; m_tmp < &m[n]; m_tmp++)
+			KASSERT(m_tmp->object == NULL ||
+			    (m_tmp->flags & PG_CACHED) != 0,
+			    ("vm_phys_free_contig: yyy %p %td %u",
+			    m, m_tmp - m, n));
 		vm_phys_free_pages(m, order);
 		m += n;
 	}
Index: vm/vm_reserv.c
===================================================================
--- vm/vm_reserv.c	(revision 267213)
+++ vm/vm_reserv.c	(working copy)
@@ -646,7 +646,8 @@ found:
 static void
 vm_reserv_break(vm_reserv_t rv, vm_page_t m)
 {
-	int begin_zeroes, hi, i, lo;
+	int begin_zeroes, hi, i, lo, x;
+	vm_object_t saved_object;
 
 	mtx_assert(&vm_page_queue_free_mtx, MA_OWNED);
 	KASSERT(rv->object != NULL,
@@ -653,6 +654,7 @@ vm_reserv_break(vm_reserv_t rv, vm_page_t m)
 	    ("vm_reserv_break: reserv %p is free", rv));
 	KASSERT(!rv->inpartpopq,
 	    ("vm_reserv_break: reserv %p's inpartpopq is TRUE", rv));
+	saved_object = rv->object;
 	LIST_REMOVE(rv, objq);
 	rv->object = NULL;
 	if (m != NULL) {
@@ -703,6 +705,19 @@ vm_reserv_break(vm_reserv_t rv, vm_page_t m)
 		if (i != NPOPMAP)
 			/* Convert from ffsl() to ordinary bit numbering. */
 			hi--;
+		for (x = begin_zeroes; x < NBPOPMAP * i + hi - begin_zeroes;
+		    x++) {
+			KASSERT(isclr(rv->popmap, x),
+			    ("vm_reserv_break: reserv %p popmap[%d]", rv, x));
+		}
+		for (x = begin_zeroes; x < NBPOPMAP * i + hi - begin_zeroes;
+		    x++) {
+			vm_page_t m_tmp = &rv->pages[x];
+			KASSERT(m_tmp->object == NULL ||
+			    (m_tmp->flags & PG_CACHED) != 0,
+			    ("vm_reservs_break: saved_object=%p x=%d m_tmp->object=%p (%d)",
+			    saved_object, x, m_tmp->object, m_tmp->object == kmem_object));
+		}
 		vm_phys_free_contig(&rv->pages[begin_zeroes], NBPOPMAP * i +
 		    hi - begin_zeroes);
 	} while (i < NPOPMAP);

--------------030200060505090800030104--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5395D312.5000302>