Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 08 Jun 2014 12:29:58 -0500
From:      Alan Cox <alc@rice.edu>
To:        "freebsd-arm@freebsd.org" <freebsd-arm@freebsd.org>, alc@freebsd.org
Subject:   Re: svn commit: r266850 - in head/sys/arm/xscale: i80321 i8134x ixp425 pxa
Message-ID:  <53949D96.3060409@rice.edu>
In-Reply-To: <20140608003944.GK31367@funkthat.com>
References:  <201405291656.s4TGudoD002868@svn.freebsd.org> <CAJ-Vmon2sup%2Bvd%2Bpi2fdjv5DaxS%2BxtG1FxmfSV%2B%2BrK1KydXRvw@mail.gmail.com> <20140529171641.GA5246@ci0.org> <CAJ-Vmo=h39AYXhPFBx7dzUe%2BQtksPB8QMaAQcoqoM6UiKZe2XA@mail.gmail.com> <20140529173803.GA5294@ci0.org> <20140530063228.GD43976@funkthat.com> <5388ABF1.3030200@rice.edu> <20140601081153.GU43976@funkthat.com> <53935755.70908@rice.edu> <20140608003944.GK31367@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------010506080105090907030104
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

On 06/07/2014 19:39, John-Mark Gurney wrote:
> Alan Cox wrote this message on Sat, Jun 07, 2014 at 13:17 -0500:
>> On 06/01/2014 03:11, John-Mark Gurney wrote:
>>> Alan Cox wrote this message on Fri, May 30, 2014 at 11:04 -0500:
>>>> On 05/30/2014 01:32, John-Mark Gurney wrote:
>>>>> Olivier Houchard wrote this message on Thu, May 29, 2014 at 19:38 +0200:
>>>>>> On Thu, May 29, 2014 at 10:19:18AM -0700, Adrian Chadd wrote:
>>>>>>> On 29 May 2014 10:16, Olivier Houchard <cognet@ci0.org> wrote:
>>>>>>>> On Thu, May 29, 2014 at 10:14:53AM -0700, Adrian Chadd wrote:
>>>>>>>>> Have you tested this on xscale hardware?
>>>>>>>> Yeah, my two last commits were an attempt to get the AVILA kernel to boot
>>>>>>>> again.
>>>>>>> Woo! What can I provide to help you do this? :-)
>>>>>>>
>>>>>>> (Drinks? Food? Donations?)
>>>>>>>
>>>>>>>
>>>>>> Drinks and food are always appreciated ;)
>>>>>> It almost boots for me now, except a few userland programs gets SIGSEGV or
>>>>>> SIGILL along the way, trying to figure out why.
>>>>> Thanks for fixing ddb... I'm getting panic messages again...  bad
>>>>> news is that my panic is still around:
>>>>> panic: vm_page_alloc: page 0xc07e73b0 is wired
>>>>>
>>>>> Though, interestingly, it looks like sparc64 has a similar panic:
>>>>> https://www.freebsd.org/cgi/query-pr.cgi?pr=187080
>>>>>
>>>>> kib, Alan, any clue to why this is happening?  Any suggestions as to
>>>>> help track it down?
>>>> I'm afraid not.  The dump below shows a perfectly normal, in-use page. 
>>>> If this page had actually been free prior to the vm_page_alloc() call,
>>>> then other fields, like dirty, would have been different.  In other
>>>> words, this isn't just a problem with the wire count.
>>>>
>>>> What object is vm_page_alloc() being performed on?
>>> Is this enough?  Or do you need more?
>>>
>>> panic: vm_page_alloc: page 0xc07e73b0 is wired, obj: 0xc1500b40
>>> KDB: enter: panic
>>> [ thread pid 781 tid 100051 ]
>>> Stopped at      kdb_enter+0x40: ldrb    r15, [r15, r15, ror r15]!
>>> db> show object/f 0xc1500b40
>>> Object 0xc1500b40: type=2, size=0xa, res=9, ref=0, flags=0x0 ruid -1 charge 0
>>>  sref=0, backing_object(0)=(0)+0x0
>>>   memory:=(off=0x0,page=0x8f0000),(off=0x1,page=0x8f1000),(off=0x2,page=0x8ee000),(off=0x3,page=0x8ef000),(off=0x4,page=0x8f3000),(off=0x5,page=0x8f4000)
>>>    ...(off=0x6,page=0x8fa000),(off=0x7,page=0x8fb000),(off=0x8,page=0x8fc000)
>>>
>>> If you need more, let me know what/how to get it, and I will...
>>>
>>
>> Anyone who has seen the "wired page" panic, please try the attached
>> patch.  It introduces some new KASSERT()s that may help me to narrow
>> down the problem.  I haven't been able to trigger these KASSERT()s on
>> amd64, but the symptoms that you guys are reporting are consistent with
>> a bug that would trigger these KASSERT()s.
> Ok, it triggered the xxx one:
> Starting sendmail_msp_queue.
> panic: vm_phys_free_contig: xxx
> KDB: enter: panic
> [ thread pid 782 tid 100051 ]
> Stopped at      kdb_enter+0x40: ldrb    r15, [r15, r15, ror r15]!
> db> bt
> Tracing pid 782 tid 100051 td 0xc1470000
> db_trace_self() at db_trace_self
>          pc = 0xc0566ec8  lr = 0xc0566f54 (db_trace_thread+0x50)
>          sp = 0xcd830850  fp = 0xc03db694
> db_trace_thread() at db_trace_thread+0x50
>          pc = 0xc0566f54  lr = 0xc022cd14 (db_command_init+0x620)
>          sp = 0xcd8308b0  fp = 0xc03db694
> db_command_init() at db_command_init+0x620
>          pc = 0xc022cd14  lr = 0xc022c3ec (db_skip_to_eol+0x480)
>          sp = 0xcd8308c8  fp = 0xc03db694
>          r4 = 0xc0683c30  r5 = 0x00000000
> db_skip_to_eol() at db_skip_to_eol+0x480
>          pc = 0xc022c3ec  lr = 0xc022c554 (db_command_loop+0x5c)
>          sp = 0xcd830968  fp = 0xc03db694
>          r4 = 0xcd83097c  r5 = 0xc0683efc
>          r6 = 0x00000000  r7 = 0x00000000
>          r8 = 0x00000001 r10 = 0x600000d3
> db_command_loop() at db_command_loop+0x5c
>          pc = 0xc022c554  lr = 0xc022e99c (X_db_sym_numargs+0xec)
>          sp = 0xcd830970  fp = 0xc03db694
> X_db_sym_numargs() at X_db_sym_numargs+0xec
>          pc = 0xc022e99c  lr = 0xc03db8c4 (kdb_trap+0x94)
>          sp = 0xcd830a88  fp = 0xc03db694
>          r4 = 0x00000000
> kdb_trap() at kdb_trap+0x94
>          pc = 0xc03db8c4  lr = 0xc0578eb0 (undefinedinstruction+0x2c8)
>          sp = 0xcd830aa8  fp = 0xc03db694
>          r4 = 0x00000000  r5 = 0x00000000
>          r6 = 0x00000000  r7 = 0xcd830b20
>          r8 = 0xe7ffffff r10 = 0xe7ffffff
> undefinedinstruction() at undefinedinstruction+0x2c8
>          pc = 0xc0578eb0  lr = 0xc0568a0c (exception_exit)
>          sp = 0xcd830b20  fp = 0xc0613e70
>          r4 = 0xffffffff  r5 = 0xffff1004
>          r6 = 0xc06d0ebc  r7 = 0xcd830ba4
>          r8 = 0xc1470000  r9 = 0x00000013
>         r10 = 0x00000010
> exception_exit() at exception_exit
>          pc = 0xc0568a0c  lr = 0xc03db68c (kdb_enter+0x38)
>          sp = 0xcd830b70  fp = 0xc0613e70
>          r0 = 0x00000012  r1 = 0x60000013
>          r2 = 0xc06df2ac  r3 = 0xc06d0ee8
>          r4 = 0xc05e5258  r5 = 0xc06155e8
>          r6 = 0xc06d0ebc  r7 = 0xcd830ba4
>          r8 = 0xc1470000  r9 = 0x00000013
>         r10 = 0x00000010 r12 = 0xc05e2518
> kdb_enter() at kdb_enter+0x44
>          pc = 0xc03db698  lr = 0xc03aa094 (kern_reboot+0x948)
>          sp = 0xcd830b78  fp = 0xc0613e70
>          r4 = 0x00000100
> kern_reboot() at kern_reboot+0x948
>          pc = 0xc03aa094  lr = 0xc03aa164 (kassert_panic+0x68)
>          sp = 0xcd830b90  fp = 0xc0613e70
>          r4 = 0xc06155e8  r5 = 0xc07e74a0
>          r6 = 0xc07e6fa0  r7 = 0x00000004
>          r8 = 0x00000010
> kassert_panic() at kassert_panic+0x68
>          pc = 0xc03aa164  lr = 0xc055a0a8 (vm_phys_free_contig+0x8c)
>          sp = 0xcd830bb0  fp = 0xc0613e70
>          r0 = 0xc06155e8  r1 = 0xc07e6d20
>          r2 = 0xc06e6a70  r3 = 0x00000000
>          r4 = 0xc07e73b0
> vm_phys_free_contig() at vm_phys_free_contig+0x8c
>          pc = 0xc055a0a8  lr = 0xc055ca70 (vm_reserv_startup+0x4bc)
>          sp = 0xcd830bd0  fp = 0xc0613e70
>          r4 = 0xc08fb2cc  r5 = 0x00000008
>          r6 = 0x000000e8  r7 = 0xc08fb280
>          r8 = 0x00000005 r10 = 0x00000001
> vm_reserv_startup() at vm_reserv_startup+0x4bc
>          pc = 0xc055ca70  lr = 0xc055cb40 (vm_reserv_startup+0x58c)
>          sp = 0xcd830be8  fp = 0xc0613e70
>          r4 = 0xc08fb280  r5 = 0x00000000
>          r6 = 0xc14b7280  r7 = 0x00000040
>          r8 = 0x00000000
> vm_reserv_startup() at vm_reserv_startup+0x58c
>          pc = 0xc055cb40  lr = 0xc055ce08 (vm_reserv_reclaim_inactive+0x34)
>          sp = 0xcd830bf0  fp = 0xc0613e70
>          r4 = 0xc06e6550
> vm_reserv_reclaim_inactive() at vm_reserv_reclaim_inactive+0x34
>          pc = 0xc055ce08  lr = 0xc0554cb8 (vm_page_alloc+0x280)
>          sp = 0xcd830bf8  fp = 0xc0613e70
> vm_page_alloc() at vm_page_alloc+0x280
>          pc = 0xc0554cb8  lr = 0xc0540eb0 (vm_fault_hold+0x60c)
>          sp = 0xcd830c30  fp = 0xcd830dac
>          r4 = 0xc14b7280  r5 = 0xc0618d00
>          r6 = 0xcd830eb0  r7 = 0xc1470000
>          r8 = 0xcd830e60  r9 = 0x00000000
>         r10 = 0x00000000
> vm_fault_hold() at vm_fault_hold+0x60c
>          pc = 0xc0540eb0  lr = 0xc05426b8 (vm_fault+0x44)
>          sp = 0xcd830db0  fp = 0x00000002
>          r4 = 0xc14c8a0c  r5 = 0xc0618d00
>          r6 = 0xcd830eb0  r7 = 0xc1470000
>          r8 = 0xcd830e60  r9 = 0x00000001
>         r10 = 0x00000000
> vm_fault() at vm_fault+0x44
>          pc = 0xc05426b8  lr = 0xc05782d0 (data_abort_handler+0x35c)
>          sp = 0xcd830dc0  fp = 0x00000002
> data_abort_handler() at data_abort_handler+0x35c
>          pc = 0xc05782d0  lr = 0xc0568a0c (exception_exit)
>          sp = 0xcd830dc0  fp = 0x00000002
> data_abort_handler() at data_abort_handler+0x35c
>          pc = 0xc05782d0  lr = 0xc0568a0c (exception_exit)
>          sp = 0xcd830e60  fp = 0x20c43000
>          r4 = 0xffffffff  r5 = 0xffff1004
>          r6 = 0x00000000  r7 = 0x20443740
>          r8 = 0x0009b8e4  r9 = 0x00000001
>         r10 = 0x00000004
> exception_exit() at exception_exit
>          pc = 0xc0568a0c  lr = 0x204140d0 (0x204140d0)
>          sp = 0xcd830eb0  fp = 0x20c43000
>          r0 = 0x00000000  r1 = 0x20c4302c
>          r2 = 0x00000004  r3 = 0x00000000
>          r4 = 0x20446190  r5 = 0x20c4302c
>          r6 = 0x00000000  r7 = 0x20443740
>          r8 = 0x0009b8e4  r9 = 0x00000001
>         r10 = 0x00000004 r12 = 0x00000001
> Unable to unwind into user mode
>
> Hope this helps, let me know if you need anything else...
>

Please try the attached patch.  It adds another KASSERT() loop.

Depending on which KASSERT() fires, that will tell us whether to look
deeper at this function or its caller for the source of the problem.

Alan



--------------010506080105090907030104
Content-Type: text/plain; charset=ISO-8859-15;
 name="arm_debug2.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="arm_debug2.patch"

Index: vm/vm_phys.c
===================================================================
--- vm/vm_phys.c	(revision 267209)
+++ vm/vm_phys.c	(working copy)
@@ -693,9 +693,16 @@ vm_phys_free_pages(vm_page_t m, int order)
 void
 vm_phys_free_contig(vm_page_t m, u_long npages)
 {
+	vm_page_t m_tmp;
 	u_int n;
 	int order;
 
+	for (m_tmp = m; m_tmp < &m[npages]; m_tmp++)
+		KASSERT(m_tmp->object == NULL ||
+		    (m_tmp->flags & PG_CACHED) != 0,
+		    ("vm_phys_free_contig: start %p %td %u",
+		    m, m_tmp - m, npages));
+
 	/*
 	 * Avoid unnecessary coalescing by freeing the pages in the largest
 	 * possible power-of-two-sized subsets.
@@ -714,6 +721,11 @@ vm_phys_free_contig(vm_page_t m, u_long npages)
 		n = 1 << order;
 		if (npages < n)
 			break;
+		for (m_tmp = m; m_tmp < &m[n]; m_tmp++)
+			KASSERT(m_tmp->object == NULL ||
+			    (m_tmp->flags & PG_CACHED) != 0,
+			    ("vm_phys_free_contig: xxx %p %td %u",
+			    m, m_tmp - m, n));
 		vm_phys_free_pages(m, order);
 		m += n;
 	}
@@ -721,6 +733,11 @@ vm_phys_free_contig(vm_page_t m, u_long npages)
 	for (; npages > 0; npages -= n) {
 		order = flsl(npages) - 1;
 		n = 1 << order;
+		for (m_tmp = m; m_tmp < &m[n]; m_tmp++)
+			KASSERT(m_tmp->object == NULL ||
+			    (m_tmp->flags & PG_CACHED) != 0,
+			    ("vm_phys_free_contig: yyy %p %td %u",
+			    m, m_tmp - m, n));
 		vm_phys_free_pages(m, order);
 		m += n;
 	}

--------------010506080105090907030104--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?53949D96.3060409>