Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 9 Jun 2014 14:23:31 -0500
From:      Alan Cox <alc@rice.edu>
To:        John-Mark Gurney <jmg@funkthat.com>
Cc:        alc@freebsd.org, "freebsd-arm@freebsd.org" <freebsd-arm@freebsd.org>
Subject:   Re: svn commit: r266850 - in head/sys/arm/xscale: i80321 i8134x ixp425 pxa
Message-ID:  <9100CDFA-0C40-4BC8-AA9C-1DE37EEA6208@rice.edu>
In-Reply-To: <20140609174431.GT31367@funkthat.com>
References:  <20140601081153.GU43976@funkthat.com> <53935755.70908@rice.edu> <20140608003944.GK31367@funkthat.com> <53949D96.3060409@rice.edu> <20140608235611.GP31367@funkthat.com> <53950BB9.3090808@rice.edu> <20140609042206.GQ31367@funkthat.com> <5395D312.5000302@rice.edu> <20140609163302.GS31367@funkthat.com> <5395E725.7020807@rice.edu> <20140609174431.GT31367@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On Jun 9, 2014, at 12:44 PM, John-Mark Gurney wrote:

> Alan Cox wrote this message on Mon, Jun 09, 2014 at 11:56 -0500:
>> On 06/09/2014 11:33, John-Mark Gurney wrote:
>>> Alan Cox wrote this message on Mon, Jun 09, 2014 at 10:30 -0500:
>>>> On 06/08/2014 23:22, John-Mark Gurney wrote:
>>>>> Alan Cox wrote this message on Sun, Jun 08, 2014 at 20:19 -0500:
>>>>>> On 06/08/2014 18:56, John-Mark Gurney wrote:
>>>>>>> Alan Cox wrote this message on Sun, Jun 08, 2014 at 12:29 -0500:
>>>>>>>> On 06/07/2014 19:39, John-Mark Gurney wrote:
>>>>>>>>> Alan Cox wrote this message on Sat, Jun 07, 2014 at 13:17 =
-0500:
>>>>>>>>>> On 06/01/2014 03:11, John-Mark Gurney wrote:
>>>>>>>>>>> Alan Cox wrote this message on Fri, May 30, 2014 at 11:04 =
-0500:
>>>>>>>>>>>> On 05/30/2014 01:32, John-Mark Gurney wrote:
>>>>>>>>>>>>> Olivier Houchard wrote this message on Thu, May 29, 2014 =
at 19:38 +0200:
>>>>>>>>>>>>>> On Thu, May 29, 2014 at 10:19:18AM -0700, Adrian Chadd =
wrote:
>>>>>>>>>>>>>>> On 29 May 2014 10:16, Olivier Houchard <cognet@ci0.org> =
wrote:
>>>>>>>>>>>>>>>> On Thu, May 29, 2014 at 10:14:53AM -0700, Adrian Chadd =
wrote:
>>>>>>>>>>>>>>>>> Have you tested this on xscale hardware?
>>>>>>>>>>>>>>>> Yeah, my two last commits were an attempt to get the =
AVILA kernel to boot
>>>>>>>>>>>>>>>> again.
>>>>>>>>>>>>>>> Woo! What can I provide to help you do this? :-)
>>>>>>>>>>>>>>>=20
>>>>>>>>>>>>>>> (Drinks? Food? Donations?)
>>>>>>>>>>>>>>>=20
>>>>>>>>>>>>>>>=20
>>>>>>>>>>>>>> Drinks and food are always appreciated ;)
>>>>>>>>>>>>>> It almost boots for me now, except a few userland =
programs gets SIGSEGV or
>>>>>>>>>>>>>> SIGILL along the way, trying to figure out why.
>>>>>>>>>>>>> Thanks for fixing ddb... I'm getting panic messages =
again...  bad
>>>>>>>>>>>>> news is that my panic is still around:
>>>>>>>>>>>>> panic: vm_page_alloc: page 0xc07e73b0 is wired
>>>>>>>>>>>>>=20
>>>>>>>>>>>>> Though, interestingly, it looks like sparc64 has a similar =
panic:
>>>>>>>>>>>>> https://www.freebsd.org/cgi/query-pr.cgi?pr=3D187080
>>>>>>>>>>>>>=20
>>>>>>>>>>>>> kib, Alan, any clue to why this is happening?  Any =
suggestions as to
>>>>>>>>>>>>> help track it down?
>>>>>>>>>>>> I'm afraid not.  The dump below shows a perfectly normal, =
in-use page.=20
>>>>>>>>>>>> If this page had actually been free prior to the =
vm_page_alloc() call,
>>>>>>>>>>>> then other fields, like dirty, would have been different.  =
In other
>>>>>>>>>>>> words, this isn't just a problem with the wire count.
>>>>>>>>>>>>=20
>>>>>>>>>>>> What object is vm_page_alloc() being performed on?
>>>>>>>>>>> Is this enough?  Or do you need more?
>>>>>>>>>>>=20
>>>>>>>>>>> panic: vm_page_alloc: page 0xc07e73b0 is wired, obj: =
0xc1500b40
>>>>>>>>>>> KDB: enter: panic
>>>>>>>>>>> [ thread pid 781 tid 100051 ]
>>>>>>>>>>> Stopped at      kdb_enter+0x40: ldrb    r15, [r15, r15, ror =
r15]!
>>>>>>>>>>> db> show object/f 0xc1500b40
>>>>>>>>>>> Object 0xc1500b40: type=3D2, size=3D0xa, res=3D9, ref=3D0, =
flags=3D0x0 ruid -1 charge 0
>>>>>>>>>>> sref=3D0, backing_object(0)=3D(0)+0x0
>>>>>>>>>>>  =
memory:=3D(off=3D0x0,page=3D0x8f0000),(off=3D0x1,page=3D0x8f1000),(off=3D0=
x2,page=3D0x8ee000),(off=3D0x3,page=3D0x8ef000),(off=3D0x4,page=3D0x8f3000=
),(off=3D0x5,page=3D0x8f4000)
>>>>>>>>>>>   =
...(off=3D0x6,page=3D0x8fa000),(off=3D0x7,page=3D0x8fb000),(off=3D0x8,page=
=3D0x8fc000)
>>>>>>>>>>>=20
>>>>>>>>>>> If you need more, let me know what/how to get it, and I =
will...
>>>>>>>>>>>=20
>>>>>>>>>> Anyone who has seen the "wired page" panic, please try the =
attached
>>>>>>>>>> patch.  It introduces some new KASSERT()s that may help me to =
narrow
>>>>>>>>>> down the problem.  I haven't been able to trigger these =
KASSERT()s on
>>>>>>>>>> amd64, but the symptoms that you guys are reporting are =
consistent with
>>>>>>>>>> a bug that would trigger these KASSERT()s.
>>>>>>>>> Ok, it triggered the xxx one:
>>>>>>>>> Starting sendmail_msp_queue.
>>>>>>>>> panic: vm_phys_free_contig: xxx
>>>>>>>>> KDB: enter: panic
>>>>>>>>> [ thread pid 782 tid 100051 ]
>>>>>>>>> Stopped at      kdb_enter+0x40: ldrb    r15, [r15, r15, ror =
r15]!
>>>>>>>>> db> bt
>>>>>>>>> Tracing pid 782 tid 100051 td 0xc1470000
>>>>>>>>> db_trace_self() at db_trace_self
>>>>>>>>>         pc =3D 0xc0566ec8  lr =3D 0xc0566f54 =
(db_trace_thread+0x50)
>>>>>>>>>         sp =3D 0xcd830850  fp =3D 0xc03db694
>>>>>>>>> db_trace_thread() at db_trace_thread+0x50
>>>>>>>>>         pc =3D 0xc0566f54  lr =3D 0xc022cd14 =
(db_command_init+0x620)
>>>>>>>>>         sp =3D 0xcd8308b0  fp =3D 0xc03db694
>>>>>>>>> db_command_init() at db_command_init+0x620
>>>>>>>>>         pc =3D 0xc022cd14  lr =3D 0xc022c3ec =
(db_skip_to_eol+0x480)
>>>>>>>>>         sp =3D 0xcd8308c8  fp =3D 0xc03db694
>>>>>>>>>         r4 =3D 0xc0683c30  r5 =3D 0x00000000
>>>>>>>>> db_skip_to_eol() at db_skip_to_eol+0x480
>>>>>>>>>         pc =3D 0xc022c3ec  lr =3D 0xc022c554 =
(db_command_loop+0x5c)
>>>>>>>>>         sp =3D 0xcd830968  fp =3D 0xc03db694
>>>>>>>>>         r4 =3D 0xcd83097c  r5 =3D 0xc0683efc
>>>>>>>>>         r6 =3D 0x00000000  r7 =3D 0x00000000
>>>>>>>>>         r8 =3D 0x00000001 r10 =3D 0x600000d3
>>>>>>>>> db_command_loop() at db_command_loop+0x5c
>>>>>>>>>         pc =3D 0xc022c554  lr =3D 0xc022e99c =
(X_db_sym_numargs+0xec)
>>>>>>>>>         sp =3D 0xcd830970  fp =3D 0xc03db694
>>>>>>>>> X_db_sym_numargs() at X_db_sym_numargs+0xec
>>>>>>>>>         pc =3D 0xc022e99c  lr =3D 0xc03db8c4 (kdb_trap+0x94)
>>>>>>>>>         sp =3D 0xcd830a88  fp =3D 0xc03db694
>>>>>>>>>         r4 =3D 0x00000000
>>>>>>>>> kdb_trap() at kdb_trap+0x94
>>>>>>>>>         pc =3D 0xc03db8c4  lr =3D 0xc0578eb0 =
(undefinedinstruction+0x2c8)
>>>>>>>>>         sp =3D 0xcd830aa8  fp =3D 0xc03db694
>>>>>>>>>         r4 =3D 0x00000000  r5 =3D 0x00000000
>>>>>>>>>         r6 =3D 0x00000000  r7 =3D 0xcd830b20
>>>>>>>>>         r8 =3D 0xe7ffffff r10 =3D 0xe7ffffff
>>>>>>>>> undefinedinstruction() at undefinedinstruction+0x2c8
>>>>>>>>>         pc =3D 0xc0578eb0  lr =3D 0xc0568a0c (exception_exit)
>>>>>>>>>         sp =3D 0xcd830b20  fp =3D 0xc0613e70
>>>>>>>>>         r4 =3D 0xffffffff  r5 =3D 0xffff1004
>>>>>>>>>         r6 =3D 0xc06d0ebc  r7 =3D 0xcd830ba4
>>>>>>>>>         r8 =3D 0xc1470000  r9 =3D 0x00000013
>>>>>>>>>        r10 =3D 0x00000010
>>>>>>>>> exception_exit() at exception_exit
>>>>>>>>>         pc =3D 0xc0568a0c  lr =3D 0xc03db68c (kdb_enter+0x38)
>>>>>>>>>         sp =3D 0xcd830b70  fp =3D 0xc0613e70
>>>>>>>>>         r0 =3D 0x00000012  r1 =3D 0x60000013
>>>>>>>>>         r2 =3D 0xc06df2ac  r3 =3D 0xc06d0ee8
>>>>>>>>>         r4 =3D 0xc05e5258  r5 =3D 0xc06155e8
>>>>>>>>>         r6 =3D 0xc06d0ebc  r7 =3D 0xcd830ba4
>>>>>>>>>         r8 =3D 0xc1470000  r9 =3D 0x00000013
>>>>>>>>>        r10 =3D 0x00000010 r12 =3D 0xc05e2518
>>>>>>>>> kdb_enter() at kdb_enter+0x44
>>>>>>>>>         pc =3D 0xc03db698  lr =3D 0xc03aa094 =
(kern_reboot+0x948)
>>>>>>>>>         sp =3D 0xcd830b78  fp =3D 0xc0613e70
>>>>>>>>>         r4 =3D 0x00000100
>>>>>>>>> kern_reboot() at kern_reboot+0x948
>>>>>>>>>         pc =3D 0xc03aa094  lr =3D 0xc03aa164 =
(kassert_panic+0x68)
>>>>>>>>>         sp =3D 0xcd830b90  fp =3D 0xc0613e70
>>>>>>>>>         r4 =3D 0xc06155e8  r5 =3D 0xc07e74a0
>>>>>>>>>         r6 =3D 0xc07e6fa0  r7 =3D 0x00000004
>>>>>>>>>         r8 =3D 0x00000010
>>>>>>>>> kassert_panic() at kassert_panic+0x68
>>>>>>>>>         pc =3D 0xc03aa164  lr =3D 0xc055a0a8 =
(vm_phys_free_contig+0x8c)
>>>>>>>>>         sp =3D 0xcd830bb0  fp =3D 0xc0613e70
>>>>>>>>>         r0 =3D 0xc06155e8  r1 =3D 0xc07e6d20
>>>>>>>>>         r2 =3D 0xc06e6a70  r3 =3D 0x00000000
>>>>>>>>>         r4 =3D 0xc07e73b0
>>>>>>>>> vm_phys_free_contig() at vm_phys_free_contig+0x8c
>>>>>>>>>         pc =3D 0xc055a0a8  lr =3D 0xc055ca70 =
(vm_reserv_startup+0x4bc)
>>>>>>>>>         sp =3D 0xcd830bd0  fp =3D 0xc0613e70
>>>>>>>>>         r4 =3D 0xc08fb2cc  r5 =3D 0x00000008
>>>>>>>>>         r6 =3D 0x000000e8  r7 =3D 0xc08fb280
>>>>>>>>>         r8 =3D 0x00000005 r10 =3D 0x00000001
>>>>>>>>> vm_reserv_startup() at vm_reserv_startup+0x4bc
>>>>>>>>>         pc =3D 0xc055ca70  lr =3D 0xc055cb40 =
(vm_reserv_startup+0x58c)
>>>>>>>>>         sp =3D 0xcd830be8  fp =3D 0xc0613e70
>>>>>>>>>         r4 =3D 0xc08fb280  r5 =3D 0x00000000
>>>>>>>>>         r6 =3D 0xc14b7280  r7 =3D 0x00000040
>>>>>>>>>         r8 =3D 0x00000000
>>>>>>>>> vm_reserv_startup() at vm_reserv_startup+0x58c
>>>>>>>>>         pc =3D 0xc055cb40  lr =3D 0xc055ce08 =
(vm_reserv_reclaim_inactive+0x34)
>>>>>>>>>         sp =3D 0xcd830bf0  fp =3D 0xc0613e70
>>>>>>>>>         r4 =3D 0xc06e6550
>>>>>>>>> vm_reserv_reclaim_inactive() at =
vm_reserv_reclaim_inactive+0x34
>>>>>>>>>         pc =3D 0xc055ce08  lr =3D 0xc0554cb8 =
(vm_page_alloc+0x280)
>>>>>>>>>         sp =3D 0xcd830bf8  fp =3D 0xc0613e70
>>>>>>>>> vm_page_alloc() at vm_page_alloc+0x280
>>>>>>>>>         pc =3D 0xc0554cb8  lr =3D 0xc0540eb0 =
(vm_fault_hold+0x60c)
>>>>>>>>>         sp =3D 0xcd830c30  fp =3D 0xcd830dac
>>>>>>>>>         r4 =3D 0xc14b7280  r5 =3D 0xc0618d00
>>>>>>>>>         r6 =3D 0xcd830eb0  r7 =3D 0xc1470000
>>>>>>>>>         r8 =3D 0xcd830e60  r9 =3D 0x00000000
>>>>>>>>>        r10 =3D 0x00000000
>>>>>>>>> vm_fault_hold() at vm_fault_hold+0x60c
>>>>>>>>>         pc =3D 0xc0540eb0  lr =3D 0xc05426b8 (vm_fault+0x44)
>>>>>>>>>         sp =3D 0xcd830db0  fp =3D 0x00000002
>>>>>>>>>         r4 =3D 0xc14c8a0c  r5 =3D 0xc0618d00
>>>>>>>>>         r6 =3D 0xcd830eb0  r7 =3D 0xc1470000
>>>>>>>>>         r8 =3D 0xcd830e60  r9 =3D 0x00000001
>>>>>>>>>        r10 =3D 0x00000000
>>>>>>>>> vm_fault() at vm_fault+0x44
>>>>>>>>>         pc =3D 0xc05426b8  lr =3D 0xc05782d0 =
(data_abort_handler+0x35c)
>>>>>>>>>         sp =3D 0xcd830dc0  fp =3D 0x00000002
>>>>>>>>> data_abort_handler() at data_abort_handler+0x35c
>>>>>>>>>         pc =3D 0xc05782d0  lr =3D 0xc0568a0c (exception_exit)
>>>>>>>>>         sp =3D 0xcd830dc0  fp =3D 0x00000002
>>>>>>>>> data_abort_handler() at data_abort_handler+0x35c
>>>>>>>>>         pc =3D 0xc05782d0  lr =3D 0xc0568a0c (exception_exit)
>>>>>>>>>         sp =3D 0xcd830e60  fp =3D 0x20c43000
>>>>>>>>>         r4 =3D 0xffffffff  r5 =3D 0xffff1004
>>>>>>>>>         r6 =3D 0x00000000  r7 =3D 0x20443740
>>>>>>>>>         r8 =3D 0x0009b8e4  r9 =3D 0x00000001
>>>>>>>>>        r10 =3D 0x00000004
>>>>>>>>> exception_exit() at exception_exit
>>>>>>>>>         pc =3D 0xc0568a0c  lr =3D 0x204140d0 (0x204140d0)
>>>>>>>>>         sp =3D 0xcd830eb0  fp =3D 0x20c43000
>>>>>>>>>         r0 =3D 0x00000000  r1 =3D 0x20c4302c
>>>>>>>>>         r2 =3D 0x00000004  r3 =3D 0x00000000
>>>>>>>>>         r4 =3D 0x20446190  r5 =3D 0x20c4302c
>>>>>>>>>         r6 =3D 0x00000000  r7 =3D 0x20443740
>>>>>>>>>         r8 =3D 0x0009b8e4  r9 =3D 0x00000001
>>>>>>>>>        r10 =3D 0x00000004 r12 =3D 0x00000001
>>>>>>>>> Unable to unwind into user mode
>>>>>>>>>=20
>>>>>>>>> Hope this helps, let me know if you need anything else...
>>>>>>>>>=20
>>>>>>>> Please try the attached patch.  It adds another KASSERT() loop.
>>>>>>>>=20
>>>>>>>> Depending on which KASSERT() fires, that will tell us whether =
to look
>>>>>>>> deeper at this function or its caller for the source of the =
problem.
>>>>>>> Ok, that panic is:
>>>>>>> panic: vm_phys_free_contig: start 0xc07e6d20 21 24
>>>>>>>=20
>>>>>>> Let me know if you need any more info...  oh, btw, the last %u =
needed
>>>>>>> to be %lu since it was a u_long, not an unsigned...
>>>>>>>=20
>>>>>> Ok.  Here is the next debug patch.
>>>>> so, it's crashing in the same place:
>>>>> panic: vm_phys_free_contig: start 0xc07e6d20 21 24
>>>>>=20
>>>>> so, I commented out this KASSERT, and now it panics with:
>>>>> panic: vm_phys_free_contig: xxx 0xc07e6fa0 13 16
>>>>>=20
>>>>> so I commented out this KASSERT too, and it panics back w/ the =
original
>>>>> panic..  So it didn't hit the new KASSERT in vm_reserv_break...
>>>> Next patch...It should panic in vm_reserv_break this time and tell =
me if
>>>> the reservation being broken belongs to the same object as the =
inuse
>>>> page that is being inappropriately freed.
>>> So, bad news...  still panics with:
>>> panic: vm_phys_free_contig: start 0xc07e6d20 21 24
>>>=20
>>> This panic seems to be consistent now, in that the start address is
>>> always the same...  Is there a way you could add various debugging
>>> for this specific vm page to catch a stack trace (stack(9)) where =
it's
>>> going wrong?  =20
>>>=20
>>=20
>> I made a mistake with the new KASSERT()s in vm_reserv_break().  Try =
this.
>=20
> No worried, the new patch panics:
> panic: vm_reserv_break: 2 saved_object=3D0xc06e6378 x=3D253 =
m_tmp->object=3D0xc06e6378 (1)
>=20


Is your arm processor running in big-endian or little-endian mode?


> w/ a bt of:
> [...]
> vm_reserv_startup() at vm_reserv_startup+0x570
>         pc =3D 0xc055cd94  lr =3D 0xc055cec8 (vm_reserv_startup+0x6a4)
>         sp =3D 0xcd833be8  fp =3D 0xc06142d0
>         r4 =3D 0xc08fb280  r5 =3D 0x00000000
>         r6 =3D 0xc14b76e0  r7 =3D 0x00000000
>         r8 =3D 0x00000000  r9 =3D 0x00000033
>        r10 =3D 0x00000001
> vm_reserv_startup() at vm_reserv_startup+0x6a4
>         pc =3D 0xc055cec8  lr =3D 0xc055d190 =
(vm_reserv_reclaim_inactive+0x34)
>         sp =3D 0xcd833bf0  fp =3D 0xc06142d0
>         r4 =3D 0xc06e6550
> vm_reserv_reclaim_inactive() at vm_reserv_reclaim_inactive+0x34
>         pc =3D 0xc055d190  lr =3D 0xc0554eb0 (vm_page_alloc+0x280)
>         sp =3D 0xcd833bf8  fp =3D 0xc06142d0
> vm_page_alloc() at vm_page_alloc+0x280
>         pc =3D 0xc0554eb0  lr =3D 0xc0540ebc (vm_fault_hold+0x60c)
>         sp =3D 0xcd833c30  fp =3D 0xcd833dac
>         r4 =3D 0xc14b76e0  r5 =3D 0xc0619288
>         r6 =3D 0xcd833eb0  r7 =3D 0xc0f7ec80
>         r8 =3D 0xcd833e60  r9 =3D 0x00000000
>        r10 =3D 0x00000000
> vm_fault_hold() at vm_fault_hold+0x60c
>         pc =3D 0xc0540ebc  lr =3D 0xc05426c4 (vm_fault+0x44)
>         sp =3D 0xcd833db0  fp =3D 0x00000002
>         r4 =3D 0xc14c66ec  r5 =3D 0xc0619288
>         r6 =3D 0xcd833eb0  r7 =3D 0xc0f7ec80
>         r8 =3D 0xcd833e60  r9 =3D 0x00000001
>        r10 =3D 0x00000000
> vm_fault() at vm_fault+0x44
>         pc =3D 0xc05426c4  lr =3D 0xc05786d0 =
(data_abort_handler+0x35c)
>         sp =3D 0xcd833dc0  fp =3D 0x00000002
> data_abort_handler() at data_abort_handler+0x35c
>         pc =3D 0xc05786d0  lr =3D 0xc0568dc8 (exception_exit)
>         sp =3D 0xcd833e60  fp =3D 0x00000000
>         r4 =3D 0xffffffff  r5 =3D 0xffff1004
>         r6 =3D 0x001b7740  r7 =3D 0x00052ec4
>         r8 =3D 0x00000000  r9 =3D 0x000cc4b0
>        r10 =3D 0x00000000
> exception_exit() at exception_exit
>         pc =3D 0xc0568dc8  lr =3D 0x203f1208 (0x203f1208)
>         sp =3D 0xcd833eb0  fp =3D 0x00000000
>         r0 =3D 0x20c53e60  r1 =3D 0x00000000
>         r2 =3D 0x000eeb40  r3 =3D 0x00000001
>         r4 =3D 0x00000000  r5 =3D 0x000e9654
>         r6 =3D 0x001b7740  r7 =3D 0x00052ec4
>         r8 =3D 0x00000000  r9 =3D 0x000cc4b0
>        r10 =3D 0x00000000 r12 =3D 0x200d26a4
>=20
> Let me know if you need any more information..
>=20
> Thanks for tracking this down.
>=20
> --=20
>  John-Mark Gurney				Voice: +1 415 225 5579
>=20
>     "All that I will do, has been done, All that I have, has not."
>=20




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9100CDFA-0C40-4BC8-AA9C-1DE37EEA6208>