Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 12 Jan 2021 22:53:00 -0800
From:      Mark Millard <marklmi@yahoo.com>
To:        "mhorne@freebsd.org" <mhorne@FreeBSD.org>, "rwatson@freebsd.org" <rwatson@FreeBSD.org>, Ed Maste <emaste@freebsd.org>
Cc:        freebsd-arm <freebsd-arm@freebsd.org>, bob prohaska <fbsd@www.zefox.net>, Gordon Bergling <gbe@freebsd.org>, =?utf-8?Q?Klaus_K=C3=BCchemann?= <maciphone2@googlemail.com>
Subject:   Re: panic: Too many early devmap mappings
Message-ID:  <3C95F8C0-73FA-4DEC-9F0A-6FFF9846E8A3@yahoo.com>
In-Reply-To: <04FEAC11-5603-4D4E-8651-43AB37A10B46@yahoo.com>
References:  <20210112233607.GA79348@www.zefox.net> <90C90797-A8A5-457C-AF07-800EA82F5F12@yahoo.com> <20210113002432.GA79600@www.zefox.net> <04FEAC11-5603-4D4E-8651-43AB37A10B46@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2021-Jan-12, at 16:55, Mark Millard <marklmi at yahoo.com> wrote:

> [I have a git bisect result for the failure: bbfa199cbc16.]
>=20
> On 2021-Jan-12, at 16:24, bob prohaska <fbsd at www.zefox.net> wrote:
>=20
>> On Tue, Jan 12, 2021 at 03:59:44PM -0800, Mark Millard wrote:
>>>=20
>>>=20
>>> On 2021-Jan-12, at 15:49, bob prohaska <fbsd at www.zefox.net> =
wrote:
>>>=20
>>>> An RPi3 running -current updated on Jan. 10 installed a new =
world/kernel and=20
>>>> when rebooted promptly crashed with=20
>>>>=20
>>>> ---<<BOOT>>---
>>>> panic: Too many early devmap mappings
>>>> cpuid =3D 0
>>>> time =3D 1
>>>> KDB: stack backtrace:
>>>> (null)() at 0xffff00000011ad90
>>>> 	 pc =3D 0xffff000000760f70  lr =3D 0xffff00000011ad90
>>>> 	 sp =3D 0xffff0000011df330  fp =3D 0xffff0000011df530
>>>>=20
>>>> (null)() at 0xffff00000045c2d4
>>>> 	 pc =3D 0xffff00000011ad90  lr =3D 0xffff00000045c2d4
>>>> 	 sp =3D 0xffff0000011df540  fp =3D 0xffff0000011df5a0
>>>>=20
>>>> (null)() at 0xffff00000045c07c
>>>> 	 pc =3D 0xffff00000045c2d4  lr =3D 0xffff00000045c07c
>>>> 	 sp =3D 0xffff0000011df5b0  fp =3D 0xffff0000011df660
>>>>=20
>>>> (null)() at 0xffff0000007d8380
>>>> 	 pc =3D 0xffff00000045c07c  lr =3D 0xffff0000007d8380
>>>> 	 sp =3D 0xffff0000011df670  fp =3D 0xffff0000011df670
>>>>=20
>>>> (null)() at 0xffff00000075dc98
>>>> 	 pc =3D 0xffff0000007d8380  lr =3D 0xffff00000075dc98
>>>> 	 sp =3D 0xffff0000011df680  fp =3D 0xffff0000011df6a0
>>>>=20
>>>> (null)() at 0xffff0000007710e4
>>>> 	 pc =3D 0xffff00000075dc98  lr =3D 0xffff0000007710e4
>>>> 	 sp =3D 0xffff0000011df6b0  fp =3D 0xffff0000011df6d0
>>>>=20
>>>> (null)() at 0xffff00000028850c
>>>> 	 pc =3D 0xffff0000007710e4  lr =3D 0xffff00000028850c
>>>> 	 sp =3D 0xffff0000011df6e0  fp =3D 0xffff0000011df7a0
>>>>=20
>>>> (null)() at 0xffff0000007c8788
>>>> 	 pc =3D 0xffff00000028850c  lr =3D 0xffff0000007c8788
>>>> 	 sp =3D 0xffff0000011df7b0  fp =3D 0xffff0000011df830
>>>>=20
>>>> (null)() at 0xffff00000028a64c
>>>> 	 pc =3D 0xffff0000007c8788  lr =3D 0xffff00000028a64c
>>>> 	 sp =3D 0xffff0000011df840  fp =3D 0xffff0000011df850
>>>>=20
>>>> (null)() at 0xffff00000039b340
>>>> 	 pc =3D 0xffff00000028a64c  lr =3D 0xffff00000039b340
>>>> 	 sp =3D 0xffff0000011df860  fp =3D 0xffff0000011df870
>>>>=20
>>>> (null)() at 0xffff0000004a6950
>>>> 	 pc =3D 0xffff00000039b340  lr =3D 0xffff0000004a6950
>>>> 	 sp =3D 0xffff0000011df880  fp =3D 0xffff0000011df8b0
>>>>=20
>>>> (null)() at 0xffff00000076d73c
>>>> 	 pc =3D 0xffff0000004a6950  lr =3D 0xffff00000076d73c
>>>> 	 sp =3D 0xffff0000011df8c0  fp =3D 0xffff0000011dfa00
>>>>=20
>>>> (null)() at 0xffff00000000089c
>>>> 	 pc =3D 0xffff00000076d73c  lr =3D 0xffff00000000089c
>>>> 	 sp =3D 0xffff0000011dfa10  fp =3D 0x0000000000000000
>>>>=20
>>>> KDB: enter: panic
>>>> [ thread pid 0 tid 0 ]
>>>> Stopped at      0xffff0000004a6550
>>>> db> reboot
>>>> cpu_reset failed
>>>>=20
>>>> It had to be power-cycled to restart. It came back up readily with
>>>> kernel.old, which reports main-c255664-g4d64c7243d26 compiled Jan =
9.
>>>>=20
>>>> In particular, how does one recognize which revision fixes=20
>>>> this problem, assuming it's a bug and not operator error?=20
>>>> Presumably, it'll take at least several days to reach git.
>>>=20
>>> Discovered last night on 8GiByte RPi4B's relative to this:
>>> Booting without a monitor changes the memory use and avoids
>>> the panic. WIth the 1920x1080 monitor it fails. (Only kernels
>>> with INVARIANTS make the check that panics, but need not
>>> mean that others are operating well, even if it is not
>>> obvious in a specific context.)
>>>=20
>>> Quoted from part of a message list item from last night:
>>>=20
>>> QUOTE
>>> Going back to my 19cca0b9613d based debug kernel build that
>>> has the printf's reporting the values used in the test, but
>>> with no monitor attached, it boots fine and reports:
>>>=20
>>> pmap_mapdev early_boot: akva_devmap_vaddr: ffff007ffffff000 size: =
1000
>>> pmap_mapdev early_boot: va: ffff007fffffe000 VM_MAX_KERNEL_ADDRESS: =
ffff008000000000 L2_SIZE: 200000
>>>=20
>>> That compares to the previously reported failure figures from
>>> having the monitor attached for that debug kernel:
>>>=20
>>> pmap_mapdev early_boot: akva_devmap_vaddr: ffff007fff816000 size: =
1000
>>> pmap_mapdev early_boot: va: ffff007fff815000 VM_MAX_KERNEL_ADDRESS: =
ffff008000000000 L2_SIZE: 200000
>>> panic: Too many early devmap mappings
>>>=20
>>> where the code does:
>>>=20
>>>             KASSERT(va >=3D VM_MAX_KERNEL_ADDRESS - L2_SIZE,
>>>                 ("Too many early devmap mappings"));
>>>=20
>>>=20
>>> Looks like akva_devmap_vaddr gets smaller to make room above
>>> for monitor related data and so va can end up being too small
>>> by the criteria of this test.
>>>=20
>>> I've no clue who would be appropriate for dealing with this.
>>> END QUOTE
>>>=20
>>> You may have provided a bound for a bisection
>>>=20
>>=20
>> It looks as if unplugging the HDMI monitor (1920x1200) fixed the
>> panic on the RPi3B+ as well.=20
>>=20
>> [the original subject line said "devmatch", which confused me hugely =
8-)]=20
>>=20
>=20
> A git bisect sequence on a 8 GiBYte RPi4B with a monitor plugged
> in (to make it use more high kernel RAM such that the KASSERT
> indicated above fails) resulted in:
>=20
> # git bisect good
> bbfa199cbc1698631a0e932848e62dd76559d4d7 is the first bad commit
> commit bbfa199cbc1698631a0e932848e62dd76559d4d7
> Author: mhorne <mhorne@FreeBSD.org>
> Date:   Wed Dec 9 16:38:42 2020 -0400
>=20
>    arm64: gdb(4) machine-dependent bits
>=20
>    Everything required for remote kernel debugging over a serial
>    connection. For FDT-based systems, a debug port can be specified by
>    setting hw.fdt.dbgport to the desired device tree node in =
loader.conf.
>    For example, hw.fdt.dbgport=3D"uart1", or
>    hw.fdt.dbgport=3D"serial@ff1a0000".
>=20
>    Looks good:     emaste
>    Tested by:      rwatson
>    MFC after:      2 weeks
>    Sponsored by:   The FreeBSD Foundation
>    Differential Revision:  https://reviews.freebsd.org/D27727
>=20
> sys/arm64/arm64/gdb_machdep.c   | 112 =
++++++++++++++++++++++++++++++++++++++++
> sys/arm64/conf/GENERIC          |   2 +-
> sys/arm64/include/gdb_machdep.h |  81 +++++++++++++++++++++++++++++
> sys/conf/files.arm64            |   1 +
> 4 files changed, 195 insertions(+), 1 deletion(-)
> create mode 100644 sys/arm64/arm64/gdb_machdep.c
> create mode 100644 sys/arm64/include/gdb_machdep.h
>=20

I forgot to list the bugzilla for this:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D252541

I have added the notes, including:

QUOTE
Turns out that this "too much high kernel memory in use" issue happens =
for
a combination of 2 things being true at the same time:

A) Monitor attached (sufficiently large pixel count?)
B) GDB enabled, per bbfa199cbc16 .
END QUOTE


=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3C95F8C0-73FA-4DEC-9F0A-6FFF9846E8A3>