Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 30 Apr 2017 18:31:03 -0700
From:      Mark Millard <markmi@dsl-only.net>
To:        freebsd-arm <freebsd-arm@freebsd.org>, freebsd-hackers@freebsd.org, FreeBSD Current <freebsd-current@freebsd.org>
Subject:   Re: FYI: example "panic: ARM64TODO: reclaim_pv_chunk" on a Pine64+ 2GB with head -r317015 [mkimg okay; "cp -p" generates the bad context]
Message-ID:  <7C5C4725-984C-4461-B76C-9A6BE052BA00@dsl-only.net>
In-Reply-To: <302D1255-4D34-4C1B-8F3A-9180A6AF8768@dsl-only.net>
References:  <4BC7E6BC-4BF9-4F5E-9851-E022AC9A3082@dsl-only.net> <51050A07-B951-45C0-82CE-73BB342F012E@dsl-only.net> <302D1255-4D34-4C1B-8F3A-9180A6AF8768@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2017-Apr-27, at 10:26 PM, Mark Millard <markmi at dsl-only.net> =
wrote:

> [As the text does not really follow from the earlier text
> I'd sent directly I'm top posting a hypothesis about where
> so much active memory came from to be so low in available
> memory to get an reclaim_pv_chunk attempt.]
>=20
> My hypothesis is that the "mdconfig -d"s from vm_copy_base
> (from /usr/src/release/tools/vmimage.subr ) did not actually
> release the memory resources involved (from vnode backed
> mdconfig use):

I watched with "vmstat 1" and "mdconfig -d" did release memory
(including RAM) each time.

> . . .
> =46rom the prior top report for the failure,
> partially repeated here:
>=20
> . . .
> Mem: 1618M Active, 17M Inact, 315M Wired, 204M Buf, 15M Free
> Swap: 6144M Total, 34M Used, 6110M Free, 348K Out
>=20
>  PID USERNAME    THR PRI NICE   SIZE    RES   SWAP STATE   C   TIME    =
 CPU COMMAND
> 48988 root          4  31    0   651M 27048K     0K RUN     0   0:03  =
87.60% xz -T 0 =
/usr/obj/DESTDIRs/vmimage-aarch64/vmimages/FreeBSD-12.0-CURRENT-
> . . .
>=20
> The combination 1618M Mem:Active but 34M Swap:Used
> and 651M xz:SIZE but 27M xz:RES and 0K xz:SWAP just
> seems very odd, like it should not happen. The 17M
> Mem:Inact is odd for the context as well. (Mem:Wired,
> Mem:Buf, and Mem:Free look normal.)
>=20
> An alternate hypothesis would be the memory "leak"
> is from mkimg not having it memory-use cleaned up.
> This happens after vm_copy_base but before the cp/xz
> sequence and is what produces vm.raw.

mkimg also release its memory (including RAM)
each time.

But the later "cp -p" of the large vm.raw that
I was producing ended up without leaving the
free memory that is expected after it finished.
I worked around the issue with (just a personal
workaround that helps in other respects as well):

Index: Makefile.vm
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- Makefile.vm	(revision 317015)
+++ Makefile.vm	(working copy)
@@ -119,15 +119,20 @@
 vm-install:
 .if defined(WITH_VMIMAGES) && !empty(WITH_VMIMAGES)
 	mkdir -p ${DESTDIR}/vmimages
-. for FORMAT in ${VMFORMATS}
-	cp -p ${VMBASE}.${FORMAT} \
-		${DESTDIR}/vmimages/${OSRELEASE}.${FORMAT}
-. endfor
 . if defined(WITH_COMPRESSED_VMIMAGES) && =
!empty(WITH_COMPRESSED_VMIMAGES)
 .  for FORMAT in ${VMFORMATS}
-	# Don't keep the originals.  There is a copy in ${.OBJDIR} if =
needed.
-	${XZ_CMD} ${DESTDIR}/vmimages/${OSRELEASE}.${FORMAT}
+	# Tradeoff "cp -p" property for less memory, I/O, and time use.
+	# Also: -3 got > 30 MiByte/sec effective (source file) for 32 =
GiByte, mostly-zero image
+	# on a Pine64+ 2GB and used about 600 MiBytes of "active virtual =
memory". The about 16
+	# minutes was vastly faster than the "cp -p" --and the sha512 =
and sha256 are vastly
+	# faster on the compressed file as well.
+	${XZ_CMD} -T 0 -3 --stdout --memlimit-compress=3D50% -v =
${VMBASE}.${FORMAT} > ${DESTDIR}/vmimages/${OSRELEASE}.${FORMAT}.xz
 .  endfor
+. else
+.  for FORMAT in ${VMFORMATS}
+	cp -p ${VMBASE}.${FORMAT} \
+		${DESTDIR}/vmimages/${OSRELEASE}.${FORMAT}
+.  endfor
 . endif
 	cd ${DESTDIR}/vmimages && sha512 ${OSRELEASE}* > \
 		${DESTDIR}/vmimages/CHECKSUM.SHA512

and using WITH_COMPRESSED_VMIMAGES to avoid a "cp -p"
based copy of the large file.


Prior reports from capturing text:

On 2017-Apr-27, at 7:31 PM, Mark Millard <markmi at dsl-only.net> wrote:

> [Another example panic. Again no dump. But I have what
> a top -PCwaopid froze at this time.]
>=20
> On 2017-Apr-27, at 4:22 PM, Mark Millard <markmi at dsl-only.net> =
wrote:
>=20
>> Unfortunately for this FYI the attempt to produce a dump
>> failed and so all the information that I have is what I
>> first captured from the console output: a backtrace.
>>=20
>> The context was head -r317015 on a Pine64+ 2GB. At the
>> time I was experimenting with trying to build a vm.raw
>> from my own build of FreeBSD. The (root) file system
>> is on a USB SSD off of a powered USB hub.
>>=20
>> panic: ARM64TODO: reclaim_pv_chunk
>> cpuid =3D 1
>> time =3D 1493332968
>> KDB: stack backtrace:
>> db_trace_self() at db_trace_self_wrapper+0x28
>>       pc =3D 0xffff000000605cc0  lr =3D 0xffff0000000869cc
>>       sp =3D 0xffff000083ba4f00  fp =3D 0xffff000083ba5110
>>=20
>> db_trace_self_wrapper() at vpanic+0x164
>>       pc =3D 0xffff0000000869cc  lr =3D 0xffff00000031d464
>>       sp =3D 0xffff000083ba5120  fp =3D 0xffff000083ba5190
>>=20
>> vpanic() at panic+0x4c
>>       pc =3D 0xffff00000031d464  lr =3D 0xffff00000031d2fc
>>       sp =3D 0xffff000083ba51a0  fp =3D 0xffff000083ba5220
>>=20
>> panic() at reclaim_pv_chunk+0x10
>>       pc =3D 0xffff00000031d2fc  lr =3D 0xffff00000061a234
>>       sp =3D 0xffff000083ba5230  fp =3D 0xffff000083ba5230
>>=20
>> reclaim_pv_chunk() at get_pv_entry+0x240
>>       pc =3D 0xffff00000061a234  lr =3D 0xffff000000616184
>>       sp =3D 0xffff000083ba5240  fp =3D 0xffff000083ba5260
>>=20
>> get_pv_entry() at pmap_enter+0x694
>>       pc =3D 0xffff000000616184  lr =3D 0xffff0000006156a0
>>       sp =3D 0xffff000083ba5270  fp =3D 0xffff000083ba5300
>>=20
>> pmap_enter() at vm_fault_hold+0x28c
>>       pc =3D 0xffff0000006156a0  lr =3D 0xffff0000005b9740
>>       sp =3D 0xffff000083ba5310  fp =3D 0xffff000083ba5460
>>=20
>> vm_fault_hold() at vm_fault+0x70
>>       pc =3D 0xffff0000005b9740  lr =3D 0xffff0000005b9464
>>       sp =3D 0xffff000083ba5470  fp =3D 0xffff000083ba54a0
>>=20
>> vm_fault() at data_abort+0xe0
>>       pc =3D 0xffff0000005b9464  lr =3D 0xffff00000061ad94
>>       sp =3D 0xffff000083ba54b0  fp =3D 0xffff000083ba5560
>>=20
>> data_abort() at handle_el1h_sync+0x70
>>       pc =3D 0xffff00000061ad94  lr =3D 0xffff000000607870
>>       sp =3D 0xffff000083ba5570  fp =3D 0xffff000083ba5680
>>=20
>> handle_el1h_sync() at kern_select+0x9fc
>>       pc =3D 0xffff000000607870  lr =3D 0xffff00000037db3c
>>       sp =3D 0xffff000083ba5690  fp =3D 0xffff000083ba58f0
>>=20
>> kern_select() at sys_select+0x5c
>>       pc =3D 0xffff00000037db3c  lr =3D 0xffff00000037dc58
>>       sp =3D 0xffff000083ba5900  fp =3D 0xffff000083ba5930
>>=20
>> sys_select() at do_el0_sync+0xa48
>>       pc =3D 0xffff00000037dc58  lr =3D 0xffff00000061b91c
>>       sp =3D 0xffff000083ba5940  fp =3D 0xffff000083ba5a70
>>=20
>> do_el0_sync() at handle_el0_sync+0x6c
>>       pc =3D 0xffff00000061b91c  lr =3D 0xffff0000006079e8
>>       sp =3D 0xffff000083ba5a80  fp =3D 0xffff000083ba5b90
>>=20
>> handle_el0_sync() at 0x4948c
>>       pc =3D 0xffff0000006079e8  lr =3D 0x000000000004948c
>>       sp =3D 0xffff000083ba5ba0  fp =3D 0x0000ffffffffd960
>=20
>=20
> This time I got to record from top:
> (swap is on a swap partition)
> (pid 49888's SIZE vs. RES and SWAP might be interesting)
> (as might the Active figure)
>=20
> last pid: 48988;  load averages:  0.64,  0.44,  0.38                   =
                                                                         =
                                up 0+04:21:01  19:19:50
> 32 processes:  2 running, 30 sleeping
> CPU 0: 13.2% user,  0.0% nice, 23.2% system,  0.3% interrupt, 63.3% =
idle
> CPU 1:  4.6% user,  0.0% nice, 23.9% system,  0.0% interrupt, 71.5% =
idle
> CPU 2:  2.1% user,  0.0% nice, 23.2% system,  0.0% interrupt, 74.8% =
idle
> CPU 3:  3.3% user,  0.0% nice, 23.8% system,  0.0% interrupt, 72.8% =
idle
> Mem: 1618M Active, 17M Inact, 315M Wired, 204M Buf, 15M Free
> Swap: 6144M Total, 34M Used, 6110M Free, 348K Out
>=20
> PID USERNAME    THR PRI NICE   SIZE    RES   SWAP STATE   C   TIME     =
CPU COMMAND
> 48988 root          4  31    0   651M 27048K     0K RUN     0   0:03  =
87.60% xz -T 0 =
/usr/obj/DESTDIRs/vmimage-aarch64/vmimages/FreeBSD-12.0-CURRENT-arm64-aarc=
h64.raw
> 11983 root          1  22    0  5068K     0K     0K wait    3   0:00   =
0.00% make vm-image vm-install DESTDIR=3D/usr/obj/DESTDIRs/vmimage-aarch64=
 (<make>)
> 11981 root          1  42    0  7320K     0K  1516K wait    1   0:00   =
0.00% sh =
/root/sys_build_scripts.aarch64-host/make_noscript_aarch64_nodebug_clang_b=
ootstrap-aarch64-host.sh vm-image vm-install=20
> 11980 root          1  20    0  6656K  1548K     0K select  0   0:02   =
0.00% [script]
> 11977 root          1  30    0  7320K     0K  1516K wait    3   0:00   =
0.00% /bin/sh =
/root/sys_build_scripts.aarch64-host/make_aarch64_nodebug_clang_bootstrap-=
aarch64-host.sh vm-image vm-install DEST
> 2694 root          1  20    0  8804K  2072K     0K CPU2    2   0:07   =
0.17% top -PCwaopid
> 827 root          1  20    0  7320K     0K   360K wait    0   0:00   =
0.00% su (<sh>)
> 826 markmi        1  22    0 10372K     0K  1532K wait    3   0:00   =
0.00% su (<su>)
> 820 markmi        1  24    0  7320K     0K  1516K wait    1   0:00   =
0.00% -sh (<sh>)
> 819 markmi        1  20    0 18416K  1152K     0K select  1   0:21   =
0.00% sshd: markmi@pts/1 (sshd)
> 816 root          1  20    0 18416K  3276K     0K select  0   0:00   =
0.00% sshd: markmi [priv] (sshd)
> 765 root          1  20    0  7320K     0K   224K wait    2   0:00   =
0.00% su (<sh>)
> 764 markmi        1  23    0 10372K     0K  1532K wait    0   0:00   =
0.00% su (<su>)
> 758 markmi        1  31    0  7320K     0K  1516K wait    1   0:00   =
0.00% -sh (<sh>)
> 757 markmi        1  20    0 18416K   228K   904K select  3   0:01   =
0.01% sshd: markmi@pts/0 (sshd)
> 754 root          1  25    0 18416K  3276K     0K select  1   0:00   =
0.00% sshd: markmi [priv] (sshd)
> 746 root          1  27    0  7320K  1532K     0K ttyin   0   0:00   =
0.00% -sh (sh)
> 745 root          1  20    0 10372K     0K  1532K wait    1   0:00   =
0.00% login [pam] (<login>)
> 700 root          1  20    0  6948K     0K   168K nanslp  1   0:00   =
0.00% /usr/sbin/cron -s (<cron>)
> 696 smmsp         1  20    0 10460K     0K   184K pause   0   0:00   =
0.00% sendmail: Queue runner@00:30:00 for /var/spool/clientmqueue =
(<sendmail>)
> 693 root          1  20    0 10460K  1392K     0K select  1   0:00   =
0.03% sendmail: accepting connections (sendmail)
> 690 root          1  20    0 15800K   968K     0K select  2   0:00   =
0.00% /usr/sbin/sshd
> 661 root          1  20    0  6656K   344K     0K select  2   0:01   =
0.00% /usr/sbin/powerd
> 658 root          2  20    0 12788K 12672K     0K select  0   0:02   =
0.01% /usr/sbin/ntpd -g -c /etc/ntp.conf -p /var/run/ntpd.pid -f =
/var/db/ntpd.drift
> 620 root         32  52    0  6384K  1100K     0K rpcsvc  1   0:00   =
0.00% nfsd: server (nfsd)
> 619 root          1  52    0  6384K   704K     0K select  1   0:00   =
0.00% nfsd: master (nfsd)
> 617 root          1  20    0  6684K   688K     0K select  1   0:00   =
0.00% /usr/sbin/mountd -r
> 478 root          1  20    0  6676K   596K     0K select  3   0:00   =
0.00% /usr/sbin/rpcbind
> 469 root          1  20    0  6680K   572K     0K select  2   0:00   =
0.00% /usr/sbin/syslogd -s
> 396 root          1  20    0  9580K    32K     0K select  0   0:00   =
0.00% /sbin/devd
> 308 _dhcp         1  20    0  6800K   532K     0K select  2   0:00   =
0.00% dhclient: awg0 (dhclient)
> 307 root          1  52    0  6800K   424K     0K select  2   0:00   =
0.00% dhclient: awg0 [priv] (dhclient)
>=20
> And here is the backtrace:
>=20
> timeout stopping cpus
> panic: ARM64TODO: reclaim_pv_chunk
> cpuid =3D 0
> time =3D 1493345992
> KDB: stack backtrace:
> db_trace_self() at db_trace_self_wrapper+0x28
>        pc =3D 0xffff000000605cc0  lr =3D 0xffff0000000869cc
>        sp =3D 0xffff000083d301d0  fp =3D 0xffff000083d303e0
>=20
> db_trace_self_wrapper() at vpanic+0x164
>        pc =3D 0xffff0000000869cc  lr =3D 0xffff00000031d464
>        sp =3D 0xffff000083d303f0  fp =3D 0xffff000083d30460
>=20
> vpanic() at panic+0x4c
>        pc =3D 0xffff00000031d464  lr =3D 0xffff00000031d2fc
>        sp =3D 0xffff000083d30470  fp =3D 0xffff000083d304f0
>=20
> panic() at reclaim_pv_chunk+0x10
>        pc =3D 0xffff00000031d2fc  lr =3D 0xffff00000061a234
>        sp =3D 0xffff000083d30500  fp =3D 0xffff000083d30500
>=20
> reclaim_pv_chunk() at get_pv_entry+0x240
>        pc =3D 0xffff00000061a234  lr =3D 0xffff000000616184
>        sp =3D 0xffff000083d30510  fp =3D 0xffff000083d30530
>=20
> get_pv_entry() at pmap_enter+0x694
>        pc =3D 0xffff000000616184  lr =3D 0xffff0000006156a0
>        sp =3D 0xffff000083d30540  fp =3D 0xffff000083d305d0
>=20
> pmap_enter() at vm_fault_hold+0x28c
>        pc =3D 0xffff0000006156a0  lr =3D 0xffff0000005b9740
>        sp =3D 0xffff000083d305e0  fp =3D 0xffff000083d30730
>=20
> vm_fault_hold() at vm_fault+0x70
>        pc =3D 0xffff0000005b9740  lr =3D 0xffff0000005b9464
>        sp =3D 0xffff000083d30740  fp =3D 0xffff000083d30770
>=20
> vm_fault() at data_abort+0xe0
>        pc =3D 0xffff0000005b9464  lr =3D 0xffff00000061ad94
>        sp =3D 0xffff000083d30780  fp =3D 0xffff000083d30830
>=20
> data_abort() at handle_el0_sync+0x6c
>        pc =3D 0xffff00000061ad94  lr =3D 0xffff0000006079e8
>        sp =3D 0xffff000083d30840  fp =3D 0xffff000083d30950
>=20
> handle_el0_sync() at 0x400a3de4
>        pc =3D 0xffff0000006079e8  lr =3D 0x00000000400a3de4
>        sp =3D 0xffff000083d30960  fp =3D 0x0000ffffbfdfcd30
>=20
> KDB: enter: panic
> [ thread pid 48988 tid 100230 ]
> Stopped at      kdb_enter+0x44: undefined       d4200000
>=20

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

_______________________________________________
freebsd-arm@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arm
To unsubscribe, send any mail to "freebsd-arm-unsubscribe@freebsd.org"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7C5C4725-984C-4461-B76C-9A6BE052BA00>