Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 25 Mar 2018 13:15:08 -0700
From:      Mark Millard <marklmi26-fbsd@yahoo.com>
To:        Mark Johnston <markj@FreeBSD.org>, FreeBSD Current <freebsd-current@freebsd.org>
Subject:   Re: head -r331499 amd64/threadripper panic in vm_page_free_prep during "poudriere bulk -a", after 14h 22m or so.
Message-ID:  <0612D846-F99F-4C55-AAD2-C2BCE098F069@yahoo.com>
In-Reply-To: <44821CA4-19C2-4265-8E83-568452DF6471@yahoo.com>
References:  <8D9C49CB-957E-40A5-8EB0-D90D8AC02060@yahoo.com> <20180325183421.GA74365@raichu> <44821CA4-19C2-4265-8E83-568452DF6471@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
[Just an added note about where in the sequence panic
messages are sent to the console vs. could potentially
be sent to the console.]

> On 2018-Mar-25, at 12:32 PM, Mark Millard <marklmi26-fbsd at =
yahoo.com> wrote:
>=20
> On 2018-Mar-25, at 11:34 AM, Mark Johnston <markj at FreeBSD.org> =
wrote:
>=20
>> On Sun, Mar 25, 2018 at 10:41:38AM -0700, Mark Millard wrote:
>>> FreeBSD panic'd while attempting to see if a "poudriere bulk -w -a"
>>> would get the "unnecessary swapping" problem in my UFS-only context,
>>> -r331499 (non-debug but with symbols), under Hyper-V. This is a
>>> Ryzen Threadripper context, but I've no clue if that is important
>>> to the problem. This was after 14 hours or so of building:
>>>=20
>>> . . .
>>> [14:22:05] [18] [00:01:16] Finished devel/p5-Test-HTML-Tidy | =
p5-Test-HTML-Tidy-1.00_1: Success
>>> [14:22:08] [18] [00:00:00] Building devel/ocaml-camlp5 | =
ocaml-camlp5-6.16
>>>=20
>>> So I've no clue if or how to repeat this.
>>>=20
>>> Unfortunately dump was unsuccessful.=20
>>=20
>> What happened?
>=20
> It reported:
>=20
> (da1:strovsc1:0:0:0) WRITE(10). CDB 2a 00 35 24 37 c7 00 00 0 00
> (da1:storvsc1:0:0:0) CAM status Command timeout
> (da1:storvsc1:0:0:0) Error 5, Retries exhausted
> Aborting dump to to I/O error.
>=20
> ** DUMP FAILED (ERROR 5) **
> =3D 0x5
>=20
>>> So all I have is the
>>> backtrace. Hand typed from a screen shot of the console
>>> window:
>>=20
>> Do you know what the panic message was? There are multiple calls to
>> panic() in vm_page_free_prep().
>=20
> No. I listed what I could see. The console screen does not have many
> lines or rows and I was sleeping when the panic happened.

I sometimes wonder if panic should repeat the panic message at the
end of the backtrace in order to deal with keeping it visible in
row-restricted console contexts.

> I redid a buildworld buildkernel installkernel installworld sequence
> since then and it looks like the detailed addresses changed (as seen
> in objdump now vs. what was on the console). But the relative offset
> in vm_page_free_prep seem to be a match, at least for the instruction
> after the "callq panic".
>=20
> Looking at the kernel code I see:
>=20
> . . .
> <vm_page_free_prep+0x10> mov    0xffffffff81843690,%rax
> <vm_page_free_prep+0x18> mov    $0xffffffff81d6d880,%rcx
> <vm_page_free_prep+0x1f> sub    %rcx,%rax
> <vm_page_free_prep+0x22> addq   $0x1,%gs:(%rax)
> <vm_page_free_prep+0x27> mov    0x54(%rbx),%eax
> <vm_page_free_prep+0x2f> and    $0x1,%eax
> <vm_page_free_prep+0x32> jne    <vm_page_free_prep+0x15a>
> . . .
> (several paths reach +0x106)
> <vm_page_free_prep+0x106> movw   $0x0,0x64(%rbx)
> <vm_page_free_prep+0x10c> cmpl   $0x0,0x50(%rbx)
> <vm_page_free_prep+0x110> jne    <vm_page_free_prep+0x163>
> . . .
> <vm_page_free_prep+0x15a> mov    $0xffffffff8116628b,%rdi
> <vm_page_free_prep+0x161> jmp    <vm_page_free_prep+0x16a>
> <vm_page_free_prep+0x163> mov    $0xffffffff8120ca97,%rdi
> <vm_page_free_prep+0x16a> xor    %eax,%eax
> <vm_page_free_prep+0x16c> mov    %rbx,%rsi
> <vm_page_free_prep+0x16f> callq  <panic>
> <vm_page_free_prep+0x174> nopw   %cs:0x0(%rax,%rax,1)
>=20
> No KASSERTS present (a non-debug build). That leaves:
>=20
>        if (vm_page_sbusied(m))
>                panic("vm_page_free: freeing busy page %p", m);
> and:
>=20
>        if (m->wire_count !=3D 0)
>                panic("vm_page_free: freeing wired page %p", m);
>=20
> I do not have anything that lets me differentiate which
> occurred based on the above detail. Sorry.


=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0612D846-F99F-4C55-AAD2-C2BCE098F069>