Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 Nov 2015 10:01:36 +0200
From:      Andriy Gapon <avg@FreeBSD.org>
To:        John Baldwin <jhb@FreeBSD.org>
Cc:        freebsd-current@FreeBSD.org, Hans Petter Selasky <hps@selasky.org>, FreeBSD Hackers <freebsd-hackers@FreeBSD.org>
Subject:   Re: strange kernel crash
Message-ID:  <5642F5E0.4050402@FreeBSD.org>
In-Reply-To: <18887451.3zmRk4crln@ralph.baldwin.cx>
References:  <563C8CED.3020101@FreeBSD.org> <2278845.gkxYBUMIWE@ralph.baldwin.cx> <5641AF48.1000507@FreeBSD.org> <18887451.3zmRk4crln@ralph.baldwin.cx>

next in thread | previous in thread | raw e-mail | index | archive | help
On 10/11/2015 20:42, John Baldwin wrote:
> On Tuesday, November 10, 2015 10:48:08 AM Andriy Gapon wrote:
>> On 09/11/2015 22:16, John Baldwin wrote:
>>> On Friday, November 06, 2015 07:02:59 PM Hans Petter Selasky wrote:
>>>> On 11/06/15 12:20, Andriy Gapon wrote:
>>>>> Now the strange part:
>>>>>
>>>>>     0xffffffff80619a18 <+744>:   jne    0xffffffff80619a61 <__mtx_lock_flags+817>
>>>>>     0xffffffff80619a1a <+746>:   mov    %rbx,(%rsp)
>>>>> => 0xffffffff80619a1e <+750>:   movq   $0x0,0x18(%rsp)
>>>>>     0xffffffff80619a27 <+759>:   movq   $0x0,0x10(%rsp)
>>>>>     0xffffffff80619a30 <+768>:   movq   $0x0,0x8(%rsp)
>>>>
>>>> Were these instructions dumped from RAM or from the kernel ELF file?
>>>
>>> Probably not from RAM.  You can use 'info files' in gdb to see what is
>>> handling the address range in question (core vs executable).  x/i in ddb
>>> would have been the "real" truth.
>>
>> Yes, according to the output of files it looks like gdb would read that data
>> from the text section of the kernel file.
>>
>> How about libkvm?  Would kvm_read read data from the core file?
> 
> kvm_read should only access the vmcore, yes.
> 
>> I've written the following small program (cut down dmesg.c, actually):
>> https://people.freebsd.org/~avg/vmcore_read.c
>>
>> (kgdb) disassemble /r
>> => 0xffffffff80619a1e <+750>:   48 c7 44 24 18 00 00 00 00      movq
>> $0x0,0x18(%rsp)
>>
>> $ vmcore_read -N /boot/kernel.29/kernel -M /var/crash/vmcore.29 0xffffffff80619a1e 9
>> 48 c7 44 24 18 00 00 00 00
>>
>> Seems like the code is intact.
>>
>> P.S.
>> 1. To correct something I said earlier, the fault is #UD, not #GP.
>> 2. The only "suspicious" activity at the time of the crash was the execution of
>> a bhyve VM.
> 
> Was the crash in the guest or the host?  UD# seems even more bizarre.

It was the host.  This is bizarre indeed.  I can think only of two possibilities:
- new CPU erratum
- corrupted data somehow getting into the instruction cache, but the correct
data being read during the crash dump (i.e. flaky memory)

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5642F5E0.4050402>