Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 12 Nov 2018 11:52:25 +0100
From:      Guido Falsi <mad@madpilot.net>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        freebsd-current@freebsd.org
Subject:   Re: 13.0 failing to boot multiuser on one PC due to system utilities crashing during rc scipt
Message-ID:  <d23f5ea2-c55d-607e-989b-502807b60fe2@madpilot.net>
In-Reply-To: <20181111211434.GS2378@kib.kiev.ua>
References:  <62bdb5ff-4d68-cf52-4dd5-f0a3cfa1c788@madpilot.net> <791e3488-b838-5cfd-8dca-8db8c74167a0@madpilot.net> <20181110230744.GN2378@kib.kiev.ua> <5176caee-126f-2709-d09a-0dcf5190e319@madpilot.net> <fa1aedd3-a7ea-ffe2-9614-03054cbed30b@madpilot.net> <20181111211434.GS2378@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On 11/11/18 22:14, Konstantin Belousov wrote:
> On Sun, Nov 11, 2018 at 08:44:24PM +0100, Guido Falsi wrote:
>> On 11/11/18 11:10, Guido Falsi wrote:
>>> On 11/11/18 00:07, Konstantin Belousov wrote:
>> I performed these tests. I downloaded the 12.0-BETA4 and 11.2
>> installation images and replaced the kernels in there. This was faster
>> than working with jails on a crippled system.
>>
>> r339895 kernel on 11.2-RELEASE causes fsck (launched by rc) to dump core
>> and this stops the boot procedure.
>>
>> r339894 kernel on 12.0-BETA4 works fine.
> 
> Ok, let try to find some reason.

The requested files are accessible here:

https://www.madpilot.net/cloud/s/Q9DAGrntnneomSs

> 
> - When you build your kernels, you do not use any cpu-specific optimization
>   flags, do you ?  More, you follow the standard build procedure and your
>   make.conf and src.conf are empty, right ?

At the start I did have some optimizations, but I disabled them all.

I'm building with 'make -j buildkernel'. I usually enable META_MODE, but
I also disabled that and even wiped out the contents of /usr/obj
multiple times to make sure I was getting a clean build.

> - Do you preload a microcode update from the loader ?

At present no, I load it later via rc scripts.

This is something I want to test though, I'll report later if it changes
anything.

> - Show the output of sysctl vm.pmap.
> - Show verbose dmesg from the boot of the problematic kernel.
>   You posted non-verbose dmesg for 12.0-BETA4.

Posted at the link above.

> - Enter ddb, when booted the problematic kernel.  Do
>   db> x/x cpu_stdext_feature

cpu_stdext_feature:     281

>   db> x/x cpu_stdext_feature+4

cpu_stdext_feature2:    0

> - From the same ddb session, disassemble e.g. cpu_set_user_tls().
>   You could paste me whole disassembling, but really I want to know
>   the single line with the call to set_pcb_flagsXXXX, it should be
>   either set_pcb_flags_raw or set_pcb_flags_fsgsbase.  To disassemble
>   in ddb, do
>   db> x/i cpu_set_user_tls
>   and then press <enter> more to get next and next instructions.
>   (I want the disassembly from ddb and not from gdb/kgdb).

cpu_set_user_tls+0x2d:  call    set_pcb_flags_raw


The full ddb session capture is posted at the link above.

> - Try the following patch.
> 

The patch does produce a working kernel. In fact I'm running that kernel
now.

I've also added the broken kernel with it's kernel.debug file as a txz
archive in the URL posted above.

Hope this helps. Thanks for your time and effort!

-- 
Guido Falsi <mad@madpilot.net>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?d23f5ea2-c55d-607e-989b-502807b60fe2>