Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 11 Nov 2018 01:07:44 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Guido Falsi <mad@madpilot.net>
Cc:        freebsd-current@freebsd.org
Subject:   Re: 13.0 failing to boot multiuser on one PC due to system utilities crashing during rc scipt
Message-ID:  <20181110230744.GN2378@kib.kiev.ua>
In-Reply-To: <791e3488-b838-5cfd-8dca-8db8c74167a0@madpilot.net>
References:  <62bdb5ff-4d68-cf52-4dd5-f0a3cfa1c788@madpilot.net> <791e3488-b838-5cfd-8dca-8db8c74167a0@madpilot.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Nov 10, 2018 at 05:27:09PM +0100, Guido Falsi wrote:
> On 10/11/18 13:08, Guido Falsi wrote:
> > Hi,
> > 
> > Today I was updating my home machines to recent head, r340303.
> > Previously I was running r339449.
> > 
> > I have a build machine where I build base packages (and also runs
> > poudriere). I updated that machine using packages I built successfully.
> > it is running fine and also successfully rebuilt a full ports package
> > set on the new head.
> > 
> > After that I upgraded, using the same package set, another machine, a PC
> > from around three years ago with an i5. After upgrade the kernel boots
> > fine but when running the rc script to go multiuser some system
> > utilities fail, especially zfs, making it impossible for the machine to
> > complete the boot process.
> > 
> > I have tested booting from the memstick snapshot images, I tested:
> > 
> > FreeBSD-13.0-CURRENT-amd64-20181107-r340239-memstick.img
> > FreeBSD-13.0-CURRENT-amd64-20181101-r339979-memstick.img
> > 
> > and both are also failing to go multiuser. The utility failing in this
> > case is fsck, which, like zfs before, dumps core.
> > 
> > I see a pattern where only disk related utilities crash.
> > 
> > The 12.0-BETA4 installation memstick works fine though.
> > 
> > So clearly something changed between r339449 and r340303 which causes
> > incompatibility with my hardware.
> > 
> > I'll to bisect things, but it will be a slow process.
> 
> I narrowed it down to r339895.
I somehow doubt that this is the case.

If you take post-r339895 kernel and start e.g. 11.2-RELEASE userspace
(untar the installation into jail to avoid reinstallation), does it
still demonstrate the behaviour ?

Also try to run pre-r339895 with the 12.0 userspace from e.g. 12.0-BETA4 
builds.

> 
> I'm not sure why it fails, it goes beyond my knowledge, the change looks
> unharmful, but clearly isn't.
Usually it means that the bisect went wrong and your environment failed
to cleanly isolate the change.

> 
> My impression is that the other conditions not moved inside the ifunc
> also play a role so such optimization is not possible on all systems.
> 
> > 
> > I have put dmesg and pciconf output here in case it could be useful:
> > 
> > https://people.freebsd.org/~madpilot/boot_fail/
This is haswell, right ?  It is exactly the same micro-arch as the machine
where I tested this series of changes.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20181110230744.GN2378>