From owner-freebsd-stable@freebsd.org Mon Oct 22 00:12:02 2018 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A121B104C595 for ; Mon, 22 Oct 2018 00:12:02 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic307-10.consmr.mail.gq1.yahoo.com (sonic307-10.consmr.mail.gq1.yahoo.com [98.137.64.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 289BC7DCDD for ; Mon, 22 Oct 2018 00:12:01 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: oHF4aGwVM1kqQuodVW3ySR9ih7PdFliZKJX4ZJsAQPMFpmbCmxdcLz0TJiI67l6 eDnXQC0CRRKH5yzT52Uf9J3xp_5QHcLBaKhO8sCfYSgnPjnNBbG7AjovxmVtmslSTtyw_ZhB.0or OzRM4mVxdP0aXa39bXwrsOvFY3EiCrIyyOFNIn5bUZem5QZ3n6ul30_YS4UDPGNRH9Wr9eonsJyB L0uU0BEhX7yM8UNo8XsQMOJAUGtC.sU_1hJVJkEFzqa3F4gpfGkqNK.8TBgNxRT._S60H.pgvKLY b7Aukh8kJAHd8gi2gAnAL_ne8DQDcmBY3647EnpKcfEc6CUiMTG5uPCE_70FiUt2xpzgAh0_jxzK RvXQkkLfTX2Zu_lA9HoCzpDACUlyYZiBt0ytrPDwRw92IqOyQszKvSXCB2on5_tinh65c7PHK4QZ frdq9CDskV0EXIzwE8UA37qiJtTGUQ8l3ASnCv485xSurYMwXQzWRw5wZS_UCDYbc9FUfSmqais1 kt9j8UK0rGzAHUln0Xpvjz4ViMGpjhfaCTLFUhUEbRmatAKaoFk3b3ak77J75Jx7bg_tyq5NgHZ1 PreHvumKDf6DMDWN1NDiCB8vqjErjMU1UdOptfeW0Xwtm1V2KmTWG7WDuzLkXZvszj28zH0.O5nM zLnjlbdaGH7IHj1sd26ULiAIkt.HqsuiZpi3hqOLe0KnT3kIv0z.zSVz2KF5GA_eYEYsIXS6IlH4 qpIROozNAgyOg5jbpp981GYIMKo5l9e6P6XlPjb_jSUueXjtpf89D263joIYszX2UNRI2psh.ykk 8PwuWusqZGMVFY593Fj1q7pwkLmQBAFSYs_Rb69VJ5ESelpEluL12TUvgyuU95xHJtwGM4VZ0RAc YVX2XsT0AzvZxreRlWTyyguYpp4shSKGne7WVnanmlpx2eUMy48Tw1Ne0D6fXYtp1WWQHuqcTTyt R_vfoWuXklw1DRGAwQvsJjCe2TPdGr.gCGjN1p_dVzfAT_UH7T.c1kCDqYAGuHz8EUBLzh6APt4g q0CLNY2Soa1PiKOAlrLHNiEHAGuETUpcer0Hpxjc5zsjKCfKYgYnbig6J2khDNFngnUjBNCF4ro8 M1pFUurU- Received: from sonic.gate.mail.ne1.yahoo.com by sonic307.consmr.mail.gq1.yahoo.com with HTTP; Mon, 22 Oct 2018 00:11:54 +0000 Received: from c-76-115-7-162.hsd1.or.comcast.net (EHLO [192.168.1.25]) ([76.115.7.162]) by smtp422.mail.gq1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 06b8f75994a09b5d566b5f2a28055cc7; Mon, 22 Oct 2018 00:11:50 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\)) Subject: Re: head -r339076's boot loader fails to boot threadripper 1950X system (BTX halted); an earlier version works [ WITHOUT_ZFS= fixes it ] From: Mark Millard In-Reply-To: <30DD2F47-C8CB-4CEC-8563-C7083D0EAEEF@yahoo.com> Date: Sun, 21 Oct 2018 17:11:50 -0700 Cc: FreeBSD Current , FreeBSD-STABLE Mailing List Content-Transfer-Encoding: quoted-printable Message-Id: <83E0D72D-04AD-48A6-89B0-4E4D6B79D749@yahoo.com> References: <2A425DE4-2B5B-474D-8B95-81890DE4D8A1@yahoo.com> <9D2A6528-F888-4833-A52B-8F9B4D66592C@yahoo.com> <30DD2F47-C8CB-4CEC-8563-C7083D0EAEEF@yahoo.com> To: Warner Losh X-Mailer: Apple Mail (2.3445.9.1) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2018 00:12:03 -0000 [Building and installing based on WITHOUT_ZFS=3D allows the resulting loader to work correctly on the 1950X.] On 2018-Oct-21, at 12:05 AM, Mark Millard wrote: > On 2018-Oct-20, at 10:32 PM, Warner Losh wrote: >=20 >> On Sat, Oct 20, 2018 at 11:04 PM Mark Millard = wrote: >> [I found what change lead to the 1950X boot crashing >> with BTX halted.] >>=20 >>> On 2018-Oct-20, at 12:44 PM, Mark Millard = wrote: >>>=20 >>>> [Adding some vintage information for a loader >>>> that allowed a native boot.] >>>>=20 >>>> On 2018-Oct-20, at 4:00 AM, Mark Millard = wrote: >>>>=20 >>>>> I attempted to jump from head -r334014 to -r339076 >>>>> on a threadripper 1950X board and the native >>>>> FreeBSD boot failed very early. (Hyper-V use of >>>>> the same media did not have this issue.) >>>>>=20 >>>>> But copying over an older /boot/loader from another >>>>> storage device with a FreeBSD head version that has >>>>> not been updated yet got past the problem being >>>>> reported here. (For other reasons, the kernel has >>>>> been moved back to -r338804 --and with that, >>>>> and the older /boot/loader, the 1950X native-boots >>>>> FreeBSD all the way just fine.) >>>>=20 >>>> I found one /boot/loader.old that was dated >>>> in the update'd file system as 2018-May 20, >>>> instead of 2018-Apr-03 from the older file >>>> system. May 20 would apparently mean a little >>>> below -r334014 . It native-booted okay, as did >>>> the April one. >>>>=20 >>>> [I do not know how to inspect a /boot/loader* >>>> to find out what -r?????? it is from.] >>>>=20 >>>> Unfortunately, I had done more than one -r339076 >>>> install from -r334014 before rebooting and >>>> no -r334014 loaders were still present: >>>> the other *.old files from a few minutes before >>>> the ones I had the boot problem with. >>>>=20 >>>> I might be able to extract loaders from various: >>>>=20 >>>> = https://artifact.ci.freebsd.org/snapshot/head/r*/amd64/amd64/base.txz >>>>=20 >>>> materials and try substituting them in order to >>>> narrow the range for works -> fails. If I can, >>>> this likely would take a fair amount of time in >>>> my context. >>>>=20 >>>> Other notes: >>>>=20 >>>> It turns out that only Hyper-V based use needed >>>> a -r334804 kernel: Native booting with the older >>>> loaders and newer kernels works fine. >>>>=20 >>>> Windows 10 Pro 64bit also has no problems >>>> booting and operating the machine. >>>>=20 >>>> The native-boot problem does seem to be freeBSD >>>> loader-vintage specific. >>>>=20 >>>>> For the BTX failure the display ends up with >>>>> (hand transcribed, ". . ." for an omission): >>>>>=20 >>>>> BTX loader 1.00 BTX version is 1.02 >>>>> Console: internal video/keyboard >>>>> BIOS drive C: is disk0 >>>>> . . . >>>>> BIOS drive P: is disk13 >>>>> - >>>>> int=3D00000000 err=3D00000000 efl=3D00010246 eip=3D000096fd >>>>> eax=3D74d48000 ebx=3D74d4e5e0 ecx=3D00000011 edx=3D00000000 >>>>> esi=3D74d4e380 edi=3D74d4e5b0 ebp=3D00091da0 esp=3D00091d60 >>>>> cs=3D002b ds=3D0033 es=3D0033 fs=3D0033 gs=3D0033 ss=3D0033 >>>>> cs:eip=3D66 f7 77 04 0f b7 c0 89-44 24 0c 89 5c 24 04 8b >>>>> 45 08 89 04 24 83 64 24-10 00 c7 44 24 08 01 00 >>>>> ss:esp=3D00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 >>>>> 00 00 00 00 00 00 00 00-f0 1d 89 00 00 00 00 00 >>>>> BTX halted >>>>=20 >>>> I've no clue what of that output might be loader vintage >>>> specific. It might not be of use without knowing the >>>> exact build of the loader. >>>>=20 >>>>> The board is a GIGABYTE X399 AORUS Gaming 7 (rev 1.0). >>>>> It has 96 GiBytes of ECC RAM, just 6 DIMMs installed. >>>>=20 >>>> For reference for the board's BIOS: >>>>=20 >>>> Version: F11e >>>> Dated: 2018-Sep-17 >>>> Description: Update AGESA 1.1.0.1a >>>=20 >>> Using: >>>=20 >>> = https://artifact.ci.freebsd.org/snapshot/head/r*/amd64/amd64/base.txz >>>=20 >>> materials I found that: >>>=20 >>> -r336492: worked (loader vs. zfsloader: not linked) >>> (no more amd64 builds until . . .) >>> -r336538: failed (loader vs. zfsloader: linked) >>>=20 >>> (Later ones that I tried also failed.) >>>=20 >>> Looks like this broke for booting the 1950X=20 >>> system in question when the following was >>> checked in: >>>=20 >>> Author: imp >>> Date: Fri Jul 20 05:17:37 2018 >>> New Revision: 336532 >>> URL:=20 >>> https://svnweb.freebsd.org/changeset/base/336532 >>>=20 >>>=20 >>> Log: >>> Collapse zfsloader functionality back down into loader. >>>=20 >> Yea, this shouldn't matter. It worked on all the systems I tried it = on. >>=20 >> So my first question: is this a ZFS system? Second, does it also have = UFS? If yes to both, which one do you want it to boot off of? >=20 > No zfs in use at all. It has been years since > I experimented with ZFS and reverted back to > UFS. >=20 > # gpart show -l > =3D> 40 937703008 da0 GPT (447G) > 40 1024 1 FBSDFSSDboot (512K) > 1064 746586112 2 FBSDFSSDroot (356G) > 746587176 31457280 3 FBSDFSSDswap (15G) > 778044456 159383552 4 FBSDFSSDswap2 (76G) > 937428008 275040 - free - (134M) > . . . >=20 > Doing: >=20 > gpart bootcode -p /boot/gptboot -i 1 da0 >=20 > and the trying a modern /boot/loader > did not change anything: still "BTX halted" > for a native boot. (No problem under Hyper-V.) I added WITHOUT_ZFS=3D to my equivalents of src.conf files for targeting amd64, built, and installed. The result native-boots just fine. The crash is somehow specific to loader code tied to LOADER_ZFS_SUPPORT being defined. Of course, this leaves me unable to native-boot an official, modern, unmodified build on the 1950X machine. While I do not actively use ZFS these days, I'd always left it built and installed in case I decided to do something with it at some point. I do not normally try to minimize configurations. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)