Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 07 May 2017 13:56:46 +0300
From:      Toomas Soome <tsoome@me.com>
To:        Julian Elischer <julian@freebsd.org>
Cc:        Warner Losh <imp@bsdimp.com>, freebsd-current <freebsd-current@freebsd.org>, Toomas Soome <tsoome@freebsd.org>, Andriy Gapon <avg@freebsd.org>, Colin Percival <cperciva@freebsd.org>
Subject:   Re: bootcode capable of booting both UFS and ZFS? (Amazon/ec2)
Message-ID:  <BC4D86D1-E17D-48CE-AB37-40AA4002BE8B@me.com>
In-Reply-To: <55ef7994-eac7-5639-0905-345a2a2d5bea@freebsd.org>
References:  <963c5c97-2f92-9983-cf90-ec9d59d87bba@freebsd.org> <053354DF-651F-423C-8057-494496DA3B91@me.com> <972d2a0b-862c-2510-090d-7e8f5d1fce4d@freebsd.org> <CANCZdfoPF2rxD50n=HgYRkwZLhX2XOAeVcMiJ8Z=3Q9wvcog-w@mail.gmail.com> <55ef7994-eac7-5639-0905-345a2a2d5bea@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help

> On 7. mai 2017, at 13:18, Julian Elischer <julian@freebsd.org> wrote:
>=20
> On 7/5/17 1:45 pm, Warner Losh wrote:
>> On Sat, May 6, 2017 at 10:03 PM, Julian Elischer <julian@freebsd.org> =
wrote:
>>> On 6/5/17 4:01 am, Toomas Soome wrote:
>>>>=20
>>>>> On 5. mai 2017, at 22:07, Julian Elischer <julian@freebsd.org
>>>>> <mailto:julian@freebsd.org>> wrote:
>>>>>=20
>>>>> Subject says it all really, is this an option at this time?
>>>>>=20
>>>>> we'd like to try boot the main zfs root partition and then fall =
back to a
>>>>> small UFS based recovery partition.. is that possible?
>>>>>=20
>>>>> I know we could use grub but I'd prefer keep it in the family.
>>>>>=20
>>>>>=20
>>>>>=20
>>>>=20
>>>> it is, sure. but there is an compromise to be made for it.
>>>>=20
>>>> Lets start with what I have done in illumos port, as the idea there =
is
>>>> exactly about having as =E2=80=9Cuniversal=E2=80=9D binaries as =
possible (just the binaries
>>>> are listed below to get the size):
>>>>=20
>>>> -r-xr-xr-x   1 root     sys       171008 apr 30 19:55 bootia32.efi
>>>> -r-xr-xr-x   1 root     sys       148992 apr 30 19:55 bootx64.efi
>>>> -r--r--r--   1 root     sys         1255 okt 25  2015 cdboot
>>>> -r--r--r--   1 root     sys       154112 apr 30 19:55 gptzfsboot
>>>> -r-xr-xr-x   1 root     sys       482293 mai  2 21:10 loader32.efi
>>>> -r-xr-xr-x   1 root     sys       499218 mai  2 21:10 loader64.efi
>>>> -r--r--r--   1 root     sys          512 okt 15  2015 pmbr
>>>> -r--r--r--   1 root     sys       377344 mai  2 21:10 pxeboot
>>>> -r--r--r--   1 root     sys       376832 mai  2 21:10 zfsloader
>>>>=20
>>>> the loader (bios/efi) is built with full complement - zfs, ufs, =
dosfs,
>>>> cd9660, nfs, tftp + gzipfs. The cdboot is starting zfsloader (thats =
trivial
>>>> string change).
>>>>=20
>>>> The gptzfsboot in illumos case is only built with zfs, dosfs and =
ufs - as
>>>> it has to support only disk based media to read out the loader. =
Also I am
>>>> building gptzfsboot with libstand and libi386 to get as much shared =
code as
>>>> possible - which has both good and bad sides, as usual;)
>>>>=20
>>>> The gptzfsboot size means that with ufs the dedicated boot =
partition is
>>>> needed (freebsd-boot), with zfs the illumos port is always using =
the 3.5MB
>>>> boot area after first 2 labels (as there is no geli, the illumos =
does not
>>>> need dedicated boot partition with zfs).
>>>>=20
>>>> As the freebsd-boot is currently created 512k, the size is not an =
issue.
>>>> Also using common code does allow the generic partition code to be =
used, so
>>>> GPT/MBR/BSD (VTOC in illumos case) labels are not problem.
>>>>=20
>>>>=20
>>>> So, even just with cd boot (iso), starting zfsloader (which in fbsd =
has
>>>> built in ufs, zfs etc), you already can get rescue capability.
>>>>=20
>>>> Now, even with just adding ufs reader to gptzfsboot, we can use gpt =
+
>>>> freebsd-boot and ufs root but loading zfsloader on usb image, so it =
can be
>>>> used for both live/install and rescue, because zfsloader itself has =
support
>>>> for all file systems + partition types.
>>>>=20
>>>> I have kept myself a bit off from freebsd gptzfsboot because of =
simple
>>>> reason - the older setups have smaller size for freebsd boot, and =
not
>>>> everyone is necessarily happy about size changes:D also in freebsd =
case
>>>> there is another factor called geli - it most certainly does =
contribute some
>>>> bits, but also needs to be properly addressed on IO call stack (as =
we have
>>>> seen with zfsbootcfg bits). But then again, here also the shared =
code can
>>>> help to reduce the complexity.
>>>>=20
>>>> Yea, the zfsloader/loader*.efi in that listing above is actually =
built
>>>> with framebuffer code and compiled in 8x16 default font (lz4 =
compressed
>>>> ascii+boxdrawing basically - because zfs has lz4, the decompressor =
is always
>>>> there), and ficl 4.1, so thats a bit of difference from fbsd =
loader.
>>>>=20
>>>> Also note that we can still build the smaller dedicated blocks like =
boot2,
>>>> just that we can not use those blocks for more universal cases and
>>>> eventually those special cases will diminish.
>>>=20
>>> thanks for that..
>>>=20
>>>  so, here's my exact problem I need to solve.
>>> FreeBSD 10 (or newer) on Amazon EC2.
>>> We need to have a plan for recovering the scenario where somethign =
goes
>>> wrong (e.g. during an upgrade) and we are left with a system where =
the
>>> default zpool rootfs points to a dataset that doesn't boot. It is =
possible
>>> that mabe the entire pool is unbootable into multi-user..  Maybe =
somehow it
>>> filled up? who knows. It's hard to predict future problems.
>>> There is no console access at all so there is no possibility of =
human
>>> intervention. So all recovery paths that start "enter single user =
mode
>>> and...." are unusable.
>>>=20
>>> The customers who own the amazon account are not crazy about giving =
us the
>>> keys to the kingdom as far as all their EC2 instances, so taking a =
root
>>> drive off a 'sick' VM and grafting it onto a freebsd instance to =
'repair' it
>>> becomes a task we don't want to really have to ask them to do. They =
may not
>>> have the in-house expertise to do it. confidently.
>>>=20
>>> This leaves us with automatic recovery, or at least automatic =
methods of
>>> getting access to that drive from the network.
>>> Since the regular root is zfs, my gut feeling is that to deduce the =
chances
>>> of confusion during recovery, I'd like the (recovery) system itself =
to be
>>> running off a UFS partition, and potentially, with a memory root =
filesystem.
>>> As long as it can be reached over the network we can then take over.
>>>=20
>>> we'd also like to have the boot environment support in the bootcode.
>>> so, what would be the minimum set we'd need?
>>>=20
>>> Ufs support, zfs support, BE support, and support for selecting a =
completely
>>> different boot procedure after some number of boot attempts without =
getting
>>> all the way to multi-user.
>>>=20
>>> How does that come out size-wise?  And what do I need to  configure =
to get
>>> that?
>>>=20
>>> The current EC2 Instances have a 64kB boot partition , but I have a =
window
>>> to convince management to expand that if I have a good enough  =
argument.
>>> (since we a re doing a repartition on the next upgrade, which is =
"special"
>>> (it's out upgrade to 10.3 from 8.0).
>>> Being able to self heal or at least 'get at' a sick instance might =
be a good
>>> enough argument and would make the EC2 instances the same as all the =
other
>>> versions of the product..
>> You should convince them to move to 512k post-haste. I doubt 64k will
>> suffice, and 512k is enough to get all the features you desire.
>=20
> yeah I know but sometimes convincing management of things is like =
banging one's head against a wall.
> Don't think I haven't tried, and won't keep trying.
>=20

To support recovery there can be 2 scenarios:

1. something has gone bad and you boot from alternate media =
(iso/usb/net), log in, and fix the setup.
2. if the alternate media is not available, there has to be recovery =
=E2=80=9Cimage=E2=80=9D, preferably isolated from rest of the system, =
such as recovery partition.

The second option needs an mechanism to get activated; something like =
=E2=80=9CX times try normal boot, then use recovery=E2=80=9D. The =
zfsbootcfg Andriy did, is currently providing the reverse option - try =
this config, if it is failing, fall back to normal. But that work can be =
used as base nevertheless - to provide not one time [next] boot config, =
but fallback.

Of course something like =E2=80=9Crecovery partition=E2=80=9D would need =
to be architected to be as foolproof as possible, but it definitely is =
possible.

BTW:  this is a bit specific to illumos and zfs, but some concerns and =
ideas from comments are still worth to be noted: =
https://www.illumos.org/rb/r/249/  - especially the pad area should =
actually have not simple string, but some structure to allow different =
semantics (next boot or fall back boot, maybe something other).

rgds,
toomas





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BC4D86D1-E17D-48CE-AB37-40AA4002BE8B>