Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 26 Dec 2016 19:39:23 -0800
From:      perryh@pluto.rain.com (Perry Hutchison)
To:        smithi@nimnet.asn.au
Cc:        marcel@freebsd.org, freebsd-stable@freebsd.org
Subject:   Re: bugs in memstick GPT table, and in gpart(8)
Message-ID:  <5861e26b.HX9c2TvEoZnz0K2K%perryh@pluto.rain.com>
In-Reply-To: <20161227012948.K26979@sola.nimnet.asn.au>
References:  <57e87099.y0HRwon108cNG6uj%perryh@pluto.rain.com> <57ddefcf.jk4kZ2Kp%2BEcHfcFr%perryh@pluto.rain.com> <57d4c674.bo2VPtTSitHw8Suf%perryh@pluto.rain.com> <58606c72.4VCGEwexwvhtqyXW%perryh@pluto.rain.com> <20161227012948.K26979@sola.nimnet.asn.au>

next in thread | previous in thread | raw e-mail | index | archive | help
Ian Smith <smithi@nimnet.asn.au> wrote:
> On Sun, 25 Dec 2016 17:03:46 -0800, Perry Hutchison wrote:
> > ...
> > The 10.3-RELEASE-i386-memstick has the smallest possible GPT table
> > (one sector), three GPT table entries (boot, rootfs, & swap), and
> > hdr_entries == 3 in the GPT header ...
> > 
> > Since the GPT table is always a whole number of sectors, and each
> > sector has room for four GPT table entries, there's no reason for
> > hdr_entries not to be a multiple of 4.  Whatever constructed the
> > 10.3-RELEASE-i386-memstick image should have rounded it up.
>
> That would be mkimg(1).  Recall that the 10.3-R amd64 memstick.img still 
> (or rather, again?) used the BSD scheme (daXa) rather than GPT, due to a 
> some-BIOSes issue.  In 11+ both use mkimg, both adding a 1MB 'vestigial' 
> swap partition to keep some BIOSes happy (thanks to Glen who pointed me 
> to the relevant r265017)

There is something truly horrifying about a report that "some BIOSes"
require a FreeBSD-swap partition in order to boot.  (I could understand
FreeBSD _itself_ insisting on having a swap partition, but the BIOS?)

> However in both cases, this is affected by a quite early commit to 
> mkimg.c: https://svnweb.freebsd.org/base?view=revision&revision=263382
>
> "Also (...), remove the enforcement of creating a GPT table with at 
> least 128 entries. While this is generally advised as the default or 
> minimum, it's not actually a hard requirement. We now recreate a table 
> that's precisely enough (rounded of course)."
>
> So '3' definitely seems an unrounded result, long after this commit.
>
> Have you checked this against an 11.0-RELEASE-i386-memstick.img ?

I hadn't, because I don't generally deal with .0 releases :)
but I just checked and it still has gpt_hdr.hdr_entries == 3.

> > The GPT support in the kernel and in gpart(8) should handle
> > hdr_entries consistently:  either always round it up to a multiple
> > of four, or always take it at face value.  It should not be possible
> > for gpart(8) to create a partition which will effectively disappear
> > the next time the provider is tasted.
>
> Agreed.  I'm not sure, even after checking the _possibly somewhat_ 
> authoritative https://en.wikipedia.org/wiki/GUID_Partition_Table
> whether this refers to the maximum (as I suspect) or the current
> number of partitions, as listed in 'GPT header format':
>
> 72 (0x48)  8 bytes  Starting LBA of array of partition entries ...
> 80 (0x50)  4 bytes  Number of partition entries in array
> 84 (0x54)  4 bytes  Size of a single partition entry (usually 80h or 128)
>
> Also from that, we're apparently disregarding: "The UEFI specification 
> stipulates that a minimum of 16,384 bytes, regardless of sector size,
> be allocated for the Partition Entry Array."  Ie, 128 entries.
>
> Which is probably just as well, though losing 16KB at the start and end 
> of a physical disk seems fairly minute these days, even on 1GB sticks,

It won't matter, until someday when it makes the difference between an
image fitting onto a stick of a given size or not :)

I have no complaint with minimizing wasted space, but limiting the size
to 3 when it necessarily occupies a multiple of 4 goes too far IMO.

> Still, there are various references, as in mkimg(1) above, stating this 
> is not required.  Similarly in gpart(8), optional for 'create' we have:
>
>     -n entries  The number of entries in the partition table ...
>                 By default, partition tables are created with the
>                 minimum number of entries.
>
> In none of the instances of use of gpart(8) I've seen in any /release is 
> that -n flag used.  mkimg(1) doesn't take anything like the -n switch, 
> though perhaps it could / should?

mkimg(1) seems to set the number of partitions created according to
the number of -p switches provided on the command line -- but I'd
think it ought to round up rather than leave part of a sector unused.

> > The willingness of gpart(8) to add a partition exceeding the defined
> > size of the GPT table is a recent regression (as well as, arguably,
> > a buffer overrun):
>
> I'm surprised it didn't extend that '3' count (clearly bogus, not
> at all rounded) to 4 when it added another partition.

IMO it should either have enforced the limit of 3 -- with an accurate
message (not FreeBSD 8's misleading "No space left on device") -- or
incremented it (that being possible in this case since there was room
to enlarge the table).  However ...

> It didn't even need to allocate another table sector, as going from
> 4 to 5 or more would.

Allocating another sector would be a _very_ large and risky
operation if, as would usually be the case, the sector needed for
expansion of the primary table is part of an already-created (and,
in this case, not-empty) partition.  Every partition up to the
first unallocated sector on the stick would need to be moved, and
interrupting that lengthy process would corrupt the partition being
moved at the time.  It gets even worse if one of the partitions to
be moved contained any absolute LBAs (not relative to the base of
the partition).

> I totally agree that memsticks ought to be extendable; the root fs is RO 
> anyway, but there's no reason not to allow other RW partition/s that one 
> could mount from 'live CD' or a single user boot, as you wanted.

In this case the plan was to use the added partition for packages,
so that an entire system (not just base) could be installed without
needing network connectivity.  The only reason I was using the memstick
to resize itself was that the earlier FreeBSD system I'm starting from
does not implement "gpart recover" to resize the GPT to the media.

Speaking of "gpart recover", it might make sense for that operation
to round up the table size to whole sectors if it is not already a
multiple.  The recovery process has to rewrite the primary header
anyway, so that it will point to the new location of the secondary.

> > Trying to perform this same operation on FreeBSD 6 produces:
> > 
> > # gpt -r show da0
> >     start     size  index  contents
> >         0        1         PMBR
> >         1        1         Pri GPT header
> >         2        1         Pri GPT table
> >         3       32      1  GPT part - 83bd6b9d-7f41-11dc-be0b-001560b84f0f
> >        35  1348832      2  GPT part - FreeBSD UFS/UFS2
> >   1348867     2048      3  GPT part - FreeBSD swap
> >   1350915  6460155         
> >   7811070        1         Sec GPT table
> >   7811071        1         Sec GPT header
> > # gpt add da0
> > gpt add: da0: error: no available table entries
> > # gpt add -i 4 da0
> > gpt add: da0: error: index 4 out of range (3 max)
>
> I missed gpt(8) altogether through 6.x and 7.x.  I wish gpart could at 
> least optionally report the Pri / Sec header / table LBAs that clearly.

Time for a -v (verbose) switch to "gpart show" -- or a separate
"gpart dump" operation -- perhaps?

My next move is to hack together a program -- incorporating parts
of the existing FreeBSD GPT code -- to round up that table size,
recompute the checksums, and rewrite the GPT headers.  It "should
not" be all that difficult.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5861e26b.HX9c2TvEoZnz0K2K%perryh>