Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 18 Feb 2019 09:51:09 -0800 (PST)
From:      "Rodney W. Grimes" <freebsd@pdx.rh.CN85.dnsmgr.net>
To:        Warner Losh <imp@bsdimp.com>
Cc:        Ian Lepore <ian@freebsd.org>, Toomas Soome <tsoome@me.com>, src-committers <src-committers@freebsd.org>, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r344238 - head/stand/common
Message-ID:  <201902181751.x1IHp9gc006395@pdx.rh.CN85.dnsmgr.net>
In-Reply-To: <CANCZdfrs80%2B9_foTy_hDHcezgKwQ8RVJyc4GEJO9AiEpTAirEg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
> On Mon, Feb 18, 2019 at 7:31 AM Ian Lepore <ian@freebsd.org> wrote:
> 
> > On Mon, 2019-02-18 at 15:09 +0200, Toomas Soome wrote:
> > > > On 18 Feb 2019, at 01:32, Ian Lepore <ian@FreeBSD.org> wrote:
> > > >
> > > > Author: ian
> > > > Date: Sun Feb 17 23:32:09 2019
> > > > New Revision: 344238
> > > > URL: https://svnweb.freebsd.org/changeset/base/344238
> > > >
> > > > Log:
> > > >  Restore loader(8)'s ability for lsdev to show partitions within a bsd
> > slice.
> > > >
> > > >  I'm pretty sure this used to work at one time, perhaps long ago.  It
> > has
> > > >  been failing recently because if you call disk_open() with
> > dev->d_partition
> > > >  set to -1 when d_slice refers to a bsd slice, it assumes you want it
> > to
> > > >  open the first partition within that slice.  When you then pass that
> > open
> > > >  dev instance to ptable_open(), it tries to read the start of the 'a'
> > > >  partition and decides there is no recognizable partition type there.
> > > >
> > > >  This restores the old functionality by resetting d_offset to the start
> > > >  of the raw slice after disk_open() returns.  For good measure,
> > d_partition
> > > >  is also set back to -1, although that doesn't currently affect
> > anything.
> > > >
> > > >  I would have preferred to make disk_open() avoid such rude
> > assumptions and
> > > >  if you ask for partition -1 you get the raw slice.  But the commit
> > history
> > > >  shows that someone already did that once (r239058), and had to revert
> > it
> > > >  (r239232), so I didn't even try to go down that road.
> > >
> > >
> > > What was the reason for the revert? I still do think the current
> > > disk_open() approach is not good because it does break the promise to
> > > get MBR partition (see common/disk.h).
> > >
> > > Of course I can see the appeal for something like ?boot disk0:? but
> > > case like that should be solved by iterating partition table, and not
> > > by making API to do wrong thing - if I did ask to for disk0s0: I
> > > really would expect to get disk0s0: and not disk0s0a:
> > >
> > > But anyhow, it would be good to understand the actual reasoning
> > > behind that decision.
> > >
> >
> > I have no idea. As is so often the case, the commit message for the
> > revert said what the commit did ("revert to historic behavior") without
> > any hint of why the change was made. One has to assume that it broke
> > some working cases and people complained.
> >
> > Part of the problem for the disk_open() "api" is that there is not even
> > a comment block with some hints in it. I was thinking one potential
> > solution might instead of using "if (partition < 0)" we could define a
> > couple magical partition number values, PNUM_GETBEST = -1,
> > PNUM_RAWSLICE = -2, that sort of thing. But it would require carefully
> > combing through the existing code looking at all calls to disk_open()
> > and all usage of disk_devdesc.d_partition.
> >
> 
> I think that we should fix the disk_open() api. And then fix everything
> that uses it to comply with the new rules. The current hodge-podge that we
> have causes a number of issues for different environments that are only
> mostly worked around. I've done some work in this area to make things more
> regular, but it's still a dog's breakfast of almost-stackable devices that
> we overload too many things on, resulting in the mis-mash we have today.
> When the whole world was just MBR + BSD label, we could get away with it.
> But now that we have GPT as well as GELI support and ZFS pool devices on
> top of disk devices, the arrangements aren't working as well as could be
> desired.

Yes please.
Could this be some of the cause of issues with legacy mbr boots
that use to work, that now stop working?  I have observed this
on zfs in mbr on a machine here.  I did not have time to debug
it, it was faster to nuke and pave the machine that was critical
to operations, the error came as part of a failed freenas upgrade
to the 11.2 based version.

-- 
Rod Grimes                                                 rgrimes@freebsd.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201902181751.x1IHp9gc006395>