Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 27 Jul 2012 20:25:58 +0200
From:      Marius Strobl <marius@alchemy.franken.de>
To:        Kurt Lidl <lidl@pix.net>
Cc:        freebsd-sparc64@freebsd.org
Subject:   Re: zfs booting feedback
Message-ID:  <20120727182558.GH58433@alchemy.franken.de>
In-Reply-To: <20120714004335.GD92944@pix.net>
References:  <20120708025435.GA12487@pix.net> <20120709140019.GA67276@alchemy.franken.de> <20120710165433.GA98707@pix.net> <CAGEduPJ%2BKpEacYuPVfUV%2BMXRM%2By1-j8k1Gb2wA7MYJ3s71vuBw@mail.gmail.com> <20120712172208.GA47484@pix.net> <20120713195807.GU63893@alchemy.franken.de> <20120714004335.GD92944@pix.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jul 13, 2012 at 08:43:35PM -0400, Kurt Lidl wrote:
> On Fri, Jul 13, 2012 at 09:58:07PM +0200, Marius Strobl wrote:
> > On Thu, Jul 12, 2012 at 01:22:08PM -0400, Kurt Lidl wrote:
> > > On Thu, Jul 12, 2012 at 03:02:56PM +0800, Gavin Mu wrote:
> > > > On Wed, Jul 11, 2012 at 12:54 AM, Kurt Lidl <lidl@pix.net> wrote:
> > > > 
> > > > > On Mon, Jul 09, 2012 at 04:00:19PM +0200, Marius Strobl wrote:
> > > > > > On Sat, Jul 07, 2012 at 10:54:35PM -0400, Kurt Lidl wrote:
> > > > > > > I built a full 9.0-stable distribution on Friday night, and got to play
> > > > > > > with installing it on a spare Netra T1-105 today.  Mostly I was
> > > > > > > interested in testing out the integrated ZFS boot support that
> > > > > > > was commited recently.
> > > > > > >
> > > > > > > First of all -- it works!  Thanks very much to all who made it
> > > > > possible!
> > > > > > >
> > > > > > > After working through a couple of nits in my script that installs it
> > > > > all,
> > > > > > > I've got a fully functioning, ZFS-only sparc64 machine.  Nice.
> > > > > > >
> > > > > > > The zfsboot bootblock's warning about not being able to open
> > > > > non-existant
> > > > > > > devices are pretty extranous, but other than that, it seems to
> > > > > function OK.
> > > > > >
> > > > > > That's more or less a cosmetic problem for now; there's no standard
> > > > > > Open Firmware method allowing to test whether the device corresponding
> > > > > > to a (automatically) created device alias actually exists short of
> > > > > > trying to open it, with OFW causing at least the "Drive not ready"
> > > > > > part on its own. There are some Sun specific extensions to the
> > > > > > default methods whose names sound like they could be of some help
> > > > > > here. I haven't gotten around to actually test whether this is the
> > > > > > case or whether they actually exist in all OFW implementations of
> > > > > > all sun4u models.
> > > > > > If the aliases were artificially created via the `nvalias` command
> > > > > > ("disk9" sounds a bit unusual for the automatically created ones)
> > > > > > you can get rid of the none existing ones via `nvunalias` (needs
> > > > > > a `reset-all` or power-cycle to take effect).
> > > > >
> > > > > All the disks that were probed were part of the normally
> > > > > defined devices on the machine.  I only have two devices defined
> > > > > in my nvramrc:
> > > > >
> > > > > ok nvramrc type
> > > > > devalias rootdisk /pci@1f,0/pci@1,1/scsi@2/disk@0,0
> > > > > devalias rootmirror /pci@1f,0/pci@1,1/scsi@2/disk@1,0
> > > > >
> > > > > And I have the system configured to boot from "rootdisk rootmirror".
> > > > >
> > > > > Here's the full output of a 'devalias' from the prom on the machine:
> > > > >
> > > > > ok devalias
> > > > > cdrom1                   /pci@1f,0/pci@1,1/scsi@2/disk@6,0:f
> > > > > cdrom                    /pci@1f,0/pci@1/pci@1/ide@e/cdrom@2:f
> > > > > ide-disk                 /pci@1f,0/pci@1/pci@1/ide@e/disk@0:f
> > > > > ide-cdrom                /pci@1f,0/pci@1/pci@1/ide@e/cdrom@2:f
> > > > > ide                      /pci@1f,0/pci@1/pci@1/ide@e
> > > > > rootmirror               /pci@1f,0/pci@1,1/scsi@2/disk@1,0
> > > > > rootdisk                 /pci@1f,0/pci@1,1/scsi@2/disk@0,0
> > > > > userprom2                /pci@1f,0/pci@1,1/ebus@1/flashprom@10,800000
> > > > > userprom1                /pci@1f,0/pci@1,1/ebus@1/flashprom@10,400000
> > > > > i2c-cs2                  /pci@1f,0/pci@1,1/ebus@1/i2c@14,100000
> > > > > i2c                      /pci@1f,0/pci@1,1/ebus@1/i2c@14,600000
> > > > > systemprom               /pci@1f,0/pci@1,1/ebus@1/flashprom@10,0
> > > > > pcic                     /pci@1f,0/pci@1/pci@1
> > > > > pcib                     /pci@1f,0/pci@1,1
> > > > > pcia                     /pci@1f,0/pci@1
> > > > > ebus                     /pci@1f,0/pci@1,1/ebus@1
> > > > > net2                     /pci@1f,0/pci@1,1/network@3,1
> > > > > net                      /pci@1f,0/pci@1,1/network@1,1
> > > > > floppy                   /pci@1f,0/pci@1,1/ebus@1/fdthree
> > > > > disk                     /pci@1f,0/pci@1,1/scsi@2/disk@0,0
> > > > > cdrom                    /pci@1f,0/pci@1,1/scsi@2/disk@6,0:f
> > > > > tape                     /pci@1f,0/pci@1,1/scsi@2/tape@4,0
> > > > > tape1                    /pci@1f,0/pci@1,1/scsi@2/tape@5,0
> > > > > tape0                    /pci@1f,0/pci@1,1/scsi@2/tape@4,0
> > > > > diskf                    /pci@1f,0/pci@1,1/scsi@2/disk@f,0
> > > > > diske                    /pci@1f,0/pci@1,1/scsi@2/disk@e,0
> > > > > diskd                    /pci@1f,0/pci@1,1/scsi@2/disk@d,0
> > > > > diskc                    /pci@1f,0/pci@1,1/scsi@2/disk@c,0
> > > > > diskb                    /pci@1f,0/pci@1,1/scsi@2/disk@b,0
> > > > > diska                    /pci@1f,0/pci@1,1/scsi@2/disk@a,0
> > > > > disk9                    /pci@1f,0/pci@1,1/scsi@2/disk@9,0
> > > > > disk8                    /pci@1f,0/pci@1,1/scsi@2/disk@8,0
> > > > > disk7                    /pci@1f,0/pci@1,1/scsi@2/disk@7,0
> > > > > disk6                    /pci@1f,0/pci@1,1/scsi@2/disk@6,0
> > > > > disk5                    /pci@1f,0/pci@1,1/scsi@2/disk@5,0
> > > > > disk4                    /pci@1f,0/pci@1,1/scsi@2/disk@4,0
> > > > > disk3                    /pci@1f,0/pci@1,1/scsi@2/disk@3,0
> > > > > disk2                    /pci@1f,0/pci@1,1/scsi@2/disk@2,0
> > > > > disk1                    /pci@1f,0/pci@1,1/scsi@2/disk@1,0
> > > > > disk0                    /pci@1f,0/pci@1,1/scsi@2/disk@0,0
> > > > > scsi                     /pci@1f,0/pci@1,1/scsi@2
> > > > > ttyb                     /pci@1f,0/pci@1,1/ebus@1/su@14,3602f8
> > > > > ttya                     /pci@1f,0/pci@1,1/ebus@1/su@14,3803f8
> > > > > ttyd                     /pci@1f,0/pci@1,1/ebus@1/se@14,400000:b
> > > > > ttyc                     /pci@1f,0/pci@1,1/ebus@1/se@14,400000:a
> > > > >
> > > > > As you can see, the devices disk0..diskf exist, but something in the
> > > > > boot code "only" probes the first 10 devices.  It's certainly not
> > > > > attempting to opening *all* the disk devices listed by 'devalias'.
> > > > >
> > > > > It looks like from the code in .../sys/boot/sparc64/loader/main.c
> > > > > that the first MAXDEV (==31) disk devices are probed (well, whatever
> > > > > disk%d is an alias to, I suppose) and the vtoc's
> > > > > loaded and examined for zfs partitions.
> > > > >
> > > > > oops, I think I assumed that the disk name should be disk9, disk10,
> > > > disk11, instead of disk9, diska, diskb...
> > > > Is there any standards to name those disks?
> > > 
> > > I do not really know.  The above 'devalias' output is the same on
> > > the two netra-T1 105s that I tested.  I looked on my SunFire V240,
> > > and it has many fewer entries:
> > > 
> > > {1} ok devalias
> > > usb                      /pci@1e,600000/ide@d/disk
> > > xnet2                    /pci@1d,700000/pci@1/SUNW,hme@0,1:dhcp,
> > > xnet1                    /pci@1e,600000/pci@3/SUNW,hme@0,1:dhcp,
> > > xnet                     /pci@1e,600000/pci@2/SUNW,hme@0,1:dhcp,
> > > net3                     /pci@1d,700000/network@2,1
> > > net2                     /pci@1d,700000/network@2
> > > net1                     /pci@1f,700000/network@2,1
> > > net                      /pci@1f,700000/network@2
> > > cdrom                    /pci@1e,600000/ide@d/cdrom@0,0:f
> > > ide                      /pci@1e,600000/ide@d
> > > disk3                    /pci@1c,600000/scsi@2/disk@3,0
> > > disk2                    /pci@1c,600000/scsi@2/disk@2,0
> > > disk1                    /pci@1c,600000/scsi@2/disk@1,0
> > > disk0                    /pci@1c,600000/scsi@2/disk@0,0
> > > disk                     /pci@1c,600000/scsi@2/disk@0,0
> > > scsi                     /pci@1c,600000/scsi@2
> > > sc-control               /pci@1e,600000/isa@7/rmc-comm@0,3e8
> > > ttyb                     /pci@1e,600000/isa@7/serial@0,2e8
> > > ttya                     /pci@1e,600000/isa@7/serial@0,3f8
> > > name                     aliases
> > > 
> > > I would argue that what the loader ought to be looking at the
> > > devices/devalias entries values for the "boot-device" property.
> > > 
> > > That way, if I wanted to boot from something like a zmirror of
> > > disk2 and disk3 on my sunfire, I would just set the
> > > "boot-device" to be "disk2 disk3", and the zfs boot code would
> > > just try to interate through those devices, rather than going
> > > from 0..31 and trying disk%d...
> > > 
> > > If I had valid boot-code on disk0 and disk2, and I set the
> > > "boot-device" to "disk2 disk3", I think current code will do
> > > this:
> > > 	- prom load "zfsboot" block off disk2
> > > 	- zfsboot block loads in the zfsloader binary from current disk (disk2)
> > > 	- which then probes disk0, disk1 .... and finally boots
> > > 	the kernel from the first freebsd-zfs partition that it finds
> > > 	on any of those disks.
> > > 
> > > I think this is wrong, as there could be some data-only zfs
> > > partition on disk0, which doesn't have a kernel to boot from...
> > > 
> > > Also, one other thing to keep in mind that the boot-device propery
> > > can be a devalias entry or just a straight-up device specifier,
> > > like this:
> > > 
> > > 	/pci@1c,600000/scsi@2/disk@0,0:a
> > > 
> > > (That's what I have on my SunFire, for various arcane reasons...)
> > > 
> > > I guess we also have to worry when someone breaks into the prom
> > > and says "boot disk4", and that user input should override the
> > > "boot-device" settings in the prom.
> > > 
> > 
> > What's currently implemented in the sparc64 ZFS loader resembles
> > how the x86 version works as close as possible, i.e. basically
> > trying to detect ZFS pools on all disks available via the firmware
> > (i.e. BIOS in the x86 case). The current approach may not be
> > ideal for sparc64, but before inventing yet another one it would
> > be great if someone could check how this is done in (Open)Solaris,
> > either by digging up the relevant documentation or by actually
> > giving it a try.
> 
> The "diskroot" and "diskmirror" devaliases that I have on the netra-T1
> date back to the time when I ran Solaris 9 with disk mirroring
> (remember DiskSuite?) on this machine.  That's when/how I learned about
> using "boot-device" with a space seperated list of things to probe
> and boot from.
> 
> I continued doing exactly the same thing for Solaris 10 - first with
> DiskSuite, and later when they introduced ZFS booting, the same thing
> continued on.
> 

Okay, in r238851, I've changed the sparc64 ZFS loader to no longer
try to detect providers based on diskN aliases but to only probe
disks listed in the boot-device environment variable. I've a pre-ACK
from re@ for merging this change in time for 9.1.

Marius




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120727182558.GH58433>