Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 Jan 2013 20:48:51 -0700 (MST)
From:      Warren Block <wblock@wonkity.com>
To:        Jeremy Chadwick <jdc@koitsu.org>
Cc:        freebsd@deman.com, freebsd-stable@freebsd.org
Subject:   Re: RFC: Suggesting ZFS "best practices" in FreeBSD
Message-ID:  <alpine.BSF.2.00.1301252014160.37256@wonkity.com>
In-Reply-To: <20130126025929.GA2777@icarus.home.lan>
References:  <20130124174039.GA35811@icarus.home.lan> <alpine.BSF.2.00.1301251249500.5564@wonkity.com> <20130126025929.GA2777@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 25 Jan 2013, Jeremy Chadwick wrote:

> On Fri, Jan 25, 2013 at 12:58:15PM -0700, Warren Block wrote:
>> On Thu, 24 Jan 2013, Jeremy Chadwick wrote:
>>
>>>>> #1.  Map the physical drive slots to how they show up in FBSD so if a
>>>>> disk is removed and the machine is rebooted all the disks after that
>>>>> removed one do not have an 'off by one error'.  i.e. if you have
>>>>> ada0-ada14 and remove ada8 then reboot - normally FBSD skips that
>>>>> missing ada8 drive and the next drive (that used to be ada9) is now
>>>>> called ada8 and so on...
>>>>
>>>> How do you do that?  If I'm in that situation, I think I could find the
>>>> bad drive, or at least the good ones, with diskinfo and the drive serial
>>>> number.  One suggestion I saw somewhere was to use disk serial numbers
>>>> for label values.
>>>
>>> The term FreeBSD uses for this is called "wiring down" or "wired down",
>>> and is documented in CAM(4).  It's come up repeatedly over the years but
>>> for whatever reason people overlook it or can't find it.
>>
>> I was aware of it, it just seems like there ought to be a better way
>> to identify drives than by messing with the hardware configuration.
>
> I understand what you mean, but it's actually messing with a software
> configuration (specifically CAM).
>
> It's a one-time change that solves the dilemma; it only has to be
> adjusted if you change controller brands or models, which is a lot less
> often than changing disks.
>
>> Something more elegant, less tied to changing the hardware
>> configuration of the host.  Assigning the drive serial number as a
>> label, for example.
>
> Hmm...  all this does is change the nature of the problem, no?  You
> still have the issue of "having to know some magical number" to
> determine out what path name refers to what physical disk in your system.
> Can you expand on how this would solve it?

It's not so much a solution as in the right domain.  The point, as I see 
it, is being able to identify individual disks uniquely.  Forcing static 
devices names does that, sort of.  But plug a different disk into the 
same port as an existing one, and that disk is now identified as the old 
one.

Using a unique identifier already built into those drives helps. 
Serial numbers are unique, built into the drive, and even printed on the 
paper label.  They can be queried through software and take no disk 
space.  If a drive fails electronically to the point it can't be 
queried, that serial number can be identified from a current list of all 
the drive serial numbers in the array--it's the one not there.

There are problems, they aren't like LEDs on each drive that could flash 
to identify it.  Some enclosures don't make drive labels easy to see. 
Some of that can be addressed with labels.  Er, sticky labels, on the 
outside of the drive or enclosure.  And serial numbers are often 
inconveniently long.

> As for a unique number per disk, disks within the past ~5 years (SATA,
> SAS, and some SCSI) all tend to have this: it's called a WWN:
>
> http://en.wikipedia.org/wiki/World_Wide_Name
>
> But older ATA disks (and by older I don't mean ancient, I mean even
> semi-old) may not have this, which means you get to use something else.
> UUIDs come to mind, but then the question becomes what do you base the
> generation off of?  Model string + serial number + firmware?
>
> There are also complexities depending on HBAs (disk controllers) as
> well; I've seen references, at least on Solaris, of people having one
> disk showing up twice across 2 separate controllers (i.e. only 1
> physical disk in the machine, but showing up as both c8d0 and c9d0, both
> with the same model string and serial number).  I imagine some RAID
> controllers would do this (when a drive isn't part of an array; it might
> show up as both /dev/adaX and /dev/somedriverX).  I know at some point I
> saw this with FreeBSD too during an OS install, I just can't remember
> what the names were that I saw.

Surely that ought to be considered a bug.  Any drive ID system is going 
to be vulnerable to certain

> Linux has by-uuid and by-id (the latter is what you'd like), but there
> are caveats to that too:
>
> https://wiki.archlinux.org/index.php/Persistent_block_device_naming
> http://www.terabyteunlimited.com/kb/article.php?id=389
>
> So at the end of the day I prefer CAM's "wired down" method -- the
> reason is that by modifying loader.conf I **know for sure** bay/cable X
> maps to /dev/adaX, and it's a one-time deal until I decide to move from
> my ICH9 controller to, say, an Areca.

That illustrates one problem with making the configuration specific to 
host hardware as compared to drive specific.

As far as "best practices", situations vary so much that I don't know if 
any drive ID method can be recommended.  For a FreeBSD ZFS document, a 
useful sample configuration is going to be small enough that anything 
would work.  A survey of the techniques in use at various data centers 
would be interesting.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.1301252014160.37256>