Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 17 Apr 2013 20:28:22 -0400
From:      Outback Dingo <outbackdingo@gmail.com>
To:        Devin Teske <dteske@freebsd.org>
Cc:        FreeBSD Questions <freebsd-questions@freebsd.org>
Subject:   Re: gmultipath, ses and shared disks / cant seem to share between local nodes
Message-ID:  <CAKYr3zz8094fbP7c21CfV%2Bie1YVfVOE4FEb9zH2pPrDsMB-HwQ@mail.gmail.com>
In-Reply-To: <13CA24D6AB415D428143D44749F57D7201F0613F@ltcfiswmsgmb21>
References:  <CAKYr3zy7fXriB_rD6XuvsZT%2B19JmTAX-882f=MNjz=uhGHwFjA@mail.gmail.com> <13CA24D6AB415D428143D44749F57D7201F05E0A@ltcfiswmsgmb21> <CAKYr3zw8eK3kHExKynvfngewa7%2BtzxyxPKo5N%2B4tzuoA%2Bz-YbA@mail.gmail.com> <13CA24D6AB415D428143D44749F57D7201F05FD4@ltcfiswmsgmb21> <CAKYr3zypR2Y4Zo%2BCqPdY%2BJ6jMJ2f11zQnHBvmzvYzE8HOTEVjA@mail.gmail.com> <13CA24D6AB415D428143D44749F57D7201F0613F@ltcfiswmsgmb21>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Apr 17, 2013 at 8:05 PM, Teske, Devin <Devin.Teske@fisglobal.com>wr=
ote:

>
>  On Apr 17, 2013, at 4:56 PM, Outback Dingo wrote:
>
>
>
>
> On Wed, Apr 17, 2013 at 7:29 PM, Teske, Devin <Devin.Teske@fisglobal.com>=
wrote:
>
>>
>>   On Apr 17, 2013, at 4:10 PM, Outback Dingo wrote:
>>
>>
>>
>>
>> On Wed, Apr 17, 2013 at 6:39 PM, Teske, Devin <Devin.Teske@fisglobal.com=
>wrote:
>>
>>>
>>> On Apr 17, 2013, at 3:26 PM, Outback Dingo wrote:
>>>
>>> > Ok, maybe im at a loss here in the way my brain is viewing this
>>> >
>>> > we have a box, its got 2 nodes in the chassis, and 32 sata drives
>>> > attached to a SATA/SAS backplane via 4 (2 per node) LSI MPT SAS2 card=
s
>>> > should i not logically be seeing 4 controllers X #drive count ??
>>> >
>>> > camcontrol devlist shows 32 devices, daX,passX and sesX,passX
>>> >
>>> > <SEAGATE ST33000650SS 0004>        at scbus0 target 9 lun 0 (da0,pass=
0)
>>> > <STORBRICK-3 1400>        at scbus0 target 10 lun 0 (ses0,pass1)
>>> > <SEAGATE ST33000650SS 0004>        at scbus0 target 11 lun 0
>>> (da1,pass2)
>>> > <STORBRICK-1 1400>        at scbus0 target 12 lun 0 (ses1,pass3)
>>> > <SEAGATE ST33000650SS 0004>        at scbus0 target 13 lun 0
>>> (da2,pass4)
>>> > <STORBRICK-2 1400>        at scbus0 target 14 lun 0 (ses2,pass5)
>>> > <SEAGATE ST33000650SS 0004>        at scbus0 target 15 lun 0
>>> (da3,pass6)
>>> > <STORBRICK-4 1400>        at scbus0 target 16 lun 0 (ses3,pass7)
>>> > <SEAGATE ST33000650SS 0004>        at scbus0 target 17 lun 0
>>> (da4,pass8)
>>> > <STORBRICK-6 1400>        at scbus0 target 18 lun 0 (ses4,pass9)
>>> > <SEAGATE ST33000650SS 0004>        at scbus0 target 19 lun 0
>>> (da5,pass10)
>>> > <STORBRICK-0 1400>        at scbus0 target 20 lun 0 (ses5,pass11)
>>> > <SEAGATE ST33000650SS 0004>        at scbus0 target 21 lun 0
>>> (da6,pass12)
>>> > <STORBRICK-7 1400>        at scbus0 target 22 lun 0 (ses6,pass13)
>>> > <SEAGATE ST33000650SS 0004>        at scbus0 target 23 lun 0
>>> (da7,pass14)
>>> > <STORBRICK-5 1400>        at scbus0 target 24 lun 0 (ses7,pass15)
>>> > <SEAGATE ST9300605SS 0004>         at scbus1 target 0 lun 0
>>> (da8,pass16)
>>> > <SEAGATE ST9300605SS 0004>         at scbus1 target 1 lun 0
>>> (da9,pass17)
>>> > <STORBRICK-3 1400>        at scbus8 target 10 lun 0 (ses8,pass19)
>>> > <SEAGATE ST33000650SS 0004>        at scbus8 target 11 lun 0
>>> (da11,pass20)
>>> > <STORBRICK-1 1400>        at scbus8 target 12 lun 0 (ses9,pass21)
>>> > <SEAGATE ST33000650SS 0004>        at scbus8 target 13 lun 0
>>> (da12,pass22)
>>> > <STORBRICK-2 1400>        at scbus8 target 14 lun 0 (ses10,pass23)
>>> > <SEAGATE ST33000650SS 0004>        at scbus8 target 15 lun 0
>>> (da13,pass24)
>>> > <STORBRICK-4 1400>        at scbus8 target 16 lun 0 (ses11,pass25)
>>> > <SEAGATE ST33000650SS 0004>        at scbus8 target 17 lun 0
>>> (da14,pass26)
>>> > <STORBRICK-6 1400>        at scbus8 target 18 lun 0 (ses12,pass27)
>>> > <SEAGATE ST33000650SS 0004>        at scbus8 target 19 lun 0
>>> (da15,pass28)
>>> > <STORBRICK-0 1400>        at scbus8 target 20 lun 0 (ses13,pass29)
>>> > <SEAGATE ST33000650SS 0004>        at scbus8 target 21 lun 0
>>> (da16,pass30)
>>> > <STORBRICK-7 1400>        at scbus8 target 22 lun 0 (ses14,pass31)
>>> > <SEAGATE ST33000650SS 0004>        at scbus8 target 23 lun 0
>>> (da17,pass32)
>>> > <STORBRICK-5 1400>        at scbus8 target 24 lun 0 (ses15,pass33)
>>> > <USB 2.0 Flash Drive 8.07>         at scbus9 target 0 lun 0
>>> (da18,pass34)
>>> >
>>> >
>>> > we would like to create a zpool from all the devices, that in theory =
if
>>> > nodeA failed
>>> > then nodeB could force import the pool,
>>>
>>>  gmultipath (which you mention in the subject) is the appropriate tool
>>> for this, but there's no need for an import of the pool if you build th=
e
>>> pool out of multipath devices. In our experience, we can pull a cable a=
nd
>>> zfs continues working just fine.
>>>
>>> In other words, don't build the pool out of the devices, put a
>>> gmultipath label on each device and then use /dev/multipath/LABEL for t=
he
>>> zpool devices.
>>>
>>>
>>> > nodeA and NodeB are attached through
>>> > dual LSI controllers, to the SATA/SAS backplane. but i cant seem to
>>> create
>>> > a zpool from sesX or passX devices, i can however create a 16 drive
>>> zp0ol
>>> > on either node, from any daX device. what did i miss? ive looked at
>>> > gmirror, and also ses documents. Any insight is appreciated, thanks i=
n
>>> > advance.
>>>
>>>  gmirror is the wrong tool, gmultipath is what you want. The basic task
>>> is to use "gmultipath label FOO da#" to write a cookie on the disk (use=
d to
>>> identify new/existing paths during GOEM "taste" events for example).
>>>
>>> After you've labeled the da# devices with gmultipath you say "gmultipat=
h
>>> status" to see the components of each label and you use "multipath/LABE=
L"
>>> as your disk name when creating the zpool (these correspond directly to
>>> /dev/multipath/LABEL, but "zpool create =85" or "zpool add =85" allow y=
ou to
>>> omit the leading "/dev").
>>>
>>
>>  sanity check me on node A i did
>>
>>  zpool destroy master
>>
>>  gmultipath label FOO da0
>>
>>  gmultipath status
>>                     Name    Status  Components
>>            multipath/FOO  DEGRADED  da0 (ACTIVE)
>>  multipath/FOO-619648737  DEGRADED  da1 (ACTIVE)
>>  multipath/FOO-191725652  DEGRADED  da2 (ACTIVE)
>> multipath/FOO-1539342315  DEGRADED  da3 (ACTIVE)
>> multipath/FOO-1276041606  DEGRADED  da4 (ACTIVE)
>> multipath/FOO-2000832198  DEGRADED  da5 (ACTIVE)
>> multipath/FOO-1285640577  DEGRADED  da6 (ACTIVE)
>> multipath/FOO-1816092574  DEGRADED  da7 (ACTIVE)
>> multipath/FOO-1102254444  DEGRADED  da8 (ACTIVE)
>>  multipath/FOO-330300690  DEGRADED  da9 (ACTIVE)
>>   multipath/FOO-92140635  DEGRADED  da10 (ACTIVE)
>>  multipath/FOO-855257672  DEGRADED  da11 (ACTIVE)
>> multipath/FOO-1003634134  DEGRADED  da12 (ACTIVE)
>>    multipath/FOO-2449862  DEGRADED  da13 (ACTIVE)
>> multipath/FOO-1137080233  DEGRADED  da14 (ACTIVE)
>> multipath/FOO-1696804371  DEGRADED  da15 (ACTIVE)
>> multipath/FOO-1304457562  DEGRADED  da16 (ACTIVE)
>>  multipath/FOO-912159854  DEGRADED  da17 (ACTIVE)
>>
>>  now on node B i should do the same? reboot both nodes and i should be
>> able "see" 32 multipath/FOO deices to create a pool from ?
>>
>>
>>   It appears from the above output that you labeled all of the block
>> devices (da0 through da17) with the same label.
>>
>>  This is not what you want.
>>
>>  Use "gmultipath clear FOO" on each of the block devices and have
>> another go using unique values.
>>
>>  For example:
>>
>>  gmultipath label SATA_LUN01 da0
>> gmultipath label SATA_LUN02 da1
>> gmultipath label SATA_LUN03 da2
>> gmultipath label SATA_LUN04 da3
>> gmultipath label SATA_LUN05 da4
>> gmultipath label SATA_LUN06 da5
>> gmultipath label SATA_LUN07 da6
>> gmultipath label SATA_LUN08 da7
>> gmultipath label SATA_LUN09 da8
>> gmultipath label SATA_LUN10 da9
>> gmultipath label SATA_LUN11 da10
>> gmultipath label SATA_LUN12 da11
>> gmultipath label SATA_LUN13 da12
>> gmultipath label SATA_LUN14 da13
>> gmultipath label SATA_LUN15 da14
>> gmultipath label SATA_LUN16 da15
>> gmultipath label SATA_LUN17 da16
>> gmultipath label SATA_LUN18 da17
>> ..
>>
>>  Then "gmultipath status" should show your unique labels each with a
>> single component.
>>
>>  Then you would do:
>>
>>  zpool create master multipath/SATA_LUN{01,02,03,04,05,06,=85}
>>
>>
>  ahh ok got it, and probably on the other node
>
>  gmultipath label SATA_LUN19 da0
> gmultipath label SATA_LUN20 da1
>
>  -------------------snip------------------------------
>
>  gmultipath label SATA_LUN36 da15
>
>
>  No. You do not need to label the other "node"
>
>  Since the "gmultipath label =85" command writes data to the disk, you do
> not need to label the disk multiple times (and in fact would be an error
> to). Rather, as the system is probing and adding disks, it will
> automatically detect multiple paths based on this data stored on the disk=
.
>
>  Read: If da0 and another da# device are indeed two paths to the same
> device, then as those devices are probed by the kernel, "gmultipath statu=
s"
> will dynamically show the newly discovered paths.
>
>  If, after labeling all the devices on a single path you find that
> "gmultipath status" still shows only one component for each label, try
> rebooting. If still after a reboot "gmultipath status" only shows a singl=
e
> component for each label, then clearly you are not configured (hardware
> wise) for multiple paths to the same components (and this may be where th=
e
> "gmultipath" versus "gmirror" nit that I caught in your original post com=
es
> into play -- maybe "gmultipath" was the wrong thing to put in the subject
> if you don't have multiple paths to the same components but instead have =
a
> mirrored set of components that you want to gmirror all your data to a
> second pool -- if that ends up being the case, then I would actually
> recommend a zfs send/receive cron-job based on snapshots to utilize the
> performance of ZFS Copy On Write rather than perhaps gmirror; but your
> mileage may vary).
> --
>

well nodeA sees daX devices, where nodeB does also however the serials for
da0 are different on both nodes

it seems NodeA sees NodeB drives as sesX/(daX,passX)

and NodeB sees NodeA drives as sesX/(daX,passX)

each node sees pass0 to pass32 so i would think the 4 LSI controllers
connected to the backplane see all 32 SATA drives in the enclosure

nodeA drive list

camcontrol devlist
<SEAGATE ST33000650SS 0004>        at scbus0 target 9 lun 0 (da0,pass0)
<SGI CORP STORBRICK-3 1400>        at scbus0 target 10 lun 0 (ses0,pass1)
<SEAGATE ST33000650SS 0004>        at scbus0 target 11 lun 0 (da1,pass2)
<SGI CORP STORBRICK-1 1400>        at scbus0 target 12 lun 0 (ses1,pass3)
<SEAGATE ST33000650SS 0004>        at scbus0 target 13 lun 0 (da2,pass4)
<SGI CORP STORBRICK-2 1400>        at scbus0 target 14 lun 0 (ses2,pass5)
<SEAGATE ST33000650SS 0004>        at scbus0 target 15 lun 0 (da3,pass6)
<SGI CORP STORBRICK-4 1400>        at scbus0 target 16 lun 0 (ses3,pass7)
<SEAGATE ST33000650SS 0004>        at scbus0 target 17 lun 0 (da4,pass8)
<SGI CORP STORBRICK-6 1400>        at scbus0 target 18 lun 0 (ses4,pass9)
<SEAGATE ST33000650SS 0004>        at scbus0 target 19 lun 0 (da5,pass10)
<SGI CORP STORBRICK-0 1400>        at scbus0 target 20 lun 0 (ses5,pass11)
<SEAGATE ST33000650SS 0004>        at scbus0 target 21 lun 0 (da6,pass12)
<SGI CORP STORBRICK-7 1400>        at scbus0 target 22 lun 0 (ses6,pass13)
<SEAGATE ST33000650SS 0004>        at scbus0 target 23 lun 0 (da7,pass14)
<SGI CORP STORBRICK-5 1400>        at scbus0 target 24 lun 0 (ses7,pass15)
<SEAGATE ST9300605SS 0004>         at scbus1 target 0 lun 0 (da8,pass16)
<SEAGATE ST9300605SS 0004>         at scbus1 target 1 lun 0 (da9,pass17)
<SEAGATE ST33000650SS 0004>        at scbus8 target 9 lun 0 (da10,pass18)
<SGI CORP STORBRICK-3 1400>        at scbus8 target 10 lun 0 (ses8,pass19)
<SEAGATE ST33000650SS 0004>        at scbus8 target 11 lun 0 (da11,pass20)
<SGI CORP STORBRICK-1 1400>        at scbus8 target 12 lun 0 (ses9,pass21)
<SEAGATE ST33000650SS 0004>        at scbus8 target 13 lun 0 (da12,pass22)
<SGI CORP STORBRICK-2 1400>        at scbus8 target 14 lun 0 (ses10,pass23)
<SEAGATE ST33000650SS 0004>        at scbus8 target 15 lun 0 (da13,pass24)
<SGI CORP STORBRICK-4 1400>        at scbus8 target 16 lun 0 (ses11,pass25)
<SEAGATE ST33000650SS 0004>        at scbus8 target 17 lun 0 (da14,pass26)
<SGI CORP STORBRICK-6 1400>        at scbus8 target 18 lun 0 (ses12,pass27)
<SEAGATE ST33000650SS 0004>        at scbus8 target 19 lun 0 (da15,pass28)
<SGI CORP STORBRICK-0 1400>        at scbus8 target 20 lun 0 (ses13,pass29)
<SEAGATE ST33000650SS 0004>        at scbus8 target 21 lun 0 (da16,pass30)
<SGI CORP STORBRICK-7 1400>        at scbus8 target 22 lun 0 (ses14,pass31)
<SEAGATE ST33000650SS 0004>        at scbus8 target 23 lun 0 (da17,pass32)
<SGI CORP STORBRICK-5 1400>        at scbus8 target 24 lun 0 (ses15,pass33)




nodeB drive list

camcontrol devlist
<SEAGATE ST33000650SS 0004>        at scbus0 target 9 lun 0 (pass0,da0)
<STEC Z16IZF2E-200UCV E46F>        at scbus0 target 10 lun 0 (pass1,da1)
<SGI CORP STORBRICK-3 1400>        at scbus0 target 11 lun 0 (ses0,pass2)
<SEAGATE ST33000650SS 0004>        at scbus0 target 12 lun 0 (pass3,da2)
<SGI CORP STORBRICK-4 1400>        at scbus0 target 13 lun 0 (ses1,pass4)
<SEAGATE ST33000650SS 0004>        at scbus0 target 14 lun 0 (pass5,da3)
<SGI CORP STORBRICK-5 1400>        at scbus0 target 15 lun 0 (ses2,pass6)
<SEAGATE ST33000650SS 0004>        at scbus0 target 16 lun 0 (pass7,da4)
<SGI CORP STORBRICK-7 1400>        at scbus0 target 17 lun 0 (ses3,pass8)
<SEAGATE ST33000650SS 0004>        at scbus0 target 18 lun 0 (pass9,da5)
<SGI CORP STORBRICK-6 1400>        at scbus0 target 19 lun 0 (ses4,pass10)
<SEAGATE ST33000650SS 0004>        at scbus0 target 20 lun 0 (pass11,da6)
<STEC Z16IZF2E-200UCV E46F>        at scbus0 target 21 lun 0 (pass12,da7)
<SGI CORP STORBRICK-0 1400>        at scbus0 target 22 lun 0 (ses5,pass13)
<SEAGATE ST33000650SS 0004>        at scbus0 target 23 lun 0 (pass14,da8)
<SGI CORP STORBRICK-1 1400>        at scbus0 target 25 lun 0 (ses6,pass15)
<SEAGATE ST33000650SS 0004>        at scbus0 target 26 lun 0 (pass16,da9)
<SGI CORP STORBRICK-2 1400>        at scbus0 target 28 lun 0 (ses7,pass17)
<SEAGATE ST9300605SS 0004>         at scbus1 target 0 lun 0 (pass18,da10)
<SEAGATE ST9300605SS 0004>         at scbus1 target 1 lun 0 (pass19,da11)
<SEAGATE ST33000650SS 0004>        at scbus8 target 9 lun 0 (pass20,da12)
<SGI CORP STORBRICK-3 1400>        at scbus8 target 10 lun 0 (ses8,pass21)
<SEAGATE ST33000650SS 0004>        at scbus8 target 11 lun 0 (pass22,da13)
<SGI CORP STORBRICK-7 1400>        at scbus8 target 12 lun 0 (ses9,pass23)
<SEAGATE ST33000650SS 0004>        at scbus8 target 13 lun 0 (pass24,da14)
<SGI CORP STORBRICK-0 1400>        at scbus8 target 14 lun 0 (ses10,pass25)
<SEAGATE ST33000650SS 0004>        at scbus8 target 15 lun 0 (pass26,da15)
<SGI CORP STORBRICK-1 1400>        at scbus8 target 16 lun 0 (ses11,pass27)
<SEAGATE ST33000650SS 0004>        at scbus8 target 17 lun 0 (pass28,da16)
<SGI CORP STORBRICK-5 1400>        at scbus8 target 18 lun 0 (ses12,pass29)
<SEAGATE ST33000650SS 0004>        at scbus8 target 19 lun 0 (pass30,da17)
<SGI CORP STORBRICK-4 1400>        at scbus8 target 20 lun 0 (ses13,pass31)
<SEAGATE ST33000650SS 0004>        at scbus8 target 21 lun 0 (pass32,da18)
<SGI CORP STORBRICK-6 1400>        at scbus8 target 22 lun 0 (ses14,pass33)
<SEAGATE ST33000650SS 0004>        at scbus8 target 23 lun 0 (pass34,da19)
<SGI CORP STORBRICK-2 1400>        at scbus8 target 24 lun 0 (ses15,pass35)
<USB 2.0 Flash Drive 8.07>         at scbus10 target 0 lun 0 (da20,pass36)

the logic looks right, correct?


Devin
>
>
>
>   then create the zpool from the "36" multipath devices?
>
>  so if i create a 36 drive multipath zpool on nodeA when it fails do i
> just import it to nodeB
> i was thinking to use carp for failover..... so nodeB would continue nfs
> sessions and import the zpool to nodeB
>
>
>
>>  --
>> Devin
>>
>>   _____________
>> The information contained in this message is proprietary and/or
>> confidential. If you are not the intended recipient, please: (i) delete =
the
>> message and all copies; (ii) do not disclose, distribute or use the mess=
age
>> in any manner; and (iii) notify the sender immediately. In addition, ple=
ase
>> be aware that any message addressed to our domain is subject to archivin=
g
>> and review by persons other than the intended recipient. Thank you.
>>
>
>
>  _____________
> The information contained in this message is proprietary and/or
> confidential. If you are not the intended recipient, please: (i) delete t=
he
> message and all copies; (ii) do not disclose, distribute or use the messa=
ge
> in any manner; and (iii) notify the sender immediately. In addition, plea=
se
> be aware that any message addressed to our domain is subject to archiving
> and review by persons other than the intended recipient. Thank you.
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAKYr3zz8094fbP7c21CfV%2Bie1YVfVOE4FEb9zH2pPrDsMB-HwQ>