Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 11 Feb 2017 04:56:05 +0000
From:      John <jwd@FreeBSD.org>
To:        FreeBSD-scsi <freebsd-scsi@freebsd.org>
Subject:   multipath device never failing - loops over providers instead
Message-ID:  <20170211045605.GA43225@FreeBSD.org>

next in thread | raw e-mail | index | archive | help

--17pEHd4RhPHOinZp
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi Folks,

   Running 10.3-STABLE  r308246 from Nov 3, 2016

   I thought I saw a commit in this area a while back but I
cannot seem to find it nor is google helping..

   I have SAS drives behind 2 multiplexers (4 paths total) which
are all configured similar to the following:

# gmultipath status Z76
         Name   Status  Components
multipath/Z76  OPTIMAL  da92 (ACTIVE)
                        da236 (PASSIVE)
                        da428 (PASSIVE)
                        da572 (PASSIVE)

   For each path on the components above, the following sequence occurs:

kernel: (da92:mpr0:0:399:0): READ(10). CDB: 28 00 0b a7 20 c0 00 00 10 00=
=20
kernel: (da92:mpr0:0:399:0): CAM status: SCSI Status Error
kernel: (da92:mpr0:0:399:0): SCSI status: Check Condition
kernel: (da92:mpr0:0:399:0): SCSI sense: HARDWARE FAILURE asc:32,0 (No defe=
ct spare location available)
kernel: (da92:mpr0:0:399:0): Info: 0xba720c0
kernel: (da92:mpr0:0:399:0): Field Replaceable Unit: 157
kernel: (da92:mpr0:0:399:0): Command Specific Info: 0x80010000
kernel: (da92:mpr0:0:399:0): Actual Retry Count: 255
kernel: (da92:mpr0:0:399:0): Retrying command (per sense data)

   After each path has failed, the following is seen:

kernel: GEOM_MULTIPATH: Error 5, da92 in Z76 marked FAIL
kernel: GEOM_MULTIPATH: all paths in Z76 were marked FAIL, restore da572
kernel: GEOM_MULTIPATH: all paths in Z76 were marked FAIL, restore da428
kernel: GEOM_MULTIPATH: all paths in Z76 were marked FAIL, restore da236
kernel: GEOM_MULTIPATH: da572 is now active path in Z76

   and the entire failure loop occurs again. The multipath device
itself is never failed (so the zfs pool can never go into degraded mode,
the faulty drive replaced with a spare, etc).

   Once I pulled the drive the multipath device Z76 fails and
things sent as expected.

   It seems g_multipath_fault() in this instance should just fail the devic=
e.

   Does anyone have any pointers on this issue?

Thanks,
John


--17pEHd4RhPHOinZp
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQF8BAEBCgBmBQJYnpljXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQwNDBGOTgxNzM0NzQ3OEFBNDYyODNGQzVC
NjI0OTlBMTQyNEY3RjgxAAoJELYkmaFCT3+BGk4IALskuVHIvoVBhLkuAViD8/ME
i/LckUyVRB86r5lHoetAfPo8yQv7urAMvB27PBnvDRsxKWF/aCMxioVHjFsai86R
BpsObFYycGazAoEgoxYsybs5wtKGO5pLm+VPS8DSaHHiNmJtpFeEg8a1vLhdOCmj
IpZHyo5StiUokvde3TViAHUo3+CeBVir5K63QlqelHtNa1oE/0difiJfkogdioHs
EBCQ34NqzsbGbogo0O8ubKI77LYZnsIxn49z0pMIoXohxuCpw53PCoN+QuFCmrjp
9n5GtA5crOieE2pixEUuixJzT1s+/6ZTeV0IaFRn7I0WZpTqsWSmspnZorOuwUk=
=KkTg
-----END PGP SIGNATURE-----

--17pEHd4RhPHOinZp--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170211045605.GA43225>