Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 12 Feb 2021 21:15:17 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 253471] Change 348906: vdev_geom_attach_by_guids() breaks other use cases
Message-ID:  <bug-253471-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D253471

            Bug ID: 253471
           Summary: Change 348906: vdev_geom_attach_by_guids() breaks
                    other use cases
           Product: Base System
           Version: 12.2-STABLE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: daveb@spectralogic.com

Change MFC r344316, SVN Importer: FreeBSD/base rev 348906 breaks the
following scenario:

- Assume 3 blank, all ZFS label cleared, disks in the system.
- Shutdown, remove one of the blanks, put in on the shelf, reboot.
- Create a mirror pool on the 2 remaining disk, shutdown.
- Replace one of the mirror disks with the with the previously removed
  blank, put the zfs mirror member on the shelf, reboot.

All these machinations are required to ensure the blank disk enumerates
the same as the mirror disk member (has the same vdev_path).

zpool status will now report the pool as a healthy "ONLINE" with
a missing disk:

# zpool status -x
  pool: pool200
 state: ONLINE
status: One or more devices could not be used because the label is missing =
or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-4J
  scan: none requested
config:

        NAME                     STATE     READ WRITE CKSUM
        pool200                  ONLINE       0     0     0
          mirror-0               ONLINE       0     0     0
            1165392866195823403  UNAVAIL      0     0     0  was /dev/da8
            da9                  ONLINE       0     0     0


This is because change 348906 allows vdev_geom_attach_by_guids() to attach a
vdev that has a "NO_MATCH" guid but satisfies the requested vdev_path:


} else if (match =3D=3D best_match) {
  /* match =3D NO_MATCH and best_match is initialized to NO_MATCH;
   * if the paths match we will attach it.
   */
   if (strcmp(pp->name, vdpath) =3D=3D 0) {
        best_pp =3D pp;
   }
}


>From ZFS_LOG() via vdev_geom_open_by_guids() and others:
vdev_geom_attach:240[1]: Attaching to da8.
vdev_geom_attach:309[1]: Created consumer for da8.
vdev_geom_open_by_guids:797[1]: Attach by guid
[17284178551510092527:1165392866195823403] succeeded, provider da8.


Later back in vdev_validate(), this condition never gets vdev_propagate()ed
when detected because for this one error, vdev_set_state() is called with
"is_open" true:

if ((label =3D vdev_label_read_config(vd, txg)) =3D=3D NULL) {
   vdev_set_state(vd, B_TRUE, VDEV_STATE_CANT_OPEN,
      VDEV_AUX_BAD_LABEL);
   return (0);
}

It is curious to me that for this one error path vdev_validate() calls
vdev_set_state(..., B_TRUE, ...) whereas all other are B_FALSE.


For my purposes, reverting 348906 fixes the issue, but changing=20
   vdev_set_state(vd, B_TRUE, VDEV_STATE_CANT_OPEN, VDEV_AUX_BAD_LABEL);
to
   vdev_set_state(vd, B_FALSE, VDEV_STATE_CANT_OPEN, VDEV_AUX_BAD_LABEL);

also solves the problem.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-253471-227>