From owner-svn-src-all@freebsd.org Fri Oct 28 18:18:55 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8EBE8C248E5; Fri, 28 Oct 2016 18:18:55 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 508D0F8F; Fri, 28 Oct 2016 18:18:55 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id u9SIIsSX064502; Fri, 28 Oct 2016 18:18:54 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id u9SIIs2X064498; Fri, 28 Oct 2016 18:18:54 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201610281818.u9SIIs2X064498@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Fri, 28 Oct 2016 18:18:54 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-10@freebsd.org Subject: svn commit: r308057 - in stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-10 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Oct 2016 18:18:55 -0000 Author: mav Date: Fri Oct 28 18:18:53 2016 New Revision: 308057 URL: https://svnweb.freebsd.org/changeset/base/308057 Log: MFC r294329 (by asomers): Disallow zvol-backed ZFS pools Using zvols as backing devices for ZFS pools is fraught with panics and deadlocks. For example, attempting to online a missing device in the presence of a zvol can cause a panic when vdev_geom tastes the zvol. Better to completely disable vdev_geom from ever opening a zvol. The solution relies on setting a thread-local variable during vdev_geom_open, and returning EOPNOTSUPP during zvol_open if that thread-local variable is set. Remove the check for MUTEX_HELD(&zfsdev_state_lock) in zvol_open. Its intent was to prevent a recursive mutex acquisition panic. However, the new check for the thread-local variable also fixes that problem. Also, fix a panic in vdev_geom_taste_orphan. For an unknown reason, this function was set to panic. But it can occur that a device disappears during tasting, and it causes no problems to ignore this departure. Modified: stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c Directory Properties: stable/10/ (props changed) Modified: stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h ============================================================================== --- stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h Fri Oct 28 18:09:08 2016 (r308056) +++ stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h Fri Oct 28 18:18:53 2016 (r308057) @@ -381,6 +381,7 @@ extern void vdev_set_min_asize(vdev_t *v */ /* zdb uses this tunable, so it must be declared here to make lint happy. */ extern int zfs_vdev_cache_size; +extern uint_t zfs_geom_probe_vdev_key; #ifdef illumos /* Modified: stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c ============================================================================== --- stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c Fri Oct 28 18:09:08 2016 (r308056) +++ stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c Fri Oct 28 18:18:53 2016 (r308057) @@ -63,6 +63,13 @@ TUNABLE_INT("vfs.zfs.vdev.bio_delete_dis SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, bio_delete_disable, CTLFLAG_RW, &vdev_geom_bio_delete_disable, 0, "Disable BIO_DELETE"); +/* + * Thread local storage used to indicate when a thread is probing geoms + * for their guids. If NULL, this thread is not tasting geoms. If non NULL, + * it is looking for a replacement for the vdev_t* that is its value. + */ +uint_t zfs_geom_probe_vdev_key; + static void vdev_geom_set_rotation_rate(vdev_t *vd, struct g_consumer *cp) { @@ -329,9 +336,8 @@ vdev_geom_io(struct g_consumer *cp, int static void vdev_geom_taste_orphan(struct g_consumer *cp) { - - KASSERT(1 == 0, ("%s called while tasting %s.", __func__, - cp->provider->name)); + ZFS_LOG(0, "WARNING: Orphan %s while tasting its VDev GUID.", + cp->provider->name); } static int @@ -578,7 +584,6 @@ vdev_geom_attach_by_guids(vdev_t *vd) g_topology_assert(); zgp = g_new_geomf(&zfs_vdev_class, "zfs::vdev::taste"); - /* This orphan function should be never called. */ zgp->orphan = vdev_geom_taste_orphan; zcp = g_new_consumer(zgp); @@ -714,6 +719,9 @@ vdev_geom_open(vdev_t *vd, uint64_t *psi size_t bufsize; int error; + /* Set the TLS to indicate downstack that we should not access zvols*/ + VERIFY(tsd_set(zfs_geom_probe_vdev_key, vd) == 0); + /* * We must have a pathname, and it must be absolute. */ @@ -764,6 +772,9 @@ vdev_geom_open(vdev_t *vd, uint64_t *psi } } + /* Clear the TLS now that tasting is done */ + VERIFY(tsd_set(zfs_geom_probe_vdev_key, NULL) == 0); + if (cp == NULL) { ZFS_LOG(1, "Provider %s not found.", vd->vdev_path); error = ENOENT; Modified: stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c ============================================================================== --- stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Fri Oct 28 18:09:08 2016 (r308056) +++ stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Fri Oct 28 18:18:53 2016 (r308057) @@ -207,6 +207,7 @@ extern void zfs_fini(void); uint_t zfs_fsyncer_key; extern uint_t rrw_tsd_key; static uint_t zfs_allow_log_key; +extern uint_t zfs_geom_probe_vdev_key; typedef int zfs_ioc_legacy_func_t(zfs_cmd_t *); typedef int zfs_ioc_func_t(const char *, nvlist_t *, nvlist_t *); @@ -6735,6 +6736,7 @@ zfs__init(void) tsd_create(&zfs_fsyncer_key, NULL); tsd_create(&rrw_tsd_key, rrw_tsd_destroy); tsd_create(&zfs_allow_log_key, zfs_allow_log_destroy); + tsd_create(&zfs_geom_probe_vdev_key, NULL); printf("ZFS storage pool version: features support (" SPA_VERSION_STRING ")\n"); root_mount_rel(zfs_root_token); Modified: stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c ============================================================================== --- stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c Fri Oct 28 18:09:08 2016 (r308056) +++ stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c Fri Oct 28 18:18:53 2016 (r308057) @@ -1123,36 +1123,30 @@ zvol_open(struct g_provider *pp, int fla return (err); } #else /* !illumos */ - boolean_t locked = B_FALSE; - - /* - * Protect against recursively entering spa_namespace_lock - * when spa_open() is used for a pool on a (local) ZVOL(s). - * This is needed since we replaced upstream zfsdev_state_lock - * with spa_namespace_lock in the ZVOL code. - * We are using the same trick as spa_open(). - * Note that calls in zvol_first_open which need to resolve - * pool name to a spa object will enter spa_open() - * recursively, but that function already has all the - * necessary protection. - */ - if (!MUTEX_HELD(&zfsdev_state_lock)) { - mutex_enter(&zfsdev_state_lock); - locked = B_TRUE; + if (tsd_get(zfs_geom_probe_vdev_key) != NULL) { + /* + * if zfs_geom_probe_vdev_key is set, that means that zfs is + * attempting to probe geom providers while looking for a + * replacement for a missing VDEV. In this case, the + * spa_namespace_lock will not be held, but it is still illegal + * to use a zvol as a vdev. Deadlocks can result if another + * thread has spa_namespace_lock + */ + return (EOPNOTSUPP); } + mutex_enter(&zfsdev_state_lock); + zv = pp->private; if (zv == NULL) { - if (locked) - mutex_exit(&zfsdev_state_lock); + mutex_exit(&zfsdev_state_lock); return (SET_ERROR(ENXIO)); } if (zv->zv_total_opens == 0) { err = zvol_first_open(zv); if (err) { - if (locked) - mutex_exit(&zfsdev_state_lock); + mutex_exit(&zfsdev_state_lock); return (err); } pp->mediasize = zv->zv_volsize; @@ -1186,8 +1180,7 @@ zvol_open(struct g_provider *pp, int fla mutex_exit(&zfsdev_state_lock); #else zv->zv_total_opens += count; - if (locked) - mutex_exit(&zfsdev_state_lock); + mutex_exit(&zfsdev_state_lock); #endif return (err); @@ -1197,8 +1190,7 @@ out: #ifdef illumos mutex_exit(&zfsdev_state_lock); #else - if (locked) - mutex_exit(&zfsdev_state_lock); + mutex_exit(&zfsdev_state_lock); #endif return (err); }