Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 14 Oct 2013 21:50:57 +0000 (UTC)
From:      Alan Somers <asomers@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-projects@freebsd.org
Subject:   svn commit: r256463 - projects/zfsd/head/cddl/sbin/zfsd
Message-ID:  <201310142150.r9ELovC1082211@svn.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: asomers
Date: Mon Oct 14 21:50:56 2013
New Revision: 256463
URL: http://svnweb.freebsd.org/changeset/base/256463

Log:
  Fix a bug in zfsd: when a drive is experiencing a rapid storm of IO or
  checksum errors, zfsd will not degrade/fault it until hundreds or thousands
  of errors have occured.
  
  	cddl/sbin/zfsd/case_file.cc
  		RefreshVdevState() iterates through all the system's zpools,
  		which involves the ioctls ZFS_IOC_POOL_CONFIGS and
  		ZFS_IOC_POOL_STATS.  Both of those acquire
  		spa_namespace_lock, which may block for a long time under
  		certain circumstances, including when the system has a storm
  		of IO or checksum errors.  This change eliminates the call
  		to RefreshVdevState() whenever a ZFSEvent is received.
  		Instead, RefreshVdevState() will only be called when a
  		CaseFile is closed, if necessary.  This way, zfsd won't
  		spend too much time blocking on ioctl()s and miss reading
  		events from devd.
  
  Submitted by:	alans
  Approved by:	ken (mentor)
  Sponsored by:	Spectra Logic Corporation

Modified:
  projects/zfsd/head/cddl/sbin/zfsd/case_file.cc

Modified: projects/zfsd/head/cddl/sbin/zfsd/case_file.cc
==============================================================================
--- projects/zfsd/head/cddl/sbin/zfsd/case_file.cc	Mon Oct 14 21:41:36 2013	(r256462)
+++ projects/zfsd/head/cddl/sbin/zfsd/case_file.cc	Mon Oct 14 21:50:56 2013	(r256463)
@@ -298,28 +298,6 @@ CaseFile::ReEvaluate(const ZfsEvent &eve
 {
 	bool consumed(false);
 
-	if (!RefreshVdevState()) {
-		/*
-		 * The pool or vdev for this case file is no longer
-		 * part of the configuration.  This can happen
-		 * if we process a device arrival notification
-		 * before seeing the ZFS configuration change
-		 * event.
-		 */
-		syslog(LOG_INFO,
-		       "CaseFile::ReEvaluate(%s,%s) Pool/Vdev unconfigured.  "
-		       "Closing\n",
-		       PoolGUIDString().c_str(),
-		       VdevGUIDString().c_str());
-		Close();
-
-		/*
-		 * Since this event was not used to close this
-		 * case, do not report it as consumed.
-		 */
-		return (/*consumed*/false);
-	}
-
 	if (event.Value("type") == "misc.fs.zfs.vdev_remove") {
 		/*
 		 * The Vdev we represent has been removed from the
@@ -333,6 +311,28 @@ CaseFile::ReEvaluate(const ZfsEvent &eve
 	if (event.Value("class") == "resource.fs.zfs.removed") {
 		bool spare_activated;
 
+		if (!RefreshVdevState()) {
+			/*
+			 * The pool or vdev for this case file is no longer
+			 * part of the configuration.  This can happen
+			 * if we process a device arrival notification
+			 * before seeing the ZFS configuration change
+			 * event.
+			 */
+			syslog(LOG_INFO,
+			       "CaseFile::ReEvaluate(%s,%s) Pool/Vdev "
+			       "unconfigured.  Closing\n",
+			       PoolGUIDString().c_str(),
+			       VdevGUIDString().c_str());
+			Close();
+
+			/*
+			 * Since this event was not used to close this
+			 * case, do not report it as consumed.
+			 */
+			return (/*consumed*/false);
+		}
+
 		/*
 		 * Discard any tentative I/O error events for
 		 * this case.  They were most likely caused by the



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201310142150.r9ELovC1082211>