Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 03 Oct 2012 11:12:34 +0300
From:      Andriy Gapon <avg@FreeBSD.org>
To:        Nikolay Denev <ndenev@gmail.com>
Cc:        "<freebsd-fs@freebsd.org>" <freebsd-fs@FreeBSD.org>
Subject:   Re: nfs + zfs hangs on RELENG_9
Message-ID:  <506BF372.1090208@FreeBSD.org>
In-Reply-To: <906543F2-96BD-4519-B693-FD5AFB646F87@gmail.com>
References:  <906543F2-96BD-4519-B693-FD5AFB646F87@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
on 02/10/2012 13:26 Nikolay Denev said the following:
> 7 100537 zfskern          txg_thread_enter mi_switch+0x186 sleepq_wait+0x42
> _cv_wait+0x121 zio_wait+0x61 dsl_pool_sync+0xe0 spa_sync+0x336
> txg_sync_thread+0x136 fork_exit+0x11f fork_trampoline+0xe

>From my past experience the threads stuck in zio_wait always meant an I/O
operation stuck in a storage controller driver, controller firmware, etc.
Not necessarily a case here, but a possibility.

Perhaps try camcontrol tags <disk devname> -v to see the state of disk queues.

P.S.
It would be nice if for debugging purposes we had some place in zio to record
bio that it depends upon.
E.g. something like:
diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h
b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h
index 80d9336..75b2fcf 100644
--- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h
+++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h
@@ -432,6 +432,7 @@ struct zio {
 #ifdef _KERNEL
 	/* FreeBSD only. */
 	struct ostask	io_task;
+	void		*io_bio;
 #endif
 };

diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
index 7d146ff..36bb5ad 100644
--- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
+++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
@@ -684,6 +684,7 @@ vdev_geom_io_intr(struct bio *bp)
 			vd->vdev_delayed_close = B_TRUE;
 		}
 	}
+	zio->io_bio = NULL;
 	g_destroy_bio(bp);
 	zio_interrupt(zio);
 }
@@ -732,6 +733,7 @@ sendreq:
 	}
 	bp = g_alloc_bio();
 	bp->bio_caller1 = zio;
+	zio->io_bio = bp;
 	switch (zio->io_type) {
 	case ZIO_TYPE_READ:
 	case ZIO_TYPE_WRITE:

Then, in situation like yours you could use kgdb, switch to the thread in
zio_wait, go to zio_wait frame and get bio pointer from zio.  From there you
could try to deduce what is going on with the I/O request.

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?506BF372.1090208>