Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 15 Oct 2013 00:01:10 -0400
From:      Garrett Wollman <wollman@bimajority.org>
To:        freebsd-stable@freebsd.org
Cc:        freebsd-fs@freebsd.org
Subject:   How to unstick ZFS resilver?
Message-ID:  <21084.48646.196295.776944@hergotha.csail.mit.edu>

next in thread | raw e-mail | index | archive | help
I have a large (88-drive) zpool in which a drive was recently
replaced.  (The pool has a bunch of duff Toshiba MK2001TRKB drives --
never ever pay money for these! -- and I'm trying to replace them one
by one before they fail completely.)  The resilver on the first drive
replacement has been taking much much too long, and currently it's
stuck in this state:

  pool: export
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Wed Oct  9 14:54:47 2013
        86.5T scanned out of 86.8T at 1/s, (scan is slow, no estimated time)
        982G resilvered, 99.62% done

The overall progress hasn't changed in twelve hours, even across a
reboot, and the server is fairly lightly loaded.  Searching the Web is
no help; can anyone suggest a remedial action?  (This is on
9.1-RELEASE, with our local patches, and all the drives are SAS.)

In exchange, I offer the following DTrace script which I used to
identify the slow SAS drives:

#!/usr/sbin/dtrace -s

#pragma D option quiet
#pragma D option dynvarsize=2m

inline int TOO_SLOW = 100000000;	/* 100 ms */

dtrace:::BEGIN
{
        printf("Tracing... Hit Ctrl-C to end.\n");
}

fbt::dastrategy:entry
{
        start_time[(struct buf *)arg0] = timestamp;
}

fbt::dadone:entry
/(this->bp = (struct buf *)args[1]->ccb_h.periph_priv.entries[1].ptr) && start_time[this->bp] && (timestamp - start_time[this->bp]) > TOO_SLOW/
{
        @[strjoin("da", lltostr(args[0]->unit_number))] = count();
        start_time[this->bp] = 0;
}

-GAWollman




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?21084.48646.196295.776944>