Date: Mon, 22 Oct 2007 17:35:21 +0200 From: Peter Schuller <peter.schuller@infidyne.com> To: freebsd-fs@freebsd.org Subject: ZFS: reproducable inability to accesss a pool (process hangs; other pools fine) Message-ID: <20071022153521.GB27594@hyperion.scode.org>
next in thread | raw e-mail | index | archive | help
--WhfpMioaduB5tiZL Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hello, On the same system I recently posted about on -stable, with RELENG_7 =66rom a few days ago, I am now running a SiL 3114 on a raidz2 in degraded mode with one disk missing (it is degraded by design because I wanted to create a 5 disk array but only had 4). For the purpose of discovering any stability issues with the 3114 controller I did some stress tests that have yet to reveil controller problems, but has triggered what appears to be a ZFS problem. Test case: /promraid - root of the pool in question /promraid/ports - copy of /usr/ports tree from my machine /promraid/1 - empty directory /promraid/2 - empty directory I now run concurrently in two shells: while [ 1 ] ; do rsync -a /promraid/ports /promraid/1/pp ; rm -rf /promraid= /1/pp ; done and: while [ 1 ] ; do rsync -a /promraid/ports /promraid/2/pp ; rm -rf /promraid= /2/pp ; done This runs fine for some hours, but eventually I end up with hung rsyncs in "zfs" state according to op. Attempting to e.g. ls /promraid hangs as well. Yet ZFS continues working (another pool is entirely fine), and there are no errors in dmesg. iostat -x does NOT indicate that it is perpetually waiting on I/O from a disk or something likethat (0% utilization). The processes are unkillable, even by SIGKILL. I should have this environment for a few more days, so can hopefully reproduce this again. It has happened at least twice already (the first time I was in X and X hung; I thought I had a panic so re-ran the tests in the console; these two times I didn't get a panic but I am unsure whether the failure case is different). Does anyone have suggestions for what to do to produce the best information possible? Given that there are no errors, no panic, etc. One obvious bit is to ktrace them I realize, if that gives me anything (the size of the trace if I were to trace it from the beginning would, I suspect, be prohibitive). Will do that next time. --=20 / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <peter.schuller@infidyne.com>' Key retrieval: Send an E-Mail to getpgpkey@scode.org E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org --WhfpMioaduB5tiZL Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHHMM4DNor2+l1i30RApmKAKCjtvR5O6TIh7RBFderKc1cZElg3gCdFIMm bFT0M9YWhc5avTYUxnhI3uw= =qSJW -----END PGP SIGNATURE----- --WhfpMioaduB5tiZL--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071022153521.GB27594>