Date: Thu, 20 Feb 2014 10:51:42 -0800 From: <dteske@FreeBSD.org> To: <dcamp@alumni.ufl.edu>, <freebsd-questions@freebsd.org> Cc: dteske@FreeBSD.org Subject: RE: System freezes up during long-running ZFS disk activity Message-ID: <10c801cf2e6c$cc6599f0$6530cdd0$@FreeBSD.org> In-Reply-To: <CADbaceJ00rk8RFMwi-S-HLNBX673j2DGe6SngUcvYTFTd5KFxw@mail.gmail.com> References: <CADbaceJ00rk8RFMwi-S-HLNBX673j2DGe6SngUcvYTFTd5KFxw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> -----Original Message----- > From: Christian Campbell [mailto:dcamp@alumni.ufl.edu] > Sent: Wednesday, February 19, 2014 12:07 PM > To: freebsd-questions@freebsd.org > Subject: System freezes up during long-running ZFS disk activity > > I recently installed 9.2-RELEASE-p3 on a Dell Precision T5400. I'm using ZFS > filesystem version: 5, ZFS storage pool version: features support (5000). The > pool was imported from a previous 9.2 box on which it worked without issue. > > I don't know if my problem is ZFS-related, but my ZFS use is why I noticed it and > I seem to be able to reproduce it reliably. Every so often, from minutes to > hours, my computer will freeze up while ZFS has been busy. This happens > during a resilver, a scrub, and a long-running process reading millions of files > from the pool. When it freezes, all output and input > freezes: tasks like zpool iostat -v 1 or top stop updating their output, whether on > the console or an ssh terminal over Ethernet. Pressing keys does not garner a > response.* Sometimes a freeze lasts minutes and then proceeds on its own. > Sometimes it goes on for hours. An action that typically, but not always, jogs it > is unplugging the USB keyboard -- the disk activity resumes immediately, and > any queued keyboard input immediately plays out whether on the console or > over ssh. Lastly, my ssh terminal (PuTTY) will stay connected for hours during a > freeze-up, *i.e.* the TCP circuit is not closed or timed out, as opposed to > closing pretty quickly after the server is powered off. > > In all cases, the system clock lags by the sum of the durations of the freezes. > > * During an initial resilver, I noticed that pressing a key such as Ctrl on the USB > keyboard would jog it, but pressing Ctrl or other keys doesn't jog my process of > long-running IO activity. But in all cases, even when unplugging and replugging > the USB keyboard doesn't jog it, Ctrl-Alt-Del prompts an orderly shutdown. > > Debugging advise is very welcome! > [Devin Teske] I had this exact same problem on a Dell 1U F1DH server. I didn't send any e-mail to the mailing lists, because I feared I was going crazy. Of course, it's been 30 days since I had that problem... if I try to remember what it was... it was either the bad SATA port (which had loose soldering), or it was the drive which said SATA port had fubar'd (putting that drive into another system saw the same thing happen in said new system). So what I did was rsync all the data off that drive to another one (and yes, because I had to "jog" the system to get it to be responsive, in the same exact situation you describe above) it took a very _very_ long time. But... once I got off of that drive everything looked much much better. I also found other ways to jog it were Alt+FN, and even the occasional ping would jog it too. It appeared to be interrupt driven in some way. Might I suggest that you have a drive acting up in your pool. -- Devin _____________ The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?10c801cf2e6c$cc6599f0$6530cdd0$>