Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 6 Mar 2015 19:13:22 -0800
From:      Nick Sivo <nick@ycombinator.com>
To:        freebsd-fs <freebsd-fs@freebsd.org>
Subject:   ZFS Deadlock?
Message-ID:  <CAM72HBb0C-DQszMZFH706rP4hJK_V9dJdVzmtSJqqafue2ud9Q@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hi,

One of our servers occasionally exhibits strange behavior under heavy
IO load. I think, based on the output from procstat -kk -a, it may be
a ZFS or VFS deadlock. Certain operations, including anything
involving the ZFS commands like zfs and zpool will hang. Running ls at
the root of a ZFS filesystem will also hang. Trying to access
snapshots in the .zfs/ folder will hang. None of these hung processes
can be killed. Eventually the machine will panic, if we don't reboot
it first, but that can take days after we start seeing this issue.
Strangely, our primary application (Hacker News) will keep running
without interruption until the panic.

Details of three occurrences can be found at
https://gist.github.com/kogir/acbd6d0e28ade0ee3aac

For the ones this month, it's on:
9.3-RELEASE-p10 FreeBSD 9.3-RELEASE-p10 #0: Tue Feb 24 21:28:03 UTC 2015

Those from October of last year were running an earlier 9.3 (exact
version unknown). The same hardware running 9.2 was solid for months
at a time. We never saw this issue on 9.2.

top output from the dying box right now:

last pid: 48083;  load averages:  0.24,  0.31,  0.27
120 processes: 1 running, 119 sleeping
CPU:  5.6% user,  0.0% nice,  1.7% system,  0.2% interrupt, 92.5% idle
Mem: 5722M Active, 249M Inact, 67G Wired, 352K Cache, 51G Free
ARC: 32G Total, 14G MFU, 8824M MRU, 52M Anon, 1800M Header, 7962M Other
Swap:

I'd show you the zpool configuration, but that would hang. We're not
using L2ARC or deduplication.

In any case, it's happening more frequently (twice this week), so I'd
like to get to the bottom of it if I can. Does this look like it could
be a filesystem issue? This will undoubtedly happen again. Is there
more information I should try to collect?

Thanks for your time and ideas/help you throw my way :)

Best,
Nick



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM72HBb0C-DQszMZFH706rP4hJK_V9dJdVzmtSJqqafue2ud9Q>