Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 08 Dec 2011 15:06:31 -0500
From:      Dan Pritts <danno@internet2.edu>
To:        freebsd-fs@freebsd.org
Subject:   ZFS hangs with 8.2-release
Message-ID:  <4EE118C7.8030803@internet2.edu>

next in thread | raw e-mail | index | archive | help
Hi all,

I've got a data archive on several ZFS filesystems on a single 
8.2-release system.

When scrubbing, or sometimes even when copying data between ZFS 
filesystems, the system frequently hangs.

System info:

Sun x4200, 16GB RAM.

FreeBSD 8.2-RELEASE #0: Thu Feb 17 02:41:51 UTC 2011
root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64
CPU: Dual Core AMD Opteron(tm) Processor 285 SE (2592.62-MHz K8-class CPU)
real memory  = 17179869184 (16384 MB)
avail memory = 16439779328 (15678 MB)

internal LSI mpt-driver hardware raid for boot.
3x LSI parallel-scsi cards for primary storage.  48 SATA disks 
attached.  Using Infortrend RAIDs as JBODs.

5 9-disk RAIDz2 zpools each independent of one another.  3 of them now 
have 2TB disks; the others still have 500GB disks.

The pools were originally created under Solaris/amd64.  The system ran 
on that for several years with no apparent issues, including doing 
weekly scrubs. I switched the system to FreeBSD when my contract came up 
for renewal at the new Oracle rates.

I've posted some system information (zpool status output, loader.conf, 
some screenshots from system hangs) at:
http://people.internet2.edu/~danno/zfs/

I've adjusted some values in loader.conf. based on stuff I found in the 
ZFS tuning wiki (see site above)

With the defaults, a single zpool scrub of one of the pools would 
reliably crash the system within a couple minutes.  Now, it's less 
crashy; it will stay up for many hours, but still hangs every 6-24 hours 
when the filesystems are being actively used (copy from one to the 
other, or a single scrub).

So:
1) suggestions for fixing this with the current system?
2) is it expected that Freebsd 9 will be improved in the solaris 
compatibility layer (which i assume is what's crashing)?

I have been unable to obtain crash dumps, apparently due to a bug in the 
mpt driver, or my hardware, or something.  Filed a bug in the tracker 
about that.

thanks
danno
-- 
Dan Pritts, Sr. Systems Engineer
Internet2
office: +1-734-352-4953 | mobile: +1-734-834-7224




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4EE118C7.8030803>