From owner-freebsd-stable@FreeBSD.ORG Thu Mar 24 20:36:34 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CD12F106564A; Thu, 24 Mar 2011 20:36:34 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-yi0-f54.google.com (mail-yi0-f54.google.com [209.85.218.54]) by mx1.freebsd.org (Postfix) with ESMTP id 62E828FC14; Thu, 24 Mar 2011 20:36:34 +0000 (UTC) Received: by yie12 with SMTP id 12so198233yie.13 for ; Thu, 24 Mar 2011 13:36:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to:cc :content-type; bh=GEu1jejOrTUik9GdpGN0Z6dh0xDfYf38cuz4CuVrtQg=; b=PGFf5w9NF+tY2I06FVEmd0ef02Y1x4wx9ixWCs5KhTWeNFYA1mazTtgPTflOOLSj7D oYo6OYBAFBlsx6vemxMQ1wzat1P/k3yjpNKqcVM80n9oGV9s3+0MHl8wkQYZctRoPcOt spNfgvq4NzvzV1/ROh9w8PbcQn15CrDzroo7g= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:cc:content-type; b=h/pPwYaXpIgTL0vePcbdoQyzrar2v7klZZBDWKn5V0McF4PBe9QEFh49gkdcrJc5Y+ wCIX/Oms/VcHDVuY57Pat2rqRk0sqlSkT7aKhELRyAJhVst7zhb7ZZaYL858082YXaGr R9QELutEwdLPEIxd9E9MwjAev5iueF9MQAbsg= MIME-Version: 1.0 Received: by 10.91.20.12 with SMTP id x12mr7990825agi.100.1300998992801; Thu, 24 Mar 2011 13:36:32 -0700 (PDT) Received: by 10.90.100.10 with HTTP; Thu, 24 Mar 2011 13:36:32 -0700 (PDT) Date: Thu, 24 Mar 2011 13:36:32 -0700 Message-ID: From: Freddie Cash To: FreeBSD-Current Content-Type: text/plain; charset=UTF-8 Cc: FreeBSD Filesystems , FreeBSD Stable Subject: Any success stories for HAST + ZFS? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Mar 2011 20:36:35 -0000 [Not sure which list is most appropriate since it's using HAST + ZFS on -RELEASE, -STABLE, and -CURRENT. Feel free to trim the CC: on replies.] I'm having a hell of a time making this work on real hardware, and am not ruling out hardware issues as yet, but wanted to get some reassurance that someone out there is using this combination (FreeBSD + HAST + ZFS) successfully, without kernel panics, without core dumps, without deadlocks, without issues, etc. I need to know I'm not chasing a dead rabbit. In tests using VirtualBox and FreeBSD 8-STABLE from when HAST was first MFC'd, everything worked wonderfully. HAST-based pool would come up, data would sync to the slave node, fail-over worked nicely, bringing the other box back online as the slave worked, data synced back, etc. It was a thing of beauty. Now, on real hardware, I cannot get the system to stay online for more than an hour. :( hastd causes kernel panics with "bufwrite: buffer not busy" errors. ZFS pools get corrupted. System deadlocks (no log messages, no onscreen errors, not even NumLock key works) at random points. The hardware is fairly standard fare: - SuperMicro H8DGi-F motherboard - AMD Opteron 6100-series CPU (8-cores @ 2.0 GHz) - 8 GB DDR3 SDRAM - 64 GB Kingston V-Series SSD for the OS install (using ahci(4) and the motherboard SATA controller) - 3x SuperMicro AOC-USAS2-8Li SATA controllers with IT firmware - 6x 1.5 TB Seagate 7200.11 drives (1x raidz2 vdev) - 12x 1.0 TB Seagate 7200.12 drives (2x raidz2 vdev) - 6x 0.5 TB WD RE3 drives (1x raidz2 vdev) The motherboard BIOS is up-to-date. I do not see any way to update the firmware on the SATA controllers. Using the onboard IPMI-based sensors, CPU, motherboard, RAM temps and volatages are in the nominal range. I've tried with FreeBSD 8.2-RELEASE, 8-STABLE, 8-STABLE w/ZFSv28 patches, and 9-CURRENT (after the ZFSv28 commit). Things work well until I start hastd. Then either the system locks up, or hastd causes a kernel panic, or hastd dumps core. Each harddrive is glabel'd as "disk-a1" through "disk-d6". hast.conf has 24 resources listed, one for each glabel'd device. The pool is created using the /dev/hast/* devices with disk-a1 through disk-a6 being one raidz2 vdev, and so on through disk-b*, disk-c*, and disk-d*, for a total of 4 raidz2 vdevs of 6 drives each. A fairly standard setup, I would think. Even using a GENERIC kernel, I can't keep things stable and running. So, please, someone, somewhere, share a success story, where you're using FreeBSD, ZFS, and HAST. Let me know that it does work. I'm starting to lose faith in my abilities here. :( Or point out where I'm doing things wrong so I can correct the issues. Thanks. -- Freddie Cash fjwcash@gmail.com