Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 11 Mar 2011 09:19:32 +1000
From:      Stephen McKay <mckay@freebsd.org>
To:        Chris Forgeron <cforgeron@acsi.ca>
Cc:        freebsd-fs@freebsd.org, Stephen McKay <mckay@freebsd.org>
Subject:   Re: Constant minor ZFS corruption 
Message-ID:  <201103102319.p2ANJWxN002125@dungeon.home>
In-Reply-To: <BEBC15BA440AB24484C067A3A9D38D7E014DA6658521@server7.acsi.ca> from Chris Forgeron at "Thu, 10 Mar 2011 16:43:43 -0400"
References:  <201103081425.p28EPQtM002115@dungeon.home> <BEBC15BA440AB24484C067A3A9D38D7E014DA66584F0@server7.acsi.ca> <201103091241.p29CfUM1003302@dungeon.home> <BEBC15BA440AB24484C067A3A9D38D7E014DA6658521@server7.acsi.ca>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday, 10th March 2011, Chris Forgeron wrote:

>You know,  I've had better luck with v28 and FreeBSD-9-CURRENT.  Make
>a very minimal compile, test it well, and you should be fine. I just
>upgraded my last 8.2 v14 ZFS FreeBSD system earlier this week, so I'm
>now 9-Current with v28 across the board. The only issue I've found so
>far is a small oddity with displaying files across ZFS, but pjd has
>already patched that in r219404. (I'm about to test it now)

We are OK using -current if we really have to, but would prefer to
stick with an official release (maybe with one or two hand-rolled
patches if they are important enough).

We've already noticed the -current "upgrade treadmill", having to build
a new kernel every day of our testing because important bug fixes are
arriving.  And in the end, we saw no difference in behaviour, so -current
doesn't fix our problems.

It's important to test -current, but not in production. :-)

>Oh - and you're AMD64, correct, not i386? I think we (royal we) should
>remove support for i385 in ZFS, it has never been stable for me, and
>I see a lot of grief about it on the boards.  I also think you need 8
>GB of RAM to play seriously. I've had reasonable success with 4GB and
>a light load, but any serious file traffic needs 8GB of breathing room
>as ZFS gobbles up the RAM in a very aggressive manner.

Yes, we are running the adm64 kernel.  Currently we're low on memory
(2GB) because I swapped out the RAM, but that, again, didn't affect
our failures.

>Lastly, check what Mike Tancsa said about his hardware - All of my
>gear is quality,  1000W dual redundant power supplies, LSI SAS
>controllers, ECC registered ram, no overclocking, etc, etc.  You may
>have a software issue, but it's more likely that ZFS is just exposing
>some instability in your system. Has your RAM checked out with a Memtest
>run overnight? We're talking small, intermittent errors here, not big
>red flags that will be obvious to spot.

The ASUS PIKE2008 card is LSI based.  Our RAM is ECC.  We're not
overclocking (in fact I disabled turbo-boost).  We haven't run memtest
but we have done a few "make buildworld" runs.  All of these completed
without error.  And with ECC RAM, we should see log messages if anything
is wrong there anyway.

We have tried to buy quality hardware.  At least, we didn't deliberately
skimp (except to build our own box vs buy a big name brand pre-built zfs
server).

We're starting to get suspicious of the PIKE card though.  Is there
anyone here who is using an ASUS PIKE2008 (as opposed to other
LSI SAS 2008 cards)?  We're kinda wishing we'd gotten an older
PIKE 1068E instead...

Cheers,

Stephen.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201103102319.p2ANJWxN002125>