Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Jul 2012 16:09:28 +0100
From:      Dr Josef Karthauser <joe@tao.org.uk>
To:        James Snow <snow@teardrop.org>
Cc:        "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>
Subject:   Re: Checksum errors across ZFS array
Message-ID:  <BC2AD7AE-4D82-4989-9D51-F1F2329C00EB@tao.org.uk>
In-Reply-To: <20120719171548.GM32960@teardrop.org>
References:  <20120719152909.GL32960@teardrop.org> <002D6A20-D2A4-4909-B2EA-3DB562326050@tao.org.uk> <20120719171548.GM32960@teardrop.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 19 Jul 2012, at 18:15, James Snow wrote:

> On Thu, Jul 19, 2012 at 06:05:32PM +0100, Dr Joe Karthauser wrote:
>=20
>> Hi James,
>>=20
>> It's almost definitely a memory problem. I'd change it ASAP if I were
>> you.
>>=20
>> I lost about 70mb from my zfs pool for this very reason just a few
>> weeks ago. Luckily I had enough snapshots from before the rot set in
>> to recover most of what I lost.
>=20
> Thanks for the input. I will run a memory test against it.
>=20
> If I may, why "almost definitely" a memory problem and not an issue =
with
> the controller? (Or did you mean the controller memory?)

Hey Snow,

Ok, it's not definitely. Of course, it could be anything. But, memory is =
where I'd look first.

Take care though, my system which had been working fine for about a year =
when I noticed the ZFS rot (which all appears to be recent in time). I =
ran memcheck+ on it for 8 hours or so, and it showed no errors at all. =
However, when I replaced the memory with a different vendor the problems =
went away. (Reboots and power off/on restarts hadn't fixed the problem =
before!).

So, take care if the memory doesn't report any failures, it might still =
be faulty.

Joe

p.s. It was my fault that I wasn't running ECC memory on the system! :/.
=20=




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BC2AD7AE-4D82-4989-9D51-F1F2329C00EB>