Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 8 Jan 2007 10:08:33 +1100
From:      "Jan Mikkelsen" <janm@transactionware.com>
To:        "'Ian West'" <ian@niw.com.au>, <freebsd-stable@freebsd.org>
Subject:   RE: kernel panic on 6.2-RC2 with GENERIC.
Message-ID:  <001a01c732b0$c22f4860$0204a8c0@transactionware.com>
In-Reply-To: <20070107213350.GA61293@aleph.niw.com.au>

next in thread | previous in thread | raw e-mail | index | archive | help
(Scott:  I should have emailed you this earlier, but Christmas and =
various
other things got in the way.)

Ian West wrote:
> On Sun, Jan 07, 2007 at 02:25:02PM -0500, Mike Tancsa wrote:
> > At 11:43 AM 1/7/2007, Craig Rodrigues wrote:
> > >On Fri, Jan 05, 2007 at 06:59:10PM +0200, Nikolay Pavlov wrote:
>>> [ Areca kernel panic, IO failures ... ]
> I have seen this identical fault with the new areca driver, my machine
> is opteron hardware, but running a regular i386/SMP kernel/world. With
> everything at 6.2RC2 (as of 29th of December) except the areca driver
> the machine is rock solid, with the 29th of december version of the
> areca driver the box will crash on extract of a large tar=20
> file, removal
> of a large directory structure, or pretty much anything that=20
> does a lot
> of disk io to different files/locations. There is no error=20
> log prior to
> seeing the following messages..
>=20
> Dec 29 14:26:44 aleph kernel:=20
> g_vfs_done():da0s1g[WRITE(offset=3D433078272, length=3D8192)]error =3D =
5
> Dec 29 14:26:44 aleph kernel:=20
> g_vfs_done():da0s1g[WRITE(offset=3D433111040, length=3D16384)]error =
=3D 5
> Dec 29 14:26:44 aleph kernel:=20
> g_vfs_done():da0s1g[WRITE(offset=3D433209344, length=3D16384)]error =
=3D 5
> Dec 29 14:26:44 aleph kernel:=20
> g_vfs_done():da0s1g[WRITE(offset=3D433242112, length=3D32768)]error =
=3D 5
> Dec 29 14:26:44 aleph kernel:=20
> g_vfs_done():da0s1g[WRITE(offset=3D437612544, length=3D4096)]error =3D =
5
> Dec 29 14:26:44 aleph kernel:=20
> g_vfs_done():da0s1g[WRITE(offset=3D437616640, length=3D12288)]error =
=3D 5
> Dec 29 14:26:44 aleph kernel:=20
> g_vfs_done():da0s1g[WRITE(offset=3D437633024, length=3D6144)]error =3D =
5
> Dec 29 14:26:44 aleph kernel:=20
> g_vfs_done():da0s1g[WRITE(offset=3D437639168, length=3D2048)]error =3D =
5
> Dec 29 14:26:44 aleph kernel:=20
> g_vfs_done():da0s1g[WRITE(offset=3D437641216, length=3D6144)]error =3D =
5
>=20
> There are a string of these, followed by a crash and reboot.=20
> The file system
> state can be left very dirty to the point where background=20
> fsck seems unable
> to recover it.
>=20
> The areca card in question is running the latest firmware/boot and
> has shown no problems either before, or since backing out the areca
> driver.
>=20
> The volume is ran the tests on was a 250G on a raid6 raid set.

I have seen various problems with various Areca drivers.  All on
6.2-RC1/amd64 with an Areca RAID-6 volume.

Areca 1.20.00.02 seems to work fine.

Areca 1.20.00.12 (from the Areca website) seems to have data corruption
problems.  My tests involve doing a "diff -r" on a filesystem with 2GB =
of
data.  It will occasional find differences in files.  On examination, =
the
last 640 bytes of the first block of the affected file contain data from
another file "nearby" in the filesystem.  Unmounting and remounting the
filesystems and rerunning the test shows no problem, or a difference in
another file entirely.  I think this is the cause of the g_vfs_done =
failures
with this version of the driver;  the offsets are wrong because the data =
is
corrupted.

Areca 1.20.00.13 (as currently in the tree) does not seem to have data
corruption problems, but I can trigger g_vfs_done failures under heavy =
I/O.

I have raised this with Areca support, and I'm waiting to hear back from
Erich Chen.

Regards,

Jan Mikkelsen




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?001a01c732b0$c22f4860$0204a8c0>