Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 18 Oct 2004 09:14:21 +0200
From:      "Daniel Eriksson" <daniel_k_eriksson@telia.com>
To:        "'Pawel Jakub Dawidek'" <pjd@freebsd.org>
Cc:        freebsd-current@freebsd.org
Subject:   RE: Current crash on today's kernel
Message-ID:  <!~!UENERkVCMDkAAQACAAAAAAAAAAAAAAAAABgAAAAAAAAA0VcX9IoJqUaXPS8MjT1PdsKAAAAQAAAA7hVFr5C480y5g+E9Xo26ugEAAAAA@telia.com>
In-Reply-To: <20041018055448.GE73767@darkness.comp.waw.pl>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help

Pawel Jakub Dawidek wrote:

>> The machine is using both ataraid and gvinum. (But cannot use geom_stripe
>> since it doesn't want to play nice with ataraid.)
>
> What are the problems exactly? I'm happy to help.

I'm not sure if it is limited to my particular setup or not. Unfortunately I
don't have enough hardware to test it on anything other than a production
machine, so that limits my willingness to try things.

I had a nice log of this with extra geom debugging enabled, but it seems I
have misplaced it. I don't think there was anything that stood out in the
log though.

When geom_stripe starts to taste providers it messes up the ataraid arrays,
making the discs in the arrays time out. This of course results in the
arrays being marked as broken. And because of bugs somewhere else, having a
disc/array disappear from under a live filesystem usually results in a
system panic.

Again, if this is a local problem it will be hard to debug given that the
machine needs to be up. However, verifying that it is a local or a general
problem is easy: Just hook two or more discs up as an ataraid RAID0 array
and then try to create a geom_stripe array using some other discs. If it
generates a bunch of ATA timeouts which eventually tears down the ataraid
array then it's a general problem.

I also remember waiting for all the ataraid arrays to fail (I have 4 in the
machine, takes 15-30 sec for all of them to fail). Once they had all failed
I tried to access the newly created geom_stripe array, and it worked just
fine. I then ran 'atacontrol delete' to remove one of the failed arrays and
tried to rebuild it with 'atacontrol create'. As soon as the arX device was
created, geom wanted to taste it which again generated ATA timeouts which
tore the array down.

Sorry about the lack of details.

/Daniel Eriksson




Want to link to this message? Use this URL: <http://docs.FreeBSD.org/cgi/mid.cgi?!~!UENERkVCMDkAAQACAAAAAAAAAAAAAAAAABgAAAAAAAAA0VcX9IoJqUaXPS8MjT1PdsKAAAAQAAAA7hVFr5C480y5g+E9Xo26ugEAAAAA>