Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 10 Aug 2005 15:51:01 -0500
From:      Karl Denninger <karl@denninger.net>
To:        freebsd-stable@freebsd.org
Cc:        sos@freebsd.org
Subject:   Re: ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599
Message-ID:  <20050810205101.GA17483@FS.denninger.net>
In-Reply-To: <4A1BF8DF-EC50-4067-A69B-84D9BE5B22C7@FreeBSD.ORG>
References:  <42F9009E.3030601@mac.com> <42F9609E.1010207@goldsword.com> <20050810023111.GA2913@FS.denninger.net> <20050810024618.GA8198@drjekyll.mkbuelow.net> <6.2.1.2.0.20050810081251.05298ff0@64.7.153.2> <20050810133159.GA10150@FS.denninger.net> <6.2.1.2.0.20050810094204.06c46098@64.7.153.2> <20050810144148.GB10150@FS.denninger.net> <790a9fff0508100844a7e5435@mail.gmail.com> <4A1BF8DF-EC50-4067-A69B-84D9BE5B22C7@FreeBSD.ORG>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Aug 10, 2005 at 07:24:01PM +0200, S?ren Schmidt wrote:
> 
> On 10/08/2005, at 17:44, Scot Hetzel wrote:
> 
> >On 8/10/05, Karl Denninger <karl@denninger.net> wrote:
> >
> >>On Wed, Aug 10, 2005 at 09:51:03AM -0400, Mike Tancsa wrote:
> >>
> >>>At 09:31 AM 10/08/2005, Karl Denninger wrote:
> >>>
> >>>
> >>>>Also, I've yet to see a developer commit on the list that they  
> >>>>WILL fix it
> >>>>if
> >>>>such a controller board is forthcoming (and will return the  
> >>>>board when
> >>>>they're
> >>>>done) - I've got two of these cards here (choose between Adaptec  
> >>>>and
> >>>>Bustek)
> >>>>and would be happy to UPS one to someone IF I had a firm  
> >>>>commitment that
> >>>>6.x
> >>>>would NOT go out without this being addressed and that the board  
> >>>>would be
> >>>>returned to me when work was complete.
> >>>>
> >>>
> >>>You demand to see support for this chipset fixed, yet, you cant  
> >>>pony up a
> >>>measly hundred bucks to donate the card to the developer who is  
> >>>not being
> >>>paid to develop anything.
> >>>
> >>>        ---Mike
> >>>
> >>
> >>I have "demanded" nothing Mike.
> >>
> >>
> >I agree Mike's wording was a little strong, but we have seen you
> >request strongly that some one fix this problem.
> >
> >Have you contacted S?ren, to see if he has the troublesome hardware?
> >
> >Also contact S?ren directly with your offer to supply a troublesome
> >board, and/or access to a system that displays the problem.  More than
> >likely he would agree to return the board once he has a proper fix for
> >the problem.
> 
> Since I came in late in this, I need to know what kind of controller  
> we are talking about, and if the problem is still present in 6.0.
> I plan to backport ATA from 6.0 to 5-stable when it has settled, so  
> 6.0 is the one and only (pre)release to test with and get back to me  
> with the result.
> 
> - S?ren

6.0 BETA1 AND 5.4 BOTH fail with the SiI 3112 chipsets.  Reliably.

I have two controllers here that are from different manufacturers and both
exhibit the same problem.  The SAME disks (two different manufacturers -
hitachi and maxtor) on a motherboard ICH5 adapter work perfectly, 
smartmontools says all 4 (I have two examples of each) are healthy, and 
both ALSO work perfectly on and are declared healthy by a 3ware 8502's
internal diags and operating kernel (smartmontools won't talk to them on 
the 8502.)

This is the subject of the PR I filed back in February.

Again, if you want either a controller shipped to you OR access to a
development machine (e.g. ssh in and play) which has the suspect
configuration on it, the latter of which is probably the best option (since 
making it fail is simple) I'm willing to provide either - my only caveat is 
that if I send hardware I want it back when you're done, and I believe its 
reasonable to expect that 6.0 will get HELD in its release cycle until this 
is resolved.

The latter offer (ssh access) has been on the table for several months.  The
former I just put on the table as I threw up my hands and bought a 3ware
card - which means I now have TWO of the suspect cards and need only one 
for my own testing (in the sandbox)

I'm willing to go WELL out of my way to make it possible for this to get
fixed, since there appears to be an issue with access to hardware that
breaks reliably.  However, I, and others, would like to know that we're
going to see the problem get resolved.

Again - this is hardware that is STABLE and works under 4.x - in the case of
my specific configuration I ran under 4.x for over a year without a single
incident.  With 5.4 and 6.0-BETA I can kill it inside of 2 minutes with 
nothing more complicated than a "make -j4 buildworld".

Let me know if you'd like to take me up on either of my offers.  Note that
with 6.0-BETA (what's currently on the sandbox machine) when it blows up it
does so in such a way that a reboot FAILS (it hangs during the shutdown
sequence!) so you need to hit the red button to get a clean restart (and
wait for the FSCK)

I have a PATA drive in the sandbox machine on the motherboard adapter that
is part of a mirror with the "bad" controller, so there is no risk of data
corruption - when it fails the "bad" disk disconnects from the array but
the boot drive remains "safe".

--
-- 
Karl Denninger (karl@denninger.net) Internet Consultant & Kids Rights Activist
http://www.denninger.net	My home on the net - links to everything I do!
http://scubaforum.org		Your UNCENSORED place to talk about DIVING!
http://genesis3.blogspot.com	Musings Of A Sentient Mind





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050810205101.GA17483>