From owner-freebsd-stable@FreeBSD.ORG Wed Aug 10 13:32:01 2005 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 06ED316A41F for ; Wed, 10 Aug 2005 13:32:01 +0000 (GMT) (envelope-from karl@FS.denninger.net) Received: from FS.denninger.net (wsip-68-15-213-52.at.at.cox.net [68.15.213.52]) by mx1.FreeBSD.org (Postfix) with ESMTP id 70CB043D46 for ; Wed, 10 Aug 2005 13:32:00 +0000 (GMT) (envelope-from karl@FS.denninger.net) Received: from fs.denninger.net (localhost [127.0.0.1]) by FS.denninger.net (8.13.3/8.13.1) with SMTP id j7ADVxZY010968 for ; Wed, 10 Aug 2005 08:31:59 -0500 (CDT) (envelope-from karl@FS.denninger.net) Received: from fs.denninger.net [127.0.0.1] by Spamblock-sys (LOCAL); Wed Aug 10 08:31:59 2005 Received: (from karl@localhost) by FS.denninger.net (8.13.3/8.13.1/Submit) id j7ADVxqR010966 for freebsd-stable@freebsd.org; Wed, 10 Aug 2005 08:31:59 -0500 (CDT) (envelope-from karl) Date: Wed, 10 Aug 2005 08:31:59 -0500 From: Karl Denninger To: freebsd-stable@freebsd.org Message-ID: <20050810133159.GA10150@FS.denninger.net> Mail-Followup-To: freebsd-stable@freebsd.org References: <42F7F7E8.1020507@mail.uni-mainz.de> <42F9009E.3030601@mac.com> <42F9609E.1010207@goldsword.com> <20050810023111.GA2913@FS.denninger.net> <20050810024618.GA8198@drjekyll.mkbuelow.net> <6.2.1.2.0.20050810081251.05298ff0@64.7.153.2> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6.2.1.2.0.20050810081251.05298ff0@64.7.153.2> User-Agent: Mutt/1.4.2.1i Organization: Karl's Sushi and Packet Smashers X-Die-Spammers: Spammers cheerfully broiled for supper and served with ketchup! Subject: Re: ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2005 13:32:01 -0000 On Wed, Aug 10, 2005 at 08:15:50AM -0400, Mike Tancsa wrote: > At 10:46 PM 09/08/2005, Matthias Buelow wrote: > >Karl Denninger wrote: > > > >>SII chipsets were ok in 4.x, but the newer ATA code broke badly with them. > >>I've had a PR open on this since February, and many others have reported > >>similar issues. The problems still exist in the 6.x-BETA releases I've > >>checked out, and are in some cases MORE severe (for me anyway) than they > >are > >>in 5.4. > > > >Well, it doesn't affect just the SII chips.. I see the same on an > >Intel ICH6 chipset but never after the kernel has mounted the root > >fs. Sometimes it takes several attempts until it manages to do so, > >though. The machine works w/o any such problems on other OSes. I've > > I have ICH6 boxes and they run just fine without issue. Have you checked > to see if it actually has bad sectors or is a problem with your tray (if > you use one) > > [verify1]% dmesg | grep -i ich > uhci0: port > 0xe000-0xe01f at device 29.0 on pci0 > usb0: on uhci0 > uhci1: port > 0xe100-0xe11f at device 29.1 on pci0 > usb1: on uhci1 > uhci2: port > 0xe200-0xe21f at device 29.2 on pci0 > usb2: on uhci2 > uhci3: port > 0xe300-0xe31f at device 29.3 on pci0 > usb3: on uhci3 > fxp0: port 0xd000-0xd03f mem 0xd0000000-0xd0000fff > irq 10 at device 8.0 on pci1 > atapci0: port > 0xf000-0xf00f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 31.2 on pci0 > [verify1]% I have an ICH5 on my motherboard and it works fine - it is under heavy use and has had no trouble. atapci1: port 0xfea0-0xfeaf,0xfe30-0xfe33,0xfe20 -0xfe27,0xfe10-0xfe13,0xfe00-0xfe07 irq 18 at device 31.2 on pci0 There's two potential issues here - if it is failing during the boot process that's a completely different set of code - and thus potential problem - than failing while running. I've not had booting problems with the ICH5, and no failures while running. On the other hand the SII chipset boards I have (two of them) can be RELIABLY forced to fail within minutes. If there are no drives in the mirror set on something else, the data on the disk(s) is toast - if so they detach and the non-SII-attached disks end up carrying the data. This is across two different manufacturers of drives (Hitachi and Maxtor) and FOUR separate disks, all four of which smartmontools bless as operating properly and all of which ran just fine under 4.x. Oh, and all of which work just fine on a 3ware 8502 card. I've read the reports that basically boil down to "the SII chipset sucks, don't use it" BUT (1) it works under 4.x, (2) it works under other operating systems and (3) the FreeBSD folks who are saying it doesn't work don't have the courage of their statements to make them in the official release documents (e.g. the release notes, hardware compatability guide or erratta.) So while the chipset may or may not be "less desireable", what is clear is that the problems with it are not insurmountable - they've just not been taken care of in the newer ATA code. Arguments that this is about resources (e.g. the developers don't have a card and need anything from a board to a complete system to have any chance of fixing it) ring pretty hollow to me. This is an EXTREMELY popular chipset, is on both the Adaptec and Bustek cards commonly sold with machines and at retail, and cards with that chipset can be had for as little as $30 (and up, of course.) In addition I've yet to find a SATA drive that WON'T fail with this board - or a motherboard that is stable with it on FreeBSD 5.4 or 6.x - it is definitely NOT linked to the drive and I have no confidence its linked to the motherboard chipset in any way. Further, smartmontools says the disks that do fail aren't defective, and it worked just fine under 4.x. Also, I've yet to see a developer commit on the list that they WILL fix it if such a controller board is forthcoming (and will return the board when they're done) - I've got two of these cards here (choose between Adaptec and Bustek) and would be happy to UPS one to someone IF I had a firm commitment that 6.x would NOT go out without this being addressed and that the board would be returned to me when work was complete. Finally, while the 3ware card works fine, it doesn't support hot plug. The SII chipset claims to, and so does the ATA code, but the 3ware card runs on a different driver - which doesn't (either claim to or actually accomplish it.) So while using a 3ware card solves the "blows chunks and dies" problem, you are back to the lack of functionality that was present in 4.x - no hot drive swap support. (This is mitigated somewhat by the 3ware management tools, which do allow reconnection and work - but its a manual operation.) This entirely voids the argument for ATAng being a "step forward" - support of hot plug and other functionality improvements - in the first place, since you can't actually USE that capability if you are forced to a 3ware board! Again, I think the ATA-xx issues are of a magnitude sufficient to basically kill FreeBSD going forward in the desktop application arena. If all FreeBSD as an organization cares about is the large server marketplace, I guess that works. But small office / home office file servers are going to be SATA based and moderately-data systems with low entropy (e.g. 300-600gb) are FAR more cost effective to deploy on SATA than on SCSI, and easily meet the performance and data stability requirements. If FreeBSD is unstable on those systems without putting in specialized, vendor-supported hardware, then FreeBSD may well be ceding those segments of the Unix marketplace to something else (e.g. Linux.) I believe that would be most unfortunate - I have been supporting FreeBSD as a platform for the code I sell in these environments exclusively for more than five years, and ran it as the only Unix OS we used at the ISP I used to own. My stance has been for the last five years (since selling my ISP) that if you want me to support code that I sell you have to be running it on FreeBSD. FreeBSD has earned this position by being a superior solution in all respects, but most particularly in the area that is most important - operating stability. If I start having customers run into stability problems with 5.x and beyond on hardware that properly worked under 4.x, I will be forced to port the code over to Linux, as I cannot force people to run 4.x as the base OS when its been EOL'd (other than for security fixes) and yet their hardware simply doesn't work right with the current FreeBSD code. As things stand I am adding a STRONG warning in my product release notes stating that if you have a SII chipset SATA controller and run any version of FreeBSD from 5.4 onward you are doing so at your own risk and against my specific recommendations. The warning will be removed from my products when the PR that I filed in February is addressed OR FreeBSD places an equally strong warning in their release notes, rendering the warning unnecessary. -- -- Karl Denninger (karl@denninger.net) Internet Consultant & Kids Rights Activist http://www.denninger.net My home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://genesis3.blogspot.com Musings Of A Sentient Mind