From owner-freebsd-current@FreeBSD.ORG Mon Mar 12 10:17:08 2007 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 711EE16A401 for ; Mon, 12 Mar 2007 10:17:08 +0000 (UTC) (envelope-from fullermd@over-yonder.net) Received: from optimus.centralmiss.com (ns.centralmiss.com [206.156.254.79]) by mx1.freebsd.org (Postfix) with ESMTP id 4CFBE13C44C for ; Mon, 12 Mar 2007 10:17:08 +0000 (UTC) (envelope-from fullermd@over-yonder.net) Received: from draco.over-yonder.net (adsl-072-148-013-213.sip.jan.bellsouth.net [72.148.13.213]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by optimus.centralmiss.com (Postfix) with ESMTP id C6C3028828 for ; Mon, 12 Mar 2007 05:17:07 -0500 (CDT) Received: by draco.over-yonder.net (Postfix, from userid 100) id 3D00561C41; Mon, 12 Mar 2007 05:17:07 -0500 (CDT) Date: Mon, 12 Mar 2007 05:17:07 -0500 From: "Matthew D. Fuller" To: freebsd-current@freebsd.org Message-ID: <20070312101707.GJ22586@over-yonder.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Editor: vi X-OS: FreeBSD User-Agent: Mutt/1.5.14-fullermd.3 (2007-02-12) Subject: ATA kablooie X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Mar 2007 10:17:08 -0000 I have a box that until Friday night was running a Nov '05 -CURRENT solidly. After an upgrade, it started spewing out kernel: ad4: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=38617823 style warnings at the slightest provocation. A "find / -xdev -print | xargs cat >> /dev/null" could bring it about in a second or two; not uncommonly, the arduous effort of spawning off 'sh' for single-user mode was enough to put it over the cliff. The system runs an ataraid RAID-1 across ad4 and ad6; which got the first errors was pretty luck of the draw on any given boot. They're on a Promise TX2200 card: atapci0: port 0xc000-0xc07f,0xc400-0xc4ff mem 0xeb420000-0xeb420fff,0xeb400000-0xeb41ffff irq 15 at device 13.0 on pci0 The card/drives were tried in 3 very different motherboards, all of which failed identically. BIOSen were scoured for "make PCI edgy" options, which were all turned off (though none exhibited a "enable bus master" option, as one seemingly-related mail thread ended with). I tried using the loader variable to force the drives to PIO mode to jam the brakes on, but it didn't seem to work at all (maybe it doesn't affect SATA?). I tried splitting the RAID so it only dealt with one drive; made no difference. The -CURRENT build was from identical sources to those currently sitting on this machine, so I can supply $Id$'s if it'll help. Sadly, the system needed to be running, so it's not available for further experimentation. It ran flawlessly with that Nov '05 -CURRENT, and is now running flawlessly on RELENG_6. -- Matthew Fuller (MF4839) | fullermd@over-yonder.net Systems/Network Administrator | http://www.over-yonder.net/~fullermd/ On the Internet, nobody can hear you scream.