From owner-freebsd-stable@FreeBSD.ORG  Tue Jul 24 01:10:21 2007
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 45D2A16A419
	for <freebsd-stable@freebsd.org>; Tue, 24 Jul 2007 01:10:21 +0000 (UTC)
	(envelope-from unfurl@dub.net)
Received: from toxic.magnesium.net (toxic.magnesium.net [207.154.84.15])
	by mx1.freebsd.org (Postfix) with ESMTP id 36AEE13C428
	for <freebsd-stable@freebsd.org>; Tue, 24 Jul 2007 01:10:21 +0000 (UTC)
	(envelope-from unfurl@dub.net)
Received: from bodhi.local (dsl081-065-206.sfo1.dsl.speakeasy.net
	[64.81.65.206])
	by toxic.magnesium.net (Postfix) with ESMTP id C50D5DA828
	for <freebsd-stable@freebsd.org>; Mon, 23 Jul 2007 20:44:37 -0400 (EDT)
Message-ID: <46A54B6F.9010100@dub.net>
Date: Mon, 23 Jul 2007 17:44:31 -0700
From: Bill Swingle <unfurl@dub.net>
User-Agent: Thunderbird 2.0.0.5 (Macintosh/20070716)
MIME-Version: 1.0
To: FreeBSD Stable <freebsd-stable@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: problems with Hitachi 1TB SATA drives
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Jul 2007 01:10:21 -0000

Hello all,

I've run across a problem that I hope someone can aid me with.

I have a fileserver that currently has a 4-disc raid connected to an IDE 3ware card. I had hoped to 
replace this dying system with a pair of synchronized 1TB SATA drives. When trying to newfs them 
both eventually failed with DMA READ or WRITE timeouts. Here's some infos:

FreeBSD rum.dub.net 6.2-STABLE FreeBSD 6.2-STABLE #2: Sat Jul 21 09:05:25 PDT 2007 
unfurl@rum.dub.net:/usr/obj/usr/src/sys/GENERIC  i386

<snip from dmesg>
ad0: 43979MB <IBM DTLA-307045 TX6OA50C> at ata0-master UDMA100 <-- system disk
ad4: 953869MB <Hitachi HDS721010KLA330 GKAOA70F> at ata2-master SATA150
ad6: 953869MB <Hitachi HDS721010KLA330 GKAOA70F> at ata3-master SATA150
twed0: <Unit 0, RAID5, Normal> on twe0
twed0: 583440MB (1194885120 sectors)

A complete dmesg is at http://dub.net/rum.dub.net.dmesg

Initially the attempted newfs would cause this:

Jul 21 00:21:45 rum kernel: ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=54194911
Jul 21 00:22:20 rum kernel: ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=107260543
Jul 21 00:22:57 rum kernel: ad4: FAILURE - device detached
Jul 21 00:22:57 rum kernel: subdisk4: detached
Jul 21 00:22:57 rum kernel: ad4: detached
Jul 21 00:24:19 rum kernel: ad6: FAILURE - device detached
Jul 21 00:24:19 rum kernel: subdisk6: detached
Jul 21 00:24:19 rum kernel: ad6: detached

After several tries I was able to get both disks newfs'd and mounted but they quickly fell down with 
DMA timeouts. On one occasion the machine actually panic'd too:

ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=1456106111
ad4: TIMEOUT - WRITE_DMA48 retrying (0 retries left) LBA=1456106111
ad4: FAILURE - WRITE_DMA48 timed out LBA=1456106111
ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=54194911
ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=461407775
ad4: TIMEOUT - WRITE_DMA48 retrying (0 retries left) LBA=461407775
ad4: FAILURE - WRITE_DMA48 timed out LBA=461407775


Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x66
fault code              = supervisor read, page not present
instruction pointer     = 0x20:0xc07253c3
stack pointer           = 0x28:0xd9724b9c
frame pointer           = 0x28:0xd9724ba4
code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 779 (mdnsd)
trap number             = 12
panic: page fault


I've read that bad SATA cables could cause this, the cables I'm using are brand new but are probably 
pretty cheap.

Help freebsd-stable, you're my only hope! :)

-Bill

-- 
-=| Bill Swingle - unfurl@dub.net