From owner-freebsd-current@FreeBSD.ORG  Wed May 11 16:32:45 2005
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 171F016A4CE; Wed, 11 May 2005 16:32:45 +0000 (GMT)
Received: from pne-smtpout2-sn2.hy.skanova.net
	(pne-smtpout2-sn2.hy.skanova.net [81.228.8.164])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id AFA2543D79; Wed, 11 May 2005 16:32:44 +0000 (GMT)
	(envelope-from daniel_k_eriksson@telia.com)
Received: from sentinel (195.198.193.104) by pne-smtpout2-sn2.hy.skanova.net
	(7.1.026.7)
	id 42662CF1005D1DF8; Wed, 11 May 2005 18:32:44 +0200
From: "Daniel Eriksson" <daniel_k_eriksson@telia.com>
To: <freebsd-current@FreeBSD.org>
Date: Wed, 11 May 2005 18:32:41 +0200
Organization: Home
Message-ID: <!~!UENERkVCMDkAAQACAAAAAAAAAAAAAAAAABgAAAAAAAAA0VcX9IoJqUaXPS8MjT1PdsKAAAAQAAAAK2qOm3PLmk2v7WAC+RJVOQEAAAAA@telia.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook, Build 11.0.6353
In-Reply-To: <20050511132427.GA64084@ip.net.ua>
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2527
Thread-Index: AcVWLODM4tlpdOBRT2a+ivYa8R5DYwAF6EUg
cc: 'Ruslan Ermilov' <ru@FreeBSD.org>
cc: =?us-ascii?Q?'Soren_Schmidt'?= <sos@DeepCore.dk>
Subject: RE: Accessing IDE disk with bad sectors freezes the box
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 11 May 2005 16:32:45 -0000

Ruslan Ermilov wrote:

> I have a disk with lot of bad sectors.  When working with it on
> an AMD64 box running 6-CURRENT, accessing bad areas just freezes
> the box completely, without any diagnostics.  The same disk when
> plugged into another i386 box running 4-STABLE works properly by
> issuing errors from the kernel, and reporting EIO to userland.

I was just about to report the same problem. Three days ago one of my SATA
disks suddenly developed a few bad sectors. Smartd reported this to me, so I
set out to try to recover the data. This was on an AMD Athlon XP (i386)
machine running the latest CURRENT, and the disc was hooked up to a Promise
SATA150 TX4. It didn't take long for the machine to lock up solid once I
started to read data from it, even a 'dd' from the raw disk caused a solid
lock (wanted to see if the problem was vfs related).

Yesterday I hooked the disk up to a spare machine (also i386) running
5.4-RC4. The motherboard has a built-in SiL 3112 based controller which I
used. I was quite surprised when instead of a crash it just printed some
errors on the console and then continued to read the data. This is what it
looked like:

ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=272855487
ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=429057631
ad4: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=1<ILLEGAL_LENGTH>
LBA=416906399
ad4: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=1<ILLEGAL_LENGTH>
LBA=416906399
ad4: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE>
LBA=478486239
spec_getpages:(ad4s1d) I/O read failure: (error=5) bp 0xc65ffb14 vp
0xc19ef738
               size: 65536, resid: 65536, a_count: 65536, valid: 0x0
               nread: 0, reqpage: 0, pindex: 96, pcount: 16
vm_fault: pager read error, pid 2990 (cp)
ad4: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE>
LBA=478507071
spec_getpages:(ad4s1d) I/O read failure: (error=5) bp 0xc65ffb14 vp
0xc19ef738
               size: 65536, resid: 65536, a_count: 65536, valid: 0x0
               nread: 0, reqpage: 0, pindex: 96, pcount: 16
vm_fault: pager read error, pid 2990 (cp)
ad4: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE>
LBA=478510815
spec_getpages:(ad4s1d) I/O read failure: (error=5) bp 0xc65ffb14 vp
0xc19ef738
               size: 65536, resid: 65536, a_count: 65536, valid: 0x0
               nread: 0, reqpage: 0, pindex: 192, pcount: 16
vm_fault: pager read error, pid 2990 (cp)
ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=73765143
ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=76399623


I have a 6-CURRENT installation on a spare disk that I will hook up to the
machine later and see how it handles the bad sectors using the same
controller. I'll report back later tonight if I can find the time to do it.

/Daniel Eriksson