From owner-freebsd-questions Wed Jun 6 19:31:20 2001 Delivered-To: freebsd-questions@freebsd.org Received: from panzer.kdm.org (panzer.kdm.org [216.160.178.169]) by hub.freebsd.org (Postfix) with ESMTP id 8628637B406; Wed, 6 Jun 2001 19:31:01 -0700 (PDT) (envelope-from ken@panzer.kdm.org) Received: (from ken@localhost) by panzer.kdm.org (8.9.3/8.9.1) id UAA32465; Wed, 6 Jun 2001 20:30:52 -0600 (MDT) (envelope-from ken) Date: Wed, 6 Jun 2001 20:30:52 -0600 From: "Kenneth D. Merry" To: Konstantinos.Dryllerakis@cec.eu.int Cc: freebsd-questions@FreeBSD.ORG, freebsd-scsi@FreeBSD.ORG Subject: Re: Help needed: MEDIUM ERRORs for scsi device Message-ID: <20010606203052.A32387@panzer.kdm.org> References: <5D802E6EDA71D411BFA900D0B76DEB1B0247EC92@EX2BEL86MBX02> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2i In-Reply-To: <5D802E6EDA71D411BFA900D0B76DEB1B0247EC92@EX2BEL86MBX02>; from Konstantinos.Dryllerakis@cec.eu.int on Wed, Jun 06, 2001 at 10:03:17AM +0200 Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, Jun 06, 2001 at 10:03:17 +0200, Konstantinos.Dryllerakis@cec.eu.int wrote: > Dear All, > > A few days ago, I started receiving "MEDIUM ERROR"s from my FreeBSD 3.3 > machine (HP Netserver less than 1 year old). I have searhed through the > archives/FAQs but I could not locate enough information to understand if the > drive is dying (and should be replaced immediately) or if this is a > situation that you may recover gracefully from. Furthermore, I am having > trouble decoding the SCSI errors... > > I would really appreciate any information/help on the subject. > > Thanks in advance, > > Kostis Dryllerakis (kd@belgacom.net) > > > The errors received are the following: > -------------- > /kernel: (da0:ahc0:0:0:0): READ(10). CDB: 28 0 1 9 20 60 0 0 10 0 > /kernel: (da0:ahc0:0:0:0): MEDIUM ERROR info:1092060 asc:11,0 > /kernel: (da0:ahc0:0:0:0): Unrecovered read error sks:80,35 > /kernel: (da0:ahc0:0:0:0): READ(10). CDB: 28 0 1 1 a9 10 0 0 4 0 > /kernel: (da0:ahc0:0:0:0): MEDIUM ERROR info:101a912 asc:11,0 > /kernel: (da0:ahc0:0:0:0): Unrecovered read error sks:80,35 > -------------- This means that you've got at least two bad blocks. The first thing to do is make sure you've got auto read and write reallocation turned on. To check it, type: camcontrol modepage -n da -u 0 -m 1 -P 3 If AWRE and/or ARRE are set to 0, type this: camcontrol modepage -n da -u 0 -m 1 -P 3 -e And change the values to 1. That will make sure that if possible, any future bad blocks will be automatically remapped. As for your current bad blocks, there are a couple of ways to handle them. One way to deal with it would be to write zeros to those two bad blocks. It will corrupt whatever those blocks are a part of, but may save the rest of your data. Another way to handle it is to backup your system, and then write zeros over the entire disk. Anyway, to write zeros to those two bad blocks: camcontrol cmd -n da -u 0 -v -c "2a 0 v:i4 0 v:i2 0" 0x1092060 1 -o 512 - < /dev/zero camcontrol cmd -n da -u 0 -v -c "2a 0 v:i4 0 v:i2 0" 0x101a912 1 -o 512 - < /dev/zero I think the hex notation will work as an argument there. If you want to be a little more sure it'll work right, you can do it like this: camcontrol cmd -n da -u 0 -v -c "2a 0 01 09 20 60 0 v:i2 0" 1 -o 512 - < /dev/zero camcontrol cmd -n da -u 0 -v -c "2a 0 01 01 a9 12 0 v:i2 0" 1 -o 512 - < /dev/zero So that might silence the drive about those two blocks. I would keep an eye on the grown defects list, though. To do that: camcontrol defects -n da -u 0 -G -f phys If the grown defects list increases, your drive is probably on its way out. In any event, you should probably make sure you've got good backups of the machine and be prepared to install a new disk. Ken -- Kenneth Merry ken@kdm.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message