From owner-freebsd-questions@FreeBSD.ORG  Mon Mar 10 23:59:35 2008
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D323C1065670
	for <freebsd-questions@freebsd.org>;
	Mon, 10 Mar 2008 23:59:35 +0000 (UTC)
	(envelope-from josh@endries.org)
Received: from www.endries.org (www.endries.org [216.230.164.2])
	by mx1.freebsd.org (Postfix) with ESMTP id 4326B8FC19
	for <freebsd-questions@freebsd.org>;
	Mon, 10 Mar 2008 23:59:34 +0000 (UTC)
	(envelope-from josh@endries.org)
Received: from localhost (localhost.endries.org [127.0.0.1])
	by www.endries.org (Postfix) with ESMTP id 95052A664CE
	for <freebsd-questions@freebsd.org>;
	Mon, 10 Mar 2008 19:40:52 -0400 (EDT)
X-Virus-Scanned: amavisd-new at endries.org
Received: from www.endries.org ([127.0.0.1])
	by localhost (www.endries.org [127.0.0.1]) (amavisd-new, port 10025)
	with LMTP id bVi-DKwgDRli for <freebsd-questions@freebsd.org>;
	Mon, 10 Mar 2008 19:40:50 -0400 (EDT)
Received: from [10.20.30.3] (cpe-74-67-72-121.stny.res.rr.com [74.67.72.121])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by www.endries.org (Postfix) with ESMTP id 2945FA664AF
	for <freebsd-questions@freebsd.org>;
	Mon, 10 Mar 2008 19:40:49 -0400 (EDT)
Message-ID: <47D5C705.2030909@endries.org>
Date: Mon, 10 Mar 2008 19:40:53 -0400
From: Josh Endries <josh@endries.org>
User-Agent: Thunderbird 2.0.0.12 (Windows/20080213)
MIME-Version: 1.0
To: freebsd-questions@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Questions about camcontrol, hot-swapping, ciss and Compaq SmartArray
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Mar 2008 23:59:35 -0000

Hello,

Today I saw that one of my disks seems to be dead/dying in a RAID 5 array I have:

http://pastebin.ca/937249

<snip>
loki.domain.int ciss0: *** Fatal drive error, SCSI port 1 ID 0
loki.domain.int (da1:ciss0:0:1:0): WRITE(10). CDB: 2a 0 c ae 3f d0 0 0 20 0
loki.domain.int (da1:ciss0:0:1:0): CAM Status: SCSI Status Error
loki.domain.int (da1:ciss0:0:1:0): SCSI Status: Check Condition
loki.domain.int (da1:ciss0:0:1:0): MEDIUM ERROR asc:11,0
loki.domain.int (da1:ciss0:0:1:0): Unrecovered read error
loki.domain.int (da1:ciss0:0:1:0): Retrying Command (per Sense Data)
</snip>

I see messages for port 0 only, but varying ID 0-3, and I'm not sure what that 
means (partition?). After a while the error messages "went away", though the 
disks were/are still being used. I found cciss_vol_status online but it says the 
volume is OK (not degraded), which doesn't really make sense to me:

# cciss_vol_status /dev/ciss0
/dev/ciss0: (Smart Array 642) RAID 0 Volume 0(?) status: OK.
/dev/ciss0: (Smart Array 642) RAID 5 Volume 1(?) status: OK.

Is there a way I can tell which port/disk is bad from these messages?

Assuming I can determine which disk it is, do I need to do anything in the OS 
before/after I swap out a drive? I've seen people talk about rescanning and 
running other camcontrol commands before...

Any other tips?

Thanks,
Josh