Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 10 Mar 2008 19:40:53 -0400
From:      Josh Endries <josh@endries.org>
To:        freebsd-questions@freebsd.org
Subject:   Questions about camcontrol, hot-swapping, ciss and Compaq SmartArray
Message-ID:  <47D5C705.2030909@endries.org>

next in thread | raw e-mail | index | archive | help
Hello,

Today I saw that one of my disks seems to be dead/dying in a RAID 5 array I have:

http://pastebin.ca/937249

<snip>
loki.domain.int ciss0: *** Fatal drive error, SCSI port 1 ID 0
loki.domain.int (da1:ciss0:0:1:0): WRITE(10). CDB: 2a 0 c ae 3f d0 0 0 20 0
loki.domain.int (da1:ciss0:0:1:0): CAM Status: SCSI Status Error
loki.domain.int (da1:ciss0:0:1:0): SCSI Status: Check Condition
loki.domain.int (da1:ciss0:0:1:0): MEDIUM ERROR asc:11,0
loki.domain.int (da1:ciss0:0:1:0): Unrecovered read error
loki.domain.int (da1:ciss0:0:1:0): Retrying Command (per Sense Data)
</snip>

I see messages for port 0 only, but varying ID 0-3, and I'm not sure what that 
means (partition?). After a while the error messages "went away", though the 
disks were/are still being used. I found cciss_vol_status online but it says the 
volume is OK (not degraded), which doesn't really make sense to me:

# cciss_vol_status /dev/ciss0
/dev/ciss0: (Smart Array 642) RAID 0 Volume 0(?) status: OK.
/dev/ciss0: (Smart Array 642) RAID 5 Volume 1(?) status: OK.

Is there a way I can tell which port/disk is bad from these messages?

Assuming I can determine which disk it is, do I need to do anything in the OS 
before/after I swap out a drive? I've seen people talk about rescanning and 
running other camcontrol commands before...

Any other tips?

Thanks,
Josh



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?47D5C705.2030909>