Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 18 Jun 2010 16:47:11 +0100
From:      Matthew Lear <matt@bubblegen.co.uk>
To:        Jeremy Chadwick <freebsd@jdc.parodius.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: 7.2-RELEASE-p4, IO errors & RAID1 failure
Message-ID:  <1276876031.7519.39.camel@almscliff.bubblegen.co.uk>
In-Reply-To: <20100618082127.GA34578@icarus.home.lan>
References:  <1276844904.7519.19.camel@almscliff.bubblegen.co.uk> <20100618082127.GA34578@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
Hello Jeremy,
Thanks very much for the feedback.

[snip]
> Could you please provide the full output from "smartctl -a /dev/ad0"
> here?  Your drive may be completely fine and you may not have to swap it
> at all; hard to say.

Sure. See below:

smartctl 5.39.1 2010-01-28 r3054 [FreeBSD 7.2-RELEASE-p4 i386] (local
build)
Copyright (C) 2002-10 by Bruce Allen,
http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar Blue Serial ATA family
Device Model:     WDC WD3200AAKS-00VYA0
Serial Number:    WD-WCARW0164427
Firmware Version: 12.01B02
User Capacity:    320,072,933,376 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Fri Jun 18 16:27:54 2010 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine
completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		 (8400) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 100) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x303f)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   200   200   051    Pre-fail  Always
-       0
  3 Spin_Up_Time            0x0003   218   150   021    Pre-fail  Always
-       2100
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always
-       118
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always
-       0
  7 Seek_Error_Rate         0x000e   200   200   051    Old_age   Always
-       0
  9 Power_On_Hours          0x0032   088   088   000    Old_age   Always
-       9316
 10 Spin_Retry_Count        0x0012   100   100   051    Old_age   Always
-       0
 11 Calibration_Retry_Count 0x0012   100   100   051    Old_age   Always
-       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always
-       116
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always
-       115
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always
-       118
194 Temperature_Celsius     0x0022   109   103   000    Old_age   Always
-       38
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always
-       0
197 Current_Pending_Sector  0x0012   200   200   000    Old_age   Always
-       0
198 Offline_Uncorrectable   0x0010   200   200   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always
-       0
200 Multi_Zone_Error_Rate   0x0008   200   200   051    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      9299
-
# 2  Short offline       Completed without error       00%      9298
-

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute
delay.

> > The drives in the RAID exist on two seperate ATA channels:
> > [root@meshuga /home/matt]# atacontrol list
> > ATA channel 0:
> >     Master:  ad0 <WDC WD3200AAKS-00VYA0/12.01B02> SATA revision 2.x
> >     Slave:   ad1 <FB160C4081/HPF0> SATA revision 1.x
> > ATA channel 1:
> >     Master:  ad2 <WDC WD3200AAKS-00VYA0/12.01B02> SATA revision 2.x
> >     Slave:       no device present
> > ATA channel 2:
> >     Master: acd0 <HL-DT-ST DVDRAM GH22NS40/NL01> SATA revision 1.x
> >     Slave:       no device present
> > ATA channel 3:
> >     Master:      no device present
> >     Slave:       no device present
> > 
> > ad1 is a third 160G drive that I periodically back up to using cron.
> 
> So your RAID-1 array consists of ad0 and ad2?  You didn't provide
> "atacontrol status" output so I'm going to assume that's the case.

Correct. Apologies. Here's the output:

ar0: ATA RAID1 status: DEGRADED
 subdisks:
   0 ad0  OFFLINE
   1 ad2  ONLINE

> What's odd to me is that you somehow have two disks on a single ATA
> channel -- look closely at channel 0.  SATA has a 1:1 device-to-channel
> mapping, so I'm a little surprised to see there's two devices on channel
> 0.  To me, this indicates your system BIOS is configured to run in
> "Emulation" mode -- where the ATA controller pretends to be a PATA/IDE
> controller, thus SATA-0 and SATA-1 devices appear as primary master and
> primary slave, respectively.

The two devices in the array are on channels 0 and 1. There is indeed a
second drive on channel 0 (160G). As I said above, I use that as an
additional back up device but it's not part of the array.

> 
> What motherboard is this?  Can you change the setting to either
> "Native", "Enhanced", or (even better) "AHCI"?  I've seen some systems
> where the Serial ATA option in the BIOS has an "Auto" option, which does
> totally bizarre things at times.
> 

I think this has been covered in subsequent postings. I could try it but
as you say below, I'd like to resolve the disk issue first.

> But before changing the setting, I would recommend dealing with the disk
> problem first.  Changing the SATA controller operation mode will almost
> certainly change all of your device names (you'll have to go into
> single-user mode, mount filesystems by hand, fix /etc/fstab, etc.).
> 
> Also, can you please provide output from "dmesg | grep -i ata"?

Sure. See below:

[root@meshuga /home/matt]# dmesg | grep -i ata
atapci0: <Intel ICH9 SATA300 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1c10-0x1c1f,0x1c00-0x1c0f at
device 31.2 on pci0
ata0: <ATA channel 0> on atapci0
ata0: [ITHREAD]
ata1: <ATA channel 1> on atapci0
ata1: [ITHREAD]
atapci1: <Intel ICH9 SATA300 controller> port
0x1c68-0x1c6f,0x1c5c-0x1c5f,0x1c60-0x1c67,0x1c58-0x1c5b,0x1c30-0x1c3f,0x1c20-0x1c2f irq 18 at device 31.5 on pci0
atapci1: [ITHREAD]
ata2: <ATA channel 0> on atapci1
ata2: [ITHREAD]
ata3: <ATA channel 1> on atapci1
ata3: [ITHREAD]
ad0: 305245MB <WDC WD3200AAKS-00VYA0 12.01B02> at ata0-master SATA300
ad1: 152627MB <FB160C4081 HPF0> at ata0-slave SATA150
ad2: 305245MB <WDC WD3200AAKS-00VYA0 12.01B02> at ata1-master SATA300
acd0: DVDR <HL-DT-ST DVDRAM GH22NS40/NL01> at ata2-master SATA150
(probe0:ata2:0:0:0): TEST UNIT READY. CDB: 0 0 0 0 0 0 
(probe0:ata2:0:0:0): CAM Status: SCSI Status Error
(probe0:ata2:0:0:0): SCSI Status: Check Condition
(probe0:ata2:0:0:0): NOT READY asc:3a,1
(probe0:ata2:0:0:0): Medium not present - tray closed
(probe0:ata2:0:0:0): Unretryable error
ar0: disk0 READY (master) using ad0 at ata0-master
ar0: disk1 READY (mirror) using ad2 at ata1-master
cd0 at ata2 bus 0 target 0 lun 0

> When you say "software RAID", I'm assuming you're referring to ata(4)'s
> native OS-level RAID (as in "atacontrol create RAID1 ad0 ad1").  Or are
> you using something like Intel MatrixRAID?

Correct (almost). The array was created using 'atacontrol create RAID1
ad0 ad2'.

> > Therefore I expect that I need to detach ad0 from the RAID, power down
> > the unit, replace the drive, power on the unit and rebuild the array in
> > order to fix things. Trouble is, I'm struggling to find out if this can
> > be done safely with atacontrol and the hw configuration I have, and if
> > so, how best to do it?
> 
> The atacontrol man page covers your situation:
> 
>   It is NOT recommended to create such arrays on a primary/secondary pair
>   on a SINGLE channel since the throughput of the mirror would be severely
>   compromised, the ability to rebuild the array in the event of a disk
>   failure would be greatly complicated, and if a disk controller
>   electronics failed it could wedge the channel and take both disks in the
>   mirror offline.  (which would defeat the purpose of having a mirror in
>   the first place)

I don't think this is the case for me since ad0 and ad2 are on seperate
ata channels.

> I realise ad0 is on channel 0 and ad2 is on channel 1, but you have a
> "mystery device" as a Slave on channel 0, which is going to be impacted.
> 
> You really need AHCI to be able to hot-swap effectively.  The procedure
> I've followed for years -- without ZFS in the picture (that should just
> add a few extra commands to the picture) -- relies on AHCI and a proper
> hot-swap bay/backplane.  Hot-swapping disks without such a backplane,
> in my experience, results in the system powering off suddenly.  Anyway,
> this is the procedure:

Indeed but my hw doesn't have hot-swap capability (at the moment!).

> - atacontrol detach ataX   (where ataX = channel disk is attached to)
> - Physically remove the bad disk
> - Physically insert a new disk
> - Wait 15 seconds for drive to settle
> - atacontrol attach ataX
> 
> The new disk should appear automatically, and should appear as the same
> device name (adX) that it did before.  At least that's my experience
> when using AHCI with ataahci.ko (I haven't tried when using ahci.ko,
> which uses CAM).  We can discuss the details/differences later.
> 
> If the disk doesn't reappear ("atacontrol list" shows no device
> attached) then do "atacontrol reinit ataX", which should make it appear.
> I've had to do this once or twice, and it worked fine.  I've also seen
> this command lock the system up or panic the kernel.
> 
> But as stated, you won't be able to do this because you have two SATA
> devices appearing under one channel.  Given that, I would recommend you
> follow this procedure instead:
> 
> - Power down system cleanly ("shutdown -p")
> - Remove power cable from PSU
> - Physically disconnect + remove the bad disk
> - Physically add + connect the new disk
> - Power up system
> - Go into system BIOS and make sure the new disk appears.  (FreeBSD
>   doesn't care what the BIOS thinks, so this step is done solely to
>   make sure that the PC sees the disk at all)
> - Let FreeBSD boot/etc. -- I believe ata(4) will automatically begin
>   rebuilding the array when it tastes the new/replacement disk and
>   sees it has no metadata.  "atacontrol status" should show the state.

Sounds good. Thanks. I'm guessing that there should be no difference in
the above steps just because ad0 and ad2 (the RAID drives) are on
seperate channels? I don't think there should be but I thought I'd ask
anyway...

> > It may well be a case of RTFM (again) but I just wanted to run this by
> > the community to get some feedback. Loosing data is not an option here
> > so hopefully I can get the machine back up on its feet soon.
> 
> Don't take this as a pot-shot, but you should have tested this whole
> ordeal before putting the machine into a mission-critical role.  It's
> important to do this rather than just blindly assume there won't be any
> complications; better to be safe than sorry.  :-)  Testing disk failures
> of this specific nature is pretty simple, especially if there's a
> hot-swap backplane involved. 
> 

You're absolutely right :-) When I say that loosing data is not an
option perhaps that was a bit over zealous :-) The server stores lots of
personal stuff (music, photos, repositories for some private code dev
etc). I've got backups of all the data (created using using dar) on the
160GB disk (ad1) plus periodic snapshots of all system files
(excluding /tmp /mnt /dev/pts /dev/fd /proc /cdrom etc). If I lost the
array it wouldn't be a total disaster. That said, I'd rather it didn't
happen! :-)

Thanks for the feedback. Any further thoughts based on the smart data?




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1276876031.7519.39.camel>