From owner-freebsd-questions@FreeBSD.ORG Wed Feb 9 19:45:14 2011 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 41B79106564A for ; Wed, 9 Feb 2011 19:45:14 +0000 (UTC) (envelope-from click@sgate.org) Received: from mail-qy0-f175.google.com (mail-qy0-f175.google.com [209.85.216.175]) by mx1.freebsd.org (Postfix) with ESMTP id CEF168FC14 for ; Wed, 9 Feb 2011 19:45:13 +0000 (UTC) Received: by qyk8 with SMTP id 8so1521915qyk.13 for ; Wed, 09 Feb 2011 11:45:13 -0800 (PST) MIME-Version: 1.0 Received: by 10.229.43.195 with SMTP id x3mr15182601qce.291.1297278959185; Wed, 09 Feb 2011 11:15:59 -0800 (PST) Sender: click@sgate.org Received: by 10.229.227.84 with HTTP; Wed, 9 Feb 2011 11:15:59 -0800 (PST) Date: Wed, 9 Feb 2011 21:15:59 +0200 X-Google-Sender-Auth: 1NHlHEajbT0lzDKXkAjlleui0pU Message-ID: From: Daniel Zhelev To: freebsd-questions@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Bad hard driver X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Feb 2011 19:45:14 -0000 Hello all, Today I`ve received the following messages: The following warning/error was logged by the smartd daemon: Device: /dev/ad7, 1 Currently unreadable (pending) sectors For details see host's SYSLOG (default: /var/log/messages). And: The following warning/error was logged by the smartd daemon: Device: /dev/ad7, 3 Offline uncorrectable sectors For details see host's SYSLOG (default: /var/log/messages). The syslog shows some activity on the disk: Feb 8 09:23:43 wolfdale smartd[1198]: Device: /dev/ad7, 1 Currently unreadable (pending) sectors Feb 8 09:53:43 wolfdale smartd[1198]: Device: /dev/ad7, 2 Currently unreadable (pending) sectors (changed +1) Feb 8 10:23:43 wolfdale smartd[1198]: Device: /dev/ad7, 2 Currently unreadable (pending) sectors Feb 8 10:53:43 wolfdale smartd[1198]: Device: /dev/ad7, 2 Currently unreadable (pending) sectors Feb 8 11:23:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Currently unreadable (pending) sectors (changed +1) Feb 8 11:53:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Currently unreadable (pending) sectors Feb 8 12:23:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Currently unreadable (pending) sectors Feb 8 12:23:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Offline uncorrectable sectors Feb 8 12:53:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Currently unreadable (pending) sectors Feb 8 12:53:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Offline uncorrectable sectors Feb 8 13:23:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Currently unreadable (pending) sectors Feb 8 13:23:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Offline uncorrectable sectors Feb 8 13:53:44 wolfdale smartd[1198]: Device: /dev/ad7, 3 Currently unreadable (pending) sectors Feb 8 13:53:44 wolfdale smartd[1198]: Device: /dev/ad7, 3 Offline uncorrectable sectors Feb 8 14:23:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Currently unreadable (pending) sectors Feb 8 14:23:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Offline uncorrectable sectors Feb 8 14:53:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Currently unreadable (pending) sectors Feb 8 14:53:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Offline uncorrectable sectors Feb 8 15:23:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Currently unreadable (pending) sectors Feb 8 15:23:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Offline uncorrectable sectors Feb 8 15:38:08 wolfdale monit[1190]: monit: Socket 5 close failed -- Connection reset by peer Feb 8 15:53:44 wolfdale smartd[1198]: Device: /dev/ad7, 3 Currently unreadable (pending) sectors Feb 8 15:53:44 wolfdale smartd[1198]: Device: /dev/ad7, 3 Offline uncorrectable sectors Feb 8 16:23:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Currently unreadable (pending) sectors Feb 8 16:23:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Offline uncorrectable sectors Feb 8 16:30:13 wolfdale monit[1190]: monit: Socket 5 close failed -- Connection reset by peer Feb 8 16:53:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Currently unreadable (pending) sectors Feb 8 16:53:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Offline uncorrectable sectors Feb 8 17:23:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Currently unreadable (pending) sectors Feb 8 17:23:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Offline uncorrectable sectors Feb 8 17:53:44 wolfdale smartd[1198]: Device: /dev/ad7, 3 Currently unreadable (pending) sectors Feb 8 17:53:44 wolfdale smartd[1198]: Device: /dev/ad7, 3 Offline uncorrectable sectors Feb 8 18:23:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Currently unreadable (pending) sectors Feb 8 18:23:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Offline uncorrectable sectors Feb 8 18:53:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Currently unreadable (pending) sectors Feb 8 18:53:43 wolfdale smartd[1198]: Device: /dev/ad7, 3 Offline uncorrectable sectors So far not so good, I`ve decided to run offline selftest on the disk: the result # 1 Extended offline Completed without error 00% 5232 - This is the output from smartctl === START OF INFORMATION SECTION === Device Model: WDC WD1002FAEX-00Z3A0 Serial Number: WD-WCATR0672307 Firmware Version: 05.01D05 User Capacity: 1,000,204,886,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Wed Feb 9 21:11:36 2011 EET SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (16200) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 187) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3037) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 175 174 021 Pre-fail Always - 4233 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 38 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0 9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 5236 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 34 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 31 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 6 194 Temperature_Celsius 0x0022 103 095 000 Old_age Always - 44 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 3 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 189 000 Old_age Offline - 3 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 5232 - # 2 Extended offline Aborted by host 90% 5229 - # 3 Extended offline Aborted by host 90% 5229 - # 4 Short offline Completed without error 00% 5218 - # 5 Short offline Completed without error 00% 5195 - # 6 Short offline Completed without error 00% 5171 - # 7 Short offline Completed without error 00% 5148 - # 8 Extended offline Completed without error 00% 5128 - # 9 Short offline Completed without error 00% 5125 - #10 Short offline Completed without error 00% 5102 - #11 Short offline Completed without error 00% 5078 - #12 Short offline Completed without error 00% 5055 - #13 Short offline Completed without error 00% 5031 - #14 Short offline Completed without error 00% 5008 - #15 Short offline Completed without error 00% 4984 - #16 Extended offline Completed without error 00% 4964 - #17 Short offline Completed without error 00% 4960 - #18 Short offline Completed without error 00% 4937 - #19 Short offline Completed without error 00% 4889 - #20 Short offline Completed without error 00% 4865 - #21 Short offline Completed without error 00% 4841 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. I know about the how to - http://smartmontools.sourceforge.net/badblockhowto.html But how can I get the LBA? And is there some diagnostic tool for WD in ports?