Date: Mon, 4 Jul 2011 09:45:47 +0200 (CEST) From: Romain Tartiere <romain@FreeBSD.org> To: FreeBSD-gnats-submit@FreeBSD.org Subject: ports/158630: sysutils/smartmontools daily script improvement Message-ID: <20110704074547.CEEDF1BF79@marvin.blogreen.org> Resent-Message-ID: <201107040750.p647o4pb002860@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 158630 >Category: ports >Synopsis: sysutils/smartmontools daily script improvement >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-ports-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: change-request >Submitter-Id: current-users >Arrival-Date: Mon Jul 04 07:50:04 UTC 2011 >Closed-Date: >Last-Modified: >Originator: Romain Tartiere >Release: FreeBSD 8.2-STABLE amd64 >Organization: >Environment: System: FreeBSD marvin.blogreen.org 8.2-STABLE FreeBSD 8.2-STABLE #7 r222417: Sat May 28 13:23:35 CEST 2011 root@marvin.blogreen.org:/usr/obj/usr/src/sys/MARVIN amd64 >Description: smartmontool port can install a daily script for checking disks. When a disk is failing, a full report is included in the generated mail, otherwise a 'OK' message is output instead. When a disk has been failing in the past but the situation is back to normal, the script still outputs a verbose report, requiring more attention to see that it is not a new problem. >How-To-Repeat: Enable smart status in /etc/periodic.conf: daily_status_smart_enable="YES" daily_status_smart_devices="ad4 ad6 ad10 ad12" ad4 has failed in the past because of a cooling problem. The daily mail include: ---------------------------------------8<------------------------------------- Checking health of /dev/ad4: === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.10 Device Model: ST3320620AS [...12 lines...] === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED See vendor-specific Attribute list for marginal Attributes. [...41 lines...] 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 057 044 045 Old_age Always In_the_past 43 (Min/Max 36/50) 194 Temperature_Celsius 0x0022 043 056 000 Old_age Always - 43 (0 22 0 0) [...23 lines...] Checking health of /dev/ad6: OK Checking health of /dev/ad10: OK Checking health of /dev/ad12: OK ---------------------------------------8<------------------------------------- >Fix: smartctl sets bit 5 (counting from 0) of the return value to 1 for such cases. ---------------------------------------8<------------------------------------- Bit 5: SMART status check returned "DISK OK" but we found that some (usage or prefail) Attributes have been <= threshold at some time in the past. ---------------------------------------8<------------------------------------- This can be used to produce clever output: --- smart.diff begins here --- --- files/smart.in 2011-07-04 09:28:22.164557351 +0200 +++ /tmp/smart.in 2011-07-04 09:29:24.213204043 +0200 @@ -63,6 +63,8 @@ status=$? if [ ${status} -eq 0 ]; then echo "OK" + elif [ ${status} -eq 32 ]; then + echo "OK (but has failed in the past)" elif [ $((status & 3)) -ne 0 ]; then rc=2 ${trim_junk} "${tmpfile}" --- smart.diff ends here --- >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110704074547.CEEDF1BF79>