From owner-freebsd-questions@FreeBSD.ORG Mon Aug 24 22:32:49 2009 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 668FA106568E for ; Mon, 24 Aug 2009 22:32:49 +0000 (UTC) (envelope-from rsmith@xs4all.nl) Received: from smtp-vbr11.xs4all.nl (smtp-vbr11.xs4all.nl [194.109.24.31]) by mx1.freebsd.org (Postfix) with ESMTP id 11FC58FC17 for ; Mon, 24 Aug 2009 22:32:48 +0000 (UTC) Received: from slackbox.xs4all.nl (slackbox.xs4all.nl [213.84.242.160]) by smtp-vbr11.xs4all.nl (8.13.8/8.13.8) with ESMTP id n7OMWl3n008007; Tue, 25 Aug 2009 00:32:47 +0200 (CEST) (envelope-from rsmith@xs4all.nl) Received: by slackbox.xs4all.nl (Postfix, from userid 1001) id 33EDFBA9C; Tue, 25 Aug 2009 00:32:47 +0200 (CEST) Date: Tue, 25 Aug 2009 00:32:47 +0200 From: Roland Smith To: Kelly Martin Message-ID: <20090824223247.GD43410@slackbox.xs4all.nl> References: <1338880b0908241129p75b6845cg26d21804e118364@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="/3yNEOqWowh/8j+e" Content-Disposition: inline In-Reply-To: <1338880b0908241129p75b6845cg26d21804e118364@mail.gmail.com> X-GPG-Fingerprint: 1A2B 477F 9970 BA3C 2914 B7CE 1277 EFB0 C321 A725 X-GPG-Key: http://www.xs4all.nl/~rsmith/pubkey.txt X-GPG-Notice: If this message is not signed, don't assume I sent it! User-Agent: Mutt/1.5.20 (2009-06-14) X-Virus-Scanned: by XS4ALL Virus Scanner Cc: FreeBSD Questions Subject: Re: hard disk failure - now what? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Aug 2009 22:32:49 -0000 --/3yNEOqWowh/8j+e Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Aug 24, 2009 at 12:29:19PM -0600, Kelly Martin wrote: > I just experienced a hard drive failure on one of my FreeBSD 7.2 > production servers with no backup! I am so mad at myself for not > backing up!! Welcome to the club. :-) > Now it's a salvage operation. Here are the type of errors > I was getting on the console, over-and-over: >=20 > ad4: TIMEOUT - WRITE_DMA48 retrying (0 retries left) LBA=3D441633503 > ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - > completing request directly > ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - > completing request directly > ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly > ad4: FAILURE - WRITE_DMA48 timed out LBA=3D441633375 > g_vgs_done():ad4s1f[WRITE(offset=3D216338284544, length=3D16384)]error = =3D 5 It _could_ just be a bad or improperly connected SATA cable. Try changing or re-seating the cable. Read errors cannot damage your data, but write errors can! Immediately stop all writing to the disk. Re-mount the partitions on that disk as read-only,= or unmount them. To see if a disk really is broken, install sysutils/smartmontools, and run 'smartctl -a' on the disk. If you see errors in its report (e.g. reallocated sectors), the disk is dying and should be unplugged to prevent it from gett= ing worse. > My question: what kind of checks and/or repair tools should I run on > the damaged drive after it's mounted? As others have mentioned, first make a copy (with the disk unmounted) of the partitions on that disk with dd, saving them to another drive. That way you can experiment with the data without further deterioration of the original. You can use this disk image e.g. as a vnode-backed memory disk, s= ee mdconfig(8). If you cannot get a good copy of the disk partitions it might = be a good idea to get a quote from a professional hard drive data recovery company to do that for you. I've never had occasion to try this (hooray for backups) but I've heard it can be quite expensive. :-/ Try using fsck_ffs on (copies of) the disk image to see if that can restore the damage. If the damage is beyond repair for fsck_ffs, you have a real problem. Of course is you have a good disk image, your data is still there, but you might have to use a forensics program like sysutils/sleuthkit or hexdump to try and piece files together. And even then you cannot be sure that there is no corrupted data in the files themselves. Good luck with tha= t. :-( Roland --=20 R.F.Smith http://www.xs4all.nl/~rsmith/ [plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated] pgp: 1A2B 477F 9970 BA3C 2914 B7CE 1277 EFB0 C321 A725 (KeyID: C321A725) --/3yNEOqWowh/8j+e Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.12 (FreeBSD) iEYEARECAAYFAkqTFQ8ACgkQEnfvsMMhpyVMhwCgr5h3MubFYhWDlv3eMMeI5hAD sWcAniUb8hErDp7loTu95UvQJ/Mc5YUZ =vn7D -----END PGP SIGNATURE----- --/3yNEOqWowh/8j+e--