From owner-freebsd-questions@FreeBSD.ORG Mon May 2 00:59:10 2011 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B3D73106564A for ; Mon, 2 May 2011 00:59:10 +0000 (UTC) (envelope-from mnorwick@centurytel.net) Received: from mail959c35.nsolutionszone.com (mail959c35.nsolutionszone.com [209.235.152.149]) by mx1.freebsd.org (Postfix) with ESMTP id 688578FC08 for ; Mon, 2 May 2011 00:59:10 +0000 (UTC) X-Authenticated-User: mnorwick.centurytel.net Received: from [192.168.1.21] (174-124-15-47.dyn.centurytel.net [174.124.15.47]) (authenticated bits=0) by mail959c35.nsolutionszone.com (8.13.6/8.13.1) with ESMTP id p420x6tI006792 for ; Mon, 2 May 2011 00:59:08 GMT Message-ID: <4DBDBB32.7010906@centurytel.net> Date: Sun, 01 May 2011 19:57:38 +0000 From: "Michael D. Norwick" User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.15) Gecko/20110408 Thunderbird/3.1.9 MIME-Version: 1.0 To: freebsd-questions@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-CSC: 0 X-CHA: v=1.1 cv=9jMDsWx+xKe1k7MiQTtfraZb4X8hU8mWteHgCHeKziI= c=1 sm=1 a=9dbVxmVYL3YA:10 a=3WPTVEtZbjMA:10 a=8nJEP1OIZ-IA:10 a=5tKAl6LZhBnMgsYvvmeC/w==:17 a=-hyCGDrDAAAA:8 a=cexIBkohAAAA:8 a=odRC58kGr9J2mwOKo2QA:9 a=CQet2J5PWbJb7ReUp7MA:7 a=wPNLvfGTeEIA:10 a=Er2gK3W4G3kA:10 a=5tKAl6LZhBnMgsYvvmeC/w==:117 Subject: freebsd zfs question X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 May 2011 00:59:10 -0000 Good Day; A week or so ago I experienced an error trying to compile openoffice.org from ports. The build failed from an error I was since able to resolve. This machine is at #>uname -r rainey 8.2-RELEASE-p1 FreeBSD 8.2-RELEASE-p1 #2: Wed Apr 27 04:37:38 UTC 2011 michael@rainey:/usr/obj/usr/src/sys/KERNEL_042511 amd64 During that episode the workstation locked up and the only way to recover was to do a hard reset (power off). I then noticed during the next weekly scrub that I had increasing errors listed for the root pool during that particular scrub; # zpool status pool: tank state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 ad2p3 ONLINE 0 0 0 ad3p1 ONLINE 0 0 0 errors: 604 data errors, use '-v' for a list pool: tank1 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM tank1 ONLINE 0 0 0 ad12p2 ONLINE 0 0 0 errors: No known data errors Tank consists of 2 300G PATA drives in a mirror. Tank1 is a 500G SATA drive I just added recently to use for data archiving. The stuff I wish to protect is backed up to a network file server manually via NFS. I have scrubbed the pool showing errors several times now with no increases or decreases in error counts. I have issued #>zpool clear tank a number of times with no change in the error count. The document listed (www.sun.com/msg/ZFS-8000-8A) was of no apparent help for my condition. I have drives I can export to and import from but I am unclear as to whether I will be just moving the bad blocks around. Sample of the output of #>zpool status -v tank; < tank/root:<0x1097e7> tank/root:<0x1096e8> tank/root:<0x1097e8> tank/root:<0x1096e9> tank/root:<0x1097e9> tank/root:<0x1095ea> tank/root:<0x1097ea> tank/root:<0x1096eb> tank/root:<0x1097eb> tank/root:<0x1096ec> tank/root:<0x1097ec> tank/root:<0x1095ed> tank/root:<0x1096ed> tank/root:<0x1097ed> tank/root:<0x1094ee> tank/root:<0x1096ee> tank/root:<0x1095ef> > Not sure what to do with these. Why doesn't #>zpool clear tank delete these? The directory /usr/ports/editors/openoffice.org-3/work was not able to be deleted after the failed build, so I moved it to /oldwork to get the port to build. /oldwork still cannot be deleted. rainey# rm -Rf ./oldwork rm: ./oldwork/OOO330_m20/dictionaries: Directory not empty rm: ./oldwork/OOO330_m20/lucene/unxfbsdx.pro/bin: Directory not empty rm: ./oldwork/OOO330_m20/lucene/unxfbsdx.pro/misc/build: Directory not empty rm: ./oldwork/OOO330_m20/lucene/unxfbsdx.pro/misc: Directory not empty rm: ./oldwork/OOO330_m20/lucene/unxfbsdx.pro: Directory not empty rm: ./oldwork/OOO330_m20/lucene: Directory not empty rm: ./oldwork/OOO330_m20/jfreereport: Directory not empty rm: ./oldwork/OOO330_m20/libxslt: Directory not empty rm: ./oldwork/OOO330_m20/sal: Directory not empty rm: ./oldwork/OOO330_m20: Directory not empty rm: ./oldwork: Directory not empty Attempts to delete the above directories fail. I've read articles about 'bit rot' and such in ZFS metadata but memtest86 completes without error on this machine's 3G of ram. I see no applicable information in dmesg or /var/log/messages. The drives have been running 24/7 since the initial incident with no increase in the error count. Thank You, Michael