From owner-freebsd-fs@FreeBSD.ORG Sun Jan 27 19:53:41 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BD0E416A417 for ; Sun, 27 Jan 2008 19:53:41 +0000 (UTC) (envelope-from joe@skyrush.com) Received: from shadow.wildlava.net (shadow.wildlava.net [67.40.138.81]) by mx1.freebsd.org (Postfix) with ESMTP id 8062413C45B for ; Sun, 27 Jan 2008 19:53:41 +0000 (UTC) (envelope-from joe@skyrush.com) Received: from crater.wildlava.net (crater.wildlava.net [67.40.138.82]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by shadow.wildlava.net (Postfix) with ESMTP id 3F3FC8F165; Sun, 27 Jan 2008 12:33:52 -0700 (MST) Message-ID: <479CDC9E.8040604@skyrush.com> Date: Sun, 27 Jan 2008 12:33:50 -0700 From: Joe Peterson User-Agent: Thunderbird 2.0.0.9 (X11/20071208) MIME-Version: 1.0 To: pjd@freebsd.org, freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Subject: Unexpected "resilver" after reboot (after scrub found CKSUM problems) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Jan 2008 19:53:41 -0000 Hi Pawel (or anyone else who might know), I had a strange thing happen on ZFS the other day, and I cannot find any info about it on the web - thought you might have some ideas. I am using 7.0-RC1 at the moment. I found a checksum error in ZFS during a scrub. This is strange in itself, since I believe the disk is OK (see below): pool: tank state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 ad0s1d ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: /home/joe/music/jukebox/christmas/Esquivel/Merry_XMas_from_the_SpaceAge_Bachelor_Pad/07-Snowfall.mp3 This is how it appears after a recent reboot, however. After a scrub, I see varying number of non-zero counts under CKSUM. Not sure why it is zero after reboot (maybe that's normal). However, the strange this is that after my first reboot after the scrub found the issue, zpool status told me that "resilver completed with 0 errors", and there were no known errors. Only trying to read the file and/or rescrubbing returned the status to the error state and made the CKSUM column non-zero. Since I do not have a mirror or raid config, I'm not sure why it would resilver at all, and I did nothing explicit to cause a resilver (as far as I know)... Any ideas? As an aside, I, along with some others on freebsd-stable@freebsd.org, have been seeing what "look" like disk errors in the system logs. I have a suspicion that there could be some other cause (lots of discussion on that list, if you are interested). Strangely, this disk checks out fine on both short and long tests in Seatools, and smartctl shows it as OK. Also, using Linux to do lots of reads from it does not show any issue or error logs. At this point, I am not sure if the CKSUM issue is a real HW flaw or something else... Thanks, Joe