From owner-freebsd-questions@FreeBSD.ORG Sat Nov 5 22:40:25 2011 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8EF85106566C for ; Sat, 5 Nov 2011 22:40:25 +0000 (UTC) (envelope-from Martin.vGagern@gmx.net) Received: from mailout-de.gmx.net (mailout-de.gmx.net [213.165.64.22]) by mx1.freebsd.org (Postfix) with SMTP id 1259E8FC0A for ; Sat, 5 Nov 2011 22:40:24 +0000 (UTC) Received: (qmail invoked by alias); 05 Nov 2011 22:13:44 -0000 Received: from 178-26-28-18-dynip.superkabel.de (EHLO [192.168.71.20]) [178.26.28.18] by mail.gmx.net (mp048) with SMTP; 05 Nov 2011 23:13:44 +0100 X-Authenticated: #858129 X-Provags-ID: V01U2FsdGVkX18yD7IR19EyKzmvqfSQQbROPROb6kd29Rz2alc9Bm 3HvgmR3kr1bLY3 Message-ID: <4EB5B512.2000601@gmx.net> Date: Sat, 05 Nov 2011 23:13:38 +0100 From: Martin von Gagern User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:7.0.1) Gecko/20111003 Thunderbird/7.0.1 MIME-Version: 1.0 To: freebsd-questions@freebsd.org X-Enigmail-Version: 1.4a1pre Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig0420CE16D17171239895FF99" X-Y-GMX-Trusted: 0 Cc: =?UTF-8?B?5p2O5qOu?= Subject: zfs file names (inodes) without files (ENOENT) X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Nov 2011 22:40:25 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig0420CE16D17171239895FF99 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi! A. SUMMARY Long story short: I have a file name on my zfs without a file to it. ls will include it in the dir content, but stat-ing that file will result in an ENOENT error: "No such file or directory". B. HISTORY So how did I come to this situation? I've recently had to kill the sending side of an rsync, with the receiving side on FreeBSD. For reasons yet unknown, the next run of rsync started deleting stuff it shouldn't. Details on this are in PR 162318 [1], but quoting the most important things: Logging into the receiving FreeBSD as root, I found that large parts of the user's home directory content had disappeared, even outside the subdirectory used as the rsync destination! - All the .* config files in the home directory were gone - The .ssh directory was still present, but its content was gone as well - Both the home dir and the .ssh subdir contained a file "rsync.%stat", which should be the name of an extattr instead, used to implement the rsync --fake-super command line option. [1] http://www.freebsd.org/cgi/query-pr.cgi?pr=3D162318 C. SYMPTOMS I first assumed a problem in the binary rsync build for FreeBSD, but devs on the above bug report favored RAM failure or an upstream source code bug. So I gave it another try, and payed closer attention to the error messages. Among them was the following: > rsync: stat "/home/name/backup/etc/ca-certificates" failed: No such fil= e or directory (2) Strange thing is, this isn't specific to rsync at all, it can be reproduced using simple command line tools like ls: > # ls /home/name/backup/etc/ | grep ca-cert > ca-certificates > ca-certificates.conf > ca-certificates.conf~ > # ls /home/name/backup/etc/ca-* > ls: /home/name/backup/etc/ca-certificates: No such file or directory > /home/name/backup/etc/ca-certificates.conf > /home/name/backup/etc/ca-certificates.conf~ So as you see, the name is returned by readdir(3), where both ls for the dir and the wildcard expansion find it. But anything that stat(2)s the file will encounter an ENOENT error. "zpool status" says everything's fine, so zfs isn't aware of any corruption. I believe that no matter what errors user space programs might make, the kernel zfs driver should never allow the above to happen. Either a file is there, or it isn't, there should be no such mixture. So what do you think, is this likely to be a bug in the zfs implementation? I found one other person describing problems like this: in threads titled "file lose inode in Memory-Based file system.", lisen1001 described pretty much the same thing, except on ramdisk on 8.2 instead of my own hdd-based raidz on 9.0-RC1 [2,3]. [2] http://thread.gmane.org/gmane.os.freebsd.questions/280183 [3] http://thread.gmane.org/gmane.os.freebsd.devel.file-systems/13153 D. NEXT STEPS As I'm new to FreeBSD, I'm not yet sure how bug reports are handled around here. As I said, I've reported a bug report against rsync, and it has been closed on the grounds that this appears to be an upstream problem. Would it make sense to include the above information in the bug report for reference? Would replying to the gnats address be enough to accomplish that? Should the bug be reopened, as I assume all my problems to be related, and as the zfs corruption at least is specific to FreeBSD? If so, how does one reopen a report? Or who can do that? Do you agree that this looks like a problem in the ZFS implementation? Should I file a new problem report for that? Can you suggest any way I could resolve the corruption on my local ZFS pool, short of destroying and recreating the whole file system? "rm" for the file doesn't work, as it, too, encounters the ENOENT. Is there any tool to check or rebuild the inode data structures of zfs? "zpool scrub" doesn't seem to fit the bill, as its manpage indicates a computation of file content checksums. Greetings, Martin von Gagern --------------enig0420CE16D17171239895FF99 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk61tRcACgkQRhp6o4m9dFt6AwCeIBOUiJJLrQayhPNU1v6HhHaW 1RcAnjmHVTuZl831SxsT0QiUTrxUkwoF =yhd9 -----END PGP SIGNATURE----- --------------enig0420CE16D17171239895FF99--