Date: Sat, 23 Jan 2010 18:40:13 -0600 (CST) From: Wes Morgan <morganw@chemikals.org> To: Rich <rincebrain@gmail.com> Cc: freebsd-fs <freebsd-fs@freebsd.org> Subject: Re: Errors on a file on a zpool: How to remove? Message-ID: <alpine.BSF.2.00.1001231814210.2160@ibyngvyr> In-Reply-To: <5da0588e1001231541l246769eao410c5ea6ccca0de4@mail.gmail.com> References: <5da0588e1001222223m773648am907267235bdcf882@mail.gmail.com> <ed91d4a81001230011t7aef2da8h3be13d2494c06550@mail.gmail.com> <5da0588e1001230014k1b8a32f8v42046497265429ed@mail.gmail.com> <alpine.BSF.2.00.1001231519110.91898@ibyngvyr> <5da0588e1001231415t403f29ceq6e8dcd16edb4a28@mail.gmail.com> <alpine.BSF.2.00.1001231733570.2160@ibyngvyr> <5da0588e1001231541l246769eao410c5ea6ccca0de4@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --3224958491-39306717-1264292171=:2160 Content-Type: TEXT/PLAIN; CHARSET=ISO-8859-15 Content-Transfer-Encoding: 8BIT Content-ID: <alpine.BSF.2.00.1001231816171.2160@ibyngvyr> On Sat, 23 Jan 2010, Rich wrote: > I have no files named 0x0. > > I have a number of files which, on attempting to do anything to them > (stat, mv, rm), EIO occurs, the checksum error number on three of the > disks in that pool ticks up, and /var/log/messages reports what I > reported in my initial post. (i discovered this due to FreeBSD's daily > check-for-setuid-bits-in-strange-places find command reporting EIO on > some files.) > > My original post in this thread is about how to resolve this. Do these bad files show up on "zpool status -v" after a scrub? This really sounds much more like an issue of corrupt metadata. ZFS keeps multiple copies of filesystem metadata even on non-redundant pools (ditto blocks). You said there was bad ram in this machine at one point, which may mean that *all* of the metadata was corrupt. In my encounter with a bad stick of ram, the data was correct but the stored checksums were wrong. I was able to "recover" the data by simply changing zfs_read() to not report EIO when it encounters an ECKSUM error from the zfs layer -- essentially ignoring the checksum error. I have no idea what this might do if the metadata itself is corrupt, so that could be risky. Another option is the zdb solution mentioned earlier. > > On Sat, Jan 23, 2010 at 6:34 PM, Wes Morgan <morganw@chemikals.org> wrote: > > On Sat, 23 Jan 2010, Rich wrote: > > > >> On Sat, Jan 23, 2010 at 4:21 PM, Wes Morgan <morganw@chemikals.org> wrote: > >> > On Sat, 23 Jan 2010, Rich wrote: > >> > > >> >> I already diagnosed the bad hardware - one of the two sticks of RAM > >> >> had gone bad, and fails memtest in the other machine. > >> >> > >> >> pool: rigatoni > >> >> state: ONLINE > >> >> status: One or more devices has experienced an error resulting in data > >> >> corruption. Applications may be affected. > >> >> action: Restore the file in question if possible. Otherwise restore the > >> >> entire pool from backup. > >> >> see: http://www.sun.com/msg/ZFS-8000-8A > >> >> scrub: scrub completed after 15h28m with 1 errors on Thu Jan 21 18:09:25 2010 > >> >> config: > >> >> > >> >> NAME STATE READ WRITE CKSUM > >> >> rigatoni ONLINE 0 0 1 > >> >> da4 ONLINE 0 0 2 > >> >> da5 ONLINE 0 0 2 > >> >> da7 ONLINE 0 0 0 > >> >> da6 ONLINE 0 0 0 > >> >> da2 ONLINE 0 0 2 > >> >> > >> >> errors: Permanent errors have been detected in the following files: > >> >> > >> >> rigatoni/mirrors:<0x0> > >> > > >> > Can you post your entire pool filesystem structure? That message above > >> > looks like an unreferenced block or corrupted metadata rather than an > >> > actual file. Also, if it's part of a snapshot, you simply have to destroy > >> > the snapshot. > >> > > >> > I had a pool become corrupted due to bad memory, and all of the files were > >> > still able to be manipulated. The only time EIO popped up was on the > >> > specific block that had a checksum error. > >> > >> # zfs list -r -t all rigatoni > >> NAME USED AVAIL REFER MOUNTPOINT > >> rigatoni 5.73T 984G 19K /rigatoni > >> rigatoni/logs_bitch 269M 984G 269M /rigatoni/logs_bitch > >> rigatoni/mirrors 5.73T 984G 5.73T /mirrors > >> > >> No snapshots here. :/ > >> > >> EIO only pops up on the files I mentioned above - everything else in > >> those directories, including renaming that directory, is fine. > > > > I must have missed it, what files is it showing besides the <0x0> address? > > Or do you have a file named "<0x0>"? > > > > --3224958491-39306717-1264292171=:2160--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.1001231814210.2160>