From owner-freebsd-stable@FreeBSD.ORG Tue Feb 14 18:54:25 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DDE4B1065678; Tue, 14 Feb 2012 18:54:25 +0000 (UTC) (envelope-from arno@heho.snv.jussieu.fr) Received: from shiva.jussieu.fr (shiva.jussieu.fr [134.157.0.129]) by mx1.freebsd.org (Postfix) with ESMTP id 463378FC1F; Tue, 14 Feb 2012 18:54:24 +0000 (UTC) Received: from heho.snv.jussieu.fr (heho.snv.jussieu.fr [134.157.184.22]) by shiva.jussieu.fr (8.14.4/jtpda-5.4) with ESMTP id q1EIsKiN062109 ; Tue, 14 Feb 2012 19:54:20 +0100 (CET) X-Ids: 164 Received: from heho.snv.jussieu.fr (localhost [127.0.0.1]) by heho.snv.jussieu.fr (8.14.3/8.14.3) with ESMTP id q1EIrhgw084894; Tue, 14 Feb 2012 19:53:44 +0100 (CET) (envelope-from arno@heho.snv.jussieu.fr) Received: (from arno@localhost) by heho.snv.jussieu.fr (8.14.3/8.14.3/Submit) id q1EIrhIq084891; Tue, 14 Feb 2012 19:53:43 +0100 (CET) (envelope-from arno) To: Martin Simmons From: "Arno J. Klaassen" References: <201202141820.q1EIK1MP032526@higson.cam.lispworks.com> Date: Tue, 14 Feb 2012 19:53:43 +0100 In-Reply-To: <201202141820.q1EIK1MP032526@higson.cam.lispworks.com> (Martin Simmons's message of "Tue\, 14 Feb 2012 18\:20\:01 GMT") Message-ID: User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Miltered: at jchkmail.jussieu.fr with ID 4F3AADDC.000 by Joe's j-chkmail (http : // j-chkmail dot ensmp dot fr)! X-j-chkmail-Enveloppe: 4F3AADDC.000/134.157.184.22/heho.snv.jussieu.fr/heho.snv.jussieu.fr/ Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: 9-stable : geli + one-disk ZFS fails X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Feb 2012 18:54:26 -0000 Hi, Martin Simmons writes: > Some random ideas: > > 1) Can you dd the whole of ada0s3.eli without errors? I just started it; will take some hours > 2) If you scrub a few more times, does it find the same number of errors each > time and are they always in that XNAT.tar file? I deleted the XNAT.tar; I also copied files by 'ssh tar -c | tar -xp' to rule out NFS, same type of errors; Looks like multiple scrubs give the same files but not the same number of chksum errors (to be confirmed) > 3) Can you try zfs without geli? sure, I will split the place in one partition with geli and one without > 4) Is the slice/partition layout definitely correct? I (still ???) use sysinstall to do the dirty computations in my place. This is what gpart says (looks OK (to me ...) : [root@cc ~]# gpart list ada0 Geom name: ada0 modified: false state: OK fwheads: 16 fwsectors: 63 last: 976773167 first: 63 entries: 4 scheme: MBR Providers: 1. Name: ada0s1 Mediasize: 40802001408 (38G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 32256 Mode: r0w0e0 rawtype: 7 length: 40802001408 offset: 32256 type: ntfs index: 1 end: 79691471 start: 63 2. Name: ada0s2 Mediasize: 34359607296 (32G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 2147328000 Mode: r3w3e5 attrib: active rawtype: 165 length: 34359607296 offset: 40802033664 type: freebsd index: 2 end: 146800079 start: 79691472 3. Name: ada0s3 Mediasize: 424946221056 (395G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 2147196928 Mode: r1w1e1 rawtype: 165 length: 424946221056 offset: 75161640960 type: freebsd index: 3 end: 976773167 start: 146800080 Consumers: 1. Name: ada0 Mediasize: 500107862016 (465G) Sectorsize: 512 Mode: r4w4e10 Merci, Arno > __Martin > > >>>>>> On Mon, 13 Feb 2012 23:39:06 +0100, Arno J Klaassen said: >> >> hello, >> >> to eventually gain interest in this issue : >> >> I updated to today's -stable, tested with vfs.zfs.debug=1 >> and vfs.zfs.prefetch_disable=0, no difference. >> >> I also tested to read the raw partition : >> >> [root@cc /usr/ports]# dd if=/dev/ada0s3 of=/dev/null bs=4096 conv=noerror >> 103746636+0 records in >> 103746636+0 records out >> 424946221056 bytes transferred in 13226.346738 secs (32128768 bytes/sec) >> [root@cc /usr/ports]# >> >> Disk is brand new, looks ok, either my setup is not good or there is >> a bug somewhere; I can play around with this box for some more time, >> please feel free to provide me with some hints what to do to be useful >> for you. >> >> Best, >> >> Arno >> >> >> "Arno J. Klaassen" writes: >> >> > Hello, >> > >> > >> > I finally decided to 'play' a bit with ZFS on a notebook, some years >> > old, but I installed a brand new disk and memtest passes OK. >> > >> > I installed base+ports on partition 2, using 'classical' UFS. >> > >> > I crypted partition 3 and created a single zpool on it containing >> > 4 Z-"file-systems" : >> > >> > [root@cc ~]# zfs list >> > NAME USED AVAIL REFER MOUNTPOINT >> > zfiles 10.7G 377G 152K /zfiles >> > zfiles/home 10.6G 377G 119M /zfiles/home >> > zfiles/home/arno 10.5G 377G 2.35G /zfiles/home/arno >> > zfiles/home/arno/.priv 192K 377G 192K /zfiles/home/arno/.priv >> > zfiles/home/arno/.scito 8.18G 377G 8.18G /zfiles/home/arno/.scito >> > >> > >> > I export the ZFS's via nfs and rsynced on the other machine some backup >> > of my current note-book (geli + UFS, (almost) same 9-stable version, no >> > problem) to the ZFS's. >> > >> > >> > Quite fast, I see on the notebook : >> > >> > >> > [root@cc /usr/temp]# zpool status -v >> > pool: zfiles >> > state: ONLINE >> > status: One or more devices has experienced an error resulting in data >> > corruption. Applications may be affected. >> > action: Restore the file in question if possible. Otherwise restore the >> > entire pool from backup. >> > see: http://www.sun.com/msg/ZFS-8000-8A >> > scan: scrub repaired 0 in 0h1m with 11 errors on Sat Feb 11 14:55:34 >> > 2012 >> > config: >> > >> > NAME STATE READ WRITE CKSUM >> > zfiles ONLINE 0 0 11 >> > ada0s3.eli ONLINE 0 0 23 >> > >> > errors: Permanent errors have been detected in the following files: >> > >> > /zfiles/home/arno/.scito/contrib/XNAT.tar >> > [root@cc /usr/temp]# md5 /zfiles/home/arno/.scito/contrib/XNAT.tar >> > md5: /zfiles/home/arno/.scito/contrib/XNAT.tar: Input/output error >> > [root@cc /usr/temp]# >> > >> > >> > As said, memtest is OK, nothing is logged to the console, UFS on the >> > same disk works OK (I did some tests copying and comparing random data) >> > and smartctl as well seems to trust the disk : >> > >> > SMART Self-test log structure revision number 1 >> > Num Test_Description Status Remaining LifeTime(hours) >> > # 1 Extended offline Completed without error 00% 388 >> > # 2 Short offline Completed without error 00% 387 >> > >> > >> > Am I doing something wrong and/or let me know what I could provide as >> > extra info to try to solve this (dmesg.boot at the end of this mail). >> > >> > Thanx a lot in advance, >> > >> > best, Arno >> > >> > >> > >> > [ dmesg.boot deleted ]