From owner-freebsd-fs@FreeBSD.ORG Sun Jan 17 06:17:20 2010 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CF2FF106566B for ; Sun, 17 Jan 2010 06:17:20 +0000 (UTC) (envelope-from v.velox@vvelox.net) Received: from vulpes.vvelox.net (vulpes.vvelox.net [99.69.115.42]) by mx1.freebsd.org (Postfix) with ESMTP id 949368FC0C for ; Sun, 17 Jan 2010 06:17:20 +0000 (UTC) Received: from vixen42.vulpes (unknown [192.168.14.1]) (Authenticated sender: v.velox) by vulpes.vvelox.net (Postfix) with ESMTP id 6A126B842 for ; Sat, 16 Jan 2010 23:56:50 -0600 (CST) Date: Sat, 16 Jan 2010 23:58:57 -0600 From: "Zane C.B." To: freebsd-fs@FreeBSD.org Message-ID: <20100116235857.6f126ad0@vixen42.vulpes> X-Mailer: Claws Mail 3.7.3 (GTK+ 2.18.5; i386-portbld-freebsd7.2) Mime-Version: 1.0 Content-Type: multipart/signed; micalg=PGP-SHA1; boundary="Sig_/lU3kkps+inNs3fR1c.HoMbo"; protocol="application/pgp-signature" Cc: Subject: odd NFS issues X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Jan 2010 06:17:20 -0000 --Sig_/lU3kkps+inNs3fR1c.HoMbo Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Regardless of if I am mounting a NFS export remotely or locally, I get the error below if I do a ls or any thing. Any suggestions? Below is a example of what happens if I try it locally. # mount_nfs -o mntudp,ro 192.168.15.2:/arc /mnt/ # ls /mnt ls: /mnt: Input/output error # cat /etc/exports=20 /arc -ro -maproot=3Dnobody -network 192.168.15 -mask 255.255.255.0 /arc -maproot=3Dnobody 10.69.0.3 10.69.0.5 /home -maproot=3Dnobody 192.168.15.10 /home -maproot=3Dnobody 10.69.0.5 # grep nfs /etc/rc.conf nfs_server_enable=3D"YES" nfs_server_flags=3D"-u -n 10" nfs_client_enable=3D"YES" nfsd_enable=3D"YES" # grep rpc /etc/rc.conf rpcbind_enable=3D"YES" rpc_lockd_enable=3D"YES" rpc_statd_enable=3D"YES" # grep mountd /etc/rc.conf mountd_flags=3D"-r" mountd_enable=3D"YES" # uname -a FreeBSD vixen42.vulpes 7.2-STABLE FreeBSD 7.2-STABLE #4: Sun Dec 20 10:19:4= 6 CST 2009 root@vixen42.vulpes:/usr/obj/usr/src/sys/GENERIC i386 --Sig_/lU3kkps+inNs3fR1c.HoMbo Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAktSpyYACgkQqrJJy0yxYQALfQCfXxWqIvZCZzTJExbz91GKpffX WbUAnRUgicWZATQwrxD0C1tOqcGqRhq9 =1wG3 -----END PGP SIGNATURE----- --Sig_/lU3kkps+inNs3fR1c.HoMbo-- From owner-freebsd-fs@FreeBSD.ORG Sun Jan 17 13:26:50 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7CD791065670 for ; Sun, 17 Jan 2010 13:26:50 +0000 (UTC) (envelope-from ronald-freebsd8@klop.yi.org) Received: from smtp-out2.tiscali.nl (smtp-out2.tiscali.nl [195.241.79.177]) by mx1.freebsd.org (Postfix) with ESMTP id 0DE178FC08 for ; Sun, 17 Jan 2010 13:26:49 +0000 (UTC) Received: from [212.123.145.58] (helo=sjakie.klop.ws) by smtp-out2.tiscali.nl with esmtp (Exim) (envelope-from ) id 1NWV9V-0001mK-0x; Sun, 17 Jan 2010 14:26:49 +0100 Received: from 212-123-145-58.ip.telfort.nl (localhost [127.0.0.1]) by sjakie.klop.ws (Postfix) with ESMTP id 498EFF442; Sun, 17 Jan 2010 14:26:43 +0100 (CET) Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes To: "Dan Naumov" , "Rick Macklem" , freebsd-questions@freebsd.org, freebsd-fs@freebsd.org References: Date: Sun, 17 Jan 2010 14:26:42 +0100 MIME-Version: 1.0 From: "Ronald Klop" Message-ID: In-Reply-To: User-Agent: Opera Mail/10.10 (FreeBSD) Content-Transfer-Encoding: quoted-printable Cc: Subject: Re: (SOLVED) Re: installing FreeBSD 8 on SSDs and UFS2 - partition alignment, block sizes, what does one need to know? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Jan 2010 13:26:50 -0000 On Fri, 15 Jan 2010 17:57:03 +0100, Dan Naumov =20 wrote: > On Fri, Jan 15, 2010 at 6:38 PM, Rick Macklem =20 > wrote: >> >> >> On Tue, 12 Jan 2010, Dan Naumov wrote: >> >>> For my upcoming storage system, the OS install is going to be on a >>> 80gb Intel SSD disk and for various reasons, I am now pretty convince= d >>> to stick with UFS2 for the root partition (the actual data pool will >>> be ZFS using traditional SATA disks). I am probably going to use GPT >>> partitioning and have the SSD host the swap, boot, root and a few >>> other partitions. What do I need to know in regards to partition >>> alignment and filesystem block sizes to get the best performance out >>> of the Intel SSDs? >>> >> I can't help with your question, but I thought I'd mention that there >> was a recent post (on freebsd-current, I think?) w.r.t. using an SSD >> for the ZFS log file. It suggested that that helped with ZFS perf., so >> you might want to look for the message. >> >> rick > > I have managed to figure out the essential things to know by know, I > just wish there was a single, easy to grasp webpage or HOWTO > describing and whys and hows so I wouldn't have had had to spend the > entire day googling things to get a proper grasp on the issue :) Maybe you can copy-paste your e-mail in a wiki somewhere. And your wish =20 has come true for other peoples. Ronald. > To (perhaps a bit too much) simplify things, if you are using an SSD > with FreeeBSD, you: > > 1) Should use GPT > > 2) Should create the freebsd-boot partition as normal (to ensure > compatibility with some funky BIOSes) > > 3) All additional partitions should be aligned, meaning that their > boundaries should be dividable by 1024kb (that's 2048 "logical blocks" > in gpart). Ie, having created your freeebsd-boot, your next partition > should start at block 2048 and the partition size should be dividable > by 2048 blocks. This applies to ALL further partitions added to the > disk, so you WILL end up having some empty space between them, but a > few MBs worth of space will be lost at most. > > P.S: My oversimplification was in that MOST SSDs will be just fine > with a 512 kb / 1024 block alignment. However, _ALL_ SSDs will be fine > with 1024 kb / 2048 block alignment. > > > - Sincerely, > Dan Naumov > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Sun Jan 17 18:29:08 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 401481065692 for ; Sun, 17 Jan 2010 18:29:08 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id EB7EB8FC08 for ; Sun, 17 Jan 2010 18:29:07 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEAInlUkuDaFvH/2dsb2JhbADUS4QyBA X-IronPort-AV: E=Sophos;i="4.49,292,1262581200"; d="scan'208";a="61761592" Received: from danube.cs.uoguelph.ca ([131.104.91.199]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 17 Jan 2010 13:29:06 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by danube.cs.uoguelph.ca (Postfix) with ESMTP id F21101084244; Sun, 17 Jan 2010 13:29:06 -0500 (EST) X-Virus-Scanned: amavisd-new at danube.cs.uoguelph.ca Received: from danube.cs.uoguelph.ca ([127.0.0.1]) by localhost (danube.cs.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fgQ6160XA0lM; Sun, 17 Jan 2010 13:29:06 -0500 (EST) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by danube.cs.uoguelph.ca (Postfix) with ESMTP id 1A33710841F3; Sun, 17 Jan 2010 13:29:06 -0500 (EST) Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id o0HIdSE06631; Sun, 17 Jan 2010 13:39:28 -0500 (EST) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Sun, 17 Jan 2010 13:39:28 -0500 (EST) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: "Zane C.B." In-Reply-To: <20100116235857.6f126ad0@vixen42.vulpes> Message-ID: References: <20100116235857.6f126ad0@vixen42.vulpes> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@FreeBSD.org Subject: Re: odd NFS issues X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Jan 2010 18:29:08 -0000 On Sat, 16 Jan 2010, Zane C.B. wrote: > Regardless of if I am mounting a NFS export remotely or locally, I > get the error below if I do a ls or any thing. > > Any suggestions? > 1 - Did anything get logged in /var/log/messages? 2 - What kind of file system is /arc? (Most tested is done with ufs/ffs and some use zfs. Some file system types don't know how to do an NFS export and some probably haven't been tested...) Beyond that, doing a packet capture and looking at it in Wireshark might tell you what's going on. rick From owner-freebsd-fs@FreeBSD.ORG Sun Jan 17 18:32:49 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3023410656A4 for ; Sun, 17 Jan 2010 18:32:49 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id DBDC68FC0C for ; Sun, 17 Jan 2010 18:32:48 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEAKzmUkuDaFvH/2dsb2JhbADUQoQyBA X-IronPort-AV: E=Sophos;i="4.49,292,1262581200"; d="scan'208";a="61761848" Received: from danube.cs.uoguelph.ca ([131.104.91.199]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 17 Jan 2010 13:32:47 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by danube.cs.uoguelph.ca (Postfix) with ESMTP id 995221084268; Sun, 17 Jan 2010 13:32:47 -0500 (EST) X-Virus-Scanned: amavisd-new at danube.cs.uoguelph.ca Received: from danube.cs.uoguelph.ca ([127.0.0.1]) by localhost (danube.cs.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TkETkrugJw9u; Sun, 17 Jan 2010 13:32:46 -0500 (EST) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by danube.cs.uoguelph.ca (Postfix) with ESMTP id D7939108425F; Sun, 17 Jan 2010 13:32:46 -0500 (EST) Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id o0HIh9c06694; Sun, 17 Jan 2010 13:43:09 -0500 (EST) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Sun, 17 Jan 2010 13:43:09 -0500 (EST) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: "Zane C.B." In-Reply-To: Message-ID: References: <20100116235857.6f126ad0@vixen42.vulpes> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@FreeBSD.org Subject: Re: odd NFS issues X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Jan 2010 18:32:49 -0000 On Sun, 17 Jan 2010, Rick Macklem wrote: > (Most tested is done with ufs/ffs and some use zfs. Some file Oops, I meant "testing"... From owner-freebsd-fs@FreeBSD.ORG Sun Jan 17 20:14:11 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E22AD106566C; Sun, 17 Jan 2010 20:14:11 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id BB1778FC14; Sun, 17 Jan 2010 20:14:11 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id o0HKEBeL037141; Sun, 17 Jan 2010 20:14:11 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id o0HKEB5h037137; Sun, 17 Jan 2010 20:14:11 GMT (envelope-from linimon) Date: Sun, 17 Jan 2010 20:14:11 GMT Message-Id: <201001172014.o0HKEB5h037137@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/142914: [zfs] ZFS performance degradation over time X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Jan 2010 20:14:12 -0000 Old Synopsis: ZFS performance degratation over time New Synopsis: [zfs] ZFS performance degradation over time Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Sun Jan 17 20:12:46 UTC 2010 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=142914 From owner-freebsd-fs@FreeBSD.ORG Sun Jan 17 20:40:06 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 27396106568F for ; Sun, 17 Jan 2010 20:40:06 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id F1F9D8FC15 for ; Sun, 17 Jan 2010 20:40:05 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id o0HKe5kP055206 for ; Sun, 17 Jan 2010 20:40:05 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id o0HKe5wd055205; Sun, 17 Jan 2010 20:40:05 GMT (envelope-from gnats) Date: Sun, 17 Jan 2010 20:40:05 GMT Message-Id: <201001172040.o0HKe5wd055205@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: "Pedro F. Giffuni" Cc: Subject: Re: kern/142558: Minor updates to fs/msdosfs headers (from NetBSD) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: "Pedro F. Giffuni" List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Jan 2010 20:40:06 -0000 The following reply was made to PR kern/142558; it has been noted by GNATS. From: "Pedro F. Giffuni" To: FreeBSD-gnats-submit@FreeBSD.org, freebsd-bugs@FreeBSD.org Cc: Subject: Re: kern/142558: Minor updates to fs/msdosfs headers (from NetBSD) Date: Sun, 17 Jan 2010 12:39:41 -0800 (PST) --0-2114050911-1263760781=:30275 Content-Type: text/plain; charset=us-ascii Updated patch: - In direntry.h remove deExtension, this was always part of the deName and some of the code still abuses it. This basically undoes CVS Rev. 1.46 but fixes the issues more practically. - As a consequence to the above winChksum is now exactly as in NetBSD. --0-2114050911-1263760781=:30275 Content-Type: application/octet-stream; name=patch-msdosfs Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename=patch-msdosfs ZGlmZiAtcnUgbXNkb3Nmcy5vcmlnL2Jvb3RzZWN0LmggbXNkb3Nmcy9ib290 c2VjdC5oCi0tLSBtc2Rvc2ZzLm9yaWcvYm9vdHNlY3QuaAkyMDEwLTAxLTA5 IDE5OjI5OjQ1LjAwMDAwMDAwMCArMDAwMAorKysgbXNkb3Nmcy9ib290c2Vj dC5oCTIwMTAtMDEtMTcgMTQ6MTY6MjAuMDAwMDAwMDAwICswMDAwCkBAIC0x Niw2ICsxNiw4IEBACiAgKgogICogT2N0b2JlciAxOTkyCiAgKi8KKyNpZm5k ZWYgX0ZTX01TRE9TRlNfQk9PVFNFQ1RfSF8KKyNkZWZpbmUJX0ZTX01TRE9T RlNfQk9PVFNFQ1RfSF8KIAogLyoKICAqIEZvcm1hdCBvZiBhIGJvb3Qgc2Vj dG9yLiAgVGhpcyBpcyB0aGUgZmlyc3Qgc2VjdG9yIG9uIGEgRE9TIGZsb3Bw eSBkaXNrCkBAIC05MSwzICs5Myw1IEBACiAjZGVmaW5lCWJzSGlkZGVuU2Vj cwlic0JQQi5icGJIaWRkZW5TZWNzCiAjZGVmaW5lCWJzSHVnZVNlY3RvcnMJ YnNCUEIuYnBiSHVnZVNlY3RvcnMKICNlbmRpZgorCisjZW5kaWYgLyogIV9G U19NU0RPU0ZTX0JPT1RTRUNUX0hfICovCmRpZmYgLXJ1IG1zZG9zZnMub3Jp Zy9icGIuaCBtc2Rvc2ZzL2JwYi5oCi0tLSBtc2Rvc2ZzLm9yaWcvYnBiLmgJ MjAxMC0wMS0wOSAxOToyOTo0NS4wMDAwMDAwMDAgKzAwMDAKKysrIG1zZG9z ZnMvYnBiLmgJMjAxMC0wMS0xNyAxNDoxNjoyNy4wMDAwMDAwMDAgKzAwMDAK QEAgLTE3LDYgKzE3LDkgQEAKICAqIE9jdG9iZXIgMTk5MgogICovCiAKKyNp Zm5kZWYgX0ZTX01TRE9TRlNfQlBCX0hfCisjZGVmaW5lCV9GU19NU0RPU0ZT X0JQQl9IXworCiAvKgogICogQklPUyBQYXJhbWV0ZXIgQmxvY2sgKEJQQikg Zm9yIERPUyAzLjMKICAqLwpAQCAtNzgsNyArODEsNyBAQAogCXVfaW50MzJf dAlicGJSb290Q2x1c3Q7CS8qIHN0YXJ0IGNsdXN0ZXIgZm9yIHJvb3QgZGly ZWN0b3J5ICovCiAJdV9pbnQxNl90CWJwYkZTSW5mbzsJLyogZmlsZXN5c3Rl bSBpbmZvIHN0cnVjdHVyZSBzZWN0b3IgKi8KIAl1X2ludDE2X3QJYnBiQmFj a3VwOwkvKiBiYWNrdXAgYm9vdCBzZWN0b3IgKi8KLQkvKiBUaGVyZSBpcyBh IDEyIGJ5dGUgZmlsbGVyIGhlcmUsIGJ1dCB3ZSBpZ25vcmUgaXQgKi8KKwl1 X2ludDhfdAlicGJSZXNlcnZlZFsxMl07IC8qIHJlc2VydmVkIGZvciBmdXR1 cmUgZXhwYW5zaW9uICovCiB9OwogCiAvKgpAQCAtMTUzLDcgKzE1Niw3IEBA CiAJdV9pbnQ4X3QgYnBiUm9vdENsdXN0WzRdOwkvKiBzdGFydCBjbHVzdGVy IGZvciByb290IGRpcmVjdG9yeSAqLwogCXVfaW50OF90IGJwYkZTSW5mb1sy XTsJCS8qIGZpbGVzeXN0ZW0gaW5mbyBzdHJ1Y3R1cmUgc2VjdG9yICovCiAJ dV9pbnQ4X3QgYnBiQmFja3VwWzJdOwkJLyogYmFja3VwIGJvb3Qgc2VjdG9y ICovCi0JLyogVGhlcmUgaXMgYSAxMiBieXRlIGZpbGxlciBoZXJlLCBidXQg d2UgaWdub3JlIGl0ICovCisJdV9pbnQ4X3QgYnBiUmVzZXJ2ZWRbMTJdOwkv KiByZXNlcnZlZCBmb3IgZnV0dXJlIGV4cGFuc2lvbiAqLwogfTsKIAogLyoK QEAgLTE2OCwzICsxNzEsNCBAQAogCXVfaW50OF90IGZzaWZpbGwyWzEyXTsK IAl1X2ludDhfdCBmc2lzaWczWzRdOwogfTsKKyNlbmRpZiAvKiAhX0ZTX01T RE9TRlNfQlBCX0hfICovCmRpZmYgLXJ1IG1zZG9zZnMub3JpZy9kaXJlbnRy eS5oIG1zZG9zZnMvZGlyZW50cnkuaAotLS0gbXNkb3Nmcy5vcmlnL2RpcmVu dHJ5LmgJMjAxMC0wMS0wOSAxOToyOTo0NS4wMDAwMDAwMDAgKzAwMDAKKysr IG1zZG9zZnMvZGlyZW50cnkuaAkyMDEwLTAxLTE3IDE1OjAyOjQxLjAwMDAw MDAwMCArMDAwMApAQCAtNDcsMTYgKzQ3LDE3IEBACiAgKgogICogT2N0b2Jl ciAxOTkyCiAgKi8KKyNpZm5kZWYgX0ZTX01TRE9TRlNfRElSRU5UUllfSF8K KyNkZWZpbmUJX0ZTX01TRE9TRlNfRElSRU5UUllfSF8KIAogLyoKICAqIFN0 cnVjdHVyZSBvZiBhIGRvcyBkaXJlY3RvcnkgZW50cnkuCiAgKi8KIHN0cnVj dCBkaXJlbnRyeSB7Ci0JdV9pbnQ4X3QJZGVOYW1lWzhdOwkvKiBmaWxlbmFt ZSwgYmxhbmsgZmlsbGVkICovCisJdV9pbnQ4X3QJZGVOYW1lWzExXTsJLyog ZmlsZW5hbWUsIGJsYW5rIGZpbGxlZCAqLwogI2RlZmluZQlTTE9UX0VNUFRZ CTB4MDAJCS8qIHNsb3QgaGFzIG5ldmVyIGJlZW4gdXNlZCAqLwogI2RlZmlu ZQlTTE9UX0U1CQkweDA1CQkvKiB0aGUgcmVhbCB2YWx1ZSBpcyAweGU1ICov CiAjZGVmaW5lCVNMT1RfREVMRVRFRAkweGU1CQkvKiBmaWxlIGluIHRoaXMg c2xvdCBkZWxldGVkICovCi0JdV9pbnQ4X3QJZGVFeHRlbnNpb25bM107CS8q IGV4dGVuc2lvbiwgYmxhbmsgZmlsbGVkICovCiAJdV9pbnQ4X3QJZGVBdHRy aWJ1dGVzOwkvKiBmaWxlIGF0dHJpYnV0ZXMgKi8KICNkZWZpbmUJQVRUUl9O T1JNQUwJMHgwMAkJLyogbm9ybWFsIGZpbGUgKi8KICNkZWZpbmUJQVRUUl9S RUFET05MWQkweDAxCQkvKiBmaWxlIGlzIHJlYWRvbmx5ICovCkBAIC0xNTUs NyArMTU2LDggQEAKIAkgICAgaW50IGNoa3N1bSwgc3RydWN0IG1zZG9zZnNt b3VudCAqcG1wKTsKIGludAl3aW4ydW5peGZuKHN0cnVjdCBtYm5hbWJ1ZiAq bmJwLCBzdHJ1Y3Qgd2luZW50cnkgKndlcCwgaW50IGNoa3N1bSwKIAkgICAg c3RydWN0IG1zZG9zZnNtb3VudCAqcG1wKTsKLXVfaW50OF90IHdpbkNoa3N1 bShzdHJ1Y3QgZGlyZW50cnkgKmRlcCk7Cit1X2ludDhfdCB3aW5DaGtzdW0o dV9pbnQ4X3QgKm5hbWUpOwogaW50CXdpblNsb3RDbnQoY29uc3QgdV9jaGFy ICp1biwgc2l6ZV90IHVubGVuLCBzdHJ1Y3QgbXNkb3Nmc21vdW50ICpwbXAp Owogc2l6ZV90CXdpbkxlbkZpeHVwKGNvbnN0IHVfY2hhciAqdW4sIHNpemVf dCB1bmxlbik7CiAjZW5kaWYJLyogX0tFUk5FTCAqLworI2VuZGlmCS8qICFf RlNfTVNET1NGU19ESVJFTlRSWV9IXyAqLwpkaWZmIC1ydSBtc2Rvc2ZzLm9y aWcvbXNkb3Nmc19jb252LmMgbXNkb3Nmcy9tc2Rvc2ZzX2NvbnYuYwotLS0g bXNkb3Nmcy5vcmlnL21zZG9zZnNfY29udi5jCTIwMTAtMDEtMDkgMTk6Mjk6 NDUuMDAwMDAwMDAwICswMDAwCisrKyBtc2Rvc2ZzL21zZG9zZnNfY29udi5j CTIwMTAtMDEtMTcgMTU6MjU6MTAuMDAwMDAwMDAwICswMDAwCkBAIC03NDEs MjIgKzc0MSwxMyBAQAogICogQ29tcHV0ZSB0aGUgdW5yb2xsZWQgY2hlY2tz dW0gb2YgYSBET1MgZmlsZW5hbWUgZm9yIFdpbjk1IExGTiB1c2UuCiAgKi8K IHVfaW50OF90Ci13aW5DaGtzdW0oc3RydWN0IGRpcmVudHJ5ICpkZXApCit3 aW5DaGtzdW0odV9pbnQ4X3QgKm5hbWUpCiB7CisJaW50IGk7CiAJdV9pbnQ4 X3QgczsKIAotCXMgPSBkZXAtPmRlTmFtZVswXTsKLQlzID0gKChzIDw8IDcp IHwgKHMgPj4gMSkpICsgZGVwLT5kZU5hbWVbMV07Ci0JcyA9ICgocyA8PCA3 KSB8IChzID4+IDEpKSArIGRlcC0+ZGVOYW1lWzJdOwotCXMgPSAoKHMgPDwg NykgfCAocyA+PiAxKSkgKyBkZXAtPmRlTmFtZVszXTsKLQlzID0gKChzIDw8 IDcpIHwgKHMgPj4gMSkpICsgZGVwLT5kZU5hbWVbNF07Ci0JcyA9ICgocyA8 PCA3KSB8IChzID4+IDEpKSArIGRlcC0+ZGVOYW1lWzVdOwotCXMgPSAoKHMg PDwgNykgfCAocyA+PiAxKSkgKyBkZXAtPmRlTmFtZVs2XTsKLQlzID0gKChz IDw8IDcpIHwgKHMgPj4gMSkpICsgZGVwLT5kZU5hbWVbN107Ci0JcyA9ICgo cyA8PCA3KSB8IChzID4+IDEpKSArIGRlcC0+ZGVFeHRlbnNpb25bMF07Ci0J cyA9ICgocyA8PCA3KSB8IChzID4+IDEpKSArIGRlcC0+ZGVFeHRlbnNpb25b MV07Ci0JcyA9ICgocyA8PCA3KSB8IChzID4+IDEpKSArIGRlcC0+ZGVFeHRl bnNpb25bMl07Ci0KKwlmb3IgKHMgPSAwLCBpID0gMTE7IC0taSA+PSAwOyBz ICs9ICpuYW1lKyspCisJCXMgPSAocyA8PCA3KXwocyA+PiAxKTsKIAlyZXR1 cm4gKHMpOwogfQogCmRpZmYgLXJ1IG1zZG9zZnMub3JpZy9tc2Rvc2ZzX2xv b2t1cC5jIG1zZG9zZnMvbXNkb3Nmc19sb29rdXAuYwotLS0gbXNkb3Nmcy5v cmlnL21zZG9zZnNfbG9va3VwLmMJMjAxMC0wMS0wOSAxOToyOTo0NS4wMDAw MDAwMDAgKzAwMDAKKysrIG1zZG9zZnMvbXNkb3Nmc19sb29rdXAuYwkyMDEw LTAxLTE3IDE1OjA2OjAxLjAwMDAwMDAwMCArMDAwMApAQCAtMjc2LDcgKzI3 Niw3IEBACiAJCQkJLyoKIAkJCQkgKiBDaGVjayBmb3IgYSBjaGVja3N1bSBv ciBuYW1lIG1hdGNoCiAJCQkJICovCi0JCQkJY2hrc3VtX29rID0gKGNoa3N1 bSA9PSB3aW5DaGtzdW0oZGVwKSk7CisJCQkJY2hrc3VtX29rID0gKGNoa3N1 bSA9PSB3aW5DaGtzdW0oZGVwLT5kZU5hbWUpKTsKIAkJCQlpZiAoIWNoa3N1 bV9vawogCQkJCSAgICAmJiAoIW9sZGRvcyB8fCBiY21wKGRvc2ZpbGVuYW1l LCBkZXAtPmRlTmFtZSwgMTEpKSkgewogCQkJCQljaGtzdW0gPSAtMTsKQEAg LTYxNyw3ICs2MTcsNyBAQAogCSAqIE5vdyB3cml0ZSB0aGUgV2luOTUgbG9u ZyBuYW1lCiAJICovCiAJaWYgKGRkZXAtPmRlX2ZuZGNudCA+IDApIHsKLQkJ dV9pbnQ4X3QgY2hrc3VtID0gd2luQ2hrc3VtKG5kZXApOworCQl1X2ludDhf dCBjaGtzdW0gPSB3aW5DaGtzdW0obmRlcC0+ZGVOYW1lKTsKIAkJY29uc3Qg dV9jaGFyICp1biA9IChjb25zdCB1X2NoYXIgKiljbnAtPmNuX25hbWVwdHI7 CiAJCWludCB1bmxlbiA9IGNucC0+Y25fbmFtZWxlbjsKIAkJaW50IGNudCA9 IDE7CmRpZmYgLXJ1IG1zZG9zZnMub3JpZy9tc2Rvc2ZzX3Zub3BzLmMgbXNk b3Nmcy9tc2Rvc2ZzX3Zub3BzLmMKLS0tIG1zZG9zZnMub3JpZy9tc2Rvc2Zz X3Zub3BzLmMJMjAxMC0wMS0wOSAxOToyOTo0NS4wMDAwMDAwMDAgKzAwMDAK KysrIG1zZG9zZnMvbXNkb3Nmc192bm9wcy5jCTIwMTAtMDEtMTcgMTU6MDc6 MjAuMDAwMDAwMDAwICswMDAwCkBAIC0xMjg3LDcgKzEyODcsNyBAQAogCXN0 cnVjdCBkaXJlbnRyeSBkb3Q7CiAJc3RydWN0IGRpcmVudHJ5IGRvdGRvdDsK IH0gZG9zZGlydGVtcGxhdGUgPSB7Ci0JewkiLiAgICAgICAiLCAiICAgIiwJ CQkvKiB0aGUgLiBlbnRyeSAqLworCXsJIi4gICAgICAgICAgIiwJCQkJLyog dGhlIC4gZW50cnkgKi8KIAkJQVRUUl9ESVJFQ1RPUlksCQkJCS8qIGZpbGUg YXR0cmlidXRlICovCiAJCTAsCQkJCQkvKiByZXNlcnZlZCAqLwogCQkwLCB7 IDAsIDAgfSwgeyAwLCAwIH0sCQkJLyogY3JlYXRlIHRpbWUgJiBkYXRlICov CkBAIC0xMjk3LDcgKzEyOTcsNyBAQAogCQl7IDAsIDAgfSwJCQkJLyogc3Rh cnRjbHVzdGVyICovCiAJCXsgMCwgMCwgMCwgMCB9CQkJCS8qIGZpbGVzaXpl ICovCiAJfSwKLQl7CSIuLiAgICAgICIsICIgICAiLAkJCS8qIHRoZSAuLiBl bnRyeSAqLworCXsJIi4uICAgICAgICAgIiwJCQkJLyogdGhlIC4uIGVudHJ5 ICovCiAJCUFUVFJfRElSRUNUT1JZLAkJCQkvKiBmaWxlIGF0dHJpYnV0ZSAq LwogCQkwLAkJCQkJLyogcmVzZXJ2ZWQgKi8KIAkJMCwgeyAwLCAwIH0sIHsg MCwgMCB9LAkJCS8qIGNyZWF0ZSB0aW1lICYgZGF0ZSAqLwpAQCAtMTcyOSw3 ICsxNzI5LDcgQEAKIAkJCX0gZWxzZQogCQkJCWRpcmJ1Zi5kX2ZpbGVubyA9 ICh1aW50MzJfdClmaWxlbm87CiAKLQkJCWlmIChjaGtzdW0gIT0gd2luQ2hr c3VtKGRlbnRwKSkgeworCQkJaWYgKGNoa3N1bSAhPSB3aW5DaGtzdW0oZGVu dHAtPmRlTmFtZSkpIHsKIAkJCQlkaXJidWYuZF9uYW1sZW4gPSBkb3MydW5p eGZuKGRlbnRwLT5kZU5hbWUsCiAJCQkJICAgICh1X2NoYXIgKilkaXJidWYu ZF9uYW1lLAogCQkJCSAgICBkZW50cC0+ZGVMb3dlckNhc2UgfAo= --0-2114050911-1263760781=:30275-- From owner-freebsd-fs@FreeBSD.ORG Sun Jan 17 20:43:51 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 31A911065670 for ; Sun, 17 Jan 2010 20:43:51 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id DEF118FC12 for ; Sun, 17 Jan 2010 20:43:50 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEACUFU0uDaFvK/2dsb2JhbADTcYQyBA X-IronPort-AV: E=Sophos;i="4.49,292,1262581200"; d="scan'208";a="61770910" Received: from fraser.cs.uoguelph.ca ([131.104.91.202]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 17 Jan 2010 15:43:50 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by fraser.cs.uoguelph.ca (Postfix) with ESMTP id 17E7A109C2C6; Sun, 17 Jan 2010 15:43:50 -0500 (EST) X-Virus-Scanned: amavisd-new at fraser.cs.uoguelph.ca Received: from fraser.cs.uoguelph.ca ([127.0.0.1]) by localhost (fraser.cs.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id s5NfkQwLjAdU; Sun, 17 Jan 2010 15:43:49 -0500 (EST) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by fraser.cs.uoguelph.ca (Postfix) with ESMTP id 9DE97109C2C2; Sun, 17 Jan 2010 15:43:49 -0500 (EST) Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id o0HKsCF01815; Sun, 17 Jan 2010 15:54:12 -0500 (EST) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Sun, 17 Jan 2010 15:54:12 -0500 (EST) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: "Zane C.B." In-Reply-To: <20100116235857.6f126ad0@vixen42.vulpes> Message-ID: References: <20100116235857.6f126ad0@vixen42.vulpes> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@FreeBSD.org Subject: Re: odd NFS issues X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Jan 2010 20:43:51 -0000 On Sat, 16 Jan 2010, Zane C.B. wrote: > Regardless of if I am mounting a NFS export remotely or locally, I > get the error below if I do a ls or any thing. > > Any suggestions? > > Below is a example of what happens if I try it locally. > > # mount_nfs -o mntudp,ro 192.168.15.2:/arc /mnt/ Oh, and it's just a stab in the dark, but I think that "ro" is handled by the generic mount command and that might be confusing it so that it doesn't recognize "mntudp" and is trying to use TCP. (which became the default somewhere along the way) You might try: mount -t nfs -o mntudp,ro ... instead? rick From owner-freebsd-fs@FreeBSD.ORG Sun Jan 17 22:36:31 2010 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5BAAF106568F for ; Sun, 17 Jan 2010 22:36:31 +0000 (UTC) (envelope-from v.velox@vvelox.net) Received: from vulpes.vvelox.net (vulpes.vvelox.net [99.69.115.42]) by mx1.freebsd.org (Postfix) with ESMTP id 171C48FC1D for ; Sun, 17 Jan 2010 22:36:30 +0000 (UTC) Received: from vixen42.vulpes (unknown [192.168.14.1]) (Authenticated sender: v.velox) by vulpes.vvelox.net (Postfix) with ESMTP id B755DB849; Sun, 17 Jan 2010 16:35:54 -0600 (CST) Date: Sun, 17 Jan 2010 16:36:08 -0600 From: "Zane C.B." To: Rick Macklem Message-ID: <20100117163608.4867dacf@vixen42.vulpes> In-Reply-To: References: <20100116235857.6f126ad0@vixen42.vulpes> X-Mailer: Claws Mail 3.7.3 (GTK+ 2.18.5; i386-portbld-freebsd7.2) Mime-Version: 1.0 Content-Type: multipart/signed; micalg=PGP-SHA1; boundary="Sig_/Sd2bElY/0DdfZhKeUWGsQU2"; protocol="application/pgp-signature" Cc: freebsd-fs@FreeBSD.org Subject: Re: odd NFS issues X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Jan 2010 22:36:31 -0000 --Sig_/Sd2bElY/0DdfZhKeUWGsQU2 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable On Sun, 17 Jan 2010 13:39:28 -0500 (EST) Rick Macklem wrote: >=20 >=20 > On Sat, 16 Jan 2010, Zane C.B. wrote: >=20 > > Regardless of if I am mounting a NFS export remotely or locally, I > > get the error below if I do a ls or any thing. > > > > Any suggestions? > > > 1 - Did anything get logged in /var/log/messages? > 2 - What kind of file system is /arc? > (Most tested is done with ufs/ffs and some use zfs. Some file > system types don't know how to do an NFS export and some > probably haven't been tested...) >=20 > Beyond that, doing a packet capture and looking at it in Wireshark > might tell you what's going on. In regards to question one, I am not seeing an thing in there. In regards to question two, it is UFS2. Hmm... I am seeing odd issues with on a tcpdump as below. Rebooting does not help nor does doing a umount -f on the location I am mounting it to. 16:33:58.397823 IP (tos 0x0, ttl 64, id 53470, offset 0, flags [none], prot= o UDP (17), length 68) 127.0.0.1.1263873696 > 127.0.0.1.2049: 40 null 16:33:58.397857 IP (tos 0x0, ttl 64, id 53471, offset 0, flags [none], prot= o UDP (17), length 52) 127.0.0.1.2049 > 127.0.0.1.1263873696: reply ok 24 n= ull 16:33:58.398364 IP (tos 0x0, ttl 64, id 53476, offset 0, flags [none], prot= o UDP (17), length 128) 127.0.0.1.1447583170 > 127.0.0.1.2049: 100 fsinfo [= |nfs] 16:33:58.398392 IP (tos 0x0, ttl 64, id 53477, offset 0, flags [none], prot= o UDP (17), length 60) 127.0.0.1.2049 > 127.0.0.1.1447583170: reply ok 32 f= sinfo ERROR: Stale NFS file handle POST: 16:33:58.398422 IP (tos 0x0, ttl 64, id 53478, offset 0, flags [none], prot= o UDP (17), length 128) 127.0.0.1.1447583171 > 127.0.0.1.2049: 100 fsinfo [= |nfs] 16:33:58.398444 IP (tos 0x0, ttl 64, id 53479, offset 0, flags [none], prot= o UDP (17), length 60) 127.0.0.1.2049 > 127.0.0.1.1447583171: reply ok 32 f= sinfo ERROR: Stale NFS file handle POST: 16:33:58.398463 IP (tos 0x0, ttl 64, id 53480, offset 0, flags [none], prot= o UDP (17), length 128) 127.0.0.1.1447583172 > 127.0.0.1.2049: 100 fsstat [= |nfs] 16:33:58.398484 IP (tos 0x0, ttl 64, id 53481, offset 0, flags [none], prot= o UDP (17), length 60) 127.0.0.1.2049 > 127.0.0.1.1447583172: reply ok 32 f= sstat ERROR: Input/output error POST: 16:33:58.398507 IP (tos 0x0, ttl 64, id 53482, offset 0, flags [none], prot= o UDP (17), length 128) 127.0.0.1.1447583173 > 127.0.0.1.2049: 100 fsinfo [= |nfs] 16:33:58.398527 IP (tos 0x0, ttl 64, id 53483, offset 0, flags [none], prot= o UDP (17), length 60) 127.0.0.1.2049 > 127.0.0.1.1447583173: reply ok 32 f= sinfo ERROR: Stale NFS file handle POST: --Sig_/Sd2bElY/0DdfZhKeUWGsQU2 Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAktTkOAACgkQqrJJy0yxYQALZgCfaWIyWVJfUvMKqRjhpFmXfaI9 nR8An3X2XPOUgFMON4GORlYQv5Gk7Q9b =9uhS -----END PGP SIGNATURE----- --Sig_/Sd2bElY/0DdfZhKeUWGsQU2-- From owner-freebsd-fs@FreeBSD.ORG Mon Jan 18 00:20:24 2010 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1B70A106568B for ; Mon, 18 Jan 2010 00:20:24 +0000 (UTC) (envelope-from v.velox@vvelox.net) Received: from vulpes.vvelox.net (vulpes.vvelox.net [99.69.115.42]) by mx1.freebsd.org (Postfix) with ESMTP id D2E1C8FC18 for ; Mon, 18 Jan 2010 00:20:23 +0000 (UTC) Received: from vixen42.vulpes (unknown [192.168.14.1]) (Authenticated sender: v.velox) by vulpes.vvelox.net (Postfix) with ESMTP id 202E9B848; Sun, 17 Jan 2010 18:19:47 -0600 (CST) Date: Sun, 17 Jan 2010 18:20:06 -0600 From: Vulpes Velox Message-ID: <20100117182006.7adc12d8@vixen42.vulpes> In-Reply-To: <20100117163608.4867dacf@vixen42.vulpes> References: <20100116235857.6f126ad0@vixen42.vulpes> <20100117163608.4867dacf@vixen42.vulpes> X-Mailer: Claws Mail 3.7.3 (GTK+ 2.18.5; i386-portbld-freebsd7.2) Mime-Version: 1.0 Content-Type: multipart/signed; micalg=PGP-SHA1; boundary="Sig_/PSQMiKjl6pHBhDs5IZxmq3i"; protocol="application/pgp-signature" Cc: freebsd-fs@FreeBSD.org Subject: Re: odd NFS issues (fixed) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jan 2010 00:20:24 -0000 --Sig_/PSQMiKjl6pHBhDs5IZxmq3i Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable On Sun, 17 Jan 2010 16:36:08 -0600 "Zane C.B." wrote: > On Sun, 17 Jan 2010 13:39:28 -0500 (EST) > Rick Macklem wrote: >=20 > >=20 > >=20 > > On Sat, 16 Jan 2010, Zane C.B. wrote: > >=20 > > > Regardless of if I am mounting a NFS export remotely or > > > locally, I get the error below if I do a ls or any thing. > > > > > > Any suggestions? > > > > > 1 - Did anything get logged in /var/log/messages? > > 2 - What kind of file system is /arc? > > (Most tested is done with ufs/ffs and some use zfs. Some file > > system types don't know how to do an NFS export and some > > probably haven't been tested...) > >=20 > > Beyond that, doing a packet capture and looking at it in Wireshark > > might tell you what's going on. >=20 > In regards to question one, I am not seeing an thing in there. >=20 > In regards to question two, it is UFS2. >=20 > Hmm... I am seeing odd issues with on a tcpdump as below. Rebooting > does not help nor does doing a umount -f on the location I am > mounting it to. >=20 > I've resolved the issue. I rebuilt the world and that solved it. --Sig_/PSQMiKjl6pHBhDs5IZxmq3i Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAktTqToACgkQqrJJy0yxYQDp4gCeLyx9ZntF9sRjSfjiW1HNIoh+ N9gAn1UPgHH17QB5MH+Bl3siDwwTJLTO =8rKh -----END PGP SIGNATURE----- --Sig_/PSQMiKjl6pHBhDs5IZxmq3i-- From owner-freebsd-fs@FreeBSD.ORG Mon Jan 18 02:52:28 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9EF1A106566B; Mon, 18 Jan 2010 02:52:28 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 760EF8FC12; Mon, 18 Jan 2010 02:52:28 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id o0I2qSTQ082617; Mon, 18 Jan 2010 02:52:28 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id o0I2qSu1082613; Mon, 18 Jan 2010 02:52:28 GMT (envelope-from linimon) Date: Mon, 18 Jan 2010 02:52:28 GMT Message-Id: <201001180252.o0I2qSu1082613@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/142924: [ext2fs] [patch] Small cleanup for the inode struct in ext2fs (based on UFS) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jan 2010 02:52:28 -0000 Old Synopsis: Small cleanup for the inode struct in ext2fs (based on UFS) New Synopsis: [ext2fs] [patch] Small cleanup for the inode struct in ext2fs (based on UFS) Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Mon Jan 18 02:52:09 UTC 2010 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=142924 From owner-freebsd-fs@FreeBSD.ORG Mon Jan 18 11:06:56 2010 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 98AD4106566B for ; Mon, 18 Jan 2010 11:06:56 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 6D7928FC29 for ; Mon, 18 Jan 2010 11:06:56 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id o0IB6uUn047525 for ; Mon, 18 Jan 2010 11:06:56 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id o0IB6tmB047523 for freebsd-fs@FreeBSD.org; Mon, 18 Jan 2010 11:06:55 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 18 Jan 2010 11:06:55 GMT Message-Id: <201001181106.o0IB6tmB047523@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jan 2010 11:06:56 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/142924 fs [ext2fs] [patch] Small cleanup for the inode struct in o kern/142914 fs [zfs] ZFS performance degradation over time o kern/142878 fs [zfs] [vfs] lock order reversal o kern/142872 fs [zfs] ZFS ZVOL Lockmgr Deadlock o kern/142597 fs [ext2fs] ext2fs does not work on filesystems with real o kern/142594 fs [zfs] Modification time reset to 1 Jan 1970 after fsyn o kern/142558 fs [msdosfs] patch] Minor updates to fs/msdosfs headers ( o kern/142489 fs [zfs] [lor] allproc/zfs LOR o kern/142466 fs Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re o kern/142401 fs [ntfs] [patch] Minor updates to NTFS from NetBSD o kern/142306 fs [zfs] [panic] ZFS drive (from OSX Leopard) causes two o kern/142271 fs [zfs] [patch] race condition on zpool create o kern/142068 fs [ufs] BSD labels are got deleted spontaneously o kern/141950 fs [unionfs] [lor] ufs/unionfs(/ufs) o kern/141897 fs [msdosfs] [panic] Kernel panic. msdofs: file name leng o kern/141718 fs [zfs] [panic] kernel panic when 'zfs rename' is used o o kern/141685 fs [zfs] zfs corruption on adaptec 5805 raid controller o kern/141463 fs [nfs] [panic] Frequent kernel panics after upgrade fro o kern/141305 fs [zfs] FreeBSD ZFS+sendfile severe performance issues ( o kern/141257 fs [gvinum] No puedo crear RAID5 por SW con gvinum o kern/141235 fs [disklabel] 8.0 no longer provides /dev entries for al o kern/141177 fs [zfs] fsync() on FIFO causes panic() on zfs o kern/141091 fs [patch] [nullfs] fix panics with DIAGNOSTIC enabled o kern/141086 fs [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS o kern/141010 fs [zfs] "zfs scrub" fails when backed by files in UFS2 o kern/140888 fs [zfs] boot fail from zfs root while the pool resilveri o kern/140682 fs [netgraph] [panic] random panic in netgraph o kern/140661 fs [zfs] /boot/loader fails to work on a GPT/ZFS-only sys o kern/140640 fs [zfs] snapshot crash o kern/140433 fs [zfs] [panic] panic while replaying ZIL after crash o kern/140134 fs [msdosfs] write and fsck destroy filesystem integrity o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs o bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139597 fs [patch] [tmpfs] tmpfs initializes va_gen but doesn't u o kern/139564 fs [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/139363 fs [nfs] diskless root nfs mount from non FreeBSD server o kern/138790 fs [zfs] ZFS ceases caching when mem demand is high o kern/138524 fs [msdosfs] disks and usb flashes/cards with Russian lab o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/138109 fs [extfs] [patch] Minor cleanups to the sys/gnu/fs/ext2f f kern/137037 fs [zfs] [hang] zfs rollback on root causes FreeBSD to fr o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic o kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135594 fs [zfs] Single dataset unresponsive with Samba o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133980 fs [panic] [ffs] panic: ffs_valloc: dup alloc o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133614 fs [panic] panic: ffs_truncate: read-only filesystem o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int f kern/133150 fs [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132597 fs [tmpfs] [panic] tmpfs-related panic while interrupting o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131995 fs [nfs] Failure to mount NFSv4 server o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130979 fs [smbfs] [panic] boot/kernel/smbfs.ko o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130229 fs [iconv] usermount fails on fs that need iconv o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/129059 fs [zfs] [patch] ZFS bootloader whitelistable via WITHOUT f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/127659 fs [tmpfs] tmpfs memory leak o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS p kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition f bin/124424 fs [zfs] zfs(8): zfs list -r shows strange snapshots' siz o kern/123939 fs [msdosfs] corrupts new files o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121779 fs [ufs] snapinfo(8) (and related tools?) only work for t o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha f kern/120991 fs [panic] [fs] [snapshot] System crashes when manipulati o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F f kern/119735 fs [zfs] geli + ZFS + samba starting on boot panics 7.0-B o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o bin/118249 fs mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o kern/116913 fs [ffs] [panic] ffs_blkfree: freeing free block p kern/116608 fs [msdosfs] [patch] msdosfs fails to check mount options o kern/116583 fs [ffs] [hang] System freezes for short time when using o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/115645 fs [snapshots] [panic] lockmgr: thread 0xc4c00d80, not ex o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] mount_msdosfs: msdosfs_iconv: Operation not o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [iso9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna f kern/91568 fs [ufs] [panic] writing to UFS/softupdates DVD media in o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o kern/85326 fs [smbfs] [panic] saving a file via samba to an overquot o kern/84589 fs [2TB] 5.4-STABLE unresponsive during background fsck 2 o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 163 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Jan 18 20:20:05 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DB4241065679 for ; Mon, 18 Jan 2010 20:20:05 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id C56F78FC08 for ; Mon, 18 Jan 2010 20:20:05 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id o0IKK5BM026104 for ; Mon, 18 Jan 2010 20:20:05 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id o0IKK5Eu026067; Mon, 18 Jan 2010 20:20:05 GMT (envelope-from gnats) Date: Mon, 18 Jan 2010 20:20:05 GMT Message-Id: <201001182020.o0IKK5Eu026067@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Ivan Voras Cc: Subject: Re: kern/142914: [zfs] ZFS performance degradation over time X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Ivan Voras List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jan 2010 20:20:05 -0000 The following reply was made to PR kern/142914; it has been noted by GNATS. From: Ivan Voras To: bug-followup@FreeBSD.org, miks.mikelsons@gmail.com Cc: Subject: Re: kern/142914: [zfs] ZFS performance degradation over time Date: Mon, 18 Jan 2010 10:52:04 +0100 This is a multi-part message in MIME format. --------------020105090205050006040605 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit RE: "used value differs": Have you checked with "du" the status "before" and "after" to rule out that something is actually allocating a big file? There is a similar problem reported here: http://permalink.gmane.org/gmane.os.freebsd.stable/66780 --------------020105090205050006040605 Content-Type: text/x-vcard; charset=utf-8; name="ivoras.vcf" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="ivoras.vcf" YmVnaW46dmNhcmQNCmZuOkl2YW4gVm9yYXMNCm46Vm9yYXM7SXZhbg0Kb3JnOlVuaXZlcnNp dHkgb2YgWmFncmViO0ZhY3VsdHkgb2YgZWxlY3RyaWNhbCBlbmdpbmVlcmluZyBhbmQgY29t cHV0aW5nDQphZHI6OztVbnNrYSAzO1phZ3JlYjs7MTAwMDA7Q3JvYXRpYQ0KZW1haWw7aW50 ZXJuZXQ6aXZvcmFzQGZlci5ocg0KdGl0bGU6SW50ZXJuZXQgc2VydmljZXMgYXJjaGl0ZWN0 DQp0ZWw7d29yazorMzg1IDEgNjEyOSA2NjANCngtbW96aWxsYS1odG1sOkZBTFNFDQp2ZXJz aW9uOjIuMQ0KZW5kOnZjYXJkDQoNCg== --------------020105090205050006040605-- From owner-freebsd-fs@FreeBSD.ORG Tue Jan 19 07:58:37 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 695D1106566B for ; Tue, 19 Jan 2010 07:58:37 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-bw0-f209.google.com (mail-bw0-f209.google.com [209.85.218.209]) by mx1.freebsd.org (Postfix) with ESMTP id EA2208FC14 for ; Tue, 19 Jan 2010 07:58:36 +0000 (UTC) Received: by bwz1 with SMTP id 1so1291033bwz.13 for ; Mon, 18 Jan 2010 23:58:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:to:subject:references :organization:from:date:in-reply-to:message-id:user-agent :mime-version:content-type; bh=q3N6dq+7CCW6pXUw1lWl8D6W76wJXtmBa/Pos4f/ETU=; b=Ndoi5gr51/D5WGcjQd+Pq24weYYkc/fJnle5M4aLfOKrv/rjUObTsjAtae7ZT9qgI9 4qeKdzvxMXFt2CWYVy461t5Ie+P4XZY2MuBlxxLUZilXHrbecIxFvGR9z5bc9NonMqVU b9R5blRtAGGKK9y3mefJKoLTj7bKdY1D7F5aw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=to:subject:references:organization:from:date:in-reply-to:message-id :user-agent:mime-version:content-type; b=j7hnVAJbwgLT7F3yLa5R59aIeitIJC6pL++UF4zfn86AV9LLBvgwx2secB5QPgRx0/ Fnl3jH+Oj7LKBbxT+VZbXZc6rOtud9MlCQCvpWqN9Yi5276dzNzVGrE0sJleOS3Nm/u2 S0K+XOwVVni/1FY8LWshgQ1Gd4/otbqY9/waQ= Received: by 10.204.30.208 with SMTP id v16mr4059308bkc.18.1263887915454; Mon, 18 Jan 2010 23:58:35 -0800 (PST) Received: from localhost (ms.singlescrowd.net [80.85.90.67]) by mx.google.com with ESMTPS id 15sm1706259bwz.0.2010.01.18.23.58.33 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 18 Jan 2010 23:58:34 -0800 (PST) To: freebsd-fs@FreeBSD.org References: <86ocl272mb.fsf@kopusha.onet> <86tyuqnz9x.fsf@zhuzha.ua1> Organization: TOA Ukraine From: Mikolaj Golub Date: Tue, 19 Jan 2010 09:58:32 +0200 In-Reply-To: <86tyuqnz9x.fsf@zhuzha.ua1> (Mikolaj Golub's message of "Wed\, 13 Jan 2010 11\:13\:14 +0200") Message-ID: <86zl4awmon.fsf@zhuzha.ua1> User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Subject: Re: FreeBSD NFS client/Linux NFS server issue X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jan 2010 07:58:37 -0000 On Wed, 13 Jan 2010 11:13:14 +0200 Mikolaj Golub wrote: > On Sun, 10 Jan 2010 11:03:56 +0200 Mikolaj Golub wrote: > So because it was appending to the file every php write call caused the > sequence of the following rpc: ACCESS - READ - WRITE - COMMIT. And trying to > flush the next line of the log it got stuck after READ call (the next should > be WRITE call but client never did it). > > The same thing is for other log file written by othe php process. The last rpc > for this file: > > 30990 18:02:05.050063 172.30.10.54 172.30.10.83 NFS V3 READ Call (Reply In 31068), FH:0x532fa29d Offset:131072 Len:2686 > 31068 18:02:05.062801 172.30.10.83 172.30.10.54 NFS V3 READ Reply (Call In 30990) Len:2685 > > A bit later there were several successful COMMIT calls (when php processes > were closing other files I think). And other NFS activity was observed -- our > nagios checks and other applications, which was just looking for presence and > status of certain files, were running successfully and in tcpdump there are > successful readdir/access/lookup/fstat calls. df utility did not hanged then > too. > > Later when our engineer tried to access the mounted folder with mc the > process locked acquiring nfs vn_lock held by php script (td=0xc6bf4690): Analyzing logs of our php scripts we have found that we had cases when a process (or two simultaneously) got stuck writing to NFS and then later they were "unfrozen" by another started php process when it was writing to this NFS share (in some other log file). We have tcpdump for such case and it looks like the following: 1) ACCESS - READ - WRITE - COMMIT sequences when the php process is writing to log file. 2) Then at some moment this stops after READ rpc call and successful reply. 3) After this successful readdir/access/lookup/fstat calls are observed from our other utilities, which just check the presence of some files. 4) New php process starts and writes to some other log file (successful ACCESS - READ - WRITE - COMMIT sequences). After this writing to the first file continues too (starting from WRITE rpc, so there is no any retransmits). As a workaround we installed cron scripts that just write to some file every 2 minutes. We have been running this for 3 days and there have not been incidents since then but actually we will be able to say if this really has helped only after running a week and more. Also we are upgrading one of our servers, where the problem has been observed most frequently to 7.2). Actually we have many FreeBSD7.1 hosts with NFS mounts but the problem has been observed only on 3 of them and currently we don't know a way to reproduce it. -- Mikolaj Golub From owner-freebsd-fs@FreeBSD.ORG Tue Jan 19 08:03:02 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A9AAF1065693; Tue, 19 Jan 2010 08:03:02 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-bw0-f209.google.com (mail-bw0-f209.google.com [209.85.218.209]) by mx1.freebsd.org (Postfix) with ESMTP id 077588FC14; Tue, 19 Jan 2010 08:03:01 +0000 (UTC) Received: by bwz1 with SMTP id 1so1292638bwz.13 for ; Tue, 19 Jan 2010 00:03:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:to:cc:subject:references :organization:from:date:in-reply-to:message-id:user-agent :mime-version:content-type:content-transfer-encoding; bh=D7b1832tx6M4Fywxyifrqf+sABYej7JZs9JCYUNZnVk=; b=hFIcAapmcxMF6lGqBTXdoK1jvhptwpw24G+1A8gTCUbJVobbm5VGahKjr5Xolq3cbu 0L/TLgHrQUf9/T11gpYlt66qeIToYqt+D789KfYEG2BTWD2Tikes3KENVdQR0qeOb7to erFcKjd/fHlLTLEpolv7LZuS6T48tN5u8dpgs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=to:cc:subject:references:organization:from:date:in-reply-to :message-id:user-agent:mime-version:content-type :content-transfer-encoding; b=P0kBO18/bYHo8XUVNNXqQoOEMoWoxo7lcqr0ai2N5YmNzp5/NZEYuamJc3ji168cq/ gVo2K5K6Hh1eptRTGi2YiV0xK12LYa8dzeEa/n5RnZytMeTYA0ufBGk6Zq2YDPmIue+d oTIBPwzOeNjJ4FqgVjNcCAk9v+Xt5JpJnF0Js= Received: by 10.204.32.72 with SMTP id b8mr2583084bkd.203.1263888180556; Tue, 19 Jan 2010 00:03:00 -0800 (PST) Received: from localhost (ms.singlescrowd.net [80.85.90.67]) by mx.google.com with ESMTPS id 16sm883583bwz.11.2010.01.19.00.02.59 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 19 Jan 2010 00:02:59 -0800 (PST) To: freebsd-fs@FreeBSD.org References: <86ocl272mb.fsf@kopusha.onet> <86tyuqnz9x.fsf@zhuzha.ua1> <86zl4awmon.fsf@zhuzha.ua1> Organization: TOA Ukraine From: Mikolaj Golub Date: Tue, 19 Jan 2010 10:02:57 +0200 In-Reply-To: <86zl4awmon.fsf@zhuzha.ua1> (Mikolaj Golub's message of "Tue\, 19 Jan 2010 09\:58\:32 +0200") Message-ID: <86vdeywmha.fsf@zhuzha.ua1> User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 8bit Cc: freebsd-stable@FreeBSD.org Subject: Re: FreeBSD NFS client/Linux NFS server issue X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jan 2010 08:03:02 -0000 Hi, In this thread I have posted to freebsd-fs@ several messages describing our problem with freebsd7.1 nfs clients. As with the time new info has appeared and having this all spread in several messages might be a bit confusing, I want to summarise here what we see and know. Also I cc to freebsd-stable@ hoping to draw more attention to this problem as it looks for me very interesting and challenging :-) I have found in the Internet that other people have been observed the similar problem with FreeBSD6.2 client: http://forums.freebsd.org/showthread.php?t=1697 So, on some of our freebsd7.1 nfs clients (and it looks like we have had similar case with 6.3), which have several nfs mounts to the same CentOS 5.3 NFS server (mount options: rw,-3,-T,-s,-i,-r=32768,-w=32768,-o=noinet6), at some moment the access to one of the NFS mount gets stuck, while the access to the other mounts works ok. In all cases we have been observed so far the first gotten stuck process was php script (or two) that was (were) writing to logs file (appending). In tcpdump we see that every write to the file causes the sequence of the following rpc: ACCESS - READ - WRITE - COMMIT. And at some moment this stops after READ rpc call and successful reply. After this in tcpdump successful readdir/access/lookup/fstat calls are observed from our other utilities, which just check the presence of some files and they work ok (df also works). The php process at this state is in bo_wwait invalidating buffer cache [1]. If at this time we try accessing the share with mc then it hangs acquiring the vn_lock held by php process [2] and after this any operations with this NFS share hang (df hangs too). If instead some other process is started that writes to some other file on this share (append) then the first process "unfreezes" too (starting from WRITE rpc, so there is no any retransmits). With my limited knowledge of this complicated kernel subsystem I have the following hypothesis what is going on. On some of the nfs_write() it does successful ACCESS - READ rpcs but by some reason does not call WRITE to flush dirty buffer to the server (aborts somewere or may be in bdwrite() which calls bd_wakeup() and actually bd_wakeup considers that we don't have enough dirty buffers?). But it looks like on this stage the buffer appears to be unlinked from bufqueues [3] so when bufdaemon runs it does not flush the buffer. The next write() call to this file causes the process to get stuck invalidating the dirty buffer. The buffer is accessible by nfsiod via nmp structure [3] and when the next process is writing to another file, nfsiod is started and flushes this dirty buffer. [1]: Gotten stuck php process: (kgdb) bt #0 sched_switch (td=0xc839e000, newtd=Variable "newtd" is not available. ) at /usr/src/sys/kern/sched_ule.c:1944 #1 0xc07cabe6 in mi_switch (flags=Variable "flags" is not available. ) at /usr/src/sys/kern/kern_synch.c:440 #2 0xc07f42fb in sleepq_switch (wchan=Variable "wchan" is not available. ) at /usr/src/sys/kern/subr_sleepqueue.c:497 #3 0xc07f460c in sleepq_catch_signals (wchan=0xc90c9ee8) at /usr/src/sys/kern/subr_sleepqueue.c:417 #4 0xc07f4ebd in sleepq_wait_sig (wchan=0xc90c9ee8) at /usr/src/sys/kern/subr_sleepqueue.c:594 #5 0xc07cb047 in _sleep (ident=0xc90c9ee8, lock=0xc90c9e8c, priority=333, wmesg=0xc0b731ed "bo_wwait", timo=0) at /usr/src/sys/kern/kern_synch.c:224 #6 0xc0827295 in bufobj_wwait (bo=0xc90c9ec4, slpflag=256, timeo=0) at /usr/src/sys/kern/vfs_bio.c:3870 #7 0xc0966307 in nfs_flush (vp=0xc90c9e04, waitfor=1, td=0xc839e000, commit=1) at /usr/src/sys/nfsclient/nfs_vnops.c:2989 #8 0xc09667ce in nfs_fsync (ap=0xed3c38ec) at /usr/src/sys/nfsclient/nfs_vnops.c:2725 #9 0xc0aee5d2 in VOP_FSYNC_APV (vop=0xc0c2b920, a=0xed3c38ec) at vnode_if.c:1007 #10 0xc0827864 in bufsync (bo=0xc90c9ec4, waitfor=1, td=0xc839e000) at vnode_if.h:538 #11 0xc083f354 in bufobj_invalbuf (bo=0xc90c9ec4, flags=1, td=0xc839e000, slpflag=256, slptimeo=0) at /usr/src/sys/kern/vfs_subr.c:1066 #12 0xc083f6e2 in vinvalbuf (vp=0xc90c9e04, flags=1, td=0xc839e000, slpflag=256, slptimeo=0) at /usr/src/sys/kern/vfs_subr.c:1142 #13 0xc094f216 in nfs_vinvalbuf (vp=0xc90c9e04, flags=Variable "flags" is not available. ) at /usr/src/sys/nfsclient/nfs_bio.c:1326 #14 0xc0951825 in nfs_write (ap=0xed3c3bc4) at /usr/src/sys/nfsclient/nfs_bio.c:918 #15 0xc0aef956 in VOP_WRITE_APV (vop=0xc0c2b920, a=0xed3c3bc4) at vnode_if.c:691 #16 0xc0850097 in vn_write (fp=0xc9969b48, uio=0xed3c3c60, active_cred=0xcb901600, flags=0, td=0xc839e000) at vnode_if.h:373 #17 0xc07f9d17 in dofilewrite (td=0xc839e000, fd=6, fp=0xc9969b48, auio=0xed3c3c60, offset=-1, flags=0) at file.h:256 #18 0xc07f9ff8 in kern_writev (td=0xc839e000, fd=6, auio=0xed3c3c60) at /usr/src/sys/kern/sys_generic.c:401 #19 0xc07fa06f in write (td=0xc839e000, uap=0xed3c3cfc) at /usr/src/sys/kern/sys_generic.c:317 #20 0xc0ad9c75 in syscall (frame=0xed3c3d38) at /usr/src/sys/i386/i386/trap.c:1090 #21 0xc0ac01b0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255 #22 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) [2] mc process gotten stuck acquiring _vn_lock held by above php thread: at /usr/src/sys/kern/sched_ule.c:1944 * 178 Thread 100340 (PID=40443: mc) sched_switch (td=0xc9810af0, newtd=Variable "newtd" is not availabl . (kgdb) thr 178 [Switching to thread 178 (Thread 100340)]#0 sched_switch (td=0xc9810af0, newtd=Variable "newtd" is not available. ) at /usr/src/sys/kern/sched_ule.c:1944 1944 cpuid = PCPU_GET(cpuid); (kgdb) bt #0 sched_switch (td=0xc9810af0, newtd=Variable "newtd" is not available. ) at /usr/src/sys/kern/sched_ule.c:1944 #1 0xc07cabe6 in mi_switch (flags=Variable "flags" is not available. ) at /usr/src/sys/kern/kern_synch.c:440 #2 0xc07f42fb in sleepq_switch (wchan=Variable "wchan" is not available. ) at /usr/src/sys/kern/subr_sleepqueue.c:497 #3 0xc07f4946 in sleepq_wait (wchan=0xc90c9e5c) at /usr/src/sys/kern/subr_sleepqueue.c:580 #4 0xc07cb056 in _sleep (ident=0xc90c9e5c, lock=0xc0c77d18, priority=80, wmesg=0xc0b80b92 "nfs", timo=0) at /usr/src/sys/kern/kern_synch.c:226 #5 0xc07adf5a in acquire (lkpp=0xed56b7f0, extflags=Variable "extflags" is not available. ) at /usr/src/sys/kern/kern_lock.c:151 #6 0xc07ae84c in _lockmgr (lkp=0xc90c9e5c, flags=8194, interlkp=0xc90c9e8c, td=0xc9810af0, file=0xc0b74aeb "/usr/src/sys/kern/vfs_subr.c", line=2061) at /usr/src/sys/kern/kern_lock.c:384 #7 0xc0832470 in vop_stdlock (ap=0xed56b840) at /usr/src/sys/kern/vfs_default.c:305 #8 0xc0aef4f6 in VOP_LOCK1_APV (vop=0xc0c1d5c0, a=0xed56b840) at vnode_if.c:1618 #9 0xc084ed86 in _vn_lock (vp=0xc90c9e04, flags=8194, td=0xc9810af0, file=0xc0b74aeb "/usr/src/sys/kern/vfs_subr.c", line=2061) at vnode_if.h:851 #10 0xc0841d84 in vget (vp=0xc90c9e04, flags=8194, td=0xc9810af0) at /usr/src/sys/kern/vfs_subr.c:2061 #11 0xc08355b3 in vfs_hash_get (mp=0xc6b472cc, hash=3326873010, flags=Variable "flags" is not available. ) at /usr/src/sys/kern/vfs_hash.c:81 #12 0xc09534d4 in nfs_nget (mntp=0xc6b472cc, fhp=0xc97be078, fhsize=20, npp=0xed56b9f0, flags=2) at /usr/src/sys/nfsclient/nfs_node.c:120 #13 0xc0964a05 in nfs_lookup (ap=0xed56ba84) at /usr/src/sys/nfsclient/nfs_vnops.c:947 #14 0xc0aefbe6 in VOP_LOOKUP_APV (vop=0xc0c2b920, a=0xed56ba84) at vnode_if.c:99 #15 0xc0836841 in lookup (ndp=0xed56bb48) at vnode_if.h:57 #16 0xc083756f in namei (ndp=0xed56bb48) at /usr/src/sys/kern/vfs_lookup.c:219 #17 0xc0844fef in kern_lstat (td=0xc9810af0, path=0x48611280
, pathseg=UIO_USERSPACE, sbp=0xed56bc18) at /usr/src/sys/kern/vfs_syscalls.c:2169 #18 0xc08451af in lstat (td=0xc9810af0, uap=0xed56bcfc) at /usr/src/sys/kern/vfs_syscalls.c:2152 #19 0xc0ad9c75 in syscall (frame=0xed56bd38) at /usr/src/sys/i386/i386/trap.c:1090 #20 0xc0ac01b0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255 #21 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) fr 6 #6 0xc07ae84c in _lockmgr (lkp=0xc90c9e5c, flags=8194, interlkp=0xc90c9e8c, td=0xc9810af0, file=0xc0b74aeb "/usr/src/sys/kern/vfs_subr.c", line=2061) at /usr/src/sys/kern/kern_lock.c:384 384 error = acquire(&lkp, extflags, (LK_HAVE_EXCL | LK_WANT_EXCL), &contested, &waitstart); (kgdb) p *lkp $2 = {lk_object = {lo_name = 0xc0b80b92 "nfs", lo_type = 0xc0b80b92 "nfs", lo_flags = 70844416, lo_witness_data = {lod_list = {stqe_next = 0x0}, lod_witness = 0x0}}, lk_interlock = 0xc0c77d18, lk_flags = 33816640, lk_sharecount = 0, lk_waitcount = 1, lk_exclusivecount = 1, lk_prio = 80, lk_timo = 51, lk_lockholder = 0xc839e000, lk_newlock = 0x0} [3] struct nfsmount of the "problem" share: (kgdb) p *nmp $4 = {nm_mtx = {lock_object = {lo_name = 0xc0b808ee "NFSmount lock", lo_type = 0xc0b808ee "NFSmount lock", lo_flags = 16973824, lo_witness_data = {lod_list = { stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4, mtx_recurse = 0}, nm_flag = 35399, nm_state = 1310720, nm_mountp = 0xc6b472cc, nm_numgrps = 16, nm_fh = "\001\000\000\000\000\223\000\000\001@\003\n", '\0' , nm_fhsize = 12, nm_rpcclnt = {rc_flag = 0, rc_wsize = 0, rc_rsize = 0, rc_name = 0x0, rc_so = 0x0, rc_sotype = 0, rc_soproto = 0, rc_soflags = 0, rc_timeo = 0, rc_retry = 0, rc_srtt = {0, 0, 0, 0}, rc_sdrtt = {0, 0, 0, 0}, rc_sent = 0, rc_cwnd = 0, rc_timeouts = 0, rc_deadthresh = 0, rc_authtype = 0, rc_auth = 0x0, rc_prog = 0x0, rc_proctlen = 0, rc_proct = 0x0}, nm_so = 0xc6e81d00, nm_sotype = 1, nm_soproto = 0, nm_soflags = 44, nm_nam = 0xc6948640, nm_timeo = 6000, nm_retry = 2, nm_srtt = {15, 15, 31, 52}, nm_sdrtt = {3, 3, 15, 15}, nm_sent = 0, nm_cwnd = 4096, nm_timeouts = 0, nm_deadthresh = 9, nm_rsize = 32768, nm_wsize = 32768, nm_readdirsize = 4096, nm_readahead = 1, nm_wcommitsize = 1177026, nm_acdirmin = 30, nm_acdirmax = 60, nm_acregmin = 3, nm_acregmax = 60, nm_verf = "Jë¾W\000\004oí", nm_bufq = {tqh_first = 0xda82dc70, tqh_last = 0xda8058e0}, nm_bufqlen = 2, nm_bufqwant = 0, nm_bufqiods = 1, nm_maxfilesize = 1099511627775, nm_rpcops = 0xc0c2b5bc, nm_tprintf_initial_delay = 12, nm_tprintf_delay = 30, nm_nfstcpstate = { rpcresid = 0, flags = 1, sock_send_inprog = 0}, nm_hostname = "172.30.10.92\000/var/www/app31", '\0' , nm_clientid = 0, nm_fsid = { val = {0, 0}}, nm_lease_time = 0, nm_last_renewal = 0} buffers on it: (kgdb) p *nmp->nm_bufq.tqh_first $7 = {b_bufobj = 0xc7324960, b_bcount = 31565, b_caller1 = 0x0, b_data = 0xde581000 " valid_lines:", ' ' , "1341\n invalid_lines:", ' ' , "1556\n total_lines:", ' ' , "2897\n\n Error summary:\n Inactive pr"..., b_error = 0, b_iocmd = 2 '\002', b_ioflags = 0 '\0', b_iooffset = 196608, b_resid = 0, b_iodone = 0, b_blkno = 384, b_offset = 196608, b_bobufs = {tqe_next = 0x0, tqe_prev = 0xc7324964}, b_left = 0x0, b_right = 0x0, b_vflags = 0, b_freelist = { tqe_next = 0xda805894, tqe_prev = 0xc725d3c0}, b_qindex = 0, b_flags = 536870948, b_xflags = 2 '\002', b_lock = {lk_object = {lo_name = 0xc0b73635 "bufwait", lo_type = 0xc0b73635 "bufwait", lo_flags = 70844416, lo_witness_data = {lod_list = { stqe_next = 0x0}, lod_witness = 0x0}}, lk_interlock = 0xc0c77b50, lk_flags = 262144, lk_sharecount = 0, lk_waitcount = 0, lk_exclusivecount = 1, lk_prio = 80, lk_timo = 0, lk_lockholder = 0xfffffffe, lk_newlock = 0x0}, b_bufsize = 31744, b_runningbufspace = 0, b_kvabase = 0xde581000 " valid_lines:", ' ' , "1341\n invalid_lines:", ' ' , "1556\n total_lines:", ' ' , "2897\n\n Error summary:\n Inactive pr"..., b_kvasize = 32768, b_lblkno = 6, b_vp = 0xc73248a0, b_dirtyoff = 31512, b_dirtyend = 31565, b_rcred = 0x0, b_wcred = 0xcebec400, b_saveaddr = 0xde581000, b_pager = { pg_reqpage = 0}, b_cluster = {cluster_head = {tqh_first = 0xda917ec8, tqh_last = 0xda888e94}, cluster_entry = {tqe_next = 0xda917ec8, tqe_prev = 0xda888e94}}, b_pages = {0xc3726e90, 0xc448dca8, 0xc2a55b98, 0xc3bf1a28, 0xc3467ff0, 0xc3299600, 0xc28db130, 0xc2301398, 0x0 }, b_npages = 8, b_dep = {lh_first = 0x0}, b_fsprivate1 = 0x0, b_fsprivate2 = 0x0, b_fsprivate3 = 0x0, b_pin_count = 0} These are entires from our log file. Note that b_qindex is 0. But bufqueues[0] is empty: (kgdb) p bufqueues[0] $8 = {tqh_first = 0x0, tqh_last = 0xc0c83e20} Also does not it look strange that lk_lockholder of b_lock points to innvalid location (0xfffffffe)? -- Mikolaj Golub From owner-freebsd-fs@FreeBSD.ORG Tue Jan 19 08:35:22 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2241910656A5; Tue, 19 Jan 2010 08:35:22 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-bw0-f209.google.com (mail-bw0-f209.google.com [209.85.218.209]) by mx1.freebsd.org (Postfix) with ESMTP id 77A218FC19; Tue, 19 Jan 2010 08:35:21 +0000 (UTC) Received: by bwz1 with SMTP id 1so1306363bwz.13 for ; Tue, 19 Jan 2010 00:35:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:to:cc:subject:references :organization:from:date:in-reply-to:message-id:user-agent :mime-version:content-type; bh=KErM9rmOfOmkhqcOLR2SyM37a2Q37/DZeNxXlPiKhsE=; b=d/OLllVB7/2495q1R8WGZazu7bdqOP9VlZAP45Efs/J365zGwNw9XZdrLSCSMEbAQj o/ixWJWtvpAjOd5IjRdUGxc6ywWUUCrsIk07zeqaNMuehf0a8r8WYdSMBIYy9oIp26jc 7/7GheWiUuzon3OYnSh+kYHjdOSb0nn4gZs2U= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=to:cc:subject:references:organization:from:date:in-reply-to :message-id:user-agent:mime-version:content-type; b=MQxAHA6GgFGq1KefSlYUXPPFj+O/z82n19TO1P0LCnKp/xqajJD763ZLqmuyUEU3yD G8Uj0OM1fCffFpz9pgCfDJuJKGlmw4WPSNqGTSPVvkImmbrYo6zLbUVNB6rEAG8izcqQ IoHCXQEmVusPgtafJKvZXw1ZeRsvL4zejRw4g= Received: by 10.204.10.18 with SMTP id n18mr4092509bkn.152.1263890120189; Tue, 19 Jan 2010 00:35:20 -0800 (PST) Received: from localhost (ms.singlescrowd.net [80.85.90.67]) by mx.google.com with ESMTPS id 14sm1055390bwz.1.2010.01.19.00.35.19 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 19 Jan 2010 00:35:19 -0800 (PST) To: freebsd-fs@FreeBSD.org References: <86ocl272mb.fsf@kopusha.onet> <86tyuqnz9x.fsf@zhuzha.ua1> <86zl4awmon.fsf@zhuzha.ua1> <86vdeywmha.fsf@zhuzha.ua1> Organization: TOA Ukraine From: Mikolaj Golub Date: Tue, 19 Jan 2010 10:35:17 +0200 In-Reply-To: <86vdeywmha.fsf@zhuzha.ua1> (Mikolaj Golub's message of "Tue\, 19 Jan 2010 10\:02\:57 +0200") Message-ID: <86r5pmwkze.fsf@zhuzha.ua1> User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-stable@FreeBSD.org Subject: Re: FreeBSD NFS client/Linux NFS server issue X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jan 2010 08:35:22 -0000 On Tue, 19 Jan 2010 10:02:57 +0200 Mikolaj Golub wrote: > I have found in the Internet that other people have been observed the similar > problem with FreeBSD6.2 client: > > http://forums.freebsd.org/showthread.php?t=1697 Reading this through carefully it looks like the guy did not experience the problem (gotten stuck processes). He just described the behaviour of freebsd client when appending the file. -- Mikolaj Golub From owner-freebsd-fs@FreeBSD.ORG Wed Jan 20 12:30:13 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 17AB21065679 for ; Wed, 20 Jan 2010 12:30:13 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id E3BA28FC1A for ; Wed, 20 Jan 2010 12:30:12 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id o0KCUC61033812 for ; Wed, 20 Jan 2010 12:30:12 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id o0KCUC3Z033809; Wed, 20 Jan 2010 12:30:12 GMT (envelope-from gnats) Date: Wed, 20 Jan 2010 12:30:12 GMT Message-Id: <201001201230.o0KCUC3Z033809@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Buganini Cc: Subject: Re: kern/142271: [zfs] [patch] race condition on zpool create X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Buganini List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Jan 2010 12:30:13 -0000 The following reply was made to PR kern/142271; it has been noted by GNATS. From: Buganini To: bug-followup@FreeBSD.org, torindel@gmail.com Cc: Subject: Re: kern/142271: [zfs] [patch] race condition on zpool create Date: Wed, 20 Jan 2010 20:24:12 +0800 this patch also works for me on today RELENG_8 From owner-freebsd-fs@FreeBSD.ORG Wed Jan 20 19:47:51 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 497441065695; Wed, 20 Jan 2010 19:47:51 +0000 (UTC) (envelope-from jh@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 208BA8FC19; Wed, 20 Jan 2010 19:47:51 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id o0KJlpPj022904; Wed, 20 Jan 2010 19:47:51 GMT (envelope-from jh@freefall.freebsd.org) Received: (from jh@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id o0KJloXd022900; Wed, 20 Jan 2010 19:47:50 GMT (envelope-from jh) Date: Wed, 20 Jan 2010 19:47:50 GMT Message-Id: <201001201947.o0KJloXd022900@freefall.freebsd.org> To: bf2006a@yahoo.com, jh@FreeBSD.org, freebsd-fs@FreeBSD.org, jh@FreeBSD.org From: jh@FreeBSD.org Cc: Subject: Re: kern/132597: [tmpfs] [panic] tmpfs-related panic while interrupting a port build on tmpfs WRKDIR X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Jan 2010 19:47:51 -0000 Synopsis: [tmpfs] [panic] tmpfs-related panic while interrupting a port build on tmpfs WRKDIR State-Changed-From-To: open->feedback State-Changed-By: jh State-Changed-When: Wed Jan 20 19:43:57 UTC 2010 State-Changed-Why: This looks like a duplicate of kern/122038 which has been patched. Can you still reproduce this after r197953? Responsible-Changed-From-To: freebsd-fs->jh Responsible-Changed-By: jh Responsible-Changed-When: Wed Jan 20 19:43:57 UTC 2010 Responsible-Changed-Why: Track. http://www.freebsd.org/cgi/query-pr.cgi?pr=132597 From owner-freebsd-fs@FreeBSD.ORG Wed Jan 20 22:40:11 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BE5C21065693 for ; Wed, 20 Jan 2010 22:40:11 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 95A098FC1C for ; Wed, 20 Jan 2010 22:40:11 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id o0KMeBS5095748 for ; Wed, 20 Jan 2010 22:40:11 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id o0KMeBXv095743; Wed, 20 Jan 2010 22:40:11 GMT (envelope-from gnats) Date: Wed, 20 Jan 2010 22:40:11 GMT Message-Id: <201001202240.o0KMeBXv095743@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Henrik Wist Cc: Subject: Re: kern/142271: [zfs] [patch] race condition on zpool create X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Henrik Wist List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Jan 2010 22:40:11 -0000 The following reply was made to PR kern/142271; it has been noted by GNATS. From: Henrik Wist To: bug-followup@FreeBSD.org, torindel@gmail.com Cc: Subject: Re: kern/142271: [zfs] [patch] race condition on zpool create Date: Wed, 20 Jan 2010 23:15:50 +0100 I can confirm this issue on 7.2-STABLE and I'm happy to report that the patch fixes this on 7.2-STABLE as well. From owner-freebsd-fs@FreeBSD.ORG Thu Jan 21 06:47:48 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 87E99106566B for ; Thu, 21 Jan 2010 06:47:48 +0000 (UTC) (envelope-from ndenev@gmail.com) Received: from mail-fx0-f218.google.com (mail-fx0-f218.google.com [209.85.220.218]) by mx1.freebsd.org (Postfix) with ESMTP id 1B6228FC16 for ; Thu, 21 Jan 2010 06:47:47 +0000 (UTC) Received: by fxm10 with SMTP id 10so2783441fxm.34 for ; Wed, 20 Jan 2010 22:47:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:content-type :content-transfer-encoding:subject:date:message-id:to:mime-version :x-mailer; bh=F5HSqAYc/ug/tgK5wUcgB0HecywasDDam998e7N5kzM=; b=gyfqxvMN9iPJ/evYmy+/3Mf25caAPQcvU9taHbrOwvW10xDMvln+k73IY5EosJRYg8 vaDTubrdlcmmr6/rlSBIIh+1S/cTVQ5GgyNtz55DIe0THuiBJeIazJoUPPJFO9W/nB4/ vPRHGQIar4qJoh2juFAFy0vLRrm+GKU1lCNcs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:content-type:content-transfer-encoding:subject:date:message-id :to:mime-version:x-mailer; b=JAMxoWc5XOShioZLLXEbpdLc6has1rdi/wSTXtdbPYoadhg/f805FOyxXkgWtd2B2Z xwF3iqA2RMzwLuo6WBaQb5MV1yBknp/nOXHAvTfXbyRBngHfuOoy9Hxf1wV6aA6ucECx gLDay1srdNOHtLqhXCOA+hq87WzlVJpaOf9xo= Received: by 10.223.3.67 with SMTP id 3mr1019579fam.25.1264056467043; Wed, 20 Jan 2010 22:47:47 -0800 (PST) Received: from mbp-gige.totalterror.net (93-152-151-19.ddns.onlinedirect.bg [93.152.151.19]) by mx.google.com with ESMTPS id 18sm889209fks.4.2010.01.20.22.47.46 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 20 Jan 2010 22:47:46 -0800 (PST) From: Nikolay Denev Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Date: Thu, 21 Jan 2010 08:47:43 +0200 Message-Id: <871ACE35-EC52-4573-B623-C1AEF609A8D3@gmail.com> To: freebsd-fs@freebsd.org Mime-Version: 1.0 (Apple Message framework v1077) X-Mailer: Apple Mail (2.1077) Subject: ZFS zpool replace issues (if vfs.zfs.debug is not set) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jan 2010 06:47:48 -0000 Hello, I have 4 drives (ad4,ad5,ad6 and ad7) raidz1 pool with GPT partitions = like this: nas# gpart show ad6 =3D> 34 1953525101 ad6 GPT (932G) 34 128 1 freebsd-boot (64K) 162 350 - free - (175K) 512 400000 2 freebsd-swap (195M) 400512 1953124623 3 freebsd-zfs (931G) The freebsd-swap partition used to be UFS partition for /boot before gptzfsboot was able to boot from raidz pools. Then I've decided to get rid of these 195MB partitions by offlining the devices in my pool one by one, deleting the swap and zfs partitons, creating one new zfs partition that uses all the available space,=20 doing zpool replace and wait to resilver. So what I did was : =20 zpool offline zfs ad4p3 (so i can modify the disk with gpart) gpart delete -i2 ad4 gpart delete -i3 ad4 gpart add -b 512 -s 1953524623 -t freebsd-zfs ad4 and then when I've tried : =20 zpool replace zfs ad4p3 ad4p2=20 I got "permission denied" error. Strangely when I've enabled the sysctl "vfs.zfs.debug" and tried again the zpool replace succeeded. And this happened on all of the drives I've = replaced so far. =20 The other probably unrelated issue is that when I've started replacing = one of the vdevs in the pool another vdev suddenly changed it's name from adXp3 to = gptid/64149f82-44f0-11de-ae31-001ff3fc24c1 This seems really strange especially because I was not touching this = particular vdev... -- Regards, Nikolay Denev From owner-freebsd-fs@FreeBSD.ORG Thu Jan 21 07:39:53 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F1200106566B for ; Thu, 21 Jan 2010 07:39:53 +0000 (UTC) (envelope-from nveeser@gmail.com) Received: from mail-pw0-f44.google.com (mail-pw0-f44.google.com [209.85.160.44]) by mx1.freebsd.org (Postfix) with ESMTP id CD6188FC0C for ; Thu, 21 Jan 2010 07:39:53 +0000 (UTC) Received: by pwi15 with SMTP id 15so4033605pwi.3 for ; Wed, 20 Jan 2010 23:39:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=SJClpWtPaWK+CqywnZIRF/LijcleppgG5yT2t9Vyntc=; b=Dv/mRw326Wpaxh20N1xc8xt/My0jl/gvHzH7LRcYscx2gjvyULK3FbOyWeC35keHia pzsqhResnKgKfMMxUX8nHOZ2Hf5qHIDUa+S8ooFM3OHhhu/ujHiwhdE9bipvHu72vtj3 lAS1WMhcVAx98oepldoZzCNZoS7rDnm6vUgDI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=YIR4QRWslPGu7RdIWnMZvkgzsqKHR+/MSBNBi+9lw8auodvawLZ5BY3gpNZB8vVzg6 zmPa8j3CHhzC5Li/KzLHAWHqmYgm1oLMGSPfkoxql2xSxp/PkI4pshmKIhiDbJIZobaG aDn6ATm3TPHJ7aZ3iMVQkiB4Cp0coCVJN6xmo= MIME-Version: 1.0 Received: by 10.141.90.9 with SMTP id s9mr790204rvl.26.1264058135959; Wed, 20 Jan 2010 23:15:35 -0800 (PST) Date: Wed, 20 Jan 2010 23:15:35 -0800 Message-ID: <674b92e91001202315x7d835b22wc52029cd3159bcd6@mail.gmail.com> From: Nicholas Veeser To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Subject: ZFS Kernel Panic during long copy from zpool -> zpool X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jan 2010 07:39:54 -0000 Don't know if this is a know issue. Hopefully this is the right place I am running 8.0 stable. I have two zpools 1. 4 250G drives (version 13) (Sil3124 SATA) 2. 4 500G drives (version 14) (Sil3114 SATA) I am copying data from 1 -> 2 running rsync for 1-8 hours, I get a kernel panic. Basically it fails an assertion somewhere. Next time it happens, I will get exact details of where the assertion is. I am also trying to make sure I get an actual kernel dump, if that is useful. What other information, investigation, etc would be useful? How should I proceed? Nicholas From owner-freebsd-fs@FreeBSD.ORG Thu Jan 21 09:44:47 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 04990106568B for ; Thu, 21 Jan 2010 09:44:47 +0000 (UTC) (envelope-from noamscai.ml@gmail.com) Received: from mail-iw0-f171.google.com (mail-iw0-f171.google.com [209.85.223.171]) by mx1.freebsd.org (Postfix) with ESMTP id BF72D8FC08 for ; Thu, 21 Jan 2010 09:44:46 +0000 (UTC) Received: by iwn1 with SMTP id 1so4556778iwn.28 for ; Thu, 21 Jan 2010 01:44:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=1le2x7OmHOUJ9qVMC8V/OBxo5GSOcc+DJpRi8pBif78=; b=p/IkUV+3rzR/Eiy8+UcrA4vlskpSPeUOpa1TfbcqhigdC3v96AgChJ++5oB7fShzR5 HiqWXxRowTQBXbD370spk2GEjjkWqae8xAPif5sVn+4LiiE+TYmLBQ7HHAcqndgGKcgs EYfX7w7nlI2U+hkaH2+NXjQxXlNqlrNKjEcF0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=Pyz9GIS3DtLBex9vGdpsfcYCst0PpXmXOP5lJ5RgdV59lEqSZZEFrUQG01axi719EB UqruBkNCUysF4OvYSYdJIGZTyjky+ACXps93yGOA7TfC09DUMaXn31NWX7tRmwDw2P6c TpeoboImx7Pz9BGP1z/fe9v9bcDy/A1OY6B1E= MIME-Version: 1.0 Received: by 10.231.146.129 with SMTP id h1mr1195021ibv.71.1264067085426; Thu, 21 Jan 2010 01:44:45 -0800 (PST) In-Reply-To: <4c44ea791001210129r33f51731p31040e2cc690180a@mail.gmail.com> References: <4c44ea791001210129r33f51731p31040e2cc690180a@mail.gmail.com> Date: Thu, 21 Jan 2010 12:44:45 +0300 Message-ID: <4c44ea791001210144n4e91ca8aq9acae53416b87b8a@mail.gmail.com> From: Ivan Borodin To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Re: Manual partitioning for newbies X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jan 2010 09:44:47 -0000 911 sectors Noamscai@home $ From owner-freebsd-fs@FreeBSD.ORG Thu Jan 21 09:56:45 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EF3E8106566C for ; Thu, 21 Jan 2010 09:56:45 +0000 (UTC) (envelope-from noamscai.ml@gmail.com) Received: from mail-iw0-f171.google.com (mail-iw0-f171.google.com [209.85.223.171]) by mx1.freebsd.org (Postfix) with ESMTP id B62C78FC12 for ; Thu, 21 Jan 2010 09:56:45 +0000 (UTC) Received: by iwn1 with SMTP id 1so4562620iwn.28 for ; Thu, 21 Jan 2010 01:56:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=oypq3HGyclnv/Etuc/512Wz/xfJ6DQwP+J26lN4Nkb8=; b=WEL9n9k9fcPDoPBXBl5u/RvUKaMuoY7JLz2YPn5N9qukCZszh0HGMvYb85qJsle0N/ PkBG5Pw8VjZGpf217luw49pZB2t8OH/223oHZc042siB/GbU0XCg6BvCX0A0jhGnsqLo kFa75kFJl0KkBJLGjBiupQCt3muG2Ne+NwdPA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=PUVGECvwbqIRs3ZN8v7eTt4J3Gs/rOFrcS5FA0ArNpQAeC2rar4LNY10py8cUv62k7 cv0rKNVDWB4lAZP0DO83vUUBMTuVpPADY9Qmt4bF2Hln9bivXhgNsjSCPEKOOl0q2b94 MiTERkeeJUNzK9MBjGr5vREd96KLRW3/hA8bo= MIME-Version: 1.0 Received: by 10.231.146.134 with SMTP id h6mr411052ibv.16.1264066254180; Thu, 21 Jan 2010 01:30:54 -0800 (PST) Date: Thu, 21 Jan 2010 12:30:54 +0300 Message-ID: <4c44ea791001210130j75e220b2p7ffd5731d940aa1@mail.gmail.com> From: Ivan Borodin To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Manual partitioning for newbies X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jan 2010 09:56:46 -0000 iGreetings to respective residents. Excuse me, haven't found a mailing list for newbies... Experiencing difficulties... Could anybody help? Its about partitioning using fdisk -f and bsdlabel. Trying to assemble a sh script for housekeeping purposes installing freebsd onto a given target Even don't know what is the question... The critical lack of experience in subj stops me from gathering information myself. Supposed one wants to generate fdisk config file so he should have these magic numbers describing disk( usb storage, md, ... ) geometry. There's some mess with this geometry because of legacy problems, as i can see, and because my misunderstanding... Sysinstall offers some geometry, then "recalculates" it= , and if you don't let it to, the new system can be born unbootable, disk sizes returned by atacontrol differ from those returned by fdisk, no visibl= e means to determine flash drive parameters( ex me, i don't now how a google request about it should look like), this magick 63 offset for the first slice, mysterious "raw" part don't edit.. Can someone help with this? Needa create one config for layout with say 50m fat and FreeBSD on all remaining and another with FreeBSD using entire disk. And with bsdlabel+config, if i have the size of the slice, everything further is the question of arythmetics and appending a raw partition to the table. The exact partiton sizes need to be somehow calculated depending on fs or can be arbitrary and aproximate? And some waypoint.. errr... How can i get geometry parameters for ata, usb and md-devices? (c)Lame English Int. =FE=C9=D4=C1=D4=D8-=D4=CF =D0=CF-=D0=D2=CF=DD=C5 =C2= =D5=C4=C5=D4, =C4=C1... ------------------ [root@ ~]# fdisk -s /dev/ad5 /dev/ad5: 19383 cyl 16 hd 63 sec Part Start Size Type Flags 1: 63 19538001 0xa5 0x80 [root@ ~]# atacontrol cap ad5 | grep lba lba supported 19538975 sectors lba48 not supported Seems like sysinstall threw away 974 sectors while partitioning.. Noamasca@home$ From owner-freebsd-fs@FreeBSD.ORG Thu Jan 21 11:36:38 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 25DC5106566B for ; Thu, 21 Jan 2010 11:36:38 +0000 (UTC) (envelope-from dimitry@andric.com) Received: from tensor.andric.com (cl-327.ede-01.nl.sixxs.net [IPv6:2001:7b8:2ff:146::2]) by mx1.freebsd.org (Postfix) with ESMTP id E21DC8FC08 for ; Thu, 21 Jan 2010 11:36:37 +0000 (UTC) Received: from [IPv6:2001:7b8:3a7:0:d495:647d:1ab9:3128] (unknown [IPv6:2001:7b8:3a7:0:d495:647d:1ab9:3128]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by tensor.andric.com (Postfix) with ESMTPSA id A24745C59; Thu, 21 Jan 2010 12:36:36 +0100 (CET) Message-ID: <4B583C44.5060107@andric.com> Date: Thu, 21 Jan 2010 12:36:36 +0100 From: Dimitry Andric User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.9.2pre) Gecko/20100115 Lanikai/3.1a1pre MIME-Version: 1.0 To: Ivan Borodin References: <4c44ea791001210130j75e220b2p7ffd5731d940aa1@mail.gmail.com> In-Reply-To: <4c44ea791001210130j75e220b2p7ffd5731d940aa1@mail.gmail.com> Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: Manual partitioning for newbies X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jan 2010 11:36:38 -0000 On 2010-01-21 10:30, Ivan Borodin wrote: > Excuse me, haven't found a mailing list for newbies... That would be freebsd-questions@, AFAIK. > [root@ ~]# fdisk -s /dev/ad5 > /dev/ad5: 19383 cyl 16 hd 63 sec > Part Start Size Type Flags > 1: 63 19538001 0xa5 0x80 > > [root@ ~]# atacontrol cap ad5 | grep lba > lba supported 19538975 sectors > lba48 not supported > > Seems like sysinstall threw away 974 sectors while partitioning.. This is because partitions are aligned to cylinders by default. In your case, the disk says it has 19383 cylinders, 16 heads and 63 sectors per track, so that is 19538064 sectors total. In some cases, disks advertise more LBA sectors than they would seem to have if you calculate cyl * hd * sec. This is because the 'geometry' is entirely fake, and sometimes doesn't fit the real number of available sectors. So in your case, there are 911 sectors slack space at the end. Not the end of the world. :) Btw, if 'dangerously dedicated' partitions are still supported in your version of sysinstall, you might be able to use the whole LBA capacity for your slices. From owner-freebsd-fs@FreeBSD.ORG Thu Jan 21 13:05:10 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 72BF1106566B for ; Thu, 21 Jan 2010 13:05:10 +0000 (UTC) (envelope-from noamscai.ml@gmail.com) Received: from mail-iw0-f171.google.com (mail-iw0-f171.google.com [209.85.223.171]) by mx1.freebsd.org (Postfix) with ESMTP id 38E628FC0C for ; Thu, 21 Jan 2010 13:05:09 +0000 (UTC) Received: by iwn1 with SMTP id 1so4668845iwn.28 for ; Thu, 21 Jan 2010 05:05:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=l8nFCK/YYfA7Z6iA4QzXsc2MBQwEZVba46Ng87ish4Q=; b=TrwK7bTk3KR5PlukFEKyhtihW9apOi1v1IWHiuw0o1fOcC9mvowmZ9ceGX1T36G506 GTe2KCmnWDzq3YO/upsgeEsCxDSbyG+RRcObkF9nefsH86V9iVwjHzliJm+YjQ/MugHU zy0rlzaYZdqxJdtYUrtg1lnP/9vNphGINsmqQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=PDattiaBdkMmt0PJ06mo4wx5uUQA2cXD9mwZAmTYyRQuDfhEeLuGUbwtjenMbp1ZbQ ttaUx/pS3ICG/Tkujh+Zc3SG0jkWp2QQqN3669s71vcf5gNzmspiB4HOzi+4Lz+vEdXl SqOAA/2wW2V9myso8YvU4Dx8fuF+o3jXVgTfk= MIME-Version: 1.0 Received: by 10.231.40.216 with SMTP id l24mr2258169ibe.40.1264079109439; Thu, 21 Jan 2010 05:05:09 -0800 (PST) In-Reply-To: <4c44ea791001210504q10c79256y6b3a6245ad89414@mail.gmail.com> References: <4c44ea791001210130j75e220b2p7ffd5731d940aa1@mail.gmail.com> <4B583C44.5060107@andric.com> <4c44ea791001210504q10c79256y6b3a6245ad89414@mail.gmail.com> Date: Thu, 21 Jan 2010 16:05:09 +0300 Message-ID: <4c44ea791001210505r268be444wda2828ab09a374d7@mail.gmail.com> From: Ivan Borodin To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Re: Manual partitioning for newbies X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jan 2010 13:05:10 -0000 ---------- Forwarded message ---------- From: Ivan Borodin Date: Thu, Jan 21, 2010 at 4:04 PM Subject: Re: Manual partitioning for newbies To: Dimitry Andric This is because partitions are aligned to cylinders by default. In your > case, the disk says it has 19383 cylinders, 16 heads and 63 sectors per > track, so that is 19538064 sectors total. > > In some cases, disks advertise more LBA sectors than they would seem to > have if you calculate cyl * hd * sec. This is because the 'geometry' is > entirely fake, and sometimes doesn't fit the real number of available > sectors. > > So, dealing with housekeeping tasks it's enough to rely on atacontrol's info about sectors and heads, calculate whole capacity as $lba/sec/hd*sec*hds, leave 63 offset and feel free with futher partitioning, cause from now on its no more than numbers. And in case of flat storages like usb-drives and mds can simply set 255hd&63sec or any other, cz its much easier than calculating values that'd fit better to the size. ..aaaand if i decide to go into the deep, i'll find out that today ata-disks' real geometry is hidden behind their controllers and the info about heads and sectors they share with the rest of the world is the way to establish comunication with bios and fs-dtivers in a traditional way... Do i get this all right? noamscai@home From owner-freebsd-fs@FreeBSD.ORG Thu Jan 21 16:02:11 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CE2C01065676 for ; Thu, 21 Jan 2010 16:02:11 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74]) by mx1.freebsd.org (Postfix) with ESMTP id 7FCCA8FC16 for ; Thu, 21 Jan 2010 16:02:11 +0000 (UTC) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by blade.simplesystems.org (8.13.8+Sun/8.13.8) with ESMTP id o0LG2AHF024190; Thu, 21 Jan 2010 10:02:10 -0600 (CST) Date: Thu, 21 Jan 2010 10:02:10 -0600 (CST) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: Nicholas Veeser In-Reply-To: <674b92e91001202315x7d835b22wc52029cd3159bcd6@mail.gmail.com> Message-ID: References: <674b92e91001202315x7d835b22wc52029cd3159bcd6@mail.gmail.com> User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (blade.simplesystems.org [65.66.246.90]); Thu, 21 Jan 2010 10:02:10 -0600 (CST) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS Kernel Panic during long copy from zpool -> zpool X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jan 2010 16:02:11 -0000 On Wed, 20 Jan 2010, Nicholas Veeser wrote: > What other information, investigation, etc would be useful? > How should I proceed? I would make sure that both pools are ok by doing a 'zpool scrub' on them. Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ From owner-freebsd-fs@FreeBSD.ORG Thu Jan 21 18:25:13 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 76D7D106568F for ; Thu, 21 Jan 2010 18:25:13 +0000 (UTC) (envelope-from doug@polands.org) Received: from hrndva-omtalb.mail.rr.com (hrndva-omtalb.mail.rr.com [71.74.56.122]) by mx1.freebsd.org (Postfix) with ESMTP id 32DAE8FC16 for ; Thu, 21 Jan 2010 18:25:12 +0000 (UTC) Received: from hrndva-omtalb.mail.rr.com ([10.128.143.53]) by hrndva-qmta04.mail.rr.com with ESMTP id <20100121180645097.VVRG22884@hrndva-qmta04.mail.rr.com> for ; Thu, 21 Jan 2010 18:06:45 +0000 X-Authority-Analysis: v=1.0 c=1 a=zE3mbNJvjOkA:10 a=370Rr2IVWakicQaKVlIA:9 a=-8jeenipcSc8JWl-pzubFp3jUpQA:4 X-Cloudmark-Score: 0 X-Originating-IP: 75.87.219.217 Received: from [75.87.219.217] ([75.87.219.217:52446] helo=haran.polands.org) by hrndva-oedge03.mail.rr.com (envelope-from ) (ecelerity 2.2.2.39 r()) with ESMTP id 9E/D4-05903-F67985B4; Thu, 21 Jan 2010 18:05:35 +0000 Received: from [172.16.1.37] (sichem-wifi.polands.org [172.16.1.37]) by haran.polands.org (8.14.3/8.14.3) with ESMTP id o0LI5YGC026970 for ; Thu, 21 Jan 2010 12:05:34 -0600 (CST) (envelope-from doug@polands.org) Message-ID: <4B58976E.1020402@polands.org> Date: Thu, 21 Jan 2010 12:05:34 -0600 From: Doug Poland User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.1.5) Gecko/20091204 Thunderbird/3.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Repeatable ZFS "kmem map too small" panic on 8.0-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jan 2010 18:25:13 -0000 Hello, I've got an 8.0-STABLE (amd64) box with 4GB RAM. The machine is running off a 6-disk RAIDZ1 booting from GPT. The box consistently panics on unixbench's fsdisk program. I have been gathering some metrics in an attempt to isolate the parameters that are significant, but I admit I do not really understand all the relationships. At this point, I have nothing set in /boot/loader.conf Here are some of the values I was recording within seconds of the panic: # dmesg | grep memory real memory = 4294967296 (4096 MB) avail memory = 3961372672 (3777 MB) kstat.zfs.misc.arcstats.size: 308522248 vfs.zfs.arc_max: 829480960 vfs.zfs.arc_meta_limit: 207370240 vfs.zfs.arc_meta_used: 165575944 vfs.zfs.arc_min: 103685120 vm.kmem_size: 1327169536 vm.kmem_size_max: 329853485875 # vmstat -m | egrep 'InUse|solaris' Type InUse MemUse HighUse solaris 491349 1316172K - % zpool status pool: bethesda state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM bethesda ONLINE 0 0 0 raidz1 ONLINE 0 0 0 gpt/disk0 ONLINE 0 0 0 gpt/disk1 ONLINE 0 0 0 gpt/disk2 ONLINE 0 0 0 gpt/disk3 ONLINE 0 0 0 gpt/disk4 ONLINE 0 0 0 gpt/disk5 ONLINE 0 0 0 errors: No known data errors My concern is if I can panic the box with a simple file system benchmark, what will happen when I rysnc files across a 1GB LAN connection? I am very willing to run any number of tests and tweek ZFS as necessary. Please advise. -- Regards, Doug From owner-freebsd-fs@FreeBSD.ORG Thu Jan 21 18:34:16 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 57E231065672 for ; Thu, 21 Jan 2010 18:34:16 +0000 (UTC) (envelope-from mj@feral.com) Received: from ns1.feral.com (ns1.feral.com [192.67.166.1]) by mx1.freebsd.org (Postfix) with ESMTP id EC1128FC18 for ; Thu, 21 Jan 2010 18:34:15 +0000 (UTC) Received: from [10.8.0.2] (remotevpn [10.8.0.2]) by ns1.feral.com (8.14.3/8.14.3) with ESMTP id o0LIYDW0096132 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Thu, 21 Jan 2010 10:34:14 -0800 (PST) (envelope-from mj@feral.com) Message-ID: <4B589E25.3010608@feral.com> Date: Thu, 21 Jan 2010 10:34:13 -0800 From: Matthew Jacob Organization: Feral Software User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.5) Gecko/20091209 Fedora/3.0-3.fc11 Thunderbird/3.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <4B58976E.1020402@polands.org> In-Reply-To: <4B58976E.1020402@polands.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender DNS name whitelisted, not delayed by milter-greylist-4.2.3 (ns1.feral.com [10.8.0.1]); Thu, 21 Jan 2010 10:34:14 -0800 (PST) Subject: Re: Repeatable ZFS "kmem map too small" panic on 8.0-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: mj@feral.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jan 2010 18:34:16 -0000 Can you get a stack traceback? > Hello, > > I've got an 8.0-STABLE (amd64) box with 4GB RAM. The machine is > running off a 6-disk RAIDZ1 booting from GPT. The box consistently > panics on unixbench's fsdisk program. > > I have been gathering some metrics in an attempt to isolate the > parameters that are significant, but I admit I do not really > understand all the relationships. At this point, I have nothing set > in /boot/loader.conf > > Here are some of the values I was recording within seconds of the panic: > > # dmesg | grep memory > real memory = 4294967296 (4096 MB) > avail memory = 3961372672 (3777 MB) > > kstat.zfs.misc.arcstats.size: 308522248 > vfs.zfs.arc_max: 829480960 > vfs.zfs.arc_meta_limit: 207370240 > vfs.zfs.arc_meta_used: 165575944 > vfs.zfs.arc_min: 103685120 > vm.kmem_size: 1327169536 > vm.kmem_size_max: 329853485875 > > # vmstat -m | egrep 'InUse|solaris' > Type InUse MemUse HighUse > solaris 491349 1316172K - > > % zpool status > pool: bethesda > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > bethesda ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > gpt/disk0 ONLINE 0 0 0 > gpt/disk1 ONLINE 0 0 0 > gpt/disk2 ONLINE 0 0 0 > gpt/disk3 ONLINE 0 0 0 > gpt/disk4 ONLINE 0 0 0 > gpt/disk5 ONLINE 0 0 0 > > errors: No known data errors > > My concern is if I can panic the box with a simple file system > benchmark, what will happen when I rysnc files across a 1GB LAN > connection? I am very willing to run any number of tests and tweek > ZFS as necessary. Please advise. > > From owner-freebsd-fs@FreeBSD.ORG Thu Jan 21 19:06:04 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E1C12106568D for ; Thu, 21 Jan 2010 19:06:04 +0000 (UTC) (envelope-from mcdouga9@egr.msu.edu) Received: from mx.egr.msu.edu (surfnturf.egr.msu.edu [35.9.37.164]) by mx1.freebsd.org (Postfix) with ESMTP id B34488FC08 for ; Thu, 21 Jan 2010 19:06:04 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mx.egr.msu.edu (Postfix) with ESMTP id 3FE6CEE47D; Thu, 21 Jan 2010 13:43:54 -0500 (EST) X-Virus-Scanned: amavisd-new at egr.msu.edu Received: from mx.egr.msu.edu ([127.0.0.1]) by localhost (surfnturf.egr.msu.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TMYxvJwxKnoD; Thu, 21 Jan 2010 13:43:54 -0500 (EST) Received: from [35.9.44.65] (daemon.egr.msu.edu [35.9.44.65]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: mcdouga9) by mx.egr.msu.edu (Postfix) with ESMTPSA id F3E50EE47A; Thu, 21 Jan 2010 13:43:53 -0500 (EST) Message-ID: <4B58A069.8000802@egr.msu.edu> Date: Thu, 21 Jan 2010 13:43:53 -0500 From: Adam McDougall User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.1.5) Gecko/20100103 Thunderbird/3.0 MIME-Version: 1.0 To: doug@polands.org References: <4B58976E.1020402@polands.org> In-Reply-To: <4B58976E.1020402@polands.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: Repeatable ZFS "kmem map too small" panic on 8.0-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jan 2010 19:06:05 -0000 Put this in /boot/loader.conf: vm.kmem_size="20G" It is intentionally higher than your amount of ram. On 01/21/10 13:05, Doug Poland wrote: > Hello, > > I've got an 8.0-STABLE (amd64) box with 4GB RAM. The machine is running > off a 6-disk RAIDZ1 booting from GPT. The box consistently panics on > unixbench's fsdisk program. > > I have been gathering some metrics in an attempt to isolate the > parameters that are significant, but I admit I do not really understand > all the relationships. At this point, I have nothing set in > /boot/loader.conf > > Here are some of the values I was recording within seconds of the panic: > > # dmesg | grep memory > real memory = 4294967296 (4096 MB) > avail memory = 3961372672 (3777 MB) > > kstat.zfs.misc.arcstats.size: 308522248 > vfs.zfs.arc_max: 829480960 > vfs.zfs.arc_meta_limit: 207370240 > vfs.zfs.arc_meta_used: 165575944 > vfs.zfs.arc_min: 103685120 > vm.kmem_size: 1327169536 > vm.kmem_size_max: 329853485875 > > # vmstat -m | egrep 'InUse|solaris' > Type InUse MemUse HighUse > solaris 491349 1316172K - > > % zpool status > pool: bethesda > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > bethesda ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > gpt/disk0 ONLINE 0 0 0 > gpt/disk1 ONLINE 0 0 0 > gpt/disk2 ONLINE 0 0 0 > gpt/disk3 ONLINE 0 0 0 > gpt/disk4 ONLINE 0 0 0 > gpt/disk5 ONLINE 0 0 0 > > errors: No known data errors > > My concern is if I can panic the box with a simple file system > benchmark, what will happen when I rysnc files across a 1GB LAN > connection? I am very willing to run any number of tests and tweek ZFS > as necessary. Please advise. > > From owner-freebsd-fs@FreeBSD.ORG Thu Jan 21 19:39:41 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2F004106566B for ; Thu, 21 Jan 2010 19:39:40 +0000 (UTC) (envelope-from doug@polands.org) Received: from hrndva-omtalb.mail.rr.com (hrndva-omtalb.mail.rr.com [71.74.56.125]) by mx1.freebsd.org (Postfix) with ESMTP id A58CF8FC19 for ; Thu, 21 Jan 2010 19:39:40 +0000 (UTC) X-Authority-Analysis: v=1.0 c=1 a=GSN_Y9T6cv4A:10 a=MWEe3nmaG1t0ArqGX8AA:9 a=OUZZS9D8PIBbVN6dIZn3FrZrN-QA:4 X-Cloudmark-Score: 0 X-Originating-IP: 75.87.219.217 Received: from [75.87.219.217] ([75.87.219.217:56018] helo=haran.polands.org) by hrndva-oedge02.mail.rr.com (envelope-from ) (ecelerity 2.2.2.39 r()) with ESMTP id 2A/2F-11553-B7DA85B4; Thu, 21 Jan 2010 19:39:39 +0000 Received: from [172.16.1.37] (sichem-wifi.polands.org [172.16.1.37]) by haran.polands.org (8.14.3/8.14.3) with ESMTP id o0LJdYtt027262; Thu, 21 Jan 2010 13:39:35 -0600 (CST) (envelope-from doug@polands.org) Message-ID: <4B58AD76.6000707@polands.org> Date: Thu, 21 Jan 2010 13:39:34 -0600 From: Doug Poland User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.1.5) Gecko/20091204 Thunderbird/3.0 MIME-Version: 1.0 To: mj@feral.com References: <4B58976E.1020402@polands.org> <4B589E25.3010608@feral.com> In-Reply-To: <4B589E25.3010608@feral.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: Repeatable ZFS "kmem map too small" panic on 8.0-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jan 2010 19:39:41 -0000 On 2010-01-21 12:34, Matthew Jacob wrote: > Can you get a stack traceback? > I am willing, but would need pointers to docs on how to make this happen. -- Regards, Doug From owner-freebsd-fs@FreeBSD.ORG Thu Jan 21 20:43:54 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 68444106566C for ; Thu, 21 Jan 2010 20:43:54 +0000 (UTC) (envelope-from gcorcoran@rcn.com) Received: from smtp02.lnh.mail.rcn.net (smtp02.lnh.mail.rcn.net [207.172.157.102]) by mx1.freebsd.org (Postfix) with ESMTP id 1ABA88FC08 for ; Thu, 21 Jan 2010 20:43:53 +0000 (UTC) Received: from mr08.lnh.mail.rcn.net ([207.172.157.28]) by smtp02.lnh.mail.rcn.net with ESMTP; 21 Jan 2010 15:43:53 -0500 Received: from smtp01.lnh.mail.rcn.net (smtp01.lnh.mail.rcn.net [207.172.4.11]) by mr08.lnh.mail.rcn.net (MOS 3.10.7-GA) with ESMTP id LJH05590; Thu, 21 Jan 2010 15:43:50 -0500 (EST) X-Auth-ID: gcorcoran Received: from 216-164-180-100.c3-0.tlg-ubr8.atw-tlg.pa.cable.rcn.com (HELO [10.56.78.161]) ([216.164.180.100]) by smtp01.lnh.mail.rcn.net with ESMTP; 21 Jan 2010 15:43:47 -0500 Message-ID: <4B58BD2D.30803@rcn.com> Date: Thu, 21 Jan 2010 15:46:37 -0500 From: Gary Corcoran User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) MIME-Version: 1.0 To: Adam McDougall References: <4B58976E.1020402@polands.org> <4B58A069.8000802@egr.msu.edu> In-Reply-To: <4B58A069.8000802@egr.msu.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Junkmail-Whitelist: YES (by domain whitelist at mr08.lnh.mail.rcn.net) Cc: freebsd-fs@freebsd.org Subject: Re: Repeatable ZFS "kmem map too small" panic on 8.0-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jan 2010 20:43:54 -0000 Adam McDougall wrote: > Put this in /boot/loader.conf: > vm.kmem_size="20G" > > It is intentionally higher than your amount of ram. Would you mind explaining... 1) why this fixes the kmem_map too small problem ? 2) why it should be larger than the amount of RAM, and by how much ? Thanks, Gary > On 01/21/10 13:05, Doug Poland wrote: >> Hello, >> >> I've got an 8.0-STABLE (amd64) box with 4GB RAM. The machine is running >> off a 6-disk RAIDZ1 booting from GPT. The box consistently panics on >> unixbench's fsdisk program. >> >> I have been gathering some metrics in an attempt to isolate the >> parameters that are significant, but I admit I do not really understand >> all the relationships. At this point, I have nothing set in >> /boot/loader.conf >> >> Here are some of the values I was recording within seconds of the panic: >> >> # dmesg | grep memory >> real memory = 4294967296 (4096 MB) >> avail memory = 3961372672 (3777 MB) >> >> kstat.zfs.misc.arcstats.size: 308522248 >> vfs.zfs.arc_max: 829480960 >> vfs.zfs.arc_meta_limit: 207370240 >> vfs.zfs.arc_meta_used: 165575944 >> vfs.zfs.arc_min: 103685120 >> vm.kmem_size: 1327169536 >> vm.kmem_size_max: 329853485875 >> >> # vmstat -m | egrep 'InUse|solaris' >> Type InUse MemUse HighUse >> solaris 491349 1316172K - >> >> % zpool status >> pool: bethesda >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> bethesda ONLINE 0 0 0 >> raidz1 ONLINE 0 0 0 >> gpt/disk0 ONLINE 0 0 0 >> gpt/disk1 ONLINE 0 0 0 >> gpt/disk2 ONLINE 0 0 0 >> gpt/disk3 ONLINE 0 0 0 >> gpt/disk4 ONLINE 0 0 0 >> gpt/disk5 ONLINE 0 0 0 >> >> errors: No known data errors >> >> My concern is if I can panic the box with a simple file system >> benchmark, what will happen when I rysnc files across a 1GB LAN >> connection? I am very willing to run any number of tests and tweek ZFS >> as necessary. Please advise. >> From owner-freebsd-fs@FreeBSD.ORG Thu Jan 21 22:21:37 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 33FBD106568B for ; Thu, 21 Jan 2010 22:21:37 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-iw0-f171.google.com (mail-iw0-f171.google.com [209.85.223.171]) by mx1.freebsd.org (Postfix) with ESMTP id C9FE58FC14 for ; Thu, 21 Jan 2010 22:21:36 +0000 (UTC) Received: by iwn1 with SMTP id 1so462106iwn.28 for ; Thu, 21 Jan 2010 14:21:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type; bh=FN1/oI683hZiYEsRodLCu4ZS1+4HIjRo2uYMdzy57E4=; b=OYVZ93sU9qEW9vaXnKUzPo2N50JqorSMD9Em96VDB7mRf7L5qnTHgqSorZafIY3zhl 03kgvX0v4xV7MF4Xi8fBiA2NKhny+16jt82HCKcBOAdF6c4Ko0rBWeOZqFDA/Q6vqOwD pxgLmyT0nSoXR9zu3osdq4VqwFgAQHpWSSe6Y= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=Q7ALshnkrQN8Bj1TOqFpoSNMQzdk0AYm0EosduYPphO1peN/Qvk2UX2fYiEFDCuTHo ZUSolXv/mX+mxQxRRfVfhYzgQdpbIMs2QbKumEo9al6Z1Xs9kpnJtHEPwc6vaG7TbcCO eay1zHvpOpjHs9AjhV53zOKStcjEIrM53JR7s= MIME-Version: 1.0 Sender: artemb@gmail.com Received: by 10.231.147.210 with SMTP id m18mr3365368ibv.48.1264112494583; Thu, 21 Jan 2010 14:21:34 -0800 (PST) In-Reply-To: <4B58BD2D.30803@rcn.com> References: <4B58976E.1020402@polands.org> <4B58A069.8000802@egr.msu.edu> <4B58BD2D.30803@rcn.com> Date: Thu, 21 Jan 2010 14:21:34 -0800 X-Google-Sender-Auth: 53b91750d8fa3c0a Message-ID: From: Artem Belevich To: Gary Corcoran Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org Subject: Re: Repeatable ZFS "kmem map too small" panic on 8.0-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jan 2010 22:21:37 -0000 On Thu, Jan 21, 2010 at 12:46 PM, Gary Corcoran wrote: > Adam McDougall wrote: >> >> Put this in /boot/loader.conf: >> vm.kmem_size="20G" >> >> It is intentionally higher than your amount of ram. > > Would you mind explaining... > 1) why this fixes the kmem_map too small problem ? Because it explicitly makes kmem_map larger. > 2) why it should be larger than the amount of RAM, and by how much ? ZFS needs access to a lot of memory for ARC and it allocates/frees memory fairly randomly. That raises two issues. First issue is that kernel is by default fairly conservative about its memory needs. vm.kmem_size which limits address space for in-kernel memory allocations is by default set to a fairly low value which works reasonably well in most of the cases. However, for ZFS it needs to be bumped up allow large amounts of memory to be allocated by ZFS. Second problem is memory fragmentation. If you set vm.kmem_size == physical memory size, over time you may end up with a situation when there is plenty of physical memory available, but there is no single contiguous block of address space to map that memory into. FreeBSD allocator is pretty good about avoiding fragmentation but you still do need more address space than the amount of memory that could potentially be allocated. I'd say that vm.kmem_size should be few multiples of amounts of memory that you're planning to allocate. Just my $0.02 --Artem From owner-freebsd-fs@FreeBSD.ORG Thu Jan 21 22:41:05 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4ECC51065670 for ; Thu, 21 Jan 2010 22:41:05 +0000 (UTC) (envelope-from mcdouga9@egr.msu.edu) Received: from mx.egr.msu.edu (surfnturf.egr.msu.edu [35.9.37.164]) by mx1.freebsd.org (Postfix) with ESMTP id 1F6D48FC15 for ; Thu, 21 Jan 2010 22:41:04 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mx.egr.msu.edu (Postfix) with ESMTP id EF6B3EE879; Thu, 21 Jan 2010 17:27:31 -0500 (EST) X-Virus-Scanned: amavisd-new at egr.msu.edu Received: from mx.egr.msu.edu ([127.0.0.1]) by localhost (surfnturf.egr.msu.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cAaGVpmm-sbq; Thu, 21 Jan 2010 17:27:31 -0500 (EST) Received: from [35.9.44.65] (daemon.egr.msu.edu [35.9.44.65]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: mcdouga9) by mx.egr.msu.edu (Postfix) with ESMTPSA id ACB18EE873; Thu, 21 Jan 2010 17:27:31 -0500 (EST) Message-ID: <4B58D4D3.80009@egr.msu.edu> Date: Thu, 21 Jan 2010 17:27:31 -0500 From: Adam McDougall User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.1.5) Gecko/20100103 Thunderbird/3.0 MIME-Version: 1.0 To: Artem Belevich References: <4B58976E.1020402@polands.org> <4B58A069.8000802@egr.msu.edu> <4B58BD2D.30803@rcn.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: Repeatable ZFS "kmem map too small" panic on 8.0-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jan 2010 22:41:05 -0000 On 01/21/10 17:21, Artem Belevich wrote: > On Thu, Jan 21, 2010 at 12:46 PM, Gary Corcoran wrote: >> Adam McDougall wrote: >>> >>> Put this in /boot/loader.conf: >>> vm.kmem_size="20G" >>> >>> It is intentionally higher than your amount of ram. >> >> Would you mind explaining... >> 1) why this fixes the kmem_map too small problem ? > > Because it explicitly makes kmem_map larger. > >> 2) why it should be larger than the amount of RAM, and by how much ? > > ZFS needs access to a lot of memory for ARC and it allocates/frees > memory fairly randomly. That raises two issues. > > First issue is that kernel is by default fairly conservative about its > memory needs. vm.kmem_size which limits address space for in-kernel > memory allocations is by default set to a fairly low value which works > reasonably well in most of the cases. However, for ZFS it needs to be > bumped up allow large amounts of memory to be allocated by ZFS. > > Second problem is memory fragmentation. If you set vm.kmem_size == > physical memory size, over time you may end up with a situation when > there is plenty of physical memory available, but there is no single > contiguous block of address space to map that memory into. FreeBSD > allocator is pretty good about avoiding fragmentation but you still do > need more address space than the amount of memory that could > potentially be allocated. I'd say that vm.kmem_size should be few > multiples of amounts of memory that you're planning to allocate. > > Just my $0.02 > > --Artem > Exactly what I would have said, thanks :) I'd imagine the kmem_size could be much larger still, closer to kmem_size_max, but I just picked 20G as a default for my servers that have 8G or less and I haven't seen an out of kmem panic for as long as I could raise kmem_size sufficiently high (a change was made around 6 months ago). kmem_size doesn't seem to "grow" (much?) towards kmem_size_max, it is what it is, and you need to make sure it is big enough for your needs. I have systems with just one gig and they run fine. From owner-freebsd-fs@FreeBSD.ORG Fri Jan 22 04:28:47 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2A4081065692 for ; Fri, 22 Jan 2010 04:28:47 +0000 (UTC) (envelope-from djp@polands.org) Received: from hrndva-omtalb.mail.rr.com (hrndva-omtalb.mail.rr.com [71.74.56.122]) by mx1.freebsd.org (Postfix) with ESMTP id D54D38FC16 for ; Fri, 22 Jan 2010 04:28:46 +0000 (UTC) X-Authority-Analysis: v=1.0 c=1 a=GSN_Y9T6cv4A:10 a=OA2lqS22AAAA:8 a=VDDltm6BRT3w7GqL0QUA:9 a=rtW85IiFBK7f9twzNHwA:7 a=x49OsMu7hDFgt7HixzlrjwBuJWQA:4 a=ZZAfTtC2Ym4A:10 a=azDIrACOat2JRc_j:21 a=TT84m8gFpZP1GIKG:21 X-Cloudmark-Score: 0 X-Originating-IP: 75.87.219.217 Received: from [75.87.219.217] ([75.87.219.217:52983] helo=haran.polands.org) by hrndva-oedge03.mail.rr.com (envelope-from ) (ecelerity 2.2.2.39 r()) with ESMTP id EA/9C-05903-D79295B4; Fri, 22 Jan 2010 04:28:45 +0000 Received: from moab.polands.org (moab.polands.org [172.16.1.8]) by haran.polands.org (8.14.3/8.14.3) with ESMTP id o0M4Si9U028661; Thu, 21 Jan 2010 22:28:44 -0600 (CST) (envelope-from djp@polands.org) Received: from moab.polands.org (localhost [127.0.0.1]) by moab.polands.org (8.14.3/8.14.3) with ESMTP id o0M4SiH8008964; Thu, 21 Jan 2010 22:28:44 -0600 (CST) (envelope-from djp@moab.polands.org) Received: (from djp@localhost) by moab.polands.org (8.14.3/8.14.3/Submit) id o0M4ShMH008963; Thu, 21 Jan 2010 22:28:43 -0600 (CST) (envelope-from djp) Date: Thu, 21 Jan 2010 22:28:43 -0600 From: Doug Poland To: Adam McDougall Message-ID: <20100122042843.GA8858@polands.org> References: <4B58976E.1020402@polands.org> <4B58A069.8000802@egr.msu.edu> <4B58BD2D.30803@rcn.com> <4B58D4D3.80009@egr.msu.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4B58D4D3.80009@egr.msu.edu> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-fs@freebsd.org Subject: Re: Repeatable ZFS "kmem map too small" panic on 8.0-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jan 2010 04:28:47 -0000 On Thu, Jan 21, 2010 at 05:27:31PM -0500, Adam McDougall wrote: > On 01/21/10 17:21, Artem Belevich wrote: > >On Thu, Jan 21, 2010 at 12:46 PM, Gary Corcoran wrote: > >>Adam McDougall wrote: > >>> > >>>Put this in /boot/loader.conf: > >>>vm.kmem_size="20G" > >>> > >>>It is intentionally higher than your amount of ram. > >> > >>Would you mind explaining... > >>1) why this fixes the kmem_map too small problem ? > > > >Because it explicitly makes kmem_map larger. > > > >>2) why it should be larger than the amount of RAM, and by how much ? > > > >ZFS needs access to a lot of memory for ARC and it allocates/frees > >memory fairly randomly. That raises two issues. > > > >First issue is that kernel is by default fairly conservative about > >its memory needs. vm.kmem_size which limits address space for > >in-kernel memory allocations is by default set to a fairly low value > >which works reasonably well in most of the cases. However, for ZFS it > >needs to be bumped up allow large amounts of memory to be allocated > >by ZFS. > > > >Second problem is memory fragmentation. If you set vm.kmem_size == > >physical memory size, over time you may end up with a situation when > >there is plenty of physical memory available, but there is no single > >contiguous block of address space to map that memory into. FreeBSD > >allocator is pretty good about avoiding fragmentation but you still > >do need more address space than the amount of memory that could > >potentially be allocated. I'd say that vm.kmem_size should be few > >multiples of amounts of memory that you're planning to allocate. > > > >Just my $0.02 > > > > Exactly what I would have said, thanks :) I'd imagine the kmem_size > could be much larger still, closer to kmem_size_max, but I just picked > 20G as a default for my servers that have 8G or less and I haven't > seen an out of kmem panic for as long as I could raise kmem_size > sufficiently high (a change was made around 6 months ago). kmem_size > doesn't seem to "grow" (much?) towards kmem_size_max, it is what it > is, and you need to make sure it is big enough for your needs. I have > systems with just one gig and they run fine. > Interesting discussion :) I added vm.kmem_size="20G" to /boot/loader.conf per your instructions. This time, it didn't panic at the same point in the test, however, it appears the filesystem is "hanging". On the fdisk test, I hit T and get: cmd: fsdisk 37066 [zio->io_cv)] 245.62r 0.12u 25.10s 0 My various metrics are still running, but anything that needs the filesystem appears "stuck". The memory usage of the item "solaris" (vmstat -m | grep solaris) spiked at 3334781952 (3180.30 MiB). # zpool iostat 2, a T shows: load: 0.00 cmd: zpool 934 [tx->tx_quiesce_done_cv)] 2052.45r 0.06 u 0.39s 0% 0k # vmstat -v | grep solaris to disk every second and it's hung at: load: 0.00 cmd: sh 38551 [zfs] 909.85r 0.00u 0.00s 0% 16k Any suggestions! -- Regards, Doug From owner-freebsd-fs@FreeBSD.ORG Fri Jan 22 06:09:20 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C04151065670 for ; Fri, 22 Jan 2010 06:09:20 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-iw0-f174.google.com (mail-iw0-f174.google.com [209.85.223.174]) by mx1.freebsd.org (Postfix) with ESMTP id 831BB8FC14 for ; Fri, 22 Jan 2010 06:09:20 +0000 (UTC) Received: by iwn4 with SMTP id 4so130021iwn.27 for ; Thu, 21 Jan 2010 22:09:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=BC8lzL27/haxcmoEmRUIhTTL8UYM8FQ/2IuAOA6ajMM=; b=lPUimAo/X15DEwYUfBAEOF3AuWCcc9VX26+BVaQ9ZpRpropQsphteEcj1H/DR+ei+B TMi5VNoUo5qeNCNt9GAEHY/wYhHErCOUaHr/1WOqdWd0STDzrPHxIWLyVrKsZItmILTg uqs9HGgTZvC4p5zhT4PPktLo5cGUnBH8xLKiI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=RMXP1/++lD52WSo+XQ1nqP9MsJK3GVgjuR91v42Ueok5GVOOBU11teqi90KpJBPHro /1ZB8HyqaCc+h6YU4c4Ps09s4rE/lMKQYVXAvbSRUhQqIeGjr7TNUivqmkdcIlsUC6OV BkuE36sIDlE+PxWlnTTQwS/38hlc6LK68A3xw= MIME-Version: 1.0 Sender: artemb@gmail.com Received: by 10.231.151.207 with SMTP id d15mr3966363ibw.44.1264140559712; Thu, 21 Jan 2010 22:09:19 -0800 (PST) In-Reply-To: <20100122042843.GA8858@polands.org> References: <4B58976E.1020402@polands.org> <4B58A069.8000802@egr.msu.edu> <4B58BD2D.30803@rcn.com> <4B58D4D3.80009@egr.msu.edu> <20100122042843.GA8858@polands.org> Date: Thu, 21 Jan 2010 22:09:19 -0800 X-Google-Sender-Auth: 6e8cd2d6384ba288 Message-ID: From: Artem Belevich To: Doug Poland Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: Repeatable ZFS "kmem map too small" panic on 8.0-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jan 2010 06:09:20 -0000 Next step would be to set vfs.zfs.arc_max to a value that's somewhat below your physical memory size. Let's say - 1G or so. On my box with 8GB of RAM I set these values: vfs.zfs.arc_max=3D"6500M" vfs.zfs.arc_min=3D"4G" You didn't mention how recent is your kernel. There were quite a few bugfixes committed to -8. Make sure your kernel is r201987 or newer. --Artem On Thu, Jan 21, 2010 at 8:28 PM, Doug Poland wrote: > Interesting discussion :) =A0I added vm.kmem_size=3D"20G" to > /boot/loader.conf per your instructions. =A0This time, it didn't panic at > the same point in the test, however, it appears the filesystem is > "hanging". > > On the fdisk test, I hit T and get: > cmd: fsdisk 37066 [zio->io_cv)] 245.62r 0.12u 25.10s 0 > > My various metrics are still running, but anything that needs the > filesystem appears "stuck". > > The memory usage of the item "solaris" (vmstat -m | grep solaris) spiked > at 3334781952 (3180.30 MiB). > > # zpool iostat 2, a T shows: > load: 0.00 =A0cmd: zpool 934 [tx->tx_quiesce_done_cv)] 2052.45r 0.06 u 0.= 39s 0% 0k > > # vmstat -v | grep solaris to disk every second and it's > hung at: > load: 0.00 =A0cmd: sh 38551 [zfs] 909.85r 0.00u 0.00s 0% 16k > > Any suggestions! > > -- > Regards, > Doug > From owner-freebsd-fs@FreeBSD.ORG Fri Jan 22 07:20:42 2010 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A201E106566B for ; Fri, 22 Jan 2010 07:20:42 +0000 (UTC) (envelope-from jh@FreeBSD.org) Received: from gw01.mail.saunalahti.fi (gw01.mail.saunalahti.fi [195.197.172.115]) by mx1.freebsd.org (Postfix) with ESMTP id 620C08FC14 for ; Fri, 22 Jan 2010 07:20:42 +0000 (UTC) Received: from a91-153-117-195.elisa-laajakaista.fi (a91-153-117-195.elisa-laajakaista.fi [91.153.117.195]) by gw01.mail.saunalahti.fi (Postfix) with SMTP id 451B31513FA; Fri, 22 Jan 2010 09:20:38 +0200 (EET) Date: Fri, 22 Jan 2010 09:20:38 +0200 From: Jaakko Heinonen To: freebsd-fs@FreeBSD.org Message-ID: <20100122072038.GA977@a91-153-117-195.elisa-laajakaista.fi> References: <201001080757.o087vhrr009799@svn.freebsd.org> <20100109051536.R57595@delplex.bde.org> <20100108214821.GA985@a91-153-117-195.elisa-laajakaista.fi> <20100110181132.D1354@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20100110181132.D1354@besplex.bde.org> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: Subject: tmpfs maximum file size limit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jan 2010 07:20:42 -0000 Unless I am missing something tmpfs maximum file size limit useless because it is set to the total amount of memory in the system including swap ((cnt.v_page_count + get_swpgtotal()) * PAGE_SIZE). In addition, it's wrong because it's set at mount time and swap space may be added or removed after the mount. So I propose adding a new mount mount option to make the limit configurable at mount time and by default setting it to UINT64_MAX ("no limit"). --- Add "maxfilesize" mount option for tmpfs to allow specifying the maximum file size limit. Default is UINT64_MAX when the option is not specified. Use tmpfs_mem_info() rather than get_swpgtotal() in tmpfs_mount() to check if there is enough memory available. Remove now unused get_swpgtotal(). The patch: http://people.freebsd.org/~jh/patches/tmpfs-maxfilesize.diff -- Jaakko From owner-freebsd-fs@FreeBSD.ORG Fri Jan 22 11:35:47 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 982BC1065693; Fri, 22 Jan 2010 11:35:47 +0000 (UTC) (envelope-from gleb.kurtsou@gmail.com) Received: from mail-ew0-f226.google.com (mail-ew0-f226.google.com [209.85.219.226]) by mx1.freebsd.org (Postfix) with ESMTP id BFFAB8FC0A; Fri, 22 Jan 2010 11:35:46 +0000 (UTC) Received: by ewy26 with SMTP id 26so44723ewy.3 for ; Fri, 22 Jan 2010 03:35:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:date:from:to:cc:subject :message-id:references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=u0GHiDwcnR0fmxC1Bco75XhJwbkbkgZmfBo044WOj8E=; b=PT2WMU4IDMY+Bp8Psd9M72mrelmRd4xSQduFlflMv9HSjvGwbGNYYF3/wh5K7+7fTK SOs741Vw+AuPWmGtEitM6ocOlIyavcif+/dD7RagRGrNYLinnFSuHaR/WD/ZyznuLjEg 38flYNYABjBNthVcL0mZh2aZ1GZgzYm8cHyvo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=QdxHbInNDTSxBDazaVQHIIK2WSqn/76Q4hfe4Vo3SWD4AFpw/y5WTETMmelCHmG1vw E9PEMiMmcatxgaLCngChYWuno+oEgZD8h0NV1rJZaxKakAvIOGfv21Wmu0UTftgs+Spx 04OZj+DeVJIT90haFPWQGbnjwJTmI74K59crQ= Received: by 10.213.111.15 with SMTP id q15mr2625730ebp.86.1264160145855; Fri, 22 Jan 2010 03:35:45 -0800 (PST) Received: from localhost (lan-78-157-90-54.vln.skynet.lt [78.157.90.54]) by mx.google.com with ESMTPS id 5sm2785209eyh.0.2010.01.22.03.35.44 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 22 Jan 2010 03:35:45 -0800 (PST) Date: Fri, 22 Jan 2010 13:35:42 +0200 From: Gleb Kurtsou To: Jaakko Heinonen Message-ID: <20100122113542.GA1662@tops> References: <201001080757.o087vhrr009799@svn.freebsd.org> <20100109051536.R57595@delplex.bde.org> <20100108214821.GA985@a91-153-117-195.elisa-laajakaista.fi> <20100110181132.D1354@besplex.bde.org> <20100122072038.GA977@a91-153-117-195.elisa-laajakaista.fi> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20100122072038.GA977@a91-153-117-195.elisa-laajakaista.fi> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-fs@FreeBSD.org Subject: Re: tmpfs maximum file size limit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jan 2010 11:35:47 -0000 On (22/01/2010 09:20), Jaakko Heinonen wrote: > > Unless I am missing something tmpfs maximum file size limit useless > because it is set to the total amount of memory in the system including > swap ((cnt.v_page_count + get_swpgtotal()) * PAGE_SIZE). In addition, > it's wrong because it's set at mount time and swap space may be added or > removed after the mount. > > So I propose adding a new mount mount option to make the limit > configurable at mount time and by default setting it to UINT64_MAX ("no > limit"). Yes, that's why I removed it in the first place. I was using INT_MAX, but UINT64_MAX would be even better. I like the patch. Thanks, Gleb. > --- > > Add "maxfilesize" mount option for tmpfs to allow specifying the > maximum file size limit. Default is UINT64_MAX when the option is > not specified. > > Use tmpfs_mem_info() rather than get_swpgtotal() in tmpfs_mount() to > check if there is enough memory available. > > Remove now unused get_swpgtotal(). > > The patch: > > http://people.freebsd.org/~jh/patches/tmpfs-maxfilesize.diff > > -- > Jaakko From owner-freebsd-fs@FreeBSD.ORG Fri Jan 22 13:22:13 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 90CA3106568B for ; Fri, 22 Jan 2010 13:22:13 +0000 (UTC) (envelope-from doug@polands.org) Received: from hrndva-omtalb.mail.rr.com (hrndva-omtalb.mail.rr.com [71.74.56.122]) by mx1.freebsd.org (Postfix) with ESMTP id 488A08FC15 for ; Fri, 22 Jan 2010 13:22:12 +0000 (UTC) X-Authority-Analysis: v=1.0 c=1 a=GSN_Y9T6cv4A:10 a=5oGaD+IacabtnYYqBVNCkQ==:17 a=bqq2Vc5EAAAA:8 a=5zptC--39CfEUlsaC4YA:9 a=-c6Oll0x8xvuYqk6BTcA:7 a=hCRTB8BdqILmrSuMg6odhxlswz4A:4 a=5ERLOmoKdHQA:10 X-Cloudmark-Score: 0 X-Originating-IP: 75.87.219.217 Received: from [75.87.219.217] ([75.87.219.217:57808] helo=haran.polands.org) by hrndva-oedge04.mail.rr.com (envelope-from ) (ecelerity 2.2.2.39 r()) with ESMTP id 71/C8-10659-486A95B4; Fri, 22 Jan 2010 13:22:12 +0000 Received: from email.polands.org (ammon.polands.org [172.16.1.7]) by haran.polands.org (8.14.3/8.14.3) with ESMTP id o0MDMBUC030444; Fri, 22 Jan 2010 07:22:11 -0600 (CST) (envelope-from doug@polands.org) Received: from 209.103.214.34 (SquirrelMail authenticated user djp) by email.polands.org with HTTP; Fri, 22 Jan 2010 07:22:11 -0600 Message-ID: In-Reply-To: References: <4B58976E.1020402@polands.org> <4B58A069.8000802@egr.msu.edu> <4B58BD2D.30803@rcn.com> <4B58D4D3.80009@egr.msu.edu> <20100122042843.GA8858@polands.org> Date: Fri, 22 Jan 2010 07:22:11 -0600 From: "Doug Poland" To: "Artem Belevich" User-Agent: SquirrelMail/1.4.20-RC2 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal Cc: freebsd-fs@freebsd.org Subject: Re: Repeatable ZFS "kmem map too small" panic on 8.0-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jan 2010 13:22:13 -0000 On Fri, January 22, 2010 00:09, Artem Belevich wrote: > On Thu, Jan 21, 2010 at 8:28 PM, Doug Poland > wrote: >> Interesting discussion :)  I added vm.kmem_size="20G" to >> /boot/loader.conf per your instructions.  This time, it didn't >> panic at the same point in the test, however, it appears the >> filesystem is "hanging". >> >> On the fdisk test, I hit T and get: >> cmd: fsdisk 37066 [zio->io_cv)] 245.62r 0.12u 25.10s 0 >> >> My various metrics are still running, but anything that needs the >> filesystem appears "stuck". >> >> The memory usage of the item "solaris" (vmstat -m | grep solaris) >> spiked at 3334781952 (3180.30 MiB). >> >> # zpool iostat 2, a T shows: >> load: 0.00  cmd: zpool 934 [tx->tx_quiesce_done_cv)] 2052.45r 0.06 u 0.39s 0% 0k >> >> # vmstat -v | grep solaris to disk every second and it's hung at: >> load: 0.00  cmd: sh 38551 [zfs] 909.85r 0.00u 0.00s 0% 16k >> >> Any suggestions! >> > Next step would be to set vfs.zfs.arc_max to a value that's somewhat > below your physical memory size. Let's say - 1G or so. On my box with > 8GB of RAM I set these values: > > vfs.zfs.arc_max="6500M" > vfs.zfs.arc_min="4G" > OK, I'll give that a shot... > You didn't mention how recent is your kernel. There were quite a few > bugfixes committed to -8. Make sure your kernel is r201987 or newer. > Right now I'm running a stock 8.0-RELEASE. I had tried it on -STABLE a couple of days ago with similar results. I'll get the box back up to -STABLE again. Do you advise to upgrade the pool to v14 for this testing? -- Regards, Doug From owner-freebsd-fs@FreeBSD.ORG Fri Jan 22 13:31:02 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7B4A61065670; Fri, 22 Jan 2010 13:31:02 +0000 (UTC) (envelope-from jh@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 53E7E8FC0A; Fri, 22 Jan 2010 13:31:02 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id o0MDV2Rd091685; Fri, 22 Jan 2010 13:31:02 GMT (envelope-from jh@freefall.freebsd.org) Received: (from jh@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id o0MDV2XQ091666; Fri, 22 Jan 2010 13:31:02 GMT (envelope-from jh) Date: Fri, 22 Jan 2010 13:31:02 GMT Message-Id: <201001221331.o0MDV2XQ091666@freefall.freebsd.org> To: a134qaed@gmail.com, jh@FreeBSD.org, freebsd-fs@FreeBSD.org, jh@FreeBSD.org From: jh@FreeBSD.org Cc: Subject: Re: kern/127659: [tmpfs] tmpfs memory leak X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jan 2010 13:31:02 -0000 Synopsis: [tmpfs] tmpfs memory leak State-Changed-From-To: open->feedback State-Changed-By: jh State-Changed-When: Fri Jan 22 13:24:41 UTC 2010 State-Changed-Why: Did you try Yoshihiro Ota's suggestion? Do you still see this problem? Responsible-Changed-From-To: freebsd-fs->jh Responsible-Changed-By: jh Responsible-Changed-When: Fri Jan 22 13:24:41 UTC 2010 Responsible-Changed-Why: Track. http://www.freebsd.org/cgi/query-pr.cgi?pr=127659 From owner-freebsd-fs@FreeBSD.ORG Fri Jan 22 16:00:27 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 81C1910656F4; Fri, 22 Jan 2010 16:00:27 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-bw0-f213.google.com (mail-bw0-f213.google.com [209.85.218.213]) by mx1.freebsd.org (Postfix) with ESMTP id AD9BF8FC0C; Fri, 22 Jan 2010 16:00:26 +0000 (UTC) Received: by bwz5 with SMTP id 5so1223792bwz.3 for ; Fri, 22 Jan 2010 08:00:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:to:subject:references :organization:from:date:in-reply-to:message-id:user-agent :mime-version:content-type:content-transfer-encoding; bh=lR54PkQR1F2FsB9R46HknPl0T/g9NX1AZEceUOxri2w=; b=XoaJ0km04Yw873HEo+DH1UEOAtkQsZD/R/5cw79O/D1hlP3IajIYwIHJWMBGeDTtlo 1xoSwdUM4Oakzsn1kJulNdFrNWCisF9VzCKu/UtUBC9mVoOm65OY/qhTaDi2KEJkMDDu ewMXxBUxhP/DVIWQugFCF3pCq63Kd79icyc6U= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=to:subject:references:organization:from:date:in-reply-to:message-id :user-agent:mime-version:content-type:content-transfer-encoding; b=SCfXG+141KYQ4AFIvg2W9BWFKMqJDuvaoAZkxyi314oDPNn0/NpJM4YmgeitzSY36J j6H/s0VtcsnaYDwZIp/xtlynIF2eJeSl25VmbnZ55SR4E/5eHvB6/5Dx/0h8JTrh0ouZ 7JwCf1PWgKfvWKx0mj29T1Dfe+CSKd8GA2c50= Received: by 10.204.144.86 with SMTP id y22mr1757061bku.43.1264176025495; Fri, 22 Jan 2010 08:00:25 -0800 (PST) Received: from localhost (ms.singlescrowd.net [80.85.90.67]) by mx.google.com with ESMTPS id 15sm1029159bwz.4.2010.01.22.08.00.22 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 22 Jan 2010 08:00:23 -0800 (PST) To: freebsd-fs@FreeBSD.org, freebsd-stable@FreeBSD.org References: <86ocl272mb.fsf@kopusha.onet> <86tyuqnz9x.fsf@zhuzha.ua1> <86zl4awmon.fsf@zhuzha.ua1> <86vdeywmha.fsf@zhuzha.ua1> Organization: TOA Ukraine From: Mikolaj Golub Date: Fri, 22 Jan 2010 18:00:21 +0200 In-Reply-To: <86vdeywmha.fsf@zhuzha.ua1> (Mikolaj Golub's message of "Tue\, 19 Jan 2010 10\:02\:57 +0200") Message-ID: <86vdeuuo2y.fsf@zhuzha.ua1> User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 8bit Cc: Subject: Re: FreeBSD NFS client/Linux NFS server issue X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jan 2010 16:00:27 -0000 On Tue, 19 Jan 2010 10:02:57 +0200 Mikolaj Golub wrote: > So, on some of our freebsd7.1 nfs clients (and it looks like we have had > similar case with 6.3), which have several nfs mounts to the same CentOS 5.3 > NFS server (mount options: rw,-3,-T,-s,-i,-r=32768,-w=32768,-o=noinet6), at > some moment the access to one of the NFS mount gets stuck, while the access to > the other mounts works ok. > > In all cases we have been observed so far the first gotten stuck process was > php script (or two) that was (were) writing to logs file (appending). In > tcpdump we see that every write to the file causes the sequence of the > following rpc: ACCESS - READ - WRITE - COMMIT. And at some moment this stops > after READ rpc call and successful reply. > > After this in tcpdump successful readdir/access/lookup/fstat calls are > observed from our other utilities, which just check the presence of some files > and they work ok (df also works). The php process at this state is in bo_wwait > invalidating buffer cache [1]. > > If at this time we try accessing the share with mc then it hangs acquiring the > vn_lock held by php process [2] and after this any operations with this NFS > share hang (df hangs too). > > If instead some other process is started that writes to some other file on > this share (append) then the first process "unfreezes" too (starting from > WRITE rpc, so there is no any retransmits). So it looks for me that the problem here is that eventually problem nfsmount ends up in this state: (kgdb) p *nmp $1 = {nm_mtx = {lock_object = {lo_name = 0xc0b808ee "NFSmount lock", lo_type = 0xc0b808ee "NFSmount lock", lo_flags = 16973824, lo_witness_data = {lod_list = { stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4, mtx_recurse = 0}, nm_flag = 35399, nm_state = 1310720, nm_mountp = 0xc6b472cc, nm_numgrps = 16, nm_fh = "\001\000\000\000\000\223\000\000\001@\003\n", '\0' , nm_fhsize = 12, nm_rpcclnt = {rc_flag = 0, rc_wsize = 0, rc_rsize = 0, rc_name = 0x0, rc_so = 0x0, rc_sotype = 0, rc_soproto = 0, rc_soflags = 0, rc_timeo = 0, rc_retry = 0, rc_srtt = {0, 0, 0, 0}, rc_sdrtt = {0, 0, 0, 0}, rc_sent = 0, rc_cwnd = 0, rc_timeouts = 0, rc_deadthresh = 0, rc_authtype = 0, rc_auth = 0x0, rc_prog = 0x0, rc_proctlen = 0, rc_proct = 0x0}, nm_so = 0xc6e81d00, nm_sotype = 1, nm_soproto = 0, nm_soflags = 44, nm_nam = 0xc6948640, nm_timeo = 6000, nm_retry = 2, nm_srtt = {15, 15, 31, 52}, nm_sdrtt = {3, 3, 15, 15}, nm_sent = 0, nm_cwnd = 4096, nm_timeouts = 0, nm_deadthresh = 9, nm_rsize = 32768, nm_wsize = 32768, nm_readdirsize = 4096, nm_readahead = 1, nm_wcommitsize = 1177026, nm_acdirmin = 30, nm_acdirmax = 60, nm_acregmin = 3, nm_acregmax = 60, nm_verf = "Jë¾W\000\004oí", nm_bufq = {tqh_first = 0xda82dc70, tqh_last = 0xda8058e0}, nm_bufqlen = 2, nm_bufqwant = 0, nm_bufqiods = 1, nm_maxfilesize = 1099511627775, nm_rpcops = 0xc0c2b5bc, nm_tprintf_initial_delay = 12, nm_tprintf_delay = 30, nm_nfstcpstate = { rpcresid = 0, flags = 1, sock_send_inprog = 0}, nm_hostname = "172.30.10.92\000/var/www/app31", '\0' , nm_clientid = 0, nm_fsid = { val = {0, 0}}, nm_lease_time = 0, nm_last_renewal = 0} We have nonempty nm_bufq, nm_bufqiods = 1, but actually there is no nfsiod thread run for this mount, which is wrong -- nm_bufq will not be emptied until some other process starts writing to the nfsmount and starts nfsiod thread for this mount. Reviewing the code how it could happen I see the following path. Could someone confirm or disprove me? in nfs_bio.c:nfs_asyncio() we have: 1363 mtx_lock(&nfs_iod_mtx); ... 1374 /* 1375 * Find a free iod to process this request. 1376 */ 1377 for (iod = 0; iod < nfs_numasync; iod++) 1378 if (nfs_iodwant[iod]) { 1379 gotiod = TRUE; 1380 break; 1381 } 1382 1383 /* 1384 * Try to create one if none are free. 1385 */ 1386 if (!gotiod) { 1387 iod = nfs_nfsiodnew(); 1388 if (iod != -1) 1389 gotiod = TRUE; 1390 } Let's consider situation when new nfsiod is created. nfs_nfsiod.c:nfs_nfsiodnew() before creating nfssvc_iod thread unlocks nfs_iod_mtx: 179 mtx_unlock(&nfs_iod_mtx); 180 error = kthread_create(nfssvc_iod, nfs_asyncdaemon + i, NULL, RFHIGHPID, 181 0, "nfsiod %d", newiod); 182 mtx_lock(&nfs_iod_mtx); And nfs_nfsiod.c:nfssvc_iod() do the followin: 226 mtx_lock(&nfs_iod_mtx); ... 238 nfs_iodwant[myiod] = curthread->td_proc; 239 nfs_iodmount[myiod] = NULL; ... 244 error = msleep(&nfs_iodwant[myiod], &nfs_iod_mtx, PWAIT | PCATCH, 245 "-", timo); Let's at this moment another nfs_asyncio() request for another nfsmount has happened and this thread has locked nfs_iod_mtx. Then this thread will found nfs_iodwant[iod] in "for" loop and will use it. When the first thread actually has returned from nfs_nfsiodnew() it will insert buffer to nmp->nm_bufq but nfsiod will process other nmp. It looks like the fix for this situation would be to check nfs_iodwant[iod] after nfs_nfsiodnew(): --- nfs_bio.c.orig 2010-01-22 15:38:02.000000000 +0000 +++ nfs_bio.c 2010-01-22 15:39:58.000000000 +0000 @@ -1385,7 +1385,7 @@ again: */ if (!gotiod) { iod = nfs_nfsiodnew(); - if (iod != -1) + if ((iod != -1) && (nfs_iodwant[iod] == NULL)) gotiod = TRUE; } Described here scenario could be our case. We have 7 nfs mounts on the problem host. And by cront at the same time one or two scripts for every mount were started. So we had something like this in top (cron tasks started at 23:02): last pid: 64884; load averages: 0.28, 0.34, 0.24 up 0+22:15:41 23:02:04 300 processes: 6 running, 259 sleeping, 1 stopped, 17 zombie, 17 waiting CPU: 10.2% user, 0.0% nice, 7.6% system, 1.0% interrupt, 81.2% idle Mem: 174M Active, 2470M Inact, 221M Wired, 136M Cache, 112M Buf, 251M Free Swap: 8192M Total, 8192M Free 64793 app12 -1 0 23352K 11980K nfsreq 0 0:00 1.07% php 64789 app16 -1 0 21304K 11084K nfsreq 0 0:00 0.98% php 64784 app16 -1 0 19256K 9696K nfsreq 2 0:00 0.88% php 64768 app20 -1 0 19256K 9300K nfsreq 0 0:00 0.78% php 64759 app20 -1 0 18232K 8888K nfsreq 1 0:00 0.78% php 64722 app31 -1 0 20280K 9956K nfsreq 0 0:00 0.68% php 64781 app18 -1 0 19256K 9412K nfsreq 3 0:00 0.68% php 64778 app26 -1 0 18232K 8840K nfsreq 1 0:00 0.68% php 64800 app8 -1 0 18232K 8664K nfsreq 3 0:00 0.68% php 64728 app31 -1 0 18232K 8752K nfsreq 0 0:00 0.59% php 64795 app18 -1 0 18232K 8676K nfsreq 1 0:00 0.59% php 64777 app22 -1 0 18232K 8984K nfsreq 0 0:00 0.49% php 2342 app31 -4 0 22236K 7780K nfs 1 0:13 0.00% icoms_agent_cox215 58920 root 8 - 0K 8K - 2 0:08 0.00% nfsiod 0 2334 app31 -4 0 18908K 6356K nfs 1 0:05 0.00% icoms_agent_cox001 64297 root 8 - 0K 8K - 2 0:00 0.00% nfsiod 1 64298 root 8 - 0K 8K - 3 0:00 0.00% nfsiod 2 64303 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 3 64874 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 12 64870 root 8 - 0K 8K - 3 0:00 0.00% nfsiod 9 64866 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 4 64873 root 8 - 0K 8K - 3 0:00 0.00% nfsiod 11 64867 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 5 64869 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 8 64872 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 10 64868 root 8 - 0K 8K - 3 0:00 0.00% nfsiod 7 64871 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 6 last pid: 64967; load averages: 0.42, 0.37, 0.25 up 0+22:15:46 23:02:09 295 processes: 7 running, 251 sleeping, 1 stopped, 19 zombie, 17 waiting CPU: 69.1% user, 0.0% nice, 8.3% system, 1.5% interrupt, 21.1% idle Mem: 376M Active, 2488M Inact, 226M Wired, 124M Cache, 106M Buf, 37M Free Swap: 8192M Total, 8192M Free 64793 app12 99 0 86840K 59968K CPU3 3 0:02 16.55% php 64768 app20 -1 0 57144K 38424K nfsreq 1 0:02 15.19% php 64722 app31 99 0 61240K 41228K CPU0 0 0:02 15.19% php 64781 app18 -1 0 54072K 35612K nfsreq 2 0:02 13.67% php 64789 app16 -1 0 48952K 31660K nfsreq 3 0:01 10.60% php 64777 app22 -1 0 43832K 27876K nfsreq 0 0:01 9.86% php 64784 app16 -1 0 45880K 29648K nfsreq 0 0:01 9.77% php 64759 app20 -7 0 36664K 22792K bo_wwa 0 0:01 8.25% php 64800 app8 -7 0 24376K 13596K bo_wwa 1 0:01 2.39% php 64795 app18 -7 0 23352K 12788K bo_wwa 3 0:00 1.37% php 58920 root -1 - 0K 8K nfsreq 2 0:08 0.00% nfsiod 0 64303 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 3 64866 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 4 64297 root -1 - 0K 8K nfsreq 2 0:00 0.00% nfsiod 1 64298 root -1 - 0K 8K nfsreq 3 0:00 0.00% nfsiod 2 64873 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 11 64868 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 7 64867 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 5 64871 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 6 64947 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 14 64950 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 17 64870 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 9 64869 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 8 64949 root 8 - 0K 8K - 3 0:00 0.00% nfsiod 16 64874 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 12 64872 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 10 64952 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 19 64948 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 15 64951 root 8 - 0K 8K - 3 0:00 0.00% nfsiod 18 64946 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 13 last pid: 64968; load averages: 0.54, 0.39, 0.26 up 0+22:15:51 23:02:14 289 processes: 7 running, 243 sleeping, 1 stopped, 21 zombie, 17 waiting CPU: 28.7% user, 0.0% nice, 8.8% system, 1.1% interrupt, 61.4% idle Mem: 404M Active, 2503M Inact, 224M Wired, 83M Cache, 107M Buf, 37M Free Swap: 8192M Total, 8192M Free 64793 app12 -1 0 148M 106M nfsreq 1 0:07 41.55% php 64722 app31 -7 0 61240K 41232K bo_wwa 1 0:03 14.26% php 64768 app20 -7 0 57144K 38424K bo_wwa 3 0:03 13.67% php 64781 app18 -7 0 54072K 35612K bo_wwa 0 0:02 11.18% php 64789 app16 -7 0 48952K 31660K bo_wwa 0 0:02 7.96% php 64784 app16 -7 0 45880K 29648K bo_wwa 0 0:02 7.76% php 64777 app22 -7 0 43832K 27876K bo_wwa 0 0:01 6.40% php 64759 app20 -7 0 36664K 22792K bo_wwa 0 0:01 4.79% php 58920 root -1 - 0K 8K nfsreq 2 0:08 0.00% nfsiod 0 64867 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 5 64873 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 11 64303 root -1 - 0K 8K nfsreq 0 0:00 0.00% nfsiod 3 64866 root -1 - 0K 8K nfsreq 1 0:00 0.00% nfsiod 4 64297 root -1 - 0K 8K nfsreq 0 0:00 0.00% nfsiod 1 64298 root -1 - 0K 8K nfsreq 3 0:00 0.00% nfsiod 2 64871 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 6 64868 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 7 64869 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 8 64947 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 14 64872 root 8 - 0K 8K - 2 0:00 0.00% nfsiod 10 64874 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 12 64870 root 8 - 0K 8K - 2 0:00 0.00% nfsiod 9 64949 root 8 - 0K 8K - 2 0:00 0.00% nfsiod 16 64950 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 17 64948 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 15 64951 root 8 - 0K 8K - 3 0:00 0.00% nfsiod 18 64946 root 8 - 0K 8K - 2 0:00 0.00% nfsiod 13 64952 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 19 last pid: 64969; load averages: 0.50, 0.39, 0.25 up 0+22:15:56 23:02:19 269 processes: 6 running, 219 sleeping, 1 stopped, 26 zombie, 17 waiting CPU: 11.9% user, 0.0% nice, 5.8% system, 0.8% interrupt, 81.5% idle Mem: 264M Active, 2504M Inact, 232M Wired, 83M Cache, 112M Buf, 169M Free Swap: 8192M Total, 8192M Free 64793 app12 -1 0 148M 106M nfsreq 3 0:08 33.69% php 64789 app16 -7 0 48952K 31660K bo_wwa 0 0:02 4.98% php 64784 app16 -7 0 45880K 29648K bo_wwa 0 0:02 4.88% php 58920 root 8 - 0K 8K - 1 0:08 0.20% nfsiod 0 64867 root 8 - 0K 8K - 2 0:00 0.10% nfsiod 5 64303 root 8 - 0K 8K - 3 0:00 0.00% nfsiod 3 64297 root 8 - 0K 8K - 3 0:00 0.00% nfsiod 1 64873 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 11 64866 root 8 - 0K 8K - 3 0:00 0.00% nfsiod 4 64871 root 8 - 0K 8K - 2 0:00 0.00% nfsiod 6 64298 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 2 64868 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 7 64869 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 8 64947 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 14 64872 root 8 - 0K 8K - 2 0:00 0.00% nfsiod 10 64874 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 12 64870 root 8 - 0K 8K - 2 0:00 0.00% nfsiod 9 64949 root 8 - 0K 8K - 2 0:00 0.00% nfsiod 16 64950 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 17 64948 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 15 64951 root 8 - 0K 8K - 3 0:00 0.00% nfsiod 18 64946 root 8 - 0K 8K - 2 0:00 0.00% nfsiod 13 64952 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 19 last pid: 64970; load averages: 0.46, 0.38, 0.25 up 0+22:16:02 23:02:25 263 processes: 5 running, 212 sleeping, 1 stopped, 28 zombie, 17 waiting CPU: 8.7% user, 0.0% nice, 3.1% system, 0.3% interrupt, 87.9% idle Mem: 160M Active, 2502M Inact, 232M Wired, 83M Cache, 112M Buf, 274M Free Swap: 8192M Total, 8192M Free 64789 app16 -7 0 48952K 31660K bo_wwa 0 0:02 3.27% php 64784 app16 -7 0 45880K 29648K bo_wwa 0 0:02 3.17% php 58920 root 8 - 0K 8K - 1 0:08 0.00% nfsiod 0 64867 root 8 - 0K 8K - 2 0:00 0.00% nfsiod 5 64303 root 8 - 0K 8K - 3 0:00 0.00% nfsiod 3 64297 root 8 - 0K 8K - 3 0:00 0.00% nfsiod 1 64873 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 11 64866 root 8 - 0K 8K - 3 0:00 0.00% nfsiod 4 64871 root 8 - 0K 8K - 2 0:00 0.00% nfsiod 6 64298 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 2 64868 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 7 64869 root 8 - 0K 8K - 0 0:00 0.00% nfsiod 8 64947 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 14 64872 root 8 - 0K 8K - 2 0:00 0.00% nfsiod 10 64874 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 12 64870 root 8 - 0K 8K - 2 0:00 0.00% nfsiod 9 64949 root 8 - 0K 8K - 2 0:00 0.00% nfsiod 16 64950 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 17 64948 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 15 64951 root 8 - 0K 8K - 3 0:00 0.00% nfsiod 18 64946 root 8 - 0K 8K - 2 0:00 0.00% nfsiod 13 64952 root 8 - 0K 8K - 1 0:00 0.00% nfsiod 19 And this two php processes were hanged until 23:05 another process started that write to another file on this nfs mount. -- Mikolaj Golub From owner-freebsd-fs@FreeBSD.ORG Fri Jan 22 16:33:55 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5A1691065672 for ; Fri, 22 Jan 2010 16:33:55 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-iw0-f198.google.com (mail-iw0-f198.google.com [209.85.223.198]) by mx1.freebsd.org (Postfix) with ESMTP id 1B6BA8FC13 for ; Fri, 22 Jan 2010 16:33:54 +0000 (UTC) Received: by iwn36 with SMTP id 36so1137644iwn.3 for ; Fri, 22 Jan 2010 08:33:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=2j2/tn7vEhphMn0ne76R7C8QDSm2qEyy/vlOkS8C6LI=; b=ihOoiqOsAuwmQclrBKKikyNA+2Mcstk95td8Xp5SRhICDyNgUlJhmXL3C9RgnW32qd TWnDNmfuuckyQ7MxRlfXHNkjGw7cj8OQ5cM5cfrXooWw7nZNOQ13Po1khqpy8mbjE1MM znMqq20FT4SxR5ylUol7jh/HOWAP5FnEhC0Ko= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=irAH3sctQ/jGPjEgUGeNO+Kw3aEZmDlHPS8OwTOWQon6wo5uhMSWYMZ+L+c8ociE+Z l+YE2pBRxnvhsAu4DmeqZDcny5/NJoR/XQc1emeRX2ht9jHqTzuivX9N6h0j8ZTKd3U2 qnknqinanvYrA5lkpDrm/ScG6rK5Q6c1VWDE4= MIME-Version: 1.0 Sender: artemb@gmail.com Received: by 10.231.146.2 with SMTP id f2mr5133899ibv.23.1264178034399; Fri, 22 Jan 2010 08:33:54 -0800 (PST) In-Reply-To: References: <4B58976E.1020402@polands.org> <4B58A069.8000802@egr.msu.edu> <4B58BD2D.30803@rcn.com> <4B58D4D3.80009@egr.msu.edu> <20100122042843.GA8858@polands.org> Date: Fri, 22 Jan 2010 08:33:54 -0800 X-Google-Sender-Auth: b1c002b8b4f64297 Message-ID: From: Artem Belevich To: Doug Poland Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: Repeatable ZFS "kmem map too small" panic on 8.0-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jan 2010 16:33:55 -0000 > Right now I'm running a stock 8.0-RELEASE. =A0I had tried it on -STABLE > a couple of days ago with similar results. =A0I'll get the box back up > to -STABLE again. =A0Do you advise to upgrade the pool to v14 for this > testing? v14 is pretty much just a version bump without much usable functionality change on FreeBSD. I don't think upgrading to it will have any practical impact. Given that pool version upgrade is a one-way process, I'd wait until v14 features are really needed. In any case it would be prudent to change only one variable at a time during experiments. --Artem From owner-freebsd-fs@FreeBSD.ORG Fri Jan 22 16:39:03 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1F70C10656DB; Fri, 22 Jan 2010 16:39:03 +0000 (UTC) (envelope-from jh@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id C6B838FC2F; Fri, 22 Jan 2010 16:39:02 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id o0MGd2uP047686; Fri, 22 Jan 2010 16:39:02 GMT (envelope-from jh@freefall.freebsd.org) Received: (from jh@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id o0MGd2W6047682; Fri, 22 Jan 2010 16:39:02 GMT (envelope-from jh) Date: Fri, 22 Jan 2010 16:39:02 GMT Message-Id: <201001221639.o0MGd2W6047682@freefall.freebsd.org> To: giffunip@tutopia.com, jh@FreeBSD.org, freebsd-fs@FreeBSD.org From: jh@FreeBSD.org Cc: Subject: Re: kern/138109: [extfs] [patch] Minor cleanups to the sys/gnu/fs/ext2fs based on BSD Lite2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jan 2010 16:39:03 -0000 Synopsis: [extfs] [patch] Minor cleanups to the sys/gnu/fs/ext2fs based on BSD Lite2 State-Changed-From-To: open->closed State-Changed-By: jh State-Changed-When: Fri Jan 22 16:39:01 UTC 2010 State-Changed-Why: Submitted changes were included in r202283. http://www.freebsd.org/cgi/query-pr.cgi?pr=138109 From owner-freebsd-fs@FreeBSD.ORG Fri Jan 22 18:09:45 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6E839106566C for ; Fri, 22 Jan 2010 18:09:45 +0000 (UTC) (envelope-from doug@polands.org) Received: from hrndva-omtalb.mail.rr.com (hrndva-omtalb.mail.rr.com [71.74.56.125]) by mx1.freebsd.org (Postfix) with ESMTP id 096858FC0A for ; Fri, 22 Jan 2010 18:09:44 +0000 (UTC) X-Authority-Analysis: v=1.0 c=1 a=GSN_Y9T6cv4A:10 a=5oGaD+IacabtnYYqBVNCkQ==:17 a=bqq2Vc5EAAAA:8 a=flJLQrt4EQDhqcY5gHkA:9 a=aOOHSjQ4rLumRQA88O0A:7 a=ZuPtNL6XV-s6dh0S0N5OpwVYNu0A:4 a=5ERLOmoKdHQA:10 X-Cloudmark-Score: 0 X-Originating-IP: 75.87.219.217 Received: from [75.87.219.217] ([75.87.219.217:50571] helo=haran.polands.org) by hrndva-oedge02.mail.rr.com (envelope-from ) (ecelerity 2.2.2.39 r()) with ESMTP id 47/4F-11553-7E9E95B4; Fri, 22 Jan 2010 18:09:44 +0000 Received: from email.polands.org (ammon.polands.org [172.16.1.7]) by haran.polands.org (8.14.3/8.14.3) with ESMTP id o0MI9hEF031285; Fri, 22 Jan 2010 12:09:43 -0600 (CST) (envelope-from doug@polands.org) Received: from 209.103.214.34 (SquirrelMail authenticated user djp) by email.polands.org with HTTP; Fri, 22 Jan 2010 12:09:43 -0600 Message-ID: In-Reply-To: References: <4B58976E.1020402@polands.org> <4B58A069.8000802@egr.msu.edu> <4B58BD2D.30803@rcn.com> <4B58D4D3.80009@egr.msu.edu> <20100122042843.GA8858@polands.org> Date: Fri, 22 Jan 2010 12:09:43 -0600 From: "Doug Poland" To: "Artem Belevich" User-Agent: SquirrelMail/1.4.20-RC2 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal Cc: freebsd-fs@freebsd.org Subject: Re: Repeatable ZFS "kmem map too small" panic on 8.0-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jan 2010 18:09:45 -0000 On Fri, January 22, 2010 07:22, Doug Poland wrote: > On Fri, January 22, 2010 00:09, Artem Belevich wrote: >> On Thu, Jan 21, 2010 at 8:28 PM, Doug Poland >> wrote: >>> Interesting discussion :)  I added vm.kmem_size="20G" to >>> /boot/loader.conf per your instructions.  This time, it didn't >>> panic at the same point in the test, however, it appears the >>> filesystem is "hanging". >>> >>> On the fdisk test, I hit T and get: >>> cmd: fsdisk 37066 [zio->io_cv)] 245.62r 0.12u 25.10s 0 >>> >>> My various metrics are still running, but anything that needs the >>> filesystem appears "stuck". >>> >>> The memory usage of the item "solaris" (vmstat -m | grep solaris) >>> spiked at 3334781952 (3180.30 MiB). >>> >>> # zpool iostat 2, a T shows: >>> load: 0.00  cmd: zpool 934 [tx->tx_quiesce_done_cv)] 2052.45r 0.06 > u 0.39s 0% 0k >>> >>> # vmstat -v | grep solaris to disk every second and it's hung at: >>> load: 0.00  cmd: sh 38551 [zfs] 909.85r 0.00u 0.00s 0% 16k >>> >>> Any suggestions! >>> >> Next step would be to set vfs.zfs.arc_max to a value that's somewhat >> below your physical memory size. Let's say - 1G or so. On my box >> with >> 8GB of RAM I set these values: >> >> vfs.zfs.arc_max="6500M" >> vfs.zfs.arc_min="4G" >> > OK, I'll give that a shot... > panic: kmem_malloc(131072): kmem_map too small: 3593236480 total allocate d # cat /boot/loader.conf vfs.root.mountfrom="zfs:bethesda" vfs.zfs.arc_max="1G" vm.kmem_size="20G" zfs_load="YES" OK, what do we do next? :) -- Regards, Doug From owner-freebsd-fs@FreeBSD.ORG Fri Jan 22 19:27:16 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D75D91065679; Fri, 22 Jan 2010 19:27:16 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 705A38FC1D; Fri, 22 Jan 2010 19:27:16 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEAA+LWUuDaFvJ/2dsb2JhbADZL4IzggkE X-IronPort-AV: E=Sophos;i="4.49,325,1262581200"; d="scan'208";a="62589991" Received: from ganges.cs.uoguelph.ca ([131.104.91.201]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 22 Jan 2010 14:27:14 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by ganges.cs.uoguelph.ca (Postfix) with ESMTP id A1DD8FB808E; Fri, 22 Jan 2010 14:27:14 -0500 (EST) X-Virus-Scanned: amavisd-new at ganges.cs.uoguelph.ca Received: from ganges.cs.uoguelph.ca ([127.0.0.1]) by localhost (ganges.cs.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZntHnpHgZ2wX; Fri, 22 Jan 2010 14:27:13 -0500 (EST) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by ganges.cs.uoguelph.ca (Postfix) with ESMTP id 5C806FB8063; Fri, 22 Jan 2010 14:27:13 -0500 (EST) Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id o0MJbmD06729; Fri, 22 Jan 2010 14:37:48 -0500 (EST) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Fri, 22 Jan 2010 14:37:48 -0500 (EST) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: Mikolaj Golub In-Reply-To: <86vdeuuo2y.fsf@zhuzha.ua1> Message-ID: References: <86ocl272mb.fsf@kopusha.onet> <86tyuqnz9x.fsf@zhuzha.ua1> <86zl4awmon.fsf@zhuzha.ua1> <86vdeywmha.fsf@zhuzha.ua1> <86vdeuuo2y.fsf@zhuzha.ua1> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@FreeBSD.org, freebsd-stable@FreeBSD.org Subject: Re: FreeBSD NFS client/Linux NFS server issue X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jan 2010 19:27:16 -0000 On Fri, 22 Jan 2010, Mikolaj Golub wrote: > > We have nonempty nm_bufq, nm_bufqiods = 1, but actually there is no nfsiod > thread run for this mount, which is wrong -- nm_bufq will not be emptied until > some other process starts writing to the nfsmount and starts nfsiod thread for > this mount. > > Reviewing the code how it could happen I see the following path. Could someone > confirm or disprove me? > > in nfs_bio.c:nfs_asyncio() we have: > > 1363 mtx_lock(&nfs_iod_mtx); > ... > 1374 /* > 1375 * Find a free iod to process this request. > 1376 */ > 1377 for (iod = 0; iod < nfs_numasync; iod++) > 1378 if (nfs_iodwant[iod]) { > 1379 gotiod = TRUE; > 1380 break; > 1381 } > 1382 > 1383 /* > 1384 * Try to create one if none are free. > 1385 */ > 1386 if (!gotiod) { > 1387 iod = nfs_nfsiodnew(); > 1388 if (iod != -1) > 1389 gotiod = TRUE; > 1390 } > > Let's consider situation when new nfsiod is created. > > nfs_nfsiod.c:nfs_nfsiodnew() before creating nfssvc_iod thread unlocks nfs_iod_mtx: > > 179 mtx_unlock(&nfs_iod_mtx); > 180 error = kthread_create(nfssvc_iod, nfs_asyncdaemon + i, NULL, RFHIGHPID, > 181 0, "nfsiod %d", newiod); > 182 mtx_lock(&nfs_iod_mtx); > > > And nfs_nfsiod.c:nfssvc_iod() do the followin: > > 226 mtx_lock(&nfs_iod_mtx); > ... > 238 nfs_iodwant[myiod] = curthread->td_proc; > 239 nfs_iodmount[myiod] = NULL; > ... > 244 error = msleep(&nfs_iodwant[myiod], &nfs_iod_mtx, PWAIT | PCATCH, > 245 "-", timo); > > Let's at this moment another nfs_asyncio() request for another nfsmount has > happened and this thread has locked nfs_iod_mtx. Then this thread will found > nfs_iodwant[iod] in "for" loop and will use it. When the first thread actually > has returned from nfs_nfsiodnew() it will insert buffer to nmp->nm_bufq but > nfsiod will process other nmp. > Ok, good catch, I think you've found the problem (or at least a race that might have caused it). > It looks like the fix for this situation would be to check nfs_iodwant[iod] > after nfs_nfsiodnew(): > > --- nfs_bio.c.orig 2010-01-22 15:38:02.000000000 +0000 > +++ nfs_bio.c 2010-01-22 15:39:58.000000000 +0000 > @@ -1385,7 +1385,7 @@ again: > */ > if (!gotiod) { > iod = nfs_nfsiodnew(); > - if (iod != -1) > + if ((iod != -1) && (nfs_iodwant[iod] == NULL)) > gotiod = TRUE; > } > Unfortunately, I don't think the above fixes the problem. If another thread that called nfs_asyncio() has "stolen" the this "iod", it will have set nfs_iodwant[iod] == NULL (set non-NULL at #238) and it will remain NULL until the other thread is done with it. If you instead make it: if (iod != -1 && nfs_iodwant[iod] != NULL) gotiod = TRUE; then I think it fixes your scenario above, but will break for the case where the mtx_lock(&nfs_iod_mtx) call in nfs_nfsnewiod() (#182) wins out over the one near the beginning of nfssvc_iod() (#226), since in that case, nfs_iodwant[iod] will still be NULL because it hasn't yet been set by nfssvc_iod() (#238). There should probably be some sort of 3 way handshake between the code in nfs_asyncio() after calling nfs_nfsnewiod() and the code near the beginning of nfssvc_iod(), but I think the following somewhat cheesy fix might do the trick: if (!gotiod) { iod = nfs_nfsiodnew(); if (iod != -1) { if (nfs_iodwant[iod] == NULL) { /* * Either another thread has acquired this * iod or I acquired the nfs_iod_mtx mutex * before the new iod thread did in * nfssvc_iod(). To be safe, go back and * try again after allowing another thread * to acquire the nfs_iod_mtx mutex. */ mtx_unlock(&nfs_iod_mtx); /* * So long as mtx_lock() implements some * sort of fairness, nfssvc_iod() should * get nfs_iod_mtx here and set * nfs_iodwant[iod] != NULL for the case * where the iod has not been "stolen" by * another thread for a different mount * point. */ mtx_lock(&nfs_iod_mtx); goto again; } gotiod = TRUE; } } Does anyone else have a better solution? (Mikolaj, could you by any chance test this? You can test yours, but I think it breaks.) rick From owner-freebsd-fs@FreeBSD.ORG Fri Jan 22 20:11:48 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C57A3106566B for ; Fri, 22 Jan 2010 20:11:48 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-iw0-f198.google.com (mail-iw0-f198.google.com [209.85.223.198]) by mx1.freebsd.org (Postfix) with ESMTP id 8548D8FC14 for ; Fri, 22 Jan 2010 20:11:48 +0000 (UTC) Received: by iwn36 with SMTP id 36so1306785iwn.3 for ; Fri, 22 Jan 2010 12:11:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=j3BI5kzaWarBNgg3UJNNDlyot/unjm7SYz2wMaVAmdI=; b=ltqwNZat3WPc2SwdutzReHgXqAAIaKkXZl2wBypvkYyAU+Wxee9/Fo2EgWYt4vMs3D sUGHjH8+Uenh+kq2b3fgb5OWrt2Y31yE/rKLwLyqiwya2h4rRsa+8jwBNbSjwBRCr20X 5VtI1J+QlXgR8fdcYNXPDJPKiIKn3XfdyXQss= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=fV9VO38wNofaoYJCcM2DgiF+t3cwK6TJX6DWQdstIIGeiyxuN2mdyYb2s2uu3E6ZZ/ wy2adiLPkH6ketM74eNPrdYjgZ3qR+QL/4P9KHhD89uWF2alS6sc/418qPSu/FfJlcKE M7kGwIdLduCg5WK/9Jt9J7+aaytBJnuLDc31w= MIME-Version: 1.0 Sender: artemb@gmail.com Received: by 10.231.59.7 with SMTP id j7mr1677464ibh.12.1264191107683; Fri, 22 Jan 2010 12:11:47 -0800 (PST) In-Reply-To: References: <4B58976E.1020402@polands.org> <4B58A069.8000802@egr.msu.edu> <4B58BD2D.30803@rcn.com> <4B58D4D3.80009@egr.msu.edu> <20100122042843.GA8858@polands.org> Date: Fri, 22 Jan 2010 12:11:47 -0800 X-Google-Sender-Auth: ca6a295921f642fc Message-ID: From: Artem Belevich To: Doug Poland Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: Repeatable ZFS "kmem map too small" panic on 8.0-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jan 2010 20:11:48 -0000 What do following sysctls show on your box? hw.physmem: vm.kmem_size vm.kmem_size_max vfs.zfs.arc_max --Artem On Fri, Jan 22, 2010 at 10:09 AM, Doug Poland wrote: > > On Fri, January 22, 2010 07:22, Doug Poland wrote: >> On Fri, January 22, 2010 00:09, Artem Belevich wrote: >>> On Thu, Jan 21, 2010 at 8:28 PM, Doug Poland >>> wrote: >>>> Interesting discussion :) =A0I added vm.kmem_size=3D"20G" to >>>> /boot/loader.conf per your instructions. =A0This time, it didn't >>>> panic at the same point in the test, however, it appears the >>>> filesystem is "hanging". >>>> >>>> On the fdisk test, I hit T and get: >>>> cmd: fsdisk 37066 [zio->io_cv)] 245.62r 0.12u 25.10s 0 >>>> >>>> My various metrics are still running, but anything that needs the >>>> filesystem appears "stuck". >>>> >>>> The memory usage of the item "solaris" (vmstat -m | grep solaris) >>>> spiked at 3334781952 (3180.30 MiB). >>>> >>>> # zpool iostat 2, a T shows: >>>> load: 0.00 =A0cmd: zpool 934 [tx->tx_quiesce_done_cv)] 2052.45r 0.06 >> u 0.39s 0% 0k >>>> >>>> # vmstat -v | grep solaris to disk every second and it's hung at: >>>> load: 0.00 =A0cmd: sh 38551 [zfs] 909.85r 0.00u 0.00s 0% 16k >>>> >>>> Any suggestions! >>>> >>> Next step would be to set vfs.zfs.arc_max to a value that's somewhat >>> below your physical memory size. Let's say - 1G or so. On my box >>> with >>> 8GB of RAM I set these values: >>> >>> vfs.zfs.arc_max=3D"6500M" >>> vfs.zfs.arc_min=3D"4G" >>> >> OK, I'll give that a shot... >> > > panic: kmem_malloc(131072): kmem_map too small: 3593236480 total > allocate d > > # cat /boot/loader.conf > vfs.root.mountfrom=3D"zfs:bethesda" > vfs.zfs.arc_max=3D"1G" > vm.kmem_size=3D"20G" > zfs_load=3D"YES" > > OK, what do we do next? :) > > > -- > Regards, > Doug > > From owner-freebsd-fs@FreeBSD.ORG Fri Jan 22 20:37:54 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1C5651065670; Fri, 22 Jan 2010 20:37:54 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.152]) by mx1.freebsd.org (Postfix) with ESMTP id 7658D8FC0C; Fri, 22 Jan 2010 20:37:53 +0000 (UTC) Received: by fg-out-1718.google.com with SMTP id 16so146326fgg.13 for ; Fri, 22 Jan 2010 12:37:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:to:cc:subject:references :organization:from:date:in-reply-to:message-id:user-agent :mime-version:content-type; bh=9bup00Tj3jlLs7R1jBd1+fha+HAjNnLMHba4LLKaQOo=; b=c71xc628ay2UbQsnNl1saJQal006lIYguiU71hIgIjqtaYa7iYALBLCCXAxmkZDWGr 8saI09bPD0MR16Axxgd06Guk5G5MyrfmpWJeqhbZWrJqiPo6GJR+Xh0fLMGiZ2kQmuTt W6O6hZDlwvz4QmQOvBhu6a1+8fwsBtp2pQjP8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=to:cc:subject:references:organization:from:date:in-reply-to :message-id:user-agent:mime-version:content-type; b=H7YV26eeHmGdectzYBZ1m8wsrNbOEY1XyvpFuWPn2YApDKTAahYEwukuWU8oDdlp1q JlMte3n8cYHp1OIufDfvtKODgm4MCcfQxt6UCLUqQWqADifiBA4jpalb9FWdSoq6DAMV YNzY2d0iGzEgA2aGVviUc3RFgzvr2KFIV92kE= Received: by 10.103.50.15 with SMTP id c15mr1840004muk.35.1264192672286; Fri, 22 Jan 2010 12:37:52 -0800 (PST) Received: from localhost ([95.69.162.7]) by mx.google.com with ESMTPS id e10sm10542198muf.26.2010.01.22.12.37.50 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 22 Jan 2010 12:37:51 -0800 (PST) To: Rick Macklem References: <86ocl272mb.fsf@kopusha.onet> <86tyuqnz9x.fsf@zhuzha.ua1> <86zl4awmon.fsf@zhuzha.ua1> <86vdeywmha.fsf@zhuzha.ua1> <86vdeuuo2y.fsf@zhuzha.ua1> Organization: TOA Ukraine From: Mikolaj Golub Date: Fri, 22 Jan 2010 22:37:49 +0200 In-Reply-To: (Rick Macklem's message of "Fri\, 22 Jan 2010 14\:37\:48 -0500 \(EST\)") Message-ID: <86my05x4de.fsf@kopusha.onet> User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-fs@FreeBSD.org, freebsd-stable@FreeBSD.org Subject: Re: FreeBSD NFS client/Linux NFS server issue X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jan 2010 20:37:54 -0000 On Fri, 22 Jan 2010 14:37:48 -0500 (EST) Rick Macklem wrote: >> --- nfs_bio.c.orig 2010-01-22 15:38:02.000000000 +0000 >> +++ nfs_bio.c 2010-01-22 15:39:58.000000000 +0000 >> @@ -1385,7 +1385,7 @@ again: >> */ >> if (!gotiod) { >> iod = nfs_nfsiodnew(); >> - if (iod != -1) >> + if ((iod != -1) && (nfs_iodwant[iod] == NULL)) >> gotiod = TRUE; >> } >> > > Unfortunately, I don't think the above fixes the problem. > If another thread that called nfs_asyncio() has "stolen" the this "iod", > it will have set nfs_iodwant[iod] == NULL (set non-NULL at #238) > and it will remain NULL until the other thread is done with it. I see. I have missed this. Thanks. > > There should probably be some sort of 3 way handshake between > the code in nfs_asyncio() after calling nfs_nfsnewiod() and the > code near the beginning of nfssvc_iod(), but I think the following > somewhat cheesy fix might do the trick: > > if (!gotiod) { > iod = nfs_nfsiodnew(); > if (iod != -1) { > if (nfs_iodwant[iod] == NULL) { > /* > * Either another thread has acquired this > * iod or I acquired the nfs_iod_mtx mutex > * before the new iod thread did in > * nfssvc_iod(). To be safe, go back and > * try again after allowing another thread > * to acquire the nfs_iod_mtx mutex. > */ > mtx_unlock(&nfs_iod_mtx); > /* > * So long as mtx_lock() implements some > * sort of fairness, nfssvc_iod() should > * get nfs_iod_mtx here and set > * nfs_iodwant[iod] != NULL for the case > * where the iod has not been "stolen" by > * another thread for a different mount > * point. > */ > mtx_lock(&nfs_iod_mtx); > goto again; > } > gotiod = TRUE; > } > } > > Does anyone else have a better solution? > (Mikolaj, could you by any chance test this? You can test yours, but I > think it breaks.) Unfortunately we observed this only on our production servers. A week ago we made some changes in configuration as workaround -- reconfigure cron no to run scripts simultaneously, set the scripts in cron that just periodically write a line to the file on nfs share (to "unlock" it if it is locked). We have not been observed problems since then and we would not like to experiment in production. If I manage to produce good test case in test environment I will be able to test the patch but I am not sure... -- Mikolaj Golub From owner-freebsd-fs@FreeBSD.ORG Fri Jan 22 22:02:27 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9A701106566C for ; Fri, 22 Jan 2010 22:02:27 +0000 (UTC) (envelope-from doug@polands.org) Received: from hrndva-omtalb.mail.rr.com (hrndva-omtalb.mail.rr.com [71.74.56.124]) by mx1.freebsd.org (Postfix) with ESMTP id 1E39D8FC0C for ; Fri, 22 Jan 2010 22:02:26 +0000 (UTC) X-Authority-Analysis: v=1.0 c=1 a=GSN_Y9T6cv4A:10 a=5oGaD+IacabtnYYqBVNCkQ==:17 a=q7QkwurUVjC4X0QMqFMA:9 a=xClCA9ryF_EwLtfQFsJJNsjW7kIA:4 X-Cloudmark-Score: 0 X-Originating-IP: 75.87.219.217 Received: from [75.87.219.217] ([75.87.219.217:53352] helo=haran.polands.org) by hrndva-oedge02.mail.rr.com (envelope-from ) (ecelerity 2.2.2.39 r()) with ESMTP id C4/38-11553-0702A5B4; Fri, 22 Jan 2010 22:02:25 +0000 Received: from email.polands.org (ammon.polands.org [172.16.1.7]) by haran.polands.org (8.14.3/8.14.3) with ESMTP id o0MM2NVs031937; Fri, 22 Jan 2010 16:02:23 -0600 (CST) (envelope-from doug@polands.org) Received: from 209.103.214.34 (SquirrelMail authenticated user djp) by email.polands.org with HTTP; Fri, 22 Jan 2010 16:02:24 -0600 Message-ID: <1308c71eec426200d4c34b926bba8806.squirrel@email.polands.org> In-Reply-To: References: <4B58976E.1020402@polands.org> <4B58A069.8000802@egr.msu.edu> <4B58BD2D.30803@rcn.com> <4B58D4D3.80009@egr.msu.edu> <20100122042843.GA8858@polands.org> Date: Fri, 22 Jan 2010 16:02:24 -0600 From: "Doug Poland" To: "Artem Belevich" User-Agent: SquirrelMail/1.4.20-RC2 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal Cc: freebsd-fs@freebsd.org Subject: Re: Repeatable ZFS "kmem map too small" panic on 8.0-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jan 2010 22:02:27 -0000 On Fri, January 22, 2010 14:11, Artem Belevich wrote: >> >> panic: kmem_malloc(131072): kmem_map too small: 3593236480 total >> allocated >> >> # cat /boot/loader.conf >> vfs.root.mountfrom="zfs:bethesda" >> vfs.zfs.arc_max="1G" >> vm.kmem_size="20G" >> zfs_load="YES" >> >> OK, what do we do next? :) >> > > What do following sysctls show on your box? > > hw.physmem: > vm.kmem_size > vm.kmem_size_max > vfs.zfs.arc_max > % sysctl hw.physmem vm.kmem_size vm.kmem_size_max vfs.zfs.arc_max hw.physmem:4102688768 vm.kmem_size: 2147483648 vm.kmem_size_max: 329853485875 vfs.zfs.arc_max: 1073741824 -- Regards, Doug From owner-freebsd-fs@FreeBSD.ORG Fri Jan 22 22:02:36 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A77C01065672; Fri, 22 Jan 2010 22:02:36 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 3BE198FC19; Fri, 22 Jan 2010 22:02:35 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEAHWvWUuDaFvI/2dsb2JhbADYfIQ8BA X-IronPort-AV: E=Sophos;i="4.49,326,1262581200"; d="scan'208";a="62614338" Received: from darling.cs.uoguelph.ca ([131.104.91.200]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 22 Jan 2010 17:02:35 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by darling.cs.uoguelph.ca (Postfix) with ESMTP id 565B394016A; Fri, 22 Jan 2010 17:02:35 -0500 (EST) X-Virus-Scanned: amavisd-new at darling.cs.uoguelph.ca Received: from darling.cs.uoguelph.ca ([127.0.0.1]) by localhost (darling.cs.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BqvrPt5TaKse; Fri, 22 Jan 2010 17:02:33 -0500 (EST) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by darling.cs.uoguelph.ca (Postfix) with ESMTP id DC38F940062; Fri, 22 Jan 2010 17:02:33 -0500 (EST) Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id o0MMD9o07073; Fri, 22 Jan 2010 17:13:09 -0500 (EST) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Fri, 22 Jan 2010 17:13:09 -0500 (EST) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: Mikolaj Golub In-Reply-To: Message-ID: References: <86ocl272mb.fsf@kopusha.onet> <86tyuqnz9x.fsf@zhuzha.ua1> <86zl4awmon.fsf@zhuzha.ua1> <86vdeywmha.fsf@zhuzha.ua1> <86vdeuuo2y.fsf@zhuzha.ua1> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@FreeBSD.org, freebsd-stable@FreeBSD.org Subject: Re: FreeBSD NFS client/Linux NFS server issue X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jan 2010 22:02:36 -0000 On Fri, 22 Jan 2010, Rick Macklem wrote: > > There should probably be some sort of 3 way handshake between > the code in nfs_asyncio() after calling nfs_nfsnewiod() and the > code near the beginning of nfssvc_iod(), but I think the following > somewhat cheesy fix might do the trick: > [stuff deleted] I know it's a little weird to reply to my own posting, but I think this might be a reasonable patch (I have only tested it for a few minutes at this point). I basically redefined nfs_iodwant[] as a tri-state variable (although it was a struct proc *, it was only tested NULL/non-NULL). 0 - was NULL 1 - was non-NULL -1 - just created by nfs_asyncio() and will be used by it I'll keep testing it, but hopefully someone else can test and/or review it... rick ps: Mikolaj, I'm a sysadmin so I understand the problems with production systems, but if you do get a chance to test it somehow, that would be great. pss: This is against -current, but hopefully stable/7 can be patched about the same. --- patch for nfsiod race against -current --- --- nfsclient/nfs.h.sav 2010-01-22 16:21:53.000000000 -0500 +++ nfsclient/nfs.h 2010-01-22 16:22:04.000000000 -0500 @@ -252,7 +252,7 @@ int nfs_commit(struct vnode *vp, u_quad_t offset, int cnt, struct ucred *cred, struct thread *td); int nfs_readdirrpc(struct vnode *, struct uio *, struct ucred *); -int nfs_nfsiodnew(void); +int nfs_nfsiodnew(int); int nfs_asyncio(struct nfsmount *, struct buf *, struct ucred *, struct thread *); int nfs_doio(struct vnode *, struct buf *, struct ucred *, struct thread *); void nfs_doio_directwrite (struct buf *); --- nfsclient/nfsnode.h.sav 2010-01-22 14:56:34.000000000 -0500 +++ nfsclient/nfsnode.h 2010-01-22 14:56:52.000000000 -0500 @@ -180,7 +180,7 @@ * Queue head for nfsiod's */ extern TAILQ_HEAD(nfs_bufq, buf) nfs_bufq; -extern struct proc *nfs_iodwant[NFS_MAXASYNCDAEMON]; +extern int nfs_iodwant[NFS_MAXASYNCDAEMON]; extern struct nfsmount *nfs_iodmount[NFS_MAXASYNCDAEMON]; #if defined(_KERNEL) --- nfsclient/nfs_bio.c.sav 2010-01-22 14:57:28.000000000 -0500 +++ nfsclient/nfs_bio.c 2010-01-22 16:17:24.000000000 -0500 @@ -1377,7 +1377,7 @@ * Find a free iod to process this request. */ for (iod = 0; iod < nfs_numasync; iod++) - if (nfs_iodwant[iod]) { + if (nfs_iodwant[iod] > 0) { gotiod = TRUE; break; } @@ -1386,7 +1386,7 @@ * Try to create one if none are free. */ if (!gotiod) { - iod = nfs_nfsiodnew(); + iod = nfs_nfsiodnew(1); if (iod != -1) gotiod = TRUE; } @@ -1398,7 +1398,7 @@ */ NFS_DPF(ASYNCIO, ("nfs_asyncio: waking iod %d for mount %p\n", iod, nmp)); - nfs_iodwant[iod] = NULL; + nfs_iodwant[iod] = 0; nfs_iodmount[iod] = nmp; nmp->nm_bufqiods++; wakeup(&nfs_iodwant[iod]); --- nfsclient/nfs_nfsiod.c.sav 2010-01-22 14:57:28.000000000 -0500 +++ nfsclient/nfs_nfsiod.c 2010-01-22 16:32:31.000000000 -0500 @@ -113,7 +113,7 @@ * than the new minimum, create some more. */ for (i = nfs_iodmin - nfs_numasync; i > 0; i--) - nfs_nfsiodnew(); + nfs_nfsiodnew(0); out: mtx_unlock(&nfs_iod_mtx); return (0); @@ -147,7 +147,7 @@ */ iod = nfs_numasync - 1; for (i = 0; i < nfs_numasync - nfs_iodmax; i++) { - if (nfs_iodwant[iod]) + if (nfs_iodwant[iod] > 0) wakeup(&nfs_iodwant[iod]); iod--; } @@ -160,7 +160,7 @@ "Max number of nfsiod kthreads"); int -nfs_nfsiodnew(void) +nfs_nfsiodnew(int set_iodwant) { int error, i; int newiod; @@ -176,12 +176,17 @@ } if (newiod == -1) return (-1); + if (set_iodwant > 0) + nfs_iodwant[i] = -1; mtx_unlock(&nfs_iod_mtx); error = kproc_create(nfssvc_iod, nfs_asyncdaemon + i, NULL, RFHIGHPID, 0, "nfsiod %d", newiod); mtx_lock(&nfs_iod_mtx); - if (error) + if (error) { + if (set_iodwant > 0) + nfs_iodwant[i] = 0; return (-1); + } nfs_numasync++; return (newiod); } @@ -199,7 +204,7 @@ nfs_iodmin = NFS_MAXASYNCDAEMON; for (i = 0; i < nfs_iodmin; i++) { - error = nfs_nfsiodnew(); + error = nfs_nfsiodnew(0); if (error == -1) panic("nfsiod_setup: nfs_nfsiodnew failed"); } @@ -236,7 +241,8 @@ goto finish; if (nmp) nmp->nm_bufqiods--; - nfs_iodwant[myiod] = curthread->td_proc; + if (nfs_iodwant[myiod] == 0) + nfs_iodwant[myiod] = 1; nfs_iodmount[myiod] = NULL; /* * Always keep at least nfs_iodmin kthreads. @@ -303,7 +309,7 @@ nfs_asyncdaemon[myiod] = 0; if (nmp) nmp->nm_bufqiods--; - nfs_iodwant[myiod] = NULL; + nfs_iodwant[myiod] = 0; nfs_iodmount[myiod] = NULL; /* Someone may be waiting for the last nfsiod to terminate. */ if (--nfs_numasync == 0) --- nfsclient/nfs_subs.c.sav 2010-01-22 14:57:28.000000000 -0500 +++ nfsclient/nfs_subs.c 2010-01-22 16:35:10.000000000 -0500 @@ -347,7 +347,7 @@ nfs_ticks = 1; /* Ensure async daemons disabled */ for (i = 0; i < NFS_MAXASYNCDAEMON; i++) { - nfs_iodwant[i] = NULL; + nfs_iodwant[i] = 0; nfs_iodmount[i] = NULL; } nfs_nhinit(); /* Init the nfsnode table */ @@ -375,7 +375,7 @@ mtx_lock(&nfs_iod_mtx); nfs_iodmax = 0; for (i = 0; i < nfs_numasync; i++) - if (nfs_iodwant[i]) + if (nfs_iodwant[i] > 0) wakeup(&nfs_iodwant[i]); /* The last nfsiod to exit will wake us up when nfs_numasync hits 0 */ while (nfs_numasync) --- nfsclient/nfs_vnops.c.sav 2010-01-22 14:57:28.000000000 -0500 +++ nfsclient/nfs_vnops.c 2010-01-22 15:01:38.000000000 -0500 @@ -212,7 +212,7 @@ * Global variables */ struct mtx nfs_iod_mtx; -struct proc *nfs_iodwant[NFS_MAXASYNCDAEMON]; +int nfs_iodwant[NFS_MAXASYNCDAEMON]; struct nfsmount *nfs_iodmount[NFS_MAXASYNCDAEMON]; int nfs_numasync = 0; vop_advlock_t *nfs_advlock_p = nfs_dolock; From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 02:24:39 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 06395106568D for ; Sat, 23 Jan 2010 02:24:39 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-yx0-f171.google.com (mail-yx0-f171.google.com [209.85.210.171]) by mx1.freebsd.org (Postfix) with ESMTP id AFC638FC14 for ; Sat, 23 Jan 2010 02:24:38 +0000 (UTC) Received: by yxe1 with SMTP id 1so1552301yxe.3 for ; Fri, 22 Jan 2010 18:24:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type; bh=Gx+rmDDGxSqgq6mzW+GwC5r5aYsSyREkaxpmeR2WhZ4=; b=pmKeAeYFw2wY7rBYx/eSXDTb1gjxWd+ndEE8ZiOE23sxH9eM3iThqkgCcBo4HVyGg6 OinfSKFkziI8ukHOm30GeO4z9ak6HKPi/2Ltvy1NJpEY7CEqdqegCRn3kkAR6bXEi+EM n0QneHGsZd0XAX24kgSpEBt7AX4xx3aCKl0MI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=oa6ICefPtR9CBQJ4GqhcSWIw0LtGsWLI/bTu4HkJ9cxN/TK9Ee+g+w7nGzDLN+xF3U ZrBLUfxybIOa9sVbjo8Bmp2G+iirUen10CardPCVp5E/d55Y1sNJ1QQlMA6u/SR/Uaub ESfKS4YTaGUTuIOR6DII3hoQMLd1Yr3w2upJI= MIME-Version: 1.0 Sender: artemb@gmail.com Received: by 10.91.17.25 with SMTP id u25mr3440954agi.68.1264213477703; Fri, 22 Jan 2010 18:24:37 -0800 (PST) In-Reply-To: <1308c71eec426200d4c34b926bba8806.squirrel@email.polands.org> References: <4B58976E.1020402@polands.org> <4B58BD2D.30803@rcn.com> <4B58D4D3.80009@egr.msu.edu> <20100122042843.GA8858@polands.org> <1308c71eec426200d4c34b926bba8806.squirrel@email.polands.org> Date: Fri, 22 Jan 2010 18:24:36 -0800 X-Google-Sender-Auth: b12ac356b4ebea0e Message-ID: From: Artem Belevich To: Doug Poland Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org Subject: Re: Repeatable ZFS "kmem map too small" panic on 8.0-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 02:24:39 -0000 > % sysctl hw.physmem vm.kmem_size vm.kmem_size_max vfs.zfs.arc_max > > hw.physmem:4102688768 > vm.kmem_size: 2147483648 Here's your problem -- kmem_size is for some reason only 2G. Argh! I ran into that before. The code in sys/kern/kern_malloc.c intentionally limits kmem_size to twice the physical memory size: /* * Limit kmem virtual size to twice the physical memory. * This allows for kmem map sparseness, but limits the size * to something sane. Be careful to not overflow the 32bit * ints while doing the check. */ if (((vm_kmem_size / 2) / PAGE_SIZE) > cnt.v_page_count) vm_kmem_size = 2 * cnt.v_page_count * PAGE_SIZE; So, either comment out these lines or just set vm.kmem_size to slightly below 8G. --Artem From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 02:33:31 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C492810656B0 for ; Sat, 23 Jan 2010 02:33:31 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-gx0-f218.google.com (mail-gx0-f218.google.com [209.85.217.218]) by mx1.freebsd.org (Postfix) with ESMTP id 41D228FC0A for ; Sat, 23 Jan 2010 02:33:30 +0000 (UTC) Received: by gxk10 with SMTP id 10so1566726gxk.3 for ; Fri, 22 Jan 2010 18:33:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=F3ejol1Y1kdrODL4eODHKh1QoYC90FKB8qIhH5IyNGw=; b=k+a5PfOWRbAop/knUwy59JdDudNwhgMQOJt/dGXrHCRfeoGw+7cbWI6TOm7bXHSsUp iYCt3ibz36PTLY8eXF3+hqh68+Wl9F88J8ef8Emd0wHGcbTLZteGRqNbRg3MecRjWq9C +TzSL2pJKFCoS4oWV5UXzeO9Z8atvt+y/70n4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=XzRsOCeU7JTP4v/iln6IF9NseyW8vyrfPsOArS8AH6LcS+OR+tBCKKsDQI+TLjX9qR IHxh+iXc8YUdAuMCdlhOU7bV0U3/iaTI/7G0nXy7xFqpthTlUnh7lVNmOwOaDjMUWSCR gCnW4ZCQcoDA0VZuIya9sNI52WF60u1k3zRM8= MIME-Version: 1.0 Sender: artemb@gmail.com Received: by 10.90.37.14 with SMTP id k14mr3469915agk.53.1264214010284; Fri, 22 Jan 2010 18:33:30 -0800 (PST) In-Reply-To: References: <4B58976E.1020402@polands.org> <4B58D4D3.80009@egr.msu.edu> <20100122042843.GA8858@polands.org> <1308c71eec426200d4c34b926bba8806.squirrel@email.polands.org> Date: Fri, 22 Jan 2010 18:33:29 -0800 X-Google-Sender-Auth: cd6c1080d9e31b03 Message-ID: From: Artem Belevich To: Doug Poland Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: Repeatable ZFS "kmem map too small" panic on 8.0-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 02:33:31 -0000 Ignore my previous email. Something else is probably at play here. If I were right, then you should have ended up with vm.kmem_size=3D8G. However, in your case it's 2G. Beats me why. You may want to get to the console prompt and check whether loader did set the values correctly before it boots the kernel. If it didn't, then there may be something wrong with your /boot/loader.conf. Unfortunately whatever errors loader prints are immediately erased by the boot menu, so it's hard to see what exactly is the problem. --Artem On Fri, Jan 22, 2010 at 6:24 PM, Artem Belevich wrote: >> % sysctl hw.physmem vm.kmem_size vm.kmem_size_max vfs.zfs.arc_max >> >> hw.physmem:4102688768 >> vm.kmem_size: 2147483648 > > Here's your problem -- kmem_size is for some reason only 2G. > > Argh! I ran into that before. The code in sys/kern/kern_malloc.c > intentionally limits kmem_size to twice the physical memory size: > > =A0 =A0 =A0 =A0/* > =A0 =A0 =A0 =A0 * Limit kmem virtual size to twice the physical memory. > =A0 =A0 =A0 =A0 * This allows for kmem map sparseness, but limits the siz= e > =A0 =A0 =A0 =A0 * to something sane. Be careful to not overflow the 32bit > =A0 =A0 =A0 =A0 * ints while doing the check. > =A0 =A0 =A0 =A0 */ > =A0 =A0 =A0 =A0if (((vm_kmem_size / 2) / PAGE_SIZE) > cnt.v_page_count) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0vm_kmem_size =3D 2 * cnt.v_page_count * PA= GE_SIZE; > > So, either comment out these lines or just set vm.kmem_size to > slightly below 8G. > > --Artem > From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 06:23:39 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 50D49106568D for ; Sat, 23 Jan 2010 06:23:39 +0000 (UTC) (envelope-from rincebrain@gmail.com) Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.152]) by mx1.freebsd.org (Postfix) with ESMTP id E01FC8FC08 for ; Sat, 23 Jan 2010 06:23:38 +0000 (UTC) Received: by fg-out-1718.google.com with SMTP id 19so204762fgg.13 for ; Fri, 22 Jan 2010 22:23:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=L4f3s3U3HAIFML8g8K8RNJbdV8NJoOHz2M0AepMHgVg=; b=Mc82bZTGM5dJk+tiLk8qU5wj5qDKZNfdT5y7B8M3ql9mxo6/15paUSnKqrX6/cNhto 5YR8HMOp6dCZyAFSpu/3PYFCDCEU5xIZAPNgIf2aEi05sJlCJiL+p7yiXHfIo9dX2KMc HlICjy06xfz/P8w0l33xDa7GwL0swVySA/37I= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=Rg0sOqoWbWCQ/a41aYLa4zUbb/TtJnVG2pHPXXjEtrA94rJsnvJ2td/gbKh+FM5GZ0 azIIXna8QCp87yVawyYN1P08a5M/XK0BFRs+kgEUheh859MfZzQRQX4bZXx2Ufui9XKc 4eQs4SFm8t8n3CkM39+rWfY3CH0N/BXui7UVI= MIME-Version: 1.0 Received: by 10.239.187.139 with SMTP id l11mr450105hbh.96.1264227817800; Fri, 22 Jan 2010 22:23:37 -0800 (PST) Date: Sat, 23 Jan 2010 01:23:37 -0500 Message-ID: <5da0588e1001222223m773648am907267235bdcf882@mail.gmail.com> From: Rich To: freebsd-fs Content-Type: text/plain; charset=ISO-8859-1 Subject: Errors on a file on a zpool: How to remove? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 06:23:39 -0000 Hey world, I've got a series of files in a non-redundant zpool which all report Input/Output Error on attempting to manipulate them in any way - stat, read, rm, anything. Whenever anything is attempted, the following style of thing is printed to /var/log/messages: Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=rigatoni path=/dev/da4 offset=1231402180608 size=8192 Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=rigatoni path=/dev/da5 offset=446136819712 size=8192 Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=rigatoni path=/dev/da2 offset=320393101312 size=8192 Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=rigatoni path=/dev/da5 offset=446136819712 size=8192 Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=rigatoni path=/dev/da2 offset=320393101312 size=8192 Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=rigatoni path=/dev/da4 offset=1231402180608 size=8192 Jan 23 01:22:35 manticore root: ZFS: zpool I/O failure, zpool=rigatoni error=86 What can I do? I really would like to just purge all of these files from orbit, since I can recreate them, but I can't seem to delete them, and deleting the pool is a really inconvenient option, as I have other data on it. I'm running 8.0-RELEASE stock on amd64. Thanks! - Rich From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 08:11:42 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A7BFF106566B for ; Sat, 23 Jan 2010 08:11:42 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-iw0-f198.google.com (mail-iw0-f198.google.com [209.85.223.198]) by mx1.freebsd.org (Postfix) with ESMTP id 6FAC38FC19 for ; Sat, 23 Jan 2010 08:11:41 +0000 (UTC) Received: by iwn36 with SMTP id 36so1626871iwn.3 for ; Sat, 23 Jan 2010 00:11:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type; bh=cy6hGdaO7YzZNTX5MkbxlpzKuyEsT7jngmssI90bbfA=; b=rYF6CEkG2alU5q9SAsQ0tHgjjRZSPYMS+4RHE3y88ePncgqR8upGkvU5A8JGMDgqgB v8j1/octQa2JMM5bJG4kmqwcXmix05dfD2VZwKVEV3MZyFjx5fcShTQPtraDF6LLsqrz XGGeQFGAEHIP6ZTnmfv+M++SCJW7voODBX0nQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=NtU/ksyjusfnurBJDxAkNmcO9tJNdS2Vu8EnPvrjFSNcGVJkRcAMo8hEI5TGBHWvjV KCmKEVc9zlIRa5LzsPShP5hXDxbFx86IRtHK/cHEB+iQ9I1Kw8KJnHoo2D2EDBULftQs 2WYzwkpT9yfizvbP1DNmbBTckl435vkfDNhjM= MIME-Version: 1.0 Sender: artemb@gmail.com Received: by 10.231.154.197 with SMTP id p5mr3448856ibw.28.1264234301404; Sat, 23 Jan 2010 00:11:41 -0800 (PST) In-Reply-To: <5da0588e1001222223m773648am907267235bdcf882@mail.gmail.com> References: <5da0588e1001222223m773648am907267235bdcf882@mail.gmail.com> Date: Sat, 23 Jan 2010 00:11:41 -0800 X-Google-Sender-Auth: f6c5d204a720992d Message-ID: From: Artem Belevich To: Rich Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs Subject: Re: Errors on a file on a zpool: How to remove? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 08:11:42 -0000 The directory that those files are in may be corrupted. What does zpool status -v show? You may want to scrub the pool if you haven't done so yet. That would help to find all corrupted files. When plain files are corrupted, you should be able to remove them. You may also try to set atime=off on the filesystem to avoid filesystem updates on reads. Some time back when I had zpool corruption I've found no way to remove corrupted directory that still had some files in it. In the end I had to rebuild the pool. BTW, given that your pool did get corrupted, perhaps it might be a good idea to start moving your data somewhere else rather than worry about how to remove corrupted files. If corruption is due to bad hardware, bad files would just keep popping up. --Artem On Fri, Jan 22, 2010 at 10:23 PM, Rich wrote: > Hey world, > I've got a series of files in a non-redundant zpool which all report > Input/Output Error on attempting to manipulate them in any way - stat, > read, rm, anything. > > Whenever anything is attempted, the following style of thing is > printed to /var/log/messages: > Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=rigatoni > path=/dev/da4 offset=1231402180608 size=8192 > Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=rigatoni > path=/dev/da5 offset=446136819712 size=8192 > Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=rigatoni > path=/dev/da2 offset=320393101312 size=8192 > Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=rigatoni > path=/dev/da5 offset=446136819712 size=8192 > Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=rigatoni > path=/dev/da2 offset=320393101312 size=8192 > Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=rigatoni > path=/dev/da4 offset=1231402180608 size=8192 > Jan 23 01:22:35 manticore root: ZFS: zpool I/O failure, zpool=rigatoni error=86 > > What can I do? I really would like to just purge all of these files > from orbit, since I can recreate them, but I can't seem to delete > them, and deleting the pool is a really inconvenient option, as I have > other data on it. > > I'm running 8.0-RELEASE stock on amd64. > > Thanks! > > - Rich > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 08:14:08 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1E16D1065672 for ; Sat, 23 Jan 2010 08:14:08 +0000 (UTC) (envelope-from rincebrain@gmail.com) Received: from mail-fx0-f218.google.com (mail-fx0-f218.google.com [209.85.220.218]) by mx1.freebsd.org (Postfix) with ESMTP id A91238FC14 for ; Sat, 23 Jan 2010 08:14:07 +0000 (UTC) Received: by fxm10 with SMTP id 10so201744fxm.14 for ; Sat, 23 Jan 2010 00:14:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=7lVsUlf5FQc1L0ROTBiXAm2ZlBwCw6+O2CZhMxUBxKc=; b=aVLBF/nueahLUSaKUqTRQ+Ms/31qKh4RF/XPsJFFroQt2thZFAAOxXQW4Fb2PJxumJ F6MXRcVzRAbqbX9F07aJ7KtKEGeHGLOXN9U6T0ADxo+BCM3503j5iL4z99EZkChSwxjL s+LAOX3NLW+ZSw/g18wqstaJ69x0mvySKc01Y= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=BLXY3CoZHeqctUkZ8BEaczlY8RmOj2yekpsazJ/MhPMDhXtWG8ZNlW4bpyk7gTWlCy 7nk2rgqbyA0Rs7nmuasTPwLyfFWnrdQgHvP+uT/06hSOgK61DyvyDFd7cWuTBSE/VWXl wVCh8Ijue765UOdk2OPNCgmVFTKsExmG8opE4= MIME-Version: 1.0 Received: by 10.239.185.195 with SMTP id d3mr421549hbh.184.1264234446382; Sat, 23 Jan 2010 00:14:06 -0800 (PST) In-Reply-To: References: <5da0588e1001222223m773648am907267235bdcf882@mail.gmail.com> Date: Sat, 23 Jan 2010 03:14:06 -0500 Message-ID: <5da0588e1001230014k1b8a32f8v42046497265429ed@mail.gmail.com> From: Rich To: Artem Belevich Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs Subject: Re: Errors on a file on a zpool: How to remove? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 08:14:08 -0000 I already diagnosed the bad hardware - one of the two sticks of RAM had gone bad, and fails memtest in the other machine. pool: rigatoni state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: scrub completed after 15h28m with 1 errors on Thu Jan 21 18:09:25 2010 config: NAME STATE READ WRITE CKSUM rigatoni ONLINE 0 0 1 da4 ONLINE 0 0 2 da5 ONLINE 0 0 2 da7 ONLINE 0 0 0 da6 ONLINE 0 0 0 da2 ONLINE 0 0 2 errors: Permanent errors have been detected in the following files: rigatoni/mirrors:<0x0> Scrubbing repeatedly does nothing to remove the note about that error, and I'd rather like to avoid trying to recreate a 7TB pool. - Rich On Sat, Jan 23, 2010 at 3:11 AM, Artem Belevich wrote: > The directory that those files are in may be corrupted. What does > zpool status -v show? > > You may want to scrub the pool if you haven't done so yet. That would > help to find all corrupted files. > > When plain files are corrupted, you should be able to remove them. You > may also try to set atime=off on the filesystem to avoid filesystem > updates on reads. > Some time back when I had zpool corruption I've found no way to remove > corrupted directory that still had some files in it. In the end I had > to rebuild the pool. > > BTW, given that your pool did get corrupted, perhaps it might be a > good idea to start moving your data somewhere else rather than worry > about how to remove corrupted files. If corruption is due to bad > hardware, bad files would just keep popping up. > > --Artem > > > > On Fri, Jan 22, 2010 at 10:23 PM, Rich wrote: >> Hey world, >> I've got a series of files in a non-redundant zpool which all report >> Input/Output Error on attempting to manipulate them in any way - stat, >> read, rm, anything. >> >> Whenever anything is attempted, the following style of thing is >> printed to /var/log/messages: >> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=rigatoni >> path=/dev/da4 offset=1231402180608 size=8192 >> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=rigatoni >> path=/dev/da5 offset=446136819712 size=8192 >> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=rigatoni >> path=/dev/da2 offset=320393101312 size=8192 >> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=rigatoni >> path=/dev/da5 offset=446136819712 size=8192 >> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=rigatoni >> path=/dev/da2 offset=320393101312 size=8192 >> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=rigatoni >> path=/dev/da4 offset=1231402180608 size=8192 >> Jan 23 01:22:35 manticore root: ZFS: zpool I/O failure, zpool=rigatoni error=86 >> >> What can I do? I really would like to just purge all of these files >> from orbit, since I can recreate them, but I can't seem to delete >> them, and deleting the pool is a really inconvenient option, as I have >> other data on it. >> >> I'm running 8.0-RELEASE stock on amd64. >> >> Thanks! >> >> - Rich >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >> > -- Todo homem morre, mas nem todo homem vive. -- William Wallace From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 09:31:47 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6F1F0106566C for ; Sat, 23 Jan 2010 09:31:47 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-iw0-f198.google.com (mail-iw0-f198.google.com [209.85.223.198]) by mx1.freebsd.org (Postfix) with ESMTP id 344D98FC1B for ; Sat, 23 Jan 2010 09:31:46 +0000 (UTC) Received: by iwn36 with SMTP id 36so1647893iwn.3 for ; Sat, 23 Jan 2010 01:31:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=E6QdVfgLvpLQ37em2x7WU+4C6O/NV7NwYVecaR9JNBg=; b=QSWiI4KdlDoZQMtDpV/YxdhXEVJgFA8aPbrKqc3/p0slid0cSqHCMqmvVARdZKijK6 FgblXPkT2jzc28rQVa5vwzkVfY4iF+TwOqyp1J+aSx5lh1sAoh4QiKeZXMEXLz9OljID MfxU3QomU178yiqsBGbQiE42FL19St8gUQe0c= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=TPGs+HOWsJTm5ZqNlnFKAXjAjSqYuyTo0GgJWVBwvROs5lc9gD2UQ377fLBcrJZ3XK VrxXvUTbVwvfE4cq1OykZgk4iRZWV+nXcQFDnWID3dbIJs6JE8vKV8UQ98xjwAd6wg1c 6bw1vxvgc2WAiO8wHEky0cC362s7cJ9SVDYo0= MIME-Version: 1.0 Sender: artemb@gmail.com Received: by 10.231.151.212 with SMTP id d20mr3749261ibw.53.1264239105819; Sat, 23 Jan 2010 01:31:45 -0800 (PST) In-Reply-To: <5da0588e1001230014k1b8a32f8v42046497265429ed@mail.gmail.com> References: <5da0588e1001222223m773648am907267235bdcf882@mail.gmail.com> <5da0588e1001230014k1b8a32f8v42046497265429ed@mail.gmail.com> Date: Sat, 23 Jan 2010 01:31:45 -0800 X-Google-Sender-Auth: ab5a8eea5f50cf13 Message-ID: From: Artem Belevich To: Rich Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs Subject: Re: Errors on a file on a zpool: How to remove? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 09:31:47 -0000 > errors: Permanent errors have been detected in the following files: > =A0 =A0 =A0 =A0rigatoni/mirrors:<0x0> This looks similar to what I had. My wild guess would be that some metadata got corrupted. If you really want to fix it, you may need to get close and personal with zdb and ZFS on-disk format spec here: http://hub.opensolaris.org/bin/download/Community+Group+zfs/docs/ondiskform= at0822.pdf Following video and PDF may help a bit: http://video.google.com/videoplay?docid=3D2325724487196148104# http://www.osdevcon.org/2008/files/osdevcon2008-max.pdf If all you want is to avoid nagging errors you can try going up the tree until you can do something with directory. Then move it somewhere where it would not get in the way, "chmod 000" it and perhaps even do "setflags schg" on it to prevent anyone from descending into directory with bad files. --Artem On Sat, Jan 23, 2010 at 12:14 AM, Rich wrote: > I already diagnosed the bad hardware - one of the two sticks of RAM > had gone bad, and fails memtest in the other machine. > > =A0pool: rigatoni > =A0state: ONLINE > status: One or more devices has experienced an error resulting in data > =A0 =A0 =A0 =A0corruption. =A0Applications may be affected. > action: Restore the file in question if possible. =A0Otherwise restore th= e > =A0 =A0 =A0 =A0entire pool from backup. > =A0 see: http://www.sun.com/msg/ZFS-8000-8A > =A0scrub: scrub completed after 15h28m with 1 errors on Thu Jan 21 18:09:= 25 2010 > config: > > =A0 =A0 =A0 =A0NAME =A0 =A0 =A0 =A0STATE =A0 =A0 READ WRITE CKSUM > =A0 =A0 =A0 =A0rigatoni =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 1 > =A0 =A0 =A0 =A0 =A0da4 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0= 2 > =A0 =A0 =A0 =A0 =A0da5 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0= 2 > =A0 =A0 =A0 =A0 =A0da7 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0= 0 > =A0 =A0 =A0 =A0 =A0da6 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0= 0 > =A0 =A0 =A0 =A0 =A0da2 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0= 2 > > errors: Permanent errors have been detected in the following files: > > =A0 =A0 =A0 =A0rigatoni/mirrors:<0x0> > > Scrubbing repeatedly does nothing to remove the note about that error, > and I'd rather like to avoid trying to recreate a 7TB pool. > > - Rich > > On Sat, Jan 23, 2010 at 3:11 AM, Artem Belevich wrote: >> The directory that those files are in may be corrupted. What does >> zpool status -v show? >> >> You may want to scrub the pool if you haven't done so yet. That would >> help to find all corrupted files. >> >> When plain files are corrupted, you should be able to remove them. You >> may also try to set atime=3Doff on the filesystem to avoid filesystem >> updates on reads. >> Some time back when I had zpool corruption I've found no way to remove >> corrupted directory that still had some files in it. In the end I had >> to rebuild the pool. >> >> BTW, given that your pool did get corrupted, perhaps it might be a >> good idea to start moving your data somewhere else rather than worry >> about how to remove corrupted files. If corruption is due to bad >> hardware, bad files would just keep popping up. >> >> --Artem >> >> >> >> On Fri, Jan 22, 2010 at 10:23 PM, Rich wrote: >>> Hey world, >>> I've got a series of files in a non-redundant zpool which all report >>> Input/Output Error on attempting to manipulate them in any way - stat, >>> read, rm, anything. >>> >>> Whenever anything is attempted, the following style of thing is >>> printed to /var/log/messages: >>> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=3Drigaton= i >>> path=3D/dev/da4 offset=3D1231402180608 size=3D8192 >>> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=3Drigaton= i >>> path=3D/dev/da5 offset=3D446136819712 size=3D8192 >>> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=3Drigaton= i >>> path=3D/dev/da2 offset=3D320393101312 size=3D8192 >>> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=3Drigaton= i >>> path=3D/dev/da5 offset=3D446136819712 size=3D8192 >>> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=3Drigaton= i >>> path=3D/dev/da2 offset=3D320393101312 size=3D8192 >>> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=3Drigaton= i >>> path=3D/dev/da4 offset=3D1231402180608 size=3D8192 >>> Jan 23 01:22:35 manticore root: ZFS: zpool I/O failure, zpool=3Drigaton= i error=3D86 >>> >>> What can I do? I really would like to just purge all of these files >>> from orbit, since I can recreate them, but I can't seem to delete >>> them, and deleting the pool is a really inconvenient option, as I have >>> other data on it. >>> >>> I'm running 8.0-RELEASE stock on amd64. >>> >>> Thanks! >>> >>> - Rich >>> _______________________________________________ >>> freebsd-fs@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >>> >> > > > > -- > > Todo homem morre, mas nem todo homem vive. -- William Wallace > From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 09:36:16 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D75C41065670 for ; Sat, 23 Jan 2010 09:36:16 +0000 (UTC) (envelope-from rincebrain@gmail.com) Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.152]) by mx1.freebsd.org (Postfix) with ESMTP id 655748FC15 for ; Sat, 23 Jan 2010 09:36:16 +0000 (UTC) Received: by fg-out-1718.google.com with SMTP id 16so291653fgg.13 for ; Sat, 23 Jan 2010 01:36:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=KZqoQo7FQC3QecpskEdQ4UeW+xJijtoDaBqIyRbCTq0=; b=v632g0vaMbZyb00eZiTkD8R5sN4l1AlkYdLel66jGOkDf2D3VZeK7kT33CNNCdA88D NzWoSuErBV27s/x4WdSobIV0t3B/Fq4Dq1BBNuqZQks4pwEB0zP0q52t8kEUTx4/OYQK j4kgETIihsbabDnvP1BwMH74JFs5aUSi7dEXI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=YCxDZdJk7UMh0jvZOehwABF5uuiR4UiK7yZglVSZf+Ct1PEzUdbbtZH8HZwc8bAz8X SmKx+nnQHyv3uhRnB8RcBJDkWn1/W5ACFNiYFIKYSSRL+lx6MsEO4D1nGCob8jiWPa6u EST5Uy9tTAujxt8ozd8ztHZ67I4p/pULsPoHQ= MIME-Version: 1.0 Received: by 10.239.133.196 with SMTP id 4mr418490hbw.99.1264239375046; Sat, 23 Jan 2010 01:36:15 -0800 (PST) In-Reply-To: References: <5da0588e1001222223m773648am907267235bdcf882@mail.gmail.com> <5da0588e1001230014k1b8a32f8v42046497265429ed@mail.gmail.com> Date: Sat, 23 Jan 2010 04:36:15 -0500 Message-ID: <5da0588e1001230136j52442acfw306dccaa889af7bb@mail.gmail.com> From: Rich To: Artem Belevich Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs Subject: Re: Errors on a file on a zpool: How to remove? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 09:36:16 -0000 I guess I'm asking for opinions on whether this behavior being irreparable, to the point of not even being able to nuke the relevant FS blocks from orbit because they're corrupt without using zdb, is something that should be reported as a bug or enhancement request. I know the files are invalid at this point, and I'm okay with sacrificing them - I just want the errors to go away. It knows the checksums on the relevant data are invalid, I'd be okay with just rewriting the blocks so that everything thinks they're unused blocks and the files/directories are unlisted... ...but short of using zdb and rewriting them myself, I can't, and that seems ridiculous. For a filesystem that insists it never needs fsck, telling someone to resort to the filesystem's debug layer to remove damaged files is horrifying. - Rich On Sat, Jan 23, 2010 at 4:31 AM, Artem Belevich wrote: >> errors: Permanent errors have been detected in the following files: >> =A0 =A0 =A0 =A0rigatoni/mirrors:<0x0> > > This looks similar to what I had. My wild guess would be that some > metadata got corrupted. > > If you really want to fix it, you may need to get close and personal > with zdb and =A0ZFS on-disk format spec here: > http://hub.opensolaris.org/bin/download/Community+Group+zfs/docs/ondiskfo= rmat0822.pdf > > Following video and PDF may help a bit: > http://video.google.com/videoplay?docid=3D2325724487196148104# > http://www.osdevcon.org/2008/files/osdevcon2008-max.pdf > > If all you want is to avoid nagging errors you can try going up the > tree until you can do something with directory. Then move it somewhere > where it would not get in the way, "chmod 000" it and perhaps even do > "setflags schg" on it to prevent anyone from descending into directory > with bad files. > > --Artem > > > On Sat, Jan 23, 2010 at 12:14 AM, Rich wrote: >> I already diagnosed the bad hardware - one of the two sticks of RAM >> had gone bad, and fails memtest in the other machine. >> >> =A0pool: rigatoni >> =A0state: ONLINE >> status: One or more devices has experienced an error resulting in data >> =A0 =A0 =A0 =A0corruption. =A0Applications may be affected. >> action: Restore the file in question if possible. =A0Otherwise restore t= he >> =A0 =A0 =A0 =A0entire pool from backup. >> =A0 see: http://www.sun.com/msg/ZFS-8000-8A >> =A0scrub: scrub completed after 15h28m with 1 errors on Thu Jan 21 18:09= :25 2010 >> config: >> >> =A0 =A0 =A0 =A0NAME =A0 =A0 =A0 =A0STATE =A0 =A0 READ WRITE CKSUM >> =A0 =A0 =A0 =A0rigatoni =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 1 >> =A0 =A0 =A0 =A0 =A0da4 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 2 >> =A0 =A0 =A0 =A0 =A0da5 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 2 >> =A0 =A0 =A0 =A0 =A0da7 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 0 >> =A0 =A0 =A0 =A0 =A0da6 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 0 >> =A0 =A0 =A0 =A0 =A0da2 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 2 >> >> errors: Permanent errors have been detected in the following files: >> >> =A0 =A0 =A0 =A0rigatoni/mirrors:<0x0> >> >> Scrubbing repeatedly does nothing to remove the note about that error, >> and I'd rather like to avoid trying to recreate a 7TB pool. >> >> - Rich >> >> On Sat, Jan 23, 2010 at 3:11 AM, Artem Belevich wrote: >>> The directory that those files are in may be corrupted. What does >>> zpool status -v show? >>> >>> You may want to scrub the pool if you haven't done so yet. That would >>> help to find all corrupted files. >>> >>> When plain files are corrupted, you should be able to remove them. You >>> may also try to set atime=3Doff on the filesystem to avoid filesystem >>> updates on reads. >>> Some time back when I had zpool corruption I've found no way to remove >>> corrupted directory that still had some files in it. In the end I had >>> to rebuild the pool. >>> >>> BTW, given that your pool did get corrupted, perhaps it might be a >>> good idea to start moving your data somewhere else rather than worry >>> about how to remove corrupted files. If corruption is due to bad >>> hardware, bad files would just keep popping up. >>> >>> --Artem >>> >>> >>> >>> On Fri, Jan 22, 2010 at 10:23 PM, Rich wrote: >>>> Hey world, >>>> I've got a series of files in a non-redundant zpool which all report >>>> Input/Output Error on attempting to manipulate them in any way - stat, >>>> read, rm, anything. >>>> >>>> Whenever anything is attempted, the following style of thing is >>>> printed to /var/log/messages: >>>> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=3Drigato= ni >>>> path=3D/dev/da4 offset=3D1231402180608 size=3D8192 >>>> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=3Drigato= ni >>>> path=3D/dev/da5 offset=3D446136819712 size=3D8192 >>>> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=3Drigato= ni >>>> path=3D/dev/da2 offset=3D320393101312 size=3D8192 >>>> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=3Drigato= ni >>>> path=3D/dev/da5 offset=3D446136819712 size=3D8192 >>>> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=3Drigato= ni >>>> path=3D/dev/da2 offset=3D320393101312 size=3D8192 >>>> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=3Drigato= ni >>>> path=3D/dev/da4 offset=3D1231402180608 size=3D8192 >>>> Jan 23 01:22:35 manticore root: ZFS: zpool I/O failure, zpool=3Drigato= ni error=3D86 >>>> >>>> What can I do? I really would like to just purge all of these files >>>> from orbit, since I can recreate them, but I can't seem to delete >>>> them, and deleting the pool is a really inconvenient option, as I have >>>> other data on it. >>>> >>>> I'm running 8.0-RELEASE stock on amd64. >>>> >>>> Thanks! >>>> >>>> - Rich >>>> _______________________________________________ >>>> freebsd-fs@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >>>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >>>> >>> >> >> >> >> -- >> >> Todo homem morre, mas nem todo homem vive. -- William Wallace >> > --=20 Yow! Am I in Milwaukee? From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 09:57:07 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5C6001065672 for ; Sat, 23 Jan 2010 09:57:07 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-iw0-f198.google.com (mail-iw0-f198.google.com [209.85.223.198]) by mx1.freebsd.org (Postfix) with ESMTP id 1F8F68FC17 for ; Sat, 23 Jan 2010 09:57:06 +0000 (UTC) Received: by iwn36 with SMTP id 36so1654480iwn.3 for ; Sat, 23 Jan 2010 01:57:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=Dbqdnk9dTYNAOUsRhYoCZa2YhXkKqiH92iNiBy3h+jM=; b=nBGw86FZEObhJp/CWQo8S+g1MGcj08MNbE45wy5SOodWK6isHossimpXTuk2yB14hG JOqsqVMVYS1MzosDmp/PQNm/t+U+lFDQgT0HUawAGVZlJ311oK5w/8cfav8BIJe2zsAB iw1t2i7xFtv8dV97kzD29zNwW3wHlIsH6XiHY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=fqDEcebsePZnOdVBHxJlUL/TKs/FtO3ymPnZ8UH2FZ6fhjEkE1xyHFsCPKr2McqwRG BfKwSKUdOQvZfLsRW3RifLSsTr2nqmNzLwWCOMBTynKM+894HsK/qPqKsWoNsmq7UaM9 F+2mlKJSiV5kfNQwiYJzHW9qcR1iNNMgr1cwg= MIME-Version: 1.0 Sender: artemb@gmail.com Received: by 10.231.143.148 with SMTP id v20mr550918ibu.14.1264240626013; Sat, 23 Jan 2010 01:57:06 -0800 (PST) In-Reply-To: <5da0588e1001230136j52442acfw306dccaa889af7bb@mail.gmail.com> References: <5da0588e1001222223m773648am907267235bdcf882@mail.gmail.com> <5da0588e1001230014k1b8a32f8v42046497265429ed@mail.gmail.com> <5da0588e1001230136j52442acfw306dccaa889af7bb@mail.gmail.com> Date: Sat, 23 Jan 2010 01:57:05 -0800 X-Google-Sender-Auth: e07966d0bd577faa Message-ID: From: Artem Belevich To: Rich Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs Subject: Re: Errors on a file on a zpool: How to remove? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 09:57:07 -0000 I guess it's a feature. Without redundancy all ZFS can do is report corruption. If corruption is in the file data, it cen be removed along with the file. If corruption is in directory or in metadata there's little ZFS can do because that particular portion of filesystem is inconsistent. The best that could be done is to keep those portions read-only and try not to crash right away. With 7TB of data with no redundancy of any kind you are asking for trouble even with all hardware functioning within spec. The math is rather scary: http://queue.acm.org/detail.cfm?id=3D1670144 --Artem On Sat, Jan 23, 2010 at 1:36 AM, Rich wrote: > I guess I'm asking for opinions on whether this behavior being > irreparable, to the point of not even being able to nuke the relevant > FS blocks from orbit because they're corrupt without using zdb, is > something that should be reported as a bug or enhancement request. > > I know the files are invalid at this point, and I'm okay with > sacrificing them - I just want the errors to go away. It knows the > checksums on the relevant data are invalid, I'd be okay with just > rewriting the blocks so that everything thinks they're unused blocks > and the files/directories are unlisted... > > ...but short of using zdb and rewriting them myself, I can't, and that > seems ridiculous. > > For a filesystem that insists it never needs fsck, telling someone to > resort to the filesystem's debug layer to remove damaged files is > horrifying. > > - Rich > > On Sat, Jan 23, 2010 at 4:31 AM, Artem Belevich wrote: >>> errors: Permanent errors have been detected in the following files: >>> =A0 =A0 =A0 =A0rigatoni/mirrors:<0x0> >> >> This looks similar to what I had. My wild guess would be that some >> metadata got corrupted. >> >> If you really want to fix it, you may need to get close and personal >> with zdb and =A0ZFS on-disk format spec here: >> http://hub.opensolaris.org/bin/download/Community+Group+zfs/docs/ondiskf= ormat0822.pdf >> >> Following video and PDF may help a bit: >> http://video.google.com/videoplay?docid=3D2325724487196148104# >> http://www.osdevcon.org/2008/files/osdevcon2008-max.pdf >> >> If all you want is to avoid nagging errors you can try going up the >> tree until you can do something with directory. Then move it somewhere >> where it would not get in the way, "chmod 000" it and perhaps even do >> "setflags schg" on it to prevent anyone from descending into directory >> with bad files. >> >> --Artem >> >> >> On Sat, Jan 23, 2010 at 12:14 AM, Rich wrote: >>> I already diagnosed the bad hardware - one of the two sticks of RAM >>> had gone bad, and fails memtest in the other machine. >>> >>> =A0pool: rigatoni >>> =A0state: ONLINE >>> status: One or more devices has experienced an error resulting in data >>> =A0 =A0 =A0 =A0corruption. =A0Applications may be affected. >>> action: Restore the file in question if possible. =A0Otherwise restore = the >>> =A0 =A0 =A0 =A0entire pool from backup. >>> =A0 see: http://www.sun.com/msg/ZFS-8000-8A >>> =A0scrub: scrub completed after 15h28m with 1 errors on Thu Jan 21 18:0= 9:25 2010 >>> config: >>> >>> =A0 =A0 =A0 =A0NAME =A0 =A0 =A0 =A0STATE =A0 =A0 READ WRITE CKSUM >>> =A0 =A0 =A0 =A0rigatoni =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 1 >>> =A0 =A0 =A0 =A0 =A0da4 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 2 >>> =A0 =A0 =A0 =A0 =A0da5 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 2 >>> =A0 =A0 =A0 =A0 =A0da7 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 0 >>> =A0 =A0 =A0 =A0 =A0da6 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 0 >>> =A0 =A0 =A0 =A0 =A0da2 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 2 >>> >>> errors: Permanent errors have been detected in the following files: >>> >>> =A0 =A0 =A0 =A0rigatoni/mirrors:<0x0> >>> >>> Scrubbing repeatedly does nothing to remove the note about that error, >>> and I'd rather like to avoid trying to recreate a 7TB pool. >>> >>> - Rich >>> >>> On Sat, Jan 23, 2010 at 3:11 AM, Artem Belevich wrote= : >>>> The directory that those files are in may be corrupted. What does >>>> zpool status -v show? >>>> >>>> You may want to scrub the pool if you haven't done so yet. That would >>>> help to find all corrupted files. >>>> >>>> When plain files are corrupted, you should be able to remove them. You >>>> may also try to set atime=3Doff on the filesystem to avoid filesystem >>>> updates on reads. >>>> Some time back when I had zpool corruption I've found no way to remove >>>> corrupted directory that still had some files in it. In the end I had >>>> to rebuild the pool. >>>> >>>> BTW, given that your pool did get corrupted, perhaps it might be a >>>> good idea to start moving your data somewhere else rather than worry >>>> about how to remove corrupted files. If corruption is due to bad >>>> hardware, bad files would just keep popping up. >>>> >>>> --Artem >>>> >>>> >>>> >>>> On Fri, Jan 22, 2010 at 10:23 PM, Rich wrote: >>>>> Hey world, >>>>> I've got a series of files in a non-redundant zpool which all report >>>>> Input/Output Error on attempting to manipulate them in any way - stat= , >>>>> read, rm, anything. >>>>> >>>>> Whenever anything is attempted, the following style of thing is >>>>> printed to /var/log/messages: >>>>> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=3Drigat= oni >>>>> path=3D/dev/da4 offset=3D1231402180608 size=3D8192 >>>>> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=3Drigat= oni >>>>> path=3D/dev/da5 offset=3D446136819712 size=3D8192 >>>>> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=3Drigat= oni >>>>> path=3D/dev/da2 offset=3D320393101312 size=3D8192 >>>>> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=3Drigat= oni >>>>> path=3D/dev/da5 offset=3D446136819712 size=3D8192 >>>>> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=3Drigat= oni >>>>> path=3D/dev/da2 offset=3D320393101312 size=3D8192 >>>>> Jan 23 01:22:34 manticore root: ZFS: checksum mismatch, zpool=3Drigat= oni >>>>> path=3D/dev/da4 offset=3D1231402180608 size=3D8192 >>>>> Jan 23 01:22:35 manticore root: ZFS: zpool I/O failure, zpool=3Drigat= oni error=3D86 >>>>> >>>>> What can I do? I really would like to just purge all of these files >>>>> from orbit, since I can recreate them, but I can't seem to delete >>>>> them, and deleting the pool is a really inconvenient option, as I hav= e >>>>> other data on it. >>>>> >>>>> I'm running 8.0-RELEASE stock on amd64. >>>>> >>>>> Thanks! >>>>> >>>>> - Rich >>>>> _______________________________________________ >>>>> freebsd-fs@freebsd.org mailing list >>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >>>>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >>>>> >>>> >>> >>> >>> >>> -- >>> >>> Todo homem morre, mas nem todo homem vive. -- William Wallace >>> >> > > > > -- > > Yow! Am I in Milwaukee? > From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 10:05:39 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3BC87106566C for ; Sat, 23 Jan 2010 10:05:39 +0000 (UTC) (envelope-from rincebrain@gmail.com) Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.155]) by mx1.freebsd.org (Postfix) with ESMTP id C467B8FC0C for ; Sat, 23 Jan 2010 10:05:38 +0000 (UTC) Received: by fg-out-1718.google.com with SMTP id 16so296690fgg.13 for ; Sat, 23 Jan 2010 02:05:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=xhhzeFejFgUpsMRJE6ap5peUfbpc3NRIlyowAVJPOL4=; b=RueLfTLRxiV49QCOg2g7pe96xDyJRNQnMlq5+iTbM7ZjFnhk5pZyXA7juIyOf3qkBm CV734FXAKH9Khh9PEf0HxiGmVIGl1tFI1D8T1Y4DwQE0lnYfvNF8NqoftXIBSTmUVZeh sJLG9Vir3F7l/cgUq7E/rWMdjo+T7aF+vUIpo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=AiNr6utpb/Exd6rR3GNttvaUxLVMnR2/uausabhfc0D9hk73Tw/16rSfKcWX/7FEYK 2yfL/HJwmX0SXLekyooQLmogKn4/+v2nIqow1KRAwYwh/QwU+owKQXKPz2tobIj3HJLl Rbjazem8ZI8W3/Fzq8MXT7HEpLGjqI86xOGgg= MIME-Version: 1.0 Received: by 10.239.188.82 with SMTP id o18mr464021hbh.129.1264241137470; Sat, 23 Jan 2010 02:05:37 -0800 (PST) In-Reply-To: References: <5da0588e1001222223m773648am907267235bdcf882@mail.gmail.com> <5da0588e1001230014k1b8a32f8v42046497265429ed@mail.gmail.com> <5da0588e1001230136j52442acfw306dccaa889af7bb@mail.gmail.com> Date: Sat, 23 Jan 2010 05:05:37 -0500 Message-ID: <5da0588e1001230205y43f359dh68268a9c59b6ce53@mail.gmail.com> From: Rich To: Artem Belevich Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs Subject: Re: Errors on a file on a zpool: How to remove? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 10:05:39 -0000 On Sat, Jan 23, 2010 at 4:57 AM, Artem Belevich wrote: > I guess it's a feature. Without redundancy all ZFS can do is report > corruption. If corruption is in the file data, it cen be removed along > with the file. If corruption is in directory or in metadata there's > little ZFS can do because that particular portion of filesystem is > inconsistent. The best that could be done is to keep those portions > read-only and try not to crash right away. > > With 7TB of data with no redundancy of any kind you are asking for > trouble even with all hardware functioning within spec. I'm well aware, thank you. The data is all retrievable from other sources, which is why redundancy wasn't a large concern in this particular pool, just speed, and knowledge of when bits flip. Of course, when all ZFS claims to do is raise EIO on corruption, and requires you to nuke the pool from orbit in order to get it to stop, rather than isolating and removing the offending segments, it's rather frustrating. - Rich From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 12:17:55 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 32FAB106568D for ; Sat, 23 Jan 2010 12:17:55 +0000 (UTC) (envelope-from andrew@modulus.org) Received: from email.octopus.com.au (email.octopus.com.au [122.100.2.232]) by mx1.freebsd.org (Postfix) with ESMTP id E81A48FC13 for ; Sat, 23 Jan 2010 12:17:54 +0000 (UTC) Received: by email.octopus.com.au (Postfix, from userid 1002) id AFC645CB8E9; Sat, 23 Jan 2010 22:53:45 +1100 (EST) X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on email.octopus.com.au X-Spam-Level: * X-Spam-Status: No, score=1.9 required=10.0 tests=ALL_TRUSTED, FH_DATE_PAST_20XX autolearn=no version=3.2.3 Received: from [10.20.30.101] (60.218.233.220.static.exetel.com.au [220.233.218.60]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: admin@email.octopus.com.au) by email.octopus.com.au (Postfix) with ESMTP id 523225CB91A; Sat, 23 Jan 2010 22:53:41 +1100 (EST) Message-ID: <4B5AE8D7.9000103@modulus.org> Date: Sat, 23 Jan 2010 23:17:27 +1100 From: Andrew Snow User-Agent: Thunderbird 1.5.0.9 (Windows/20061207) MIME-Version: 1.0 To: Rich , freebsd-fs@freebsd.org References: <5da0588e1001222223m773648am907267235bdcf882@mail.gmail.com> <5da0588e1001230014k1b8a32f8v42046497265429ed@mail.gmail.com> In-Reply-To: <5da0588e1001230014k1b8a32f8v42046497265429ed@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Re: Errors on a file on a zpool: How to remove? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 12:17:55 -0000 Rich wrote: > Scrubbing repeatedly does nothing to remove the note about that error, > and I'd rather like to avoid trying to recreate a 7TB pool. If you have bad hardware, its quite possible for ZFS to get itself into a state that it cannot repair itself. The claim about "never needs fsck" only applies if the hardware is doing what is expected of it. This especially goes for a pool with zero redundancy, like yours. Its pretty good that ZFS can report the checksum failures where with other filesystems wouldn't even know something's wrong until it starts returning garbled data or crashes the whole kernel. I presume you have already tried "zpool clear" ? - Andrew From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 12:43:43 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B7D92106568D for ; Sat, 23 Jan 2010 12:43:43 +0000 (UTC) (envelope-from rincebrain@gmail.com) Received: from mail-fx0-f226.google.com (mail-fx0-f226.google.com [209.85.220.226]) by mx1.freebsd.org (Postfix) with ESMTP id 4B9738FC27 for ; Sat, 23 Jan 2010 12:43:42 +0000 (UTC) Received: by fxm26 with SMTP id 26so2020671fxm.13 for ; Sat, 23 Jan 2010 04:43:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=3qCGqSqYnHZhYELkbzBFumqRelk727xHoFqaLUpp+qY=; b=Hc7Uc5piY2GciRCy6BtsgO+cIAhczEdlB1L/znffSE0ROOUWsVg8IRbta2c9fv09tU M5O2XMscjRLeF09HJBlXA9hqvu0tHTIe6L2QF+FZ9+Mbd8114/KOgS2wZZwAMewTKBO+ R3l6cNHDSQC4URgMoS4RiB2Lyg++w/hMUP+H4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=qXGMjALopFkychQIhBrovpkxRXfTcWhqIPQzAjpC8VBBkUlzLMSYQPBOOh+LMULVVq Un5nypKUTjS+yvtsLKlJk4QiI0MvdGZYkdzbmIrvZC8LviHOf0gFz8LuIQttKlM0pBya uwri1N9W6wsVkpruLomstSjzyQD1/4Ik/778M= MIME-Version: 1.0 Received: by 10.239.187.139 with SMTP id l11mr480113hbh.96.1264250615907; Sat, 23 Jan 2010 04:43:35 -0800 (PST) In-Reply-To: <4B5AE8D7.9000103@modulus.org> References: <5da0588e1001222223m773648am907267235bdcf882@mail.gmail.com> <5da0588e1001230014k1b8a32f8v42046497265429ed@mail.gmail.com> <4B5AE8D7.9000103@modulus.org> Date: Sat, 23 Jan 2010 07:43:35 -0500 Message-ID: <5da0588e1001230443r1fee3b45o906690bc0115bb4e@mail.gmail.com> From: Rich To: Andrew Snow Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: Errors on a file on a zpool: How to remove? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 12:43:43 -0000 zpool clear always clears the checksum column whenever I run it. Then, as soon as I touch those files again, or run a scrub, the checksum error numbers tick up on those three disks, and those entries appear in /var/log/messages. - Rich On Sat, Jan 23, 2010 at 7:17 AM, Andrew Snow wrote: > Rich wrote: >> >> Scrubbing repeatedly does nothing to remove the note about that error, >> and I'd rather like to avoid trying to recreate a 7TB pool. > > If you have bad hardware, its quite possible for ZFS to get itself into a > state that it cannot repair itself. =A0The claim about "never needs fsck"= only > applies if the hardware is doing what is expected of it. This especially > goes for a pool with zero redundancy, like yours. > > Its pretty good that ZFS can report the checksum failures where with othe= r > filesystems wouldn't even know something's wrong until it starts returnin= g > garbled data or crashes the whole kernel. > > I presume you have already tried "zpool clear" ? > > - Andrew > > --=20 Life is a yo-yo, and mankind ties knots in the string. From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 12:48:51 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3D144106566B for ; Sat, 23 Jan 2010 12:48:51 +0000 (UTC) (envelope-from andrew@modulus.org) Received: from email.octopus.com.au (email.octopus.com.au [122.100.2.232]) by mx1.freebsd.org (Postfix) with ESMTP id F0A0E8FC12 for ; Sat, 23 Jan 2010 12:48:50 +0000 (UTC) Received: by email.octopus.com.au (Postfix, from userid 1002) id 7A74F5CB90E; Sat, 23 Jan 2010 23:24:42 +1100 (EST) X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on email.octopus.com.au X-Spam-Level: * X-Spam-Status: No, score=1.9 required=10.0 tests=ALL_TRUSTED, FH_DATE_PAST_20XX autolearn=no version=3.2.3 Received: from [10.20.30.101] (60.218.233.220.static.exetel.com.au [220.233.218.60]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: admin@email.octopus.com.au) by email.octopus.com.au (Postfix) with ESMTP id 3CB1F5CB8E7; Sat, 23 Jan 2010 23:24:38 +1100 (EST) Message-ID: <4B5AF018.7070503@modulus.org> Date: Sat, 23 Jan 2010 23:48:24 +1100 From: Andrew Snow User-Agent: Thunderbird 1.5.0.9 (Windows/20061207) MIME-Version: 1.0 To: Rich , freebsd-fs@freebsd.org References: <5da0588e1001222223m773648am907267235bdcf882@mail.gmail.com> <5da0588e1001230014k1b8a32f8v42046497265429ed@mail.gmail.com> <4B5AE8D7.9000103@modulus.org> <5da0588e1001230443r1fee3b45o906690bc0115bb4e@mail.gmail.com> In-Reply-To: <5da0588e1001230443r1fee3b45o906690bc0115bb4e@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Re: Errors on a file on a zpool: How to remove? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 12:48:51 -0000 Rich wrote: > zpool clear always clears the checksum column whenever I run it. > > Then, as soon as I touch those files again, or run a scrub, the > checksum error numbers tick up on those three disks, and those entries > appear in /var/log/messages. That is the normal behaviour if there are no additional copies of the data to go from (via mirroring or RAIDZ): it sees that the file has blocks with incorrect checksums, but it won't take action as there's no way to know if the file data is corrupt or the checksum value is wrong. You might be able to clear it by renaming the file and copying it back in place, and thus the new file will not have any bad checksums (but likely will contain corrupt data). - Andrew From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 13:21:13 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6A733106568B for ; Sat, 23 Jan 2010 13:21:13 +0000 (UTC) (envelope-from rincebrain@gmail.com) Received: from mail-fx0-f226.google.com (mail-fx0-f226.google.com [209.85.220.226]) by mx1.freebsd.org (Postfix) with ESMTP id F3F928FC0C for ; Sat, 23 Jan 2010 13:21:12 +0000 (UTC) Received: by fxm26 with SMTP id 26so2036917fxm.13 for ; Sat, 23 Jan 2010 05:21:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=793qY4YtoybjGE0rTtox0sEmk3IMbd+D+w3FdLeM6Es=; b=RzTIAC8mpJZfcLguiSpM0PG4feyMvzMtylc2vdPFp8mlHNMkLzxc3tU7aCbY+9xZ1Z INQido6ZVuV6W5N+ULmoDYip7BQUQ7y1br5/8cOWhjHcTe+c9lnPBzXYTH8I3g/mlRrm MyRwujyMND6Rh3tyMTSElZMfYEdJukUlohjJM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=sUpi6m9heaj6OMpPJrJo5MR6T5+8OGpzhUn+dF/tMHHqQMCj+a1ruRTBWOSmmJpsk4 dvnGst+272CTMYeChRPP7pmiWMbWofNBrP5ugBBE9DGyn+I90FIg6VL/AQ6Z5Zz7KUZF QEa859EV68mAWZmzI65SG+ttuaZsqH1WEUllQ= MIME-Version: 1.0 Received: by 10.239.165.72 with SMTP id w8mr466488hbd.8.1264252871285; Sat, 23 Jan 2010 05:21:11 -0800 (PST) In-Reply-To: <4B5AF018.7070503@modulus.org> References: <5da0588e1001222223m773648am907267235bdcf882@mail.gmail.com> <5da0588e1001230014k1b8a32f8v42046497265429ed@mail.gmail.com> <4B5AE8D7.9000103@modulus.org> <5da0588e1001230443r1fee3b45o906690bc0115bb4e@mail.gmail.com> <4B5AF018.7070503@modulus.org> Date: Sat, 23 Jan 2010 08:21:11 -0500 Message-ID: <5da0588e1001230521h3cbef0b9m9781bb7eb6ff92c@mail.gmail.com> From: Rich To: Andrew Snow Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org Subject: Re: Errors on a file on a zpool: How to remove? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 13:21:13 -0000 I can't; any operations on the file yield EIEIO...err, EIO, input/output error. - Rich On Sat, Jan 23, 2010 at 7:48 AM, Andrew Snow wrote: > Rich wrote: >> >> zpool clear always clears the checksum column whenever I run it. >> >> Then, as soon as I touch those files again, or run a scrub, the >> checksum error numbers tick up on those three disks, and those entries >> appear in /var/log/messages. > > That is the normal behaviour if there are no additional copies of the data > to go from (via mirroring or RAIDZ): it sees that the file has blocks with > incorrect checksums, but it won't take action as there's no way to know if > the file data is corrupt or the checksum value is wrong. > > You might be able to clear it by renaming the file and copying it back in > place, and thus the new file will not have any bad checksums (but likely > will contain corrupt data). > > - Andrew > -- You are an insult to my intelligence! I demand that you log off immediately. From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 14:35:02 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5256C106566B for ; Sat, 23 Jan 2010 14:35:02 +0000 (UTC) (envelope-from doug@polands.org) Received: from hrndva-omtalb.mail.rr.com (hrndva-omtalb.mail.rr.com [71.74.56.125]) by mx1.freebsd.org (Postfix) with ESMTP id 084338FC12 for ; Sat, 23 Jan 2010 14:35:01 +0000 (UTC) X-Authority-Analysis: v=1.0 c=1 a=GSN_Y9T6cv4A:10 a=QNyOmuW7hx_2i3oR08AA:9 a=gD7rSQgncZlKzIFUK0rFvdNyP0IA:4 X-Cloudmark-Score: 0 X-Originating-IP: 75.87.219.217 Received: from [75.87.219.217] ([75.87.219.217:55929] helo=haran.polands.org) by hrndva-oedge03.mail.rr.com (envelope-from ) (ecelerity 2.2.2.39 r()) with ESMTP id 7B/FB-19161-4190B5B4; Sat, 23 Jan 2010 14:35:01 +0000 Received: from [172.16.1.37] (sichem-wifi.polands.org [172.16.1.37]) by haran.polands.org (8.14.3/8.14.3) with ESMTP id o0NEYxkO034898; Sat, 23 Jan 2010 08:35:00 -0600 (CST) (envelope-from doug@polands.org) Message-ID: <4B5B0913.3050107@polands.org> Date: Sat, 23 Jan 2010 08:34:59 -0600 From: Doug Poland User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.1.7) Gecko/20100111 Thunderbird/3.0.1 MIME-Version: 1.0 To: Artem Belevich References: <4B58976E.1020402@polands.org> <4B58D4D3.80009@egr.msu.edu> <20100122042843.GA8858@polands.org> <1308c71eec426200d4c34b926bba8806.squirrel@email.polands.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: Repeatable ZFS "kmem map too small" panic on 8.0-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 14:35:02 -0000 On 2010-01-22 20:33, Artem Belevich wrote: > > > Ignore my previous email. Something else is probably at play here. If > I were right, then you should have ended up with vm.kmem_size=8G. > However, in your case it's 2G. Beats me why. > I apologize, when I captured the sysctls, I was running with different values in /boot/loader.conf than what we were discussing. While I was waiting for a response, I was experimenting with a recommendation to run kmem_size = 1/2 phy RAM and arc_max 1/2 of kmem_size. This is what I have now: # cat /boot/loader.conf vfs.root.mountfrom="zfs:bethesda" vfs.zfs.arc_max="1G" vm.kmem_size="20G" zfs_load="YES" # sysctl hw.physmem vm.kmem_size vm.kmem_size_max vfs.zfs.arc_max hw.physmem: 4102688768 vm.kmem_size: 3668041728 vm.kmem_size_max: 329853485875 vfs.zfs.arc_max: 1073741824 # unixbench fstime fsbuffer fsdisk panic: kmem_malloc(131072): kmem_map too small: 3593236480 total allocated -- Regards, Doug From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 14:51:09 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C0365106566C; Sat, 23 Jan 2010 14:51:09 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-fx0-f226.google.com (mail-fx0-f226.google.com [209.85.220.226]) by mx1.freebsd.org (Postfix) with ESMTP id 272D98FC20; Sat, 23 Jan 2010 14:51:08 +0000 (UTC) Received: by fxm26 with SMTP id 26so2076519fxm.13 for ; Sat, 23 Jan 2010 06:51:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:to:cc:subject:references :organization:from:date:in-reply-to:message-id:user-agent :mime-version:content-type; bh=2oX4HIw7PMEf2Z+H5a4yCLpdVvoFVr1p8PUmtvc0GUw=; b=xEYj8RmyTBMzkSI+jI7MlTcQOhnBWAXIhFDlOwUzf2XGIUsMY9FMVFBWHNLnGBbEap NTodRVr6uuvOkaIYRozC6T1qxRYp0kxMBiIFputEAbZ2U+2dotwizK8Cci0otHTwy91Q e4dnStFZcODMeNI8TeFqFg77dILppI71ErAw4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=to:cc:subject:references:organization:from:date:in-reply-to :message-id:user-agent:mime-version:content-type; b=atBgxoyZ9VkvH18svZeKRIq++6BvutO6xJ+9wL4dmk97L871SNP0jloG5KlDBOsklH +SuTercGMmseD2RkL7Nb2G1nR7Z2+adBPXy5Mt9BGzZGI2qZc5jctagXxVSeUv80kQeK jeTe0gxb74xIjOYT9RHltHyNfkICajN4Xz86Y= Received: by 10.102.17.15 with SMTP id 15mr2210217muq.133.1264258267836; Sat, 23 Jan 2010 06:51:07 -0800 (PST) Received: from localhost (vpn-193-138-135-114.customer.onet.com.ua [193.138.135.114]) by mx.google.com with ESMTPS id b9sm13212343mug.9.2010.01.23.06.51.06 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sat, 23 Jan 2010 06:51:06 -0800 (PST) To: Rick Macklem References: <86ocl272mb.fsf@kopusha.onet> <86tyuqnz9x.fsf@zhuzha.ua1> <86zl4awmon.fsf@zhuzha.ua1> <86vdeywmha.fsf@zhuzha.ua1> <86vdeuuo2y.fsf@zhuzha.ua1> Organization: TOA Ukraine From: Mikolaj Golub Date: Sat, 23 Jan 2010 16:51:04 +0200 In-Reply-To: (Rick Macklem's message of "Fri\, 22 Jan 2010 17\:13\:09 -0500 \(EST\)") Message-ID: <86pr50vprb.fsf@kopusha.onet> User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-fs@FreeBSD.org, freebsd-stable@FreeBSD.org Subject: Re: FreeBSD NFS client/Linux NFS server issue X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 14:51:09 -0000 On Fri, 22 Jan 2010 17:13:09 -0500 (EST) Rick Macklem wrote: > On Fri, 22 Jan 2010, Rick Macklem wrote: > >> >> There should probably be some sort of 3 way handshake between >> the code in nfs_asyncio() after calling nfs_nfsnewiod() and the >> code near the beginning of nfssvc_iod(), but I think the following >> somewhat cheesy fix might do the trick: >> > [stuff deleted] > I know it's a little weird to reply to my own posting, but I think > this might be a reasonable patch (I have only tested it for a few > minutes at this point). > > I basically redefined nfs_iodwant[] as a tri-state variable (although > it was a struct proc *, it was only tested NULL/non-NULL). > 0 - was NULL > 1 - was non-NULL > -1 - just created by nfs_asyncio() and will be used by it > > I'll keep testing it, but hopefully someone else can test and/or > review it... rick I applied your patch to FreeBSD8.0 (the box I get on weekend :-), mounted 10 shares, set vfs.nfs.iodmaxidle=10 (to have nfsiod creation more frequently) and have been running tests for 4 hours -- just to check the patch does not break anything. No issues have been detected. It would be very nice to have this patch committed. -- Mikolaj Golub From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 16:54:10 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 97E441065670 for ; Sat, 23 Jan 2010 16:54:10 +0000 (UTC) (envelope-from mcdouga9@egr.msu.edu) Received: from mx.egr.msu.edu (surfnturf.egr.msu.edu [35.9.37.164]) by mx1.freebsd.org (Postfix) with ESMTP id 642088FC13 for ; Sat, 23 Jan 2010 16:54:09 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mx.egr.msu.edu (Postfix) with ESMTP id 3A433101114; Sat, 23 Jan 2010 11:54:09 -0500 (EST) X-Virus-Scanned: amavisd-new at egr.msu.edu Received: from mx.egr.msu.edu ([127.0.0.1]) by localhost (surfnturf.egr.msu.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id o8vSi8DY1qeI; Sat, 23 Jan 2010 11:54:09 -0500 (EST) Received: from localhost (daemon.egr.msu.edu [35.9.44.65]) by mx.egr.msu.edu (Postfix) with ESMTP id E4BAA101110; Sat, 23 Jan 2010 11:54:08 -0500 (EST) Received: by localhost (Postfix, from userid 21281) id D451E8895C; Sat, 23 Jan 2010 11:54:08 -0500 (EST) Date: Sat, 23 Jan 2010 11:54:08 -0500 From: Adam McDougall To: Artem Belevich Message-ID: <20100123165408.GR86054@egr.msu.edu> References: <4B58BD2D.30803@rcn.com> <4B58D4D3.80009@egr.msu.edu> <20100122042843.GA8858@polands.org> <1308c71eec426200d4c34b926bba8806.squirrel@email.polands.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-fs@freebsd.org Subject: Re: Repeatable ZFS "kmem map too small" panic on 8.0-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 16:54:10 -0000 On Fri, Jan 22, 2010 at 06:24:36PM -0800, Artem Belevich wrote: > % sysctl hw.physmem vm.kmem_size vm.kmem_size_max vfs.zfs.arc_max > > hw.physmem:4102688768 > vm.kmem_size: 2147483648 Here's your problem -- kmem_size is for some reason only 2G. Argh! I ran into that before. The code in sys/kern/kern_malloc.c intentionally limits kmem_size to twice the physical memory size: /* * Limit kmem virtual size to twice the physical memory. * This allows for kmem map sparseness, but limits the size * to something sane. Be careful to not overflow the 32bit * ints while doing the check. */ if (((vm_kmem_size / 2) / PAGE_SIZE) > cnt.v_page_count) vm_kmem_size = 2 * cnt.v_page_count * PAGE_SIZE; So, either comment out these lines or just set vm.kmem_size to slightly below 8G. --Artem That works for me and a friend of mine, I didn't realize that my kmem_size was being capped. I tested it on a 8.0-prerel server with 8G of ram, kmem_size set to 20G in the loader but the resulting kmem_size was less than 4G before the patch. Now it is 20G. So, I'm not sure it calculates it properly? From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 21:22:00 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C9C3D1065679 for ; Sat, 23 Jan 2010 21:22:00 +0000 (UTC) (envelope-from morganw@chemikals.org) Received: from warped.bluecherry.net (unknown [IPv6:2001:440:eeee:fffb::2]) by mx1.freebsd.org (Postfix) with ESMTP id 547D48FC15 for ; Sat, 23 Jan 2010 21:22:00 +0000 (UTC) Received: from volatile.chemikals.org (adsl-67-214-156.shv.bellsouth.net [98.67.214.156]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by warped.bluecherry.net (Postfix) with ESMTPSA id 1E2D3907DD3F; Sat, 23 Jan 2010 15:21:58 -0600 (CST) Received: from localhost (morganw@localhost [127.0.0.1]) by volatile.chemikals.org (8.14.3/8.14.3) with ESMTP id o0NLLt0e001997; Sat, 23 Jan 2010 15:21:56 -0600 (CST) (envelope-from morganw@chemikals.org) Date: Sat, 23 Jan 2010 15:21:55 -0600 (CST) From: Wes Morgan X-X-Sender: morganw@volatile To: Rich In-Reply-To: <5da0588e1001230014k1b8a32f8v42046497265429ed@mail.gmail.com> Message-ID: References: <5da0588e1001222223m773648am907267235bdcf882@mail.gmail.com> <5da0588e1001230014k1b8a32f8v42046497265429ed@mail.gmail.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: clamav-milter 0.95.3 at warped X-Virus-Status: Clean Cc: freebsd-fs Subject: Re: Errors on a file on a zpool: How to remove? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 21:22:00 -0000 On Sat, 23 Jan 2010, Rich wrote: > I already diagnosed the bad hardware - one of the two sticks of RAM > had gone bad, and fails memtest in the other machine. > > pool: rigatoni > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://www.sun.com/msg/ZFS-8000-8A > scrub: scrub completed after 15h28m with 1 errors on Thu Jan 21 18:09:25 2010 > config: > > NAME STATE READ WRITE CKSUM > rigatoni ONLINE 0 0 1 > da4 ONLINE 0 0 2 > da5 ONLINE 0 0 2 > da7 ONLINE 0 0 0 > da6 ONLINE 0 0 0 > da2 ONLINE 0 0 2 > > errors: Permanent errors have been detected in the following files: > > rigatoni/mirrors:<0x0> Can you post your entire pool filesystem structure? That message above looks like an unreferenced block or corrupted metadata rather than an actual file. Also, if it's part of a snapshot, you simply have to destroy the snapshot. I had a pool become corrupted due to bad memory, and all of the files were still able to be manipulated. The only time EIO popped up was on the specific block that had a checksum error. From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 22:15:13 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 78119106566C for ; Sat, 23 Jan 2010 22:15:13 +0000 (UTC) (envelope-from rincebrain@gmail.com) Received: from mail-fx0-f226.google.com (mail-fx0-f226.google.com [209.85.220.226]) by mx1.freebsd.org (Postfix) with ESMTP id 088188FC19 for ; Sat, 23 Jan 2010 22:15:12 +0000 (UTC) Received: by fxm26 with SMTP id 26so2278019fxm.13 for ; Sat, 23 Jan 2010 14:15:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=kdq88iTthHfVPYnhGJUJqfQtFPVHYV6rh1MYaHVDMmE=; b=mOPj28SrQwN85z+/k6oF61otJsrXNAa33EaYI5ubjjsx4qSHgIO5J2XEKGQeC12KRy u5Mi2SxdGxJU7kWOmQOkl4cl0ElPRHLyRligNQeBfmsEh0jIhRgHJnD3iBGuM3dVk39l yAECo+UTyjmtHeBM1qC3Pd8AbCIvDTeXL4duM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=AUTagrvKu3uXYkrrixjjOKkWkFqmjwtZUihdIG8CAoqLk5N03kIjoaDDy4yT3xxE5r aF3TuKZ8exWXe3VE76PNWP0DdIRr4Bt/D4bvrimQEFowm9zaEdv8s4ulsySaHwrO+TzM xgnEFwQfHDMzXwEd3IEZbMH+WGrUVD8nU8APw= MIME-Version: 1.0 Received: by 10.239.166.7 with SMTP id z7mr486585hbd.23.1264284911799; Sat, 23 Jan 2010 14:15:11 -0800 (PST) In-Reply-To: References: <5da0588e1001222223m773648am907267235bdcf882@mail.gmail.com> <5da0588e1001230014k1b8a32f8v42046497265429ed@mail.gmail.com> Date: Sat, 23 Jan 2010 17:15:11 -0500 Message-ID: <5da0588e1001231415t403f29ceq6e8dcd16edb4a28@mail.gmail.com> From: Rich To: Wes Morgan Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs Subject: Re: Errors on a file on a zpool: How to remove? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 22:15:13 -0000 On Sat, Jan 23, 2010 at 4:21 PM, Wes Morgan wrote: > On Sat, 23 Jan 2010, Rich wrote: > >> I already diagnosed the bad hardware - one of the two sticks of RAM >> had gone bad, and fails memtest in the other machine. >> >> =A0 pool: rigatoni >> =A0state: ONLINE >> status: One or more devices has experienced an error resulting in data >> =A0 =A0 =A0 corruption. =A0Applications may be affected. >> action: Restore the file in question if possible. =A0Otherwise restore t= he >> =A0 =A0 =A0 entire pool from backup. >> =A0 =A0see: http://www.sun.com/msg/ZFS-8000-8A >> =A0scrub: scrub completed after 15h28m with 1 errors on Thu Jan 21 18:09= :25 2010 >> config: >> >> =A0 =A0 =A0 NAME =A0 =A0 =A0 =A0STATE =A0 =A0 READ WRITE CKSUM >> =A0 =A0 =A0 rigatoni =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 1 >> =A0 =A0 =A0 =A0 da4 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 2 >> =A0 =A0 =A0 =A0 da5 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 2 >> =A0 =A0 =A0 =A0 da7 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 >> =A0 =A0 =A0 =A0 da6 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 >> =A0 =A0 =A0 =A0 da2 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 2 >> >> errors: Permanent errors have been detected in the following files: >> >> =A0 =A0 =A0 =A0 rigatoni/mirrors:<0x0> > > Can you post your entire pool filesystem structure? That message above > looks like an unreferenced block or corrupted metadata rather than an > actual file. Also, if it's part of a snapshot, you simply have to destroy > the snapshot. > > I had a pool become corrupted due to bad memory, and all of the files wer= e > still able to be manipulated. The only time EIO popped up was on the > specific block that had a checksum error. # zfs list -r -t all rigatoni NAME USED AVAIL REFER MOUNTPOINT rigatoni 5.73T 984G 19K /rigatoni rigatoni/logs_bitch 269M 984G 269M /rigatoni/logs_bitch rigatoni/mirrors 5.73T 984G 5.73T /mirrors No snapshots here. :/ EIO only pops up on the files I mentioned above - everything else in those directories, including renaming that directory, is fine. - Rich From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 23:19:10 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 18AFF106566C; Sat, 23 Jan 2010 23:19:10 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id A49648FC1D; Sat, 23 Jan 2010 23:19:09 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEAIQSW0uDaFvJ/2dsb2JhbADWFoQ7BA X-IronPort-AV: E=Sophos;i="4.49,331,1262581200"; d="scan'208";a="62710664" Received: from ganges.cs.uoguelph.ca ([131.104.91.201]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 23 Jan 2010 18:19:08 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by ganges.cs.uoguelph.ca (Postfix) with ESMTP id BAFFCFB8063; Sat, 23 Jan 2010 18:19:08 -0500 (EST) X-Virus-Scanned: amavisd-new at ganges.cs.uoguelph.ca Received: from ganges.cs.uoguelph.ca ([127.0.0.1]) by localhost (ganges.cs.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DuIa2GmUAKfw; Sat, 23 Jan 2010 18:19:08 -0500 (EST) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by ganges.cs.uoguelph.ca (Postfix) with ESMTP id ECAAAFB8059; Sat, 23 Jan 2010 18:19:07 -0500 (EST) Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id o0NNTjp24249; Sat, 23 Jan 2010 18:29:45 -0500 (EST) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Sat, 23 Jan 2010 18:29:45 -0500 (EST) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: Mikolaj Golub In-Reply-To: <86pr50vprb.fsf@kopusha.onet> Message-ID: References: <86ocl272mb.fsf@kopusha.onet> <86tyuqnz9x.fsf@zhuzha.ua1> <86zl4awmon.fsf@zhuzha.ua1> <86vdeywmha.fsf@zhuzha.ua1> <86vdeuuo2y.fsf@zhuzha.ua1> <86pr50vprb.fsf@kopusha.onet> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@FreeBSD.org, freebsd-stable@FreeBSD.org Subject: Re: FreeBSD NFS client/Linux NFS server issue X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 23:19:10 -0000 On Sat, 23 Jan 2010, Mikolaj Golub wrote: > > I applied your patch to FreeBSD8.0 (the box I get on weekend :-), mounted 10 > shares, set vfs.nfs.iodmaxidle=10 (to have nfsiod creation more frequently) > and have been running tests for 4 hours -- just to check the patch does not > break anything. No issues have been detected. > > It would be very nice to have this patch committed. > Thanks for doing the testing and good work w.r.t. figuring out the race. (I'll admit I didn't really have a clue what was causing your problem.) rick From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 23:34:46 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7ACB81065679 for ; Sat, 23 Jan 2010 23:34:46 +0000 (UTC) (envelope-from morganw@chemikals.org) Received: from warped.bluecherry.net (unknown [IPv6:2001:440:eeee:fffb::2]) by mx1.freebsd.org (Postfix) with ESMTP id 0147D8FC1C for ; Sat, 23 Jan 2010 23:34:46 +0000 (UTC) Received: from volatile.chemikals.org (adsl-67-214-156.shv.bellsouth.net [98.67.214.156]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by warped.bluecherry.net (Postfix) with ESMTPSA id E856590A7B00; Sat, 23 Jan 2010 17:34:44 -0600 (CST) Received: from localhost (morganw@localhost [127.0.0.1]) by volatile.chemikals.org (8.14.3/8.14.3) with ESMTP id o0NNYfeM004748; Sat, 23 Jan 2010 17:34:41 -0600 (CST) (envelope-from morganw@chemikals.org) Date: Sat, 23 Jan 2010 17:34:41 -0600 (CST) From: Wes Morgan X-X-Sender: morganw@volatile To: Rich In-Reply-To: <5da0588e1001231415t403f29ceq6e8dcd16edb4a28@mail.gmail.com> Message-ID: References: <5da0588e1001222223m773648am907267235bdcf882@mail.gmail.com> <5da0588e1001230014k1b8a32f8v42046497265429ed@mail.gmail.com> <5da0588e1001231415t403f29ceq6e8dcd16edb4a28@mail.gmail.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="3224958491-771498596-1264289681=:2160" X-Virus-Scanned: clamav-milter 0.95.3 at warped X-Virus-Status: Clean Cc: freebsd-fs Subject: Re: Errors on a file on a zpool: How to remove? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 23:34:46 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --3224958491-771498596-1264289681=:2160 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT On Sat, 23 Jan 2010, Rich wrote: > On Sat, Jan 23, 2010 at 4:21 PM, Wes Morgan wrote: > > On Sat, 23 Jan 2010, Rich wrote: > > > >> I already diagnosed the bad hardware - one of the two sticks of RAM > >> had gone bad, and fails memtest in the other machine. > >> > >>   pool: rigatoni > >>  state: ONLINE > >> status: One or more devices has experienced an error resulting in data > >>       corruption.  Applications may be affected. > >> action: Restore the file in question if possible.  Otherwise restore the > >>       entire pool from backup. > >>    see: http://www.sun.com/msg/ZFS-8000-8A > >>  scrub: scrub completed after 15h28m with 1 errors on Thu Jan 21 18:09:25 2010 > >> config: > >> > >>       NAME        STATE     READ WRITE CKSUM > >>       rigatoni    ONLINE       0     0     1 > >>         da4       ONLINE       0     0     2 > >>         da5       ONLINE       0     0     2 > >>         da7       ONLINE       0     0     0 > >>         da6       ONLINE       0     0     0 > >>         da2       ONLINE       0     0     2 > >> > >> errors: Permanent errors have been detected in the following files: > >> > >>         rigatoni/mirrors:<0x0> > > > > Can you post your entire pool filesystem structure? That message above > > looks like an unreferenced block or corrupted metadata rather than an > > actual file. Also, if it's part of a snapshot, you simply have to destroy > > the snapshot. > > > > I had a pool become corrupted due to bad memory, and all of the files were > > still able to be manipulated. The only time EIO popped up was on the > > specific block that had a checksum error. > > # zfs list -r -t all rigatoni > NAME USED AVAIL REFER MOUNTPOINT > rigatoni 5.73T 984G 19K /rigatoni > rigatoni/logs_bitch 269M 984G 269M /rigatoni/logs_bitch > rigatoni/mirrors 5.73T 984G 5.73T /mirrors > > No snapshots here. :/ > > EIO only pops up on the files I mentioned above - everything else in > those directories, including renaming that directory, is fine. I must have missed it, what files is it showing besides the <0x0> address? Or do you have a file named "<0x0>"? --3224958491-771498596-1264289681=:2160-- From owner-freebsd-fs@FreeBSD.ORG Sat Jan 23 23:41:03 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 125E5106566C for ; Sat, 23 Jan 2010 23:41:03 +0000 (UTC) (envelope-from rincebrain@gmail.com) Received: from mail-pz0-f202.google.com (mail-pz0-f202.google.com [209.85.222.202]) by mx1.freebsd.org (Postfix) with ESMTP id D7A2A8FC12 for ; Sat, 23 Jan 2010 23:41:02 +0000 (UTC) Received: by pzk40 with SMTP id 40so57041pzk.7 for ; Sat, 23 Jan 2010 15:41:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=0igOfbDDwl/3v+4Bo1waq7lirkXLiDcZJ6mCl2+8kE4=; b=EcdAhQVL5IPE3y8ytcXY6mDQtFOfTGxB0udqRCPQ4uPeBKP51F4vZjitCNuwq5xzM1 b5xXaUiQf/RlJ8pSDnvGSS3PK7Tu4ixSx7V85e5bwiWHmqsdl7wzTsKlD0ZTdRB9v4Xq /VQWcUr9JYhvofz9wN7DCk5BntC/e6avEErIw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=J84KjEkAIW0EISUW77OLFZ+mWO0WzB9Xj58MLOLgtwVgdpr0NA5N/Lwe8Dxud1S3U/ J4CLe9ge6bAehEQJDbJgjmqOJsQT0jSgHrEAmrI7edxXpxBv7L/dZ47jGeO+och5m228 XjYXzFvZ2RSziT7cvITcsKLjUX3CsFrc5FKDk= MIME-Version: 1.0 Received: by 10.115.87.4 with SMTP id p4mr3239701wal.202.1264290062143; Sat, 23 Jan 2010 15:41:02 -0800 (PST) In-Reply-To: References: <5da0588e1001222223m773648am907267235bdcf882@mail.gmail.com> <5da0588e1001230014k1b8a32f8v42046497265429ed@mail.gmail.com> <5da0588e1001231415t403f29ceq6e8dcd16edb4a28@mail.gmail.com> Date: Sat, 23 Jan 2010 18:41:02 -0500 Message-ID: <5da0588e1001231541l246769eao410c5ea6ccca0de4@mail.gmail.com> From: Rich To: Wes Morgan Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs Subject: Re: Errors on a file on a zpool: How to remove? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2010 23:41:03 -0000 I have no files named 0x0. I have a number of files which, on attempting to do anything to them (stat, mv, rm), EIO occurs, the checksum error number on three of the disks in that pool ticks up, and /var/log/messages reports what I reported in my initial post. (i discovered this due to FreeBSD's daily check-for-setuid-bits-in-strange-places find command reporting EIO on some files.) My original post in this thread is about how to resolve this. - Rich On Sat, Jan 23, 2010 at 6:34 PM, Wes Morgan wrote: > On Sat, 23 Jan 2010, Rich wrote: > >> On Sat, Jan 23, 2010 at 4:21 PM, Wes Morgan wrot= e: >> > On Sat, 23 Jan 2010, Rich wrote: >> > >> >> I already diagnosed the bad hardware - one of the two sticks of RAM >> >> had gone bad, and fails memtest in the other machine. >> >> >> >> =A0 pool: rigatoni >> >> =A0state: ONLINE >> >> status: One or more devices has experienced an error resulting in dat= a >> >> =A0 =A0 =A0 corruption. =A0Applications may be affected. >> >> action: Restore the file in question if possible. =A0Otherwise restor= e the >> >> =A0 =A0 =A0 entire pool from backup. >> >> =A0 =A0see: http://www.sun.com/msg/ZFS-8000-8A >> >> =A0scrub: scrub completed after 15h28m with 1 errors on Thu Jan 21 18= :09:25 2010 >> >> config: >> >> >> >> =A0 =A0 =A0 NAME =A0 =A0 =A0 =A0STATE =A0 =A0 READ WRITE CKSUM >> >> =A0 =A0 =A0 rigatoni =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 1 >> >> =A0 =A0 =A0 =A0 da4 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 2 >> >> =A0 =A0 =A0 =A0 da5 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 2 >> >> =A0 =A0 =A0 =A0 da7 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 0 >> >> =A0 =A0 =A0 =A0 da6 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 0 >> >> =A0 =A0 =A0 =A0 da2 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 2 >> >> >> >> errors: Permanent errors have been detected in the following files: >> >> >> >> =A0 =A0 =A0 =A0 rigatoni/mirrors:<0x0> >> > >> > Can you post your entire pool filesystem structure? That message above >> > looks like an unreferenced block or corrupted metadata rather than an >> > actual file. Also, if it's part of a snapshot, you simply have to dest= roy >> > the snapshot. >> > >> > I had a pool become corrupted due to bad memory, and all of the files = were >> > still able to be manipulated. The only time EIO popped up was on the >> > specific block that had a checksum error. >> >> # zfs list -r -t all rigatoni >> NAME =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0USED =A0AVAIL =A0REFER =A0MOUNTP= OINT >> rigatoni =A0 =A0 =A0 =A0 =A0 =A0 5.73T =A0 984G =A0 =A019K =A0/rigatoni >> rigatoni/logs_bitch =A0 269M =A0 984G =A0 269M =A0/rigatoni/logs_bitch >> rigatoni/mirrors =A0 =A0 5.73T =A0 984G =A05.73T =A0/mirrors >> >> No snapshots here. :/ >> >> EIO only pops up on the files I mentioned above - everything else in >> those directories, including renaming that directory, is fine. > > I must have missed it, what files is it showing besides the <0x0> address= ? > Or do you have a file named "<0x0>"? --=20 Life is a yo-yo, and mankind ties knots in the string.