From owner-freebsd-fs@FreeBSD.ORG Sun Oct 9 19:30:18 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 61FF4106566B for ; Sun, 9 Oct 2011 19:30:18 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 382E78FC0A for ; Sun, 9 Oct 2011 19:30:18 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p99JUInL044449 for ; Sun, 9 Oct 2011 19:30:18 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p99JUHKQ044439; Sun, 9 Oct 2011 19:30:18 GMT (envelope-from gnats) Date: Sun, 9 Oct 2011 19:30:18 GMT Message-Id: <201110091930.p99JUHKQ044439@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Garrett Cooper Cc: Subject: Re: kern/134491: [zfs] Hot spares are rather cold... X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Garrett Cooper List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Oct 2011 19:30:18 -0000 The following reply was made to PR kern/134491; it has been noted by GNATS. From: Garrett Cooper To: bug-followup@FreeBSD.org, michel.bouissou@bioclinica.com Cc: Subject: Re: kern/134491: [zfs] Hot spares are rather cold... Date: Sun, 9 Oct 2011 12:20:09 -0700 delphij and I have verified that this issue is resolved on the zfsd svn branch, but this hasn't been backported to CURRENT and contains a number of changes to geom, and a handful of changes to zfs. I'll leave it to the reader to determine where between geom and zfs things are getting hung up. Thanks, -Garrett From owner-freebsd-fs@FreeBSD.ORG Sun Oct 9 19:30:24 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CF09F10656E5 for ; Sun, 9 Oct 2011 19:30:24 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id A48768FC1A for ; Sun, 9 Oct 2011 19:30:24 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p99JUOqh045011 for ; Sun, 9 Oct 2011 19:30:24 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p99JUO6e044994; Sun, 9 Oct 2011 19:30:24 GMT (envelope-from gnats) Date: Sun, 9 Oct 2011 19:30:24 GMT Message-Id: <201110091930.p99JUO6e044994@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Garrett Cooper Cc: Subject: Re: kern/154447: [zfs] [panic] Occasional panics - solaris assert somewhere in ZFS code X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Garrett Cooper List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Oct 2011 19:30:24 -0000 The following reply was made to PR kern/154447; it has been noted by GNATS. From: Garrett Cooper To: bug-followup@FreeBSD.org, andriys@gmail.com Cc: Subject: Re: kern/154447: [zfs] [panic] Occasional panics - solaris assert somewhere in ZFS code Date: Sun, 9 Oct 2011 12:24:09 -0700 Hi, Can you upgrade to 9.x and please try and reproduce the issue again? It might have been resolved via ZFS v28. If the issue persists, please drop me a line so I can setup a debug session with you and we can further isolate the issue. Thanks, -Garrett From owner-freebsd-fs@FreeBSD.ORG Sun Oct 9 19:40:14 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 416041065675 for ; Sun, 9 Oct 2011 19:40:14 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 30E618FC14 for ; Sun, 9 Oct 2011 19:40:14 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p99JeEBK094657 for ; Sun, 9 Oct 2011 19:40:14 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p99JeDjf094646; Sun, 9 Oct 2011 19:40:13 GMT (envelope-from gnats) Date: Sun, 9 Oct 2011 19:40:13 GMT Message-Id: <201110091940.p99JeDjf094646@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Garrett Cooper Cc: Subject: Re: kern/147790: [zfs] zfs set acl(mode|inherit) fails on existing zfs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Garrett Cooper List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Oct 2011 19:40:14 -0000 The following reply was made to PR kern/147790; it has been noted by GNATS. From: Garrett Cooper To: bug-followup@FreeBSD.org, rs@bytecamp.net Cc: Subject: Re: kern/147790: [zfs] zfs set acl(mode|inherit) fails on existing zfs Date: Sun, 9 Oct 2011 12:30:43 -0700 This works with ZFS v28: # zfs set aclmode=groupmask tank # zfs get aclmode tank # zfs create tank/foo # zfs get aclmode tank/foo NAME PROPERTY VALUE SOURCE tank/foo aclmode groupmask inherited from tank # zfs set aclmode=passthrough tank/foo # zfs get aclmode tank/foo NAME PROPERTY VALUE SOURCE tank/foo aclmode passthrough local Please try upgrade to 9.0; you may need to upgrade your zpool via 'zpool upgrade -a' (highly unlikely though). HTH, -Garrett From owner-freebsd-fs@FreeBSD.ORG Sun Oct 9 19:40:19 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B97A31065672 for ; Sun, 9 Oct 2011 19:40:19 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 8FE268FC15 for ; Sun, 9 Oct 2011 19:40:19 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p99JeJDc095039 for ; Sun, 9 Oct 2011 19:40:19 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p99JeJIc095036; Sun, 9 Oct 2011 19:40:19 GMT (envelope-from gnats) Date: Sun, 9 Oct 2011 19:40:19 GMT Message-Id: <201110091940.p99JeJIc095036@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Garrett Cooper Cc: Subject: Re: kern/146528: [zfs] Severe memory leak in ZFS on i386 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Garrett Cooper List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Oct 2011 19:40:19 -0000 The following reply was made to PR kern/146528; it has been noted by GNATS. From: Garrett Cooper To: bug-followup@FreeBSD.org, EdwinGuy@GMail.com Cc: Subject: Re: kern/146528: [zfs] Severe memory leak in ZFS on i386 Date: Sun, 9 Oct 2011 12:34:00 -0700 Could you please try upgrading to 8.2-STABLE or 9.0 and see if the issue persists with ZFS v28? -Garrett From owner-freebsd-fs@FreeBSD.ORG Sun Oct 9 22:22:05 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 35203106566B; Sun, 9 Oct 2011 22:22:05 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id BBC138FC08; Sun, 9 Oct 2011 22:22:04 +0000 (UTC) Received: by qadz30 with SMTP id z30so4765105qad.13 for ; Sun, 09 Oct 2011 15:22:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=+DqWLv4NTv+DBOQxUsTb6aYdckU2dy2WqPqsKCOBYT8=; b=bXVQM0Jd/IN8hzq3fW0g4DW51selrZ87DX0GYjFXmX5OKqymwkBzf+4gaKmO9HZkdz z9Hn06vdk8eKSNMHULG95d7zN8niI6O1RWbk5J96hNWuh3vHQezXQ1z7OoLgexC/MgF/ IF3CpkJbI94AIjkqcT8G+r2XD2NAsHbed1Bvc= MIME-Version: 1.0 Received: by 10.224.173.69 with SMTP id o5mr8661216qaz.7.1318198923899; Sun, 09 Oct 2011 15:22:03 -0700 (PDT) Received: by 10.224.74.82 with HTTP; Sun, 9 Oct 2011 15:22:03 -0700 (PDT) In-Reply-To: <201110012137.p91Lb6FI093841@chez.mckusick.com> References: <201110012137.p91Lb6FI093841@chez.mckusick.com> Date: Sun, 9 Oct 2011 15:22:03 -0700 Message-ID: From: Garrett Cooper To: Kirk McKusick Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Attilio Rao , Xin LI , freebsd-fs@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Oct 2011 22:22:05 -0000 On Sat, Oct 1, 2011 at 2:37 PM, Kirk McKusick wrote= : >> Date: Sat, 1 Oct 2011 12:44:04 -0700 >> Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? >> From: Garrett Cooper >> To: Attilio Rao >> Cc: Kostik Belousov , >> =A0 =A0 =A0 =A0 Kirk McKusick , freebsd-fs@freebs= d.org, >> =A0 =A0 =A0 =A0 Xin LI >> >> Ok. Now that I know this is the direction you guys want to go, I'll >> start testing the change. > > Thanks for throwing some testing at this. Please test my latest > proposed change (included below so you do not have to dig through > earlier email) as I believe that it has the least likelyhood of > problems and is what I am currently proposing to put in. I apologize for not getting this done sooner. It passes a smoke test with the following filesystems: nfs nullfs smbfs unionfs ufs zfs I'll be running more extensive stress tests soon, but it looks like a good step forward. Thanks! -Garrett From owner-freebsd-fs@FreeBSD.ORG Sun Oct 9 22:37:23 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8D9E1106566B; Sun, 9 Oct 2011 22:37:23 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id 18FA48FC0A; Sun, 9 Oct 2011 22:37:22 +0000 (UTC) Received: by qadz30 with SMTP id z30so4768289qad.13 for ; Sun, 09 Oct 2011 15:37:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=9Bm7B/IAsFjy+OHzAjyFWQ/OR82Yhkb2CZY9WPy/8m0=; b=pk6LlpzNCEaizxdfJhoj4XjqZS9lsQfdsVlC2PlSrqbSxtQrIM6A4aX/IgGC4LoK3C CTEGC0Vg7DcuOTbZsAvqjnxfudSbBacG29ghVtq65M7q3punvpYVrTrum1UzFWKDxhrx iUuyunHwpOneA1sB30OkSRyQKkSEYz1GGCuRM= MIME-Version: 1.0 Received: by 10.224.176.143 with SMTP id be15mr853040qab.33.1318199842346; Sun, 09 Oct 2011 15:37:22 -0700 (PDT) Received: by 10.224.74.82 with HTTP; Sun, 9 Oct 2011 15:37:22 -0700 (PDT) In-Reply-To: References: <201110012137.p91Lb6FI093841@chez.mckusick.com> Date: Sun, 9 Oct 2011 15:37:22 -0700 Message-ID: From: Garrett Cooper To: Kirk McKusick Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Attilio Rao , Xin LI , freebsd-fs@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Oct 2011 22:37:23 -0000 On Sun, Oct 9, 2011 at 3:22 PM, Garrett Cooper wrote: > On Sat, Oct 1, 2011 at 2:37 PM, Kirk McKusick wro= te: >>> Date: Sat, 1 Oct 2011 12:44:04 -0700 >>> Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? >>> From: Garrett Cooper >>> To: Attilio Rao >>> Cc: Kostik Belousov , >>> =A0 =A0 =A0 =A0 Kirk McKusick , freebsd-fs@freeb= sd.org, >>> =A0 =A0 =A0 =A0 Xin LI >>> >>> Ok. Now that I know this is the direction you guys want to go, I'll >>> start testing the change. >> >> Thanks for throwing some testing at this. Please test my latest >> proposed change (included below so you do not have to dig through >> earlier email) as I believe that it has the least likelyhood of >> problems and is what I am currently proposing to put in. > > I apologize for not getting this done sooner. It passes a smoke test > with the following filesystems: > > nfs > nullfs > smbfs > unionfs > ufs > zfs > > I'll be running more extensive stress tests soon, but it looks like a > good step forward. Forgot to note: my FreeNAS builds nanobsd no longer fail with the attached patch after I remove my sync hacks :). -Garrett From owner-freebsd-fs@FreeBSD.ORG Mon Oct 10 04:54:03 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C971A1065672; Mon, 10 Oct 2011 04:54:03 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id A1A748FC13; Mon, 10 Oct 2011 04:54:03 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9A4s3rT053129; Mon, 10 Oct 2011 04:54:03 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9A4s3sp053125; Mon, 10 Oct 2011 04:54:03 GMT (envelope-from linimon) Date: Mon, 10 Oct 2011 04:54:03 GMT Message-Id: <201110100454.p9A4s3sp053125@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/161112: [ufs] [lor] filesystem LOR in FreeBSD 9.0-BETA3 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Oct 2011 04:54:03 -0000 Old Synopsis: filesystem LOR in FreeBSD 9.0-BETA3 New Synopsis: [ufs] [lor] filesystem LOR in FreeBSD 9.0-BETA3 Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Mon Oct 10 04:53:48 UTC 2011 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=161112 From owner-freebsd-fs@FreeBSD.ORG Mon Oct 10 04:57:41 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 20BC6106567C; Mon, 10 Oct 2011 04:57:41 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id ECF608FC14; Mon, 10 Oct 2011 04:57:40 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9A4veTl053427; Mon, 10 Oct 2011 04:57:40 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9A4veCB053423; Mon, 10 Oct 2011 04:57:40 GMT (envelope-from linimon) Date: Mon, 10 Oct 2011 04:57:40 GMT Message-Id: <201110100457.p9A4veCB053423@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/161169: [zfs] [panic] ZFS causes kernel panic in dbuf_dirty X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Oct 2011 04:57:41 -0000 Old Synopsis: ZFS causes kernel panic in dbuf_dirty New Synopsis: [zfs] [panic] ZFS causes kernel panic in dbuf_dirty Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Mon Oct 10 04:57:28 UTC 2011 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=161169 From owner-freebsd-fs@FreeBSD.ORG Mon Oct 10 05:05:09 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 570FF1065678; Mon, 10 Oct 2011 05:05:09 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 2F3898FC15; Mon, 10 Oct 2011 05:05:09 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9A558t2067448; Mon, 10 Oct 2011 05:05:08 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9A558nS067444; Mon, 10 Oct 2011 05:05:08 GMT (envelope-from linimon) Date: Mon, 10 Oct 2011 05:05:08 GMT Message-Id: <201110100505.p9A558nS067444@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/161280: [zfs] Stack overflow in gptzfsboot X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Oct 2011 05:05:09 -0000 Old Synopsis: Stack overflow in gptzfsboot New Synopsis: [zfs] Stack overflow in gptzfsboot Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Mon Oct 10 05:04:39 UTC 2011 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=161280 From owner-freebsd-fs@FreeBSD.ORG Mon Oct 10 05:10:22 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C14621065670; Mon, 10 Oct 2011 05:10:22 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 990288FC16; Mon, 10 Oct 2011 05:10:22 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9A5AMFB068350; Mon, 10 Oct 2011 05:10:22 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9A5AMxp068340; Mon, 10 Oct 2011 05:10:22 GMT (envelope-from linimon) Date: Mon, 10 Oct 2011 05:10:22 GMT Message-Id: <201110100510.p9A5AMxp068340@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/161424: [nullfs] __getcwd() calls fail when used on nullfs mount X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Oct 2011 05:10:22 -0000 Old Synopsis: __getcwd() calls fail when used on nullfs mount New Synopsis: [nullfs] __getcwd() calls fail when used on nullfs mount Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Mon Oct 10 05:10:06 UTC 2011 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=161424 From owner-freebsd-fs@FreeBSD.ORG Mon Oct 10 05:11:46 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B960D1065672; Mon, 10 Oct 2011 05:11:46 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 90E228FC08; Mon, 10 Oct 2011 05:11:46 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9A5BkWh076295; Mon, 10 Oct 2011 05:11:46 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9A5BkAT076291; Mon, 10 Oct 2011 05:11:46 GMT (envelope-from linimon) Date: Mon, 10 Oct 2011 05:11:46 GMT Message-Id: <201110100511.p9A5BkAT076291@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/161438: [zfs] [panic] recursed on non-recursive spa_namespace_lock when creating zpool from zvol X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Oct 2011 05:11:46 -0000 Synopsis: [zfs] [panic] recursed on non-recursive spa_namespace_lock when creating zpool from zvol Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Mon Oct 10 05:11:35 UTC 2011 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=161438 From owner-freebsd-fs@FreeBSD.ORG Mon Oct 10 07:40:11 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A388B106566C for ; Mon, 10 Oct 2011 07:40:11 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 7A7CC8FC0C for ; Mon, 10 Oct 2011 07:40:11 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9A7eBVL034000 for ; Mon, 10 Oct 2011 07:40:11 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9A7eBuh033999; Mon, 10 Oct 2011 07:40:11 GMT (envelope-from gnats) Date: Mon, 10 Oct 2011 07:40:11 GMT Message-Id: <201110100740.p9A7eBuh033999@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Andriy Gapon Cc: Subject: Re: kern/161280: [zfs] Stack overflow in gptzfsboot X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Andriy Gapon List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Oct 2011 07:40:11 -0000 The following reply was made to PR kern/161280; it has been noted by GNATS. From: Andriy Gapon To: bug-followup@FreeBSD.org, dorionpatrick@gmail.com Cc: Subject: Re: kern/161280: [zfs] Stack overflow in gptzfsboot Date: Mon, 10 Oct 2011 10:38:10 +0300 I believe that the problem happens in (zfs)loader, not (gpt)(zfs)boot code. You shot your foot by adding /boot/loader.conf to loader_conf_files, which caused a recursion in parsing of loader.conf. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Mon Oct 10 08:00:34 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F116D106564A for ; Mon, 10 Oct 2011 08:00:34 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id C72AB8FC1C for ; Mon, 10 Oct 2011 08:00:34 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9A80Y3r052235 for ; Mon, 10 Oct 2011 08:00:34 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9A80Y9S052229; Mon, 10 Oct 2011 08:00:34 GMT (envelope-from gnats) Date: Mon, 10 Oct 2011 08:00:34 GMT Message-Id: <201110100800.p9A80Y9S052229@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Martin Matuska Cc: Subject: Re: kern/147790: [zfs] zfs set acl(mode|inherit) fails on existing zfs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Martin Matuska List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Oct 2011 08:00:35 -0000 The following reply was made to PR kern/147790; it has been noted by GNATS. From: Martin Matuska To: bug-followup@FreeBSD.org, rs@bytecamp.net Cc: Subject: Re: kern/147790: [zfs] zfs set acl(mode|inherit) fails on existing zfs Date: Mon, 10 Oct 2011 09:55:35 +0200 If you need this property, you have to upgrade to one of the following: 8-STABLE revision 224564 (Aug 01, 2011) or newer 9-STABLE HEAD revision 224174 (Jul 18, 2011) or newer FreeBSD 9.0 and 8.3 will have ZFS v28 and support this property. -- Martin Matuska FreeBSD committer http://blog.vx.sk From owner-freebsd-fs@FreeBSD.ORG Mon Oct 10 08:40:08 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A2163106566B for ; Mon, 10 Oct 2011 08:40:08 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 743F48FC1C for ; Mon, 10 Oct 2011 08:40:08 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9A8e8Wa093371 for ; Mon, 10 Oct 2011 08:40:08 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9A8e8PC093370; Mon, 10 Oct 2011 08:40:08 GMT (envelope-from gnats) Date: Mon, 10 Oct 2011 08:40:08 GMT Message-Id: <201110100840.p9A8e8PC093370@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Robert Schulze Cc: Subject: Re: kern/147790: [zfs] zfs set acl(mode|inherit) fails on existing zfs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Robert Schulze List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Oct 2011 08:40:08 -0000 The following reply was made to PR kern/147790; it has been noted by GNATS. From: Robert Schulze To: Garrett Cooper Cc: bug-followup@FreeBSD.org Subject: Re: kern/147790: [zfs] zfs set acl(mode|inherit) fails on existing zfs Date: Mon, 10 Oct 2011 10:04:33 +0200 Hi, Am 09.10.2011 21:30, schrieb Garrett Cooper: > This works with ZFS v28: > well, then this can be closed. I wonder why nobody cared for this one in the meantime. with kind regards, Robert Schulze From owner-freebsd-fs@FreeBSD.ORG Mon Oct 10 09:35:47 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 85C8A1065677; Mon, 10 Oct 2011 09:35:47 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 1E2D98FC1A; Mon, 10 Oct 2011 09:35:46 +0000 (UTC) Received: from alf.home (alf.kiev.zoral.com.ua [10.1.1.177]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p9A9Zh7w034469 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 10 Oct 2011 12:35:43 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from alf.home (kostik@localhost [127.0.0.1]) by alf.home (8.14.5/8.14.5) with ESMTP id p9A9Zh9K018157; Mon, 10 Oct 2011 12:35:43 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by alf.home (8.14.5/8.14.5/Submit) id p9A9ZhdU018156; Mon, 10 Oct 2011 12:35:43 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: alf.home: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 10 Oct 2011 12:35:43 +0300 From: Kostik Belousov To: Garrett Cooper Message-ID: <20111010093543.GV1511@deviant.kiev.zoral.com.ua> References: <201110012137.p91Lb6FI093841@chez.mckusick.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="ipBCFTroSgA6x0qa" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-3.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: Kirk McKusick , Attilio Rao , Xin LI , freebsd-fs@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Oct 2011 09:35:47 -0000 --ipBCFTroSgA6x0qa Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Oct 09, 2011 at 03:37:22PM -0700, Garrett Cooper wrote: > On Sun, Oct 9, 2011 at 3:22 PM, Garrett Cooper wrote: > > On Sat, Oct 1, 2011 at 2:37 PM, Kirk McKusick w= rote: > >>> Date: Sat, 1 Oct 2011 12:44:04 -0700 > >>> Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? > >>> From: Garrett Cooper > >>> To: Attilio Rao > >>> Cc: Kostik Belousov , > >>> =9A =9A =9A =9A Kirk McKusick , freebsd-fs@fre= ebsd.org, > >>> =9A =9A =9A =9A Xin LI > >>> > >>> Ok. Now that I know this is the direction you guys want to go, I'll > >>> start testing the change. > >> > >> Thanks for throwing some testing at this. Please test my latest > >> proposed change (included below so you do not have to dig through > >> earlier email) as I believe that it has the least likelyhood of > >> problems and is what I am currently proposing to put in. > > > > I apologize for not getting this done sooner. It passes a smoke test > > with the following filesystems: > > > > nfs > > nullfs > > smbfs > > unionfs > > ufs > > zfs > > > > I'll be running more extensive stress tests soon, but it looks like a > > good step forward. >=20 > Forgot to note: my FreeNAS builds nanobsd no longer fail with the > attached patch after I remove my sync hacks :). The real case to test is the NFS mount which is wedged due to hung/unresponsive NFS server. I have high suspect that the patch could introduce the unkillable hung unmount process. --ipBCFTroSgA6x0qa Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iEYEARECAAYFAk6SvG8ACgkQC3+MBN1Mb4j7egCg0LlstdvWLwP1Db5TcdLzjhtv Lo4AoI1OFjZffYs1Fmc8RJGy9fz7AeLF =SVoo -----END PGP SIGNATURE----- --ipBCFTroSgA6x0qa-- From owner-freebsd-fs@FreeBSD.ORG Mon Oct 10 11:07:07 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0ADA11065702 for ; Mon, 10 Oct 2011 11:07:07 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id ECF2D8FC1B for ; Mon, 10 Oct 2011 11:07:06 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9AB76o9032372 for ; Mon, 10 Oct 2011 11:07:06 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9AB7693032370 for freebsd-fs@FreeBSD.org; Mon, 10 Oct 2011 11:07:06 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 10 Oct 2011 11:07:06 GMT Message-Id: <201110101107.p9AB7693032370@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Oct 2011 11:07:07 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/161438 fs [zfs] [panic] recursed on non-recursive spa_namespace_ o kern/161424 fs [nullfs] __getcwd() calls fail when used on nullfs mou o kern/161280 fs [zfs] Stack overflow in gptzfsboot o kern/161169 fs [zfs] [panic] ZFS causes kernel panic in dbuf_dirty o kern/161112 fs [ufs] [lor] filesystem LOR in FreeBSD 9.0-BETA3 o kern/160893 fs [zfs] [panic] 9.0-BETA2 kernel panic o kern/160860 fs Random UFS root filesystem corruption with SU+J [regre o kern/160801 fs [zfs] zfsboot on 8.2-RELEASE fails to boot from root-o o kern/160790 fs [fusefs] [panic] VPUTX: negative ref count with FUSE o kern/160777 fs [zfs] [hang] RAID-Z3 causes fatal hang upon scrub/impo o kern/160706 fs [zfs] zfs bootloader fails when a non-root vdev exists o kern/160591 fs [zfs] Fail to boot on zfs root with degraded raidz2 [r o kern/160410 fs [smbfs] [hang] smbfs hangs when transferring large fil o kern/160283 fs [zfs] [patch] 'zfs list' does abort in make_dataset_ha o kern/159971 fs [ffs] [panic] panic with soft updates journaling durin o kern/159930 fs [ufs] [panic] kernel core o kern/159418 fs [tmpfs] [panic] tmpfs kernel panic: recursing on non r o kern/159402 fs [zfs][loader] symlinks cause I/O errors o kern/159357 fs [zfs] ZFS MAXNAMELEN macro has confusing name (off-by- o kern/159356 fs [zfs] [patch] ZFS NAME_ERR_DISKLIKE check is Solaris-s o kern/159351 fs [nfs] [patch] - divide by zero in mountnfs() o kern/159251 fs [zfs] [request]: add FLETCHER4 as DEDUP hash option o kern/159233 fs [ext2fs] [patch] fs/ext2fs: finish reallocblk implemen o kern/159232 fs [ext2fs] [patch] fs/ext2fs: merge ext2_readwrite into o kern/159077 fs [zfs] Can't cd .. with latest zfs version o kern/159048 fs [smbfs] smb mount corrupts large files o kern/159045 fs [zfs] [hang] ZFS scrub freezes system o kern/158839 fs [zfs] ZFS Bootloader Fails if there is a Dead Disk o kern/158802 fs [amd] amd(8) ICMP storm and unkillable process. o kern/158711 fs [ffs] [panic] panic in ffs_blkfree and ffs_valloc o kern/158231 fs [nullfs] panic on unmounting nullfs mounted over ufs o f kern/157929 fs [nfs] NFS slow read o kern/157722 fs [geli] unable to newfs a geli encrypted partition o kern/157399 fs [zfs] trouble with: mdconfig force delete && zfs strip o kern/157179 fs [zfs] zfs/dbuf.c: panic: solaris assert: arc_buf_remov o kern/156797 fs [zfs] [panic] Double panic with FreeBSD 9-CURRENT and o kern/156781 fs [zfs] zfs is losing the snapshot directory, p kern/156545 fs [ufs] mv could break UFS on SMP systems o kern/156193 fs [ufs] [hang] UFS snapshot hangs && deadlocks processes o kern/156168 fs [nfs] [panic] Kernel panic under concurrent access ove o kern/156039 fs [nullfs] [unionfs] nullfs + unionfs do not compose, re o kern/155615 fs [zfs] zfs v28 broken on sparc64 -current o kern/155587 fs [zfs] [panic] kernel panic with zfs o kern/155411 fs [regression] [8.2-release] [tmpfs]: mount: tmpfs : No o kern/155199 fs [ext2fs] ext3fs mounted as ext2fs gives I/O errors o bin/155104 fs [zfs][patch] use /dev prefix by default when importing o kern/154930 fs [zfs] cannot delete/unlink file from full volume -> EN o kern/154828 fs [msdosfs] Unable to create directories on external USB o kern/154491 fs [smbfs] smb_co_lock: recursive lock for object 1 o kern/154447 fs [zfs] [panic] Occasional panics - solaris assert somew p kern/154228 fs [md] md getting stuck in wdrain state o kern/153996 fs [zfs] zfs root mount error while kernel is not located o kern/153847 fs [nfs] [panic] Kernel panic from incorrect m_free in nf o kern/153753 fs [zfs] ZFS v15 - grammatical error when attempting to u o kern/153716 fs [zfs] zpool scrub time remaining is incorrect o kern/153695 fs [patch] [zfs] Booting from zpool created on 4k-sector o kern/153680 fs [xfs] 8.1 failing to mount XFS partitions o kern/153520 fs [zfs] Boot from GPT ZFS root on HP BL460c G1 unstable o kern/153418 fs [zfs] [panic] Kernel Panic occurred writing to zfs vol o kern/153351 fs [zfs] locking directories/files in ZFS o bin/153258 fs [patch][zfs] creating ZVOLs requires `refreservation' s kern/153173 fs [zfs] booting from a gzip-compressed dataset doesn't w o kern/153126 fs [zfs] vdev failure, zpool=peegel type=vdev.too_small p kern/152488 fs [tmpfs] [patch] mtime of file updated when only inode o kern/152022 fs [nfs] nfs service hangs with linux client [regression] o kern/151942 fs [zfs] panic during ls(1) zfs snapshot directory o kern/151905 fs [zfs] page fault under load in /sbin/zfs o kern/151845 fs [smbfs] [patch] smbfs should be upgraded to support Un o bin/151713 fs [patch] Bug in growfs(8) with respect to 32-bit overfl o kern/151648 fs [zfs] disk wait bug o kern/151629 fs [fs] [patch] Skip empty directory entries during name o kern/151330 fs [zfs] will unshare all zfs filesystem after execute a o kern/151326 fs [nfs] nfs exports fail if netgroups contain duplicate o kern/151251 fs [ufs] Can not create files on filesystem with heavy us o kern/151226 fs [zfs] can't delete zfs snapshot o kern/151111 fs [zfs] vnodes leakage during zfs unmount o kern/150503 fs [zfs] ZFS disks are UNAVAIL and corrupted after reboot o kern/150501 fs [zfs] ZFS vdev failure vdev.bad_label on amd64 o kern/150390 fs [zfs] zfs deadlock when arcmsr reports drive faulted o kern/150336 fs [nfs] mountd/nfsd became confused; refused to reload n o kern/150207 fs zpool(1): zpool import -d /dev tries to open weird dev o kern/149208 fs mksnap_ffs(8) hang/deadlock o kern/149173 fs [patch] [zfs] make OpenSolaris installa o kern/149015 fs [zfs] [patch] misc fixes for ZFS code to build on Glib o kern/149014 fs [zfs] [patch] declarations in ZFS libraries/utilities o kern/149013 fs [zfs] [patch] make ZFS makefiles use the libraries fro o kern/148504 fs [zfs] ZFS' zpool does not allow replacing drives to be o kern/148490 fs [zfs]: zpool attach - resilver bidirectionally, and re o kern/148368 fs [zfs] ZFS hanging forever on 8.1-PRERELEASE o kern/148204 fs [nfs] UDP NFS causes overload o kern/148138 fs [zfs] zfs raidz pool commands freeze o kern/147903 fs [zfs] [panic] Kernel panics on faulty zfs device o kern/147881 fs [zfs] [patch] ZFS "sharenfs" doesn't allow different " o kern/147790 fs [zfs] zfs set acl(mode|inherit) fails on existing zfs o kern/147560 fs [zfs] [boot] Booting 8.1-PRERELEASE raidz system take o kern/147420 fs [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt o kern/146941 fs [zfs] [panic] Kernel Double Fault - Happens constantly o kern/146786 fs [zfs] zpool import hangs with checksum errors o kern/146708 fs [ufs] [panic] Kernel panic in softdep_disk_write_compl o kern/146528 fs [zfs] Severe memory leak in ZFS on i386 o kern/146502 fs [nfs] FreeBSD 8 NFS Client Connection to Server s kern/145712 fs [zfs] cannot offline two drives in a raidz2 configurat o kern/145411 fs [xfs] [panic] Kernel panics shortly after mounting an o bin/145309 fs bsdlabel: Editing disk label invalidates the whole dev o kern/145272 fs [zfs] [panic] Panic during boot when accessing zfs on o kern/145246 fs [ufs] dirhash in 7.3 gratuitously frees hashes when it o kern/145238 fs [zfs] [panic] kernel panic on zpool clear tank o kern/145229 fs [zfs] Vast differences in ZFS ARC behavior between 8.0 o kern/145189 fs [nfs] nfsd performs abysmally under load o kern/144929 fs [ufs] [lor] vfs_bio.c + ufs_dirhash.c p kern/144447 fs [zfs] sharenfs fsunshare() & fsshare_main() non functi o kern/144416 fs [panic] Kernel panic on online filesystem optimization s kern/144415 fs [zfs] [panic] kernel panics on boot after zfs crash o kern/144234 fs [zfs] Cannot boot machine with recent gptzfsboot code o kern/143825 fs [nfs] [panic] Kernel panic on NFS client o bin/143572 fs [zfs] zpool(1): [patch] The verbose output from iostat o kern/143212 fs [nfs] NFSv4 client strange work ... o kern/143184 fs [zfs] [lor] zfs/bufwait LOR o kern/142878 fs [zfs] [vfs] lock order reversal o kern/142597 fs [ext2fs] ext2fs does not work on filesystems with real o kern/142489 fs [zfs] [lor] allproc/zfs LOR o kern/142466 fs Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re o kern/142306 fs [zfs] [panic] ZFS drive (from OSX Leopard) causes two o kern/142068 fs [ufs] BSD labels are got deleted spontaneously o kern/141897 fs [msdosfs] [panic] Kernel panic. msdofs: file name leng o kern/141463 fs [nfs] [panic] Frequent kernel panics after upgrade fro o kern/141305 fs [zfs] FreeBSD ZFS+sendfile severe performance issues ( o kern/141091 fs [patch] [nullfs] fix panics with DIAGNOSTIC enabled o kern/141086 fs [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS o kern/141010 fs [zfs] "zfs scrub" fails when backed by files in UFS2 o kern/140888 fs [zfs] boot fail from zfs root while the pool resilveri o kern/140661 fs [zfs] [patch] /boot/loader fails to work on a GPT/ZFS- o kern/140640 fs [zfs] snapshot crash o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs p bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139597 fs [patch] [tmpfs] tmpfs initializes va_gen but doesn't u o kern/139564 fs [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/138662 fs [panic] ffs_blkfree: freeing free block o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic p kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133174 fs [msdosfs] [patch] msdosfs must support multibyte inter o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130210 fs [nullfs] Error by check nullfs f kern/130133 fs [panic] [zfs] 'kmem_map too small' caused by make clea o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/127787 fs [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs f kern/127375 fs [zfs] If vm.kmem_size_max>"1073741823" then write spee o bin/127270 fs fsck_msdosfs(8) may crash if BytesPerSec is zero o kern/127029 fs [panic] mount(8): trying to mount a write protected zi f kern/126703 fs [panic] [zfs] _mtx_lock_sleep: recursed on non-recursi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file o kern/125895 fs [ffs] [panic] kernel: panic: ffs_blkfree: freeing free s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS o kern/123939 fs [msdosfs] corrupts new files f sparc/123566 fs [zfs] zpool import issue: EOVERFLOW o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F f kern/120210 fs [zfs] [panic] reboot after panic: solaris assert: arc_ o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o bin/118249 fs [ufs] mv(1): moving a directory changes its mtime o kern/118126 fs [nfs] [patch] Poor NFS server write performance o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o kern/117954 fs [ufs] dirhash on very large directories blocks the mac o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o conf/116931 fs lack of fsck_cd9660 prevents mounting iso images with o kern/116583 fs [ffs] [hang] System freezes for short time when using o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] [iconv] mount_msdosfs: msdosfs_iconv: Operat o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106107 fs [ufs] left-over fsck_snapshot after unfinished backgro o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes s bin/97498 fs [request] newfs(8) has no option to clear the first 12 o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [cd9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o bin/94810 fs fsck(8) incorrectly reports 'file system marked clean' o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88555 fs [panic] ffs_blkfree: freeing free frag on AMD 64 o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o bin/87966 fs [patch] newfs(8): introduce -A flag for newfs to enabl o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o bin/85494 fs fsck_ffs: unchecked use of cg_inosused macro etc. o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o bin/74779 fs Background-fsck checks one filesystem twice and omits o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o bin/70600 fs fsck(8) throws files away when it can't grow lost+foun o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o bin/27687 fs fsck(8) wrapper is not properly passing options to fsc o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 256 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Oct 10 14:33:16 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 009A0106566C for ; Mon, 10 Oct 2011 14:33:16 +0000 (UTC) (envelope-from patpro@patpro.net) Received: from rack.patpro.net (rack.patpro.net [193.30.227.216]) by mx1.freebsd.org (Postfix) with ESMTP id B7EEF8FC13 for ; Mon, 10 Oct 2011 14:33:15 +0000 (UTC) Received: from rack.patpro.net (localhost [127.0.0.1]) by rack.patpro.net (Postfix) with ESMTP id 753581CC038 for ; Mon, 10 Oct 2011 16:33:13 +0200 (CEST) X-Virus-Scanned: amavisd-new at patpro.net Received: from amavis-at-patpro.net ([127.0.0.1]) by rack.patpro.net (rack.patpro.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id c5P3gMPEvCt1 for ; Mon, 10 Oct 2011 16:33:12 +0200 (CEST) Received: from [127.0.0.1] (localhost [127.0.0.1]) by rack.patpro.net (Postfix) with ESMTP for ; Mon, 10 Oct 2011 16:33:12 +0200 (CEST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1084) From: Patrick Proniewski In-Reply-To: <20110915120007.F41FF10656E1@hub.freebsd.org> Date: Mon, 10 Oct 2011 16:33:11 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <4B8C8026-1E12-4C32-88E3-9B34A3E58A91@patpro.net> References: <20110915120007.F41FF10656E1@hub.freebsd.org> To: freebsd-fs@freebsd.org X-Mailer: Apple Mail (2.1084) Subject: measuring IO asynchronously X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Oct 2011 14:33:16 -0000 Hello, I would like to monitor the storage on various FreeBSD servers, = especially I/O per seconds. Is there any way to gather statistics about = I/O via asynchronous request, lets say, for example, using a munin = plugin? `iostat -w 1` and `zpool iostat tank 1` are interesting, but not useable = asynchronously. regards, patpro= From owner-freebsd-fs@FreeBSD.ORG Mon Oct 10 18:05:58 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5734E106566C for ; Mon, 10 Oct 2011 18:05:58 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com [209.85.215.182]) by mx1.freebsd.org (Postfix) with ESMTP id D8E3B8FC0C for ; Mon, 10 Oct 2011 18:05:57 +0000 (UTC) Received: by eyd10 with SMTP id 10so1738864eyd.13 for ; Mon, 10 Oct 2011 11:05:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=from:to:cc:subject:references:x-comment-to:sender:date:in-reply-to :message-id:user-agent:mime-version:content-type; bh=ss33lIHlxplWFklxAi3+w6VTIGji3mvcx4F85KfQ4lM=; b=XQpTneEdoq9NLGeA9Czb4AjjS5R06phKofu+RTg7rIWW0UujJIBH1YT9M7eUHL7Znf 4ckcbt5QzR7thrI7LLuITWsWV7ImtNA0CheBcMCcnScF9iXP26AYyP2Db5AO4YIx2Gfg 0jZKNxn4WqAob+5jjVyLj7FZKUL0GqLyFFPoU= Received: by 10.14.8.1 with SMTP id 1mr1685367eeq.208.1318269956714; Mon, 10 Oct 2011 11:05:56 -0700 (PDT) Received: from localhost ([95.69.173.122]) by mx.google.com with ESMTPS id a10sm41711075een.6.2011.10.10.11.05.53 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 10 Oct 2011 11:05:53 -0700 (PDT) From: Mikolaj Golub To: Patrick Proniewski References: <20110915120007.F41FF10656E1@hub.freebsd.org> <4B8C8026-1E12-4C32-88E3-9B34A3E58A91@patpro.net> X-Comment-To: Patrick Proniewski Sender: Mikolaj Golub Date: Mon, 10 Oct 2011 21:05:51 +0300 In-Reply-To: <4B8C8026-1E12-4C32-88E3-9B34A3E58A91@patpro.net> (Patrick Proniewski's message of "Mon, 10 Oct 2011 16:33:11 +0200") Message-ID: <86ipnwg1s0.fsf@kopusha.home.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-fs@freebsd.org Subject: Re: measuring IO asynchronously X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Oct 2011 18:05:58 -0000 On Mon, 10 Oct 2011 16:33:11 +0200 Patrick Proniewski wrote: PP> Hello, PP> I would like to monitor the storage on various FreeBSD servers, PP> especially I/O per seconds. Is there any way to gather statistics PP> about I/O via asynchronous request, lets say, for example, using a PP> munin plugin? `iostat -w 1` and `zpool iostat tank 1` are PP> interesting, but not useable asynchronously. I use for this a simple program that I wrote some time ago. It uses devstat(9) kernel interface and outputs counters, like below kopusha:~% devstat ada0 ada0: 1339552256 bytes read 858508800 bytes written 0 bytes freed 26711 reads 52207 writes 0 frees 32 other duration: 329 15518804455250054608/2^64 sec reads 116 18223026934881081924/2^64 sec writes 0 0/2^64 sec frees 296 15557434685545187896/2^64 sec busy time 5 18134849003398009609/2^64 sec creation time 512 block size tags sent: 78950 simple 0 ordered 0 head of queue supported statistics measurements flags: 0 device type: 32 devstat list insert priority: 272 You can find it in ports (sysutils/devstat). -- Mikolaj Golub From owner-freebsd-fs@FreeBSD.ORG Tue Oct 11 02:13:01 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D17E6106566B; Tue, 11 Oct 2011 02:13:01 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id 59E238FC15; Tue, 11 Oct 2011 02:13:01 +0000 (UTC) Received: by qadz30 with SMTP id z30so5814432qad.13 for ; Mon, 10 Oct 2011 19:13:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=JZWmbECoIkXLEDS3bE4fyMD1sG50GLC7KIb+Ebyqka8=; b=NgIzUjTsrK48S9n5JtMMpjA3a3ZE6rgH+3IOPnxngzCyX7So+5+IdBFZscI8yTEmN3 tugQKMAx6jfvw7usrSA5P4+rTrId93B0wlQrRZ0Y+1EwPbj3lvd93kAN1OkwmdT52sax XeY/EsfwtnwxOolYrCSUzf4oTYRB33RC2c8SU= MIME-Version: 1.0 Received: by 10.224.213.2 with SMTP id gu2mr8661079qab.85.1318299180454; Mon, 10 Oct 2011 19:13:00 -0700 (PDT) Received: by 10.224.74.82 with HTTP; Mon, 10 Oct 2011 19:12:59 -0700 (PDT) In-Reply-To: <20111010093543.GV1511@deviant.kiev.zoral.com.ua> References: <201110012137.p91Lb6FI093841@chez.mckusick.com> <20111010093543.GV1511@deviant.kiev.zoral.com.ua> Date: Mon, 10 Oct 2011 19:12:59 -0700 Message-ID: From: Garrett Cooper To: Kostik Belousov Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Kirk McKusick , Attilio Rao , Xin LI , freebsd-fs@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Oct 2011 02:13:02 -0000 2011/10/10 Kostik Belousov : > On Sun, Oct 09, 2011 at 03:37:22PM -0700, Garrett Cooper wrote: >> On Sun, Oct 9, 2011 at 3:22 PM, Garrett Cooper wrot= e: >> > On Sat, Oct 1, 2011 at 2:37 PM, Kirk McKusick = wrote: >> >>> Date: Sat, 1 Oct 2011 12:44:04 -0700 >> >>> Subject: Re: Need to force sync(2) before umounting UFS1 filesystems= ? >> >>> From: Garrett Cooper >> >>> To: Attilio Rao >> >>> Cc: Kostik Belousov , >> >>> =A0 =A0 =A0 =A0 Kirk McKusick , freebsd-fs@fr= eebsd.org, >> >>> =A0 =A0 =A0 =A0 Xin LI >> >>> >> >>> Ok. Now that I know this is the direction you guys want to go, I'll >> >>> start testing the change. >> >> >> >> Thanks for throwing some testing at this. Please test my latest >> >> proposed change (included below so you do not have to dig through >> >> earlier email) as I believe that it has the least likelyhood of >> >> problems and is what I am currently proposing to put in. >> > >> > I apologize for not getting this done sooner. It passes a smoke test >> > with the following filesystems: >> > >> > nfs >> > nullfs >> > smbfs >> > unionfs >> > ufs >> > zfs >> > >> > I'll be running more extensive stress tests soon, but it looks like a >> > good step forward. >> >> Forgot to note: my FreeNAS builds nanobsd no longer fail with the >> attached patch after I remove my sync hacks :). > > The real case to test is the NFS mount which is wedged due to > hung/unresponsive NFS server. I have high suspect that the patch > could introduce the unkillable hung unmount process. It blocked, but I could ^C it perfectly fine. I tested it via: Setup: 1. Started up FreeNAS 8.x image; it acquired an IP from my server with dhcp-75.local. Test 1: 1. mount -t nfs dhcp-75:/mnt/tank /mnt/nfs/ from my test workstation. 2. Paused VM. 3. umount /mnt/nfs (the command blocked). 4. ^C. 5. mount | grep /mnt/nfs showed nothing (it had unmounted). Test 2: 1. mount -t nfs dhcp-75:/mnt/tank /mnt/nfs/ from my test workstation (block= ed). 2. Opened up another ssh session and cd'ed to /mnt/nfs . 3. Paused VM. 4. umount /mnt/nfs . It failed with EBUSY. 5. mount | grep /mnt/nfs showed that it was still mounted, as expected. So unless there are buffers still waiting to be written out to an NFS share, or other reasons that would prevent the NFS share from being fully released, I doubt the proposed behavior is really different from previous versions of FreeBSD. Thanks, -Garrett From owner-freebsd-fs@FreeBSD.ORG Tue Oct 11 07:21:11 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F408B106568F; Tue, 11 Oct 2011 07:21:10 +0000 (UTC) (envelope-from mm@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id CC1F88FC14; Tue, 11 Oct 2011 07:21:10 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9B7LAk3070177; Tue, 11 Oct 2011 07:21:10 GMT (envelope-from mm@freefall.freebsd.org) Received: (from mm@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9B7LAIV070170; Tue, 11 Oct 2011 07:21:10 GMT (envelope-from mm) Date: Tue, 11 Oct 2011 07:21:10 GMT Message-Id: <201110110721.p9B7LAIV070170@freefall.freebsd.org> To: rs@bytecamp.net, mm@FreeBSD.org, freebsd-fs@FreeBSD.org From: mm@FreeBSD.org Cc: Subject: Re: kern/147790: [zfs] zfs set acl(mode|inherit) fails on existing zfs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Oct 2011 07:21:11 -0000 Synopsis: [zfs] zfs set acl(mode|inherit) fails on existing zfs State-Changed-From-To: open->closed State-Changed-By: mm State-Changed-When: Tue Oct 11 07:21:10 UTC 2011 State-Changed-Why: Closed on submitter request. Thanks! http://www.freebsd.org/cgi/query-pr.cgi?pr=147790 From owner-freebsd-fs@FreeBSD.ORG Tue Oct 11 07:26:44 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C9DD7106579C; Tue, 11 Oct 2011 07:26:44 +0000 (UTC) (envelope-from mm@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id A227E8FC08; Tue, 11 Oct 2011 07:26:44 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9B7Qi8u072917; Tue, 11 Oct 2011 07:26:44 GMT (envelope-from mm@freefall.freebsd.org) Received: (from mm@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9B7QhEC072913; Tue, 11 Oct 2011 07:26:43 GMT (envelope-from mm) Date: Tue, 11 Oct 2011 07:26:43 GMT Message-Id: <201110110726.p9B7QhEC072913@freefall.freebsd.org> To: sm@kill-9.net, mm@FreeBSD.org, freebsd-fs@FreeBSD.org From: mm@FreeBSD.org Cc: Subject: Re: bin/121366: [zfs] [patch] Automatic disk scrubbing from periodic(8) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Oct 2011 07:26:44 -0000 Synopsis: [zfs] [patch] Automatic disk scrubbing from periodic(8) State-Changed-From-To: open->closed State-Changed-By: mm State-Changed-When: Tue Oct 11 07:26:43 UTC 2011 State-Changed-Why: Implemented in r209195 by netchild@ (/etc/periodic/daily/800.scrub-zfs) http://www.freebsd.org/cgi/query-pr.cgi?pr=121366 From owner-freebsd-fs@FreeBSD.ORG Tue Oct 11 07:30:14 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4573A106564A for ; Tue, 11 Oct 2011 07:30:14 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 1A8AA8FC13 for ; Tue, 11 Oct 2011 07:30:14 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9B7UDU5089496 for ; Tue, 11 Oct 2011 07:30:13 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9B7UDFk089493; Tue, 11 Oct 2011 07:30:13 GMT (envelope-from gnats) Date: Tue, 11 Oct 2011 07:30:13 GMT Message-Id: <201110110730.p9B7UDFk089493@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Martin Matuska Cc: Subject: Re: bin/115361: [zfs] mount(8) gets into a state where it won' t set/unset ZFS properties (atime, exec, setuid) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Martin Matuska List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Oct 2011 07:30:14 -0000 The following reply was made to PR bin/115361; it has been noted by GNATS. From: Martin Matuska To: bug-followup@FreeBSD.org, swhetzel@gmail.com Cc: Subject: Re: bin/115361: [zfs] mount(8) gets into a state where it won't set/unset ZFS properties (atime, exec, setuid) Date: Tue, 11 Oct 2011 09:23:27 +0200 If there are no objections, I would like to close this PR. -- Martin Matuska FreeBSD committer http://blog.vx.sk From owner-freebsd-fs@FreeBSD.ORG Tue Oct 11 07:30:26 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F0AD91065706; Tue, 11 Oct 2011 07:30:26 +0000 (UTC) (envelope-from mm@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id C862E8FC0A; Tue, 11 Oct 2011 07:30:26 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9B7UQd0090128; Tue, 11 Oct 2011 07:30:26 GMT (envelope-from mm@freefall.freebsd.org) Received: (from mm@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9B7UQ0s090103; Tue, 11 Oct 2011 07:30:26 GMT (envelope-from mm) Date: Tue, 11 Oct 2011 07:30:26 GMT Message-Id: <201110110730.p9B7UQ0s090103@freefall.freebsd.org> To: weldon@excelsus.com, mm@FreeBSD.org, freebsd-fs@FreeBSD.org From: mm@FreeBSD.org Cc: Subject: Re: kern/120210: [zfs] [panic] reboot after panic: solaris assert: arc_buf_remove_ref X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Oct 2011 07:30:27 -0000 Synopsis: [zfs] [panic] reboot after panic: solaris assert: arc_buf_remove_ref State-Changed-From-To: feedback->closed State-Changed-By: mm State-Changed-When: Tue Oct 11 07:30:25 UTC 2011 State-Changed-Why: Closed on feedback timeout. http://www.freebsd.org/cgi/query-pr.cgi?pr=120210 From owner-freebsd-fs@FreeBSD.ORG Tue Oct 11 07:32:40 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9EB24106566B; Tue, 11 Oct 2011 07:32:40 +0000 (UTC) (envelope-from mm@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 76C948FC0C; Tue, 11 Oct 2011 07:32:40 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9B7WeIH096841; Tue, 11 Oct 2011 07:32:40 GMT (envelope-from mm@freefall.freebsd.org) Received: (from mm@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9B7We5c096822; Tue, 11 Oct 2011 07:32:40 GMT (envelope-from mm) Date: Tue, 11 Oct 2011 07:32:40 GMT Message-Id: <201110110732.p9B7We5c096822@freefall.freebsd.org> To: g.veniamin@googlemail.com, mm@FreeBSD.org, freebsd-fs@FreeBSD.org From: mm@FreeBSD.org Cc: Subject: Re: kern/130133: [panic] [zfs] 'kmem_map too small' caused by make clean X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Oct 2011 07:32:40 -0000 Synopsis: [panic] [zfs] 'kmem_map too small' caused by make clean State-Changed-From-To: feedback->closed State-Changed-By: mm State-Changed-When: Tue Oct 11 07:32:39 UTC 2011 State-Changed-Why: Closed on feedback timeout. http://www.freebsd.org/cgi/query-pr.cgi?pr=130133 From owner-freebsd-fs@FreeBSD.ORG Tue Oct 11 07:33:34 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 61391106566B; Tue, 11 Oct 2011 07:33:34 +0000 (UTC) (envelope-from mm@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 3922B8FC1A; Tue, 11 Oct 2011 07:33:34 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9B7XYxw098783; Tue, 11 Oct 2011 07:33:34 GMT (envelope-from mm@freefall.freebsd.org) Received: (from mm@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9B7XXnH098754; Tue, 11 Oct 2011 07:33:33 GMT (envelope-from mm) Date: Tue, 11 Oct 2011 07:33:33 GMT Message-Id: <201110110733.p9B7XXnH098754@freefall.freebsd.org> To: kevinxlinuz@163.com, mm@FreeBSD.org, freebsd-fs@FreeBSD.org From: mm@FreeBSD.org Cc: Subject: Re: kern/126703: [panic] [zfs] _mtx_lock_sleep: recursed on non-recursive mutex vnode interlock X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Oct 2011 07:33:34 -0000 Synopsis: [panic] [zfs] _mtx_lock_sleep: recursed on non-recursive mutex vnode interlock State-Changed-From-To: feedback->closed State-Changed-By: mm State-Changed-When: Tue Oct 11 07:33:32 UTC 2011 State-Changed-Why: Closed on feedback timeout. http://www.freebsd.org/cgi/query-pr.cgi?pr=126703 From owner-freebsd-fs@FreeBSD.ORG Tue Oct 11 07:56:49 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6970E1065676; Tue, 11 Oct 2011 07:56:49 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [70.36.157.235]) by mx1.freebsd.org (Postfix) with ESMTP id 2C3D88FC16; Tue, 11 Oct 2011 07:56:49 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id p9B7ul0g051037; Tue, 11 Oct 2011 00:56:47 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201110110756.p9B7ul0g051037@chez.mckusick.com> To: Garrett Cooper In-reply-to: Date: Tue, 11 Oct 2011 00:56:47 -0700 From: Kirk McKusick X-Spam-Status: No, score=0.0 required=5.0 tests=MISSING_MID, UNPARSEABLE_RELAY autolearn=failed version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com Cc: Attilio Rao , Xin LI , freebsd-fs@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Oct 2011 07:56:49 -0000 > Date: Mon, 10 Oct 2011 19:12:59 -0700 > From: Garrett Cooper > To: Kostik Belousov > Cc: Kirk McKusick , Attilio Rao , > Xin LI , freebsd-fs@freebsd.org > Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? > > 2011/10/10 Kostik Belousov : > > > The real case to test is the NFS mount which is wedged due to > > hung/unresponsive NFS server. I have high suspect that the patch > > could introduce the unkillable hung unmount process. > > It blocked, but I could ^C it perfectly fine. I tested it via: > > Setup: > 1. Started up FreeNAS 8.x image; it acquired an IP from my server with > dhcp-75.local. > > Test 1: > 1. mount -t nfs dhcp-75:/mnt/tank /mnt/nfs/ from my test workstation. > 2. Paused VM. > 3. umount /mnt/nfs (the command blocked). > 4. ^C. > 5. mount | grep /mnt/nfs showed nothing (it had unmounted). > > Test 2: > 1. mount -t nfs dhcp-75:/mnt/tank /mnt/nfs/ from my test workstation (blocked). > 2. Opened up another ssh session and cd'ed to /mnt/nfs . > 3. Paused VM. > 4. umount /mnt/nfs . It failed with EBUSY. > 5. mount | grep /mnt/nfs showed that it was still mounted, as expected. > > So unless there are buffers still waiting to be written out to an > NFS share, or other reasons that would prevent the NFS share from > being fully released, I doubt the proposed behavior is really > different from previous versions of FreeBSD. > Thanks, > -Garrett Given the testing that has been done and our discussion about deadlocks, I believe that I should proceed to check in my originally proposed change. Notably the one that simply deleted the != MNT_FORCE conditional. However, there is no harm in using my revised version that releases the covered vnode before draining vfs_busy, and there might be some future case where that would be a necessary thing to do. Speak up if you think I should not proceed to check in this change. Also, let me know if you have thoughts on which version I should use. Kirk McKusick From owner-freebsd-fs@FreeBSD.ORG Tue Oct 11 08:02:03 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8253C106566B; Tue, 11 Oct 2011 08:02:03 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 2AA0A8FC17; Tue, 11 Oct 2011 08:02:03 +0000 (UTC) Received: by iaby12 with SMTP id y12so4472753iab.13 for ; Tue, 11 Oct 2011 01:02:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version:content-type; bh=xanLbDUk5oDTp5eE7bfNB65bUgZlNZcu5qxjDga1zYM=; b=eoKS3XH6XTHtHm5vCFD8qt/sIG53aUECITydmYQz2zp/zC88cOyJmNkaSqDYPUiY9X VbA8Tw6A3vGwTgAmRlYCAE6BRqAevqG+rsVo+BidWbvJB4RFIZiQVasa27LR46k8DcEZ 6AUGncdKB4i5pfeTwhcQ07NIui2ean7M80zlY= Received: by 10.231.63.209 with SMTP id c17mr10535659ibi.65.1318320122719; Tue, 11 Oct 2011 01:02:02 -0700 (PDT) Received: from c-24-6-49-154.hsd1.ca.comcast.net (c-24-6-49-154.hsd1.ca.comcast.net. [24.6.49.154]) by mx.google.com with ESMTPS id bu33sm22236033ibb.11.2011.10.11.01.02.00 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 11 Oct 2011 01:02:01 -0700 (PDT) Date: Tue, 11 Oct 2011 01:01:59 -0700 (PDT) From: Garrett Cooper To: Kirk McKusick In-Reply-To: <201110110756.p9B7ul0g051037@chez.mckusick.com> Message-ID: References: <201110110756.p9B7ul0g051037@chez.mckusick.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII Cc: Garrett Cooper , Xin LI , Attilio Rao , freebsd-fs@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Oct 2011 08:02:03 -0000 On Tue, 11 Oct 2011, Kirk McKusick wrote: >> Date: Mon, 10 Oct 2011 19:12:59 -0700 >> From: Garrett Cooper >> To: Kostik Belousov >> Cc: Kirk McKusick , Attilio Rao , >> Xin LI , freebsd-fs@freebsd.org >> Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? >> >> 2011/10/10 Kostik Belousov : >> >>> The real case to test is the NFS mount which is wedged due to >>> hung/unresponsive NFS server. I have high suspect that the patch >>> could introduce the unkillable hung unmount process. >> >> It blocked, but I could ^C it perfectly fine. I tested it via: >> >> Setup: >> 1. Started up FreeNAS 8.x image; it acquired an IP from my server with >> dhcp-75.local. >> >> Test 1: >> 1. mount -t nfs dhcp-75:/mnt/tank /mnt/nfs/ from my test workstation. >> 2. Paused VM. >> 3. umount /mnt/nfs (the command blocked). >> 4. ^C. >> 5. mount | grep /mnt/nfs showed nothing (it had unmounted). >> >> Test 2: >> 1. mount -t nfs dhcp-75:/mnt/tank /mnt/nfs/ from my test workstation (blocked). >> 2. Opened up another ssh session and cd'ed to /mnt/nfs . >> 3. Paused VM. >> 4. umount /mnt/nfs . It failed with EBUSY. >> 5. mount | grep /mnt/nfs showed that it was still mounted, as expected. >> >> So unless there are buffers still waiting to be written out to an >> NFS share, or other reasons that would prevent the NFS share from >> being fully released, I doubt the proposed behavior is really >> different from previous versions of FreeBSD. >> Thanks, >> -Garrett > > Given the testing that has been done and our discussion about deadlocks, > I believe that I should proceed to check in my originally proposed change. > Notably the one that simply deleted the != MNT_FORCE conditional. However, > there is no harm in using my revised version that releases the covered vnode before draining vfs_busy, and there might be some future case where that would be a necessary thing to do. > > Speak up if you think I should not proceed to check in this change. > Also, let me know if you have thoughts on which version I should use. I think the final version that you provided to me should be the one that's put through long-term soak testing because it appeared functionally sound based on my soak testing over the past couple days. I personally wasn't able to unroot the concern that kib had with deadlocked unmounts via NFS. Thanks! -Garrett Index: sys/kern/vfs_mount.c =================================================================== --- sys/kern/vfs_mount.c (revision 226242) +++ sys/kern/vfs_mount.c (working copy) @@ -1187,6 +1187,7 @@ mtx_assert(&Giant, MA_OWNED); +top: if ((coveredvp = mp->mnt_vnodecovered) != NULL) { mnt_gen_r = mp->mnt_gen; VI_LOCK(coveredvp); @@ -1227,21 +1228,19 @@ mp->mnt_kern_flag |= MNTK_UNMOUNTF; error = 0; if (mp->mnt_lockref) { - if ((flags & MNT_FORCE) == 0) { - mp->mnt_kern_flag &= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | - MNTK_UNMOUNTF); - if (mp->mnt_kern_flag & MNTK_MWAIT) { - mp->mnt_kern_flag &= ~MNTK_MWAIT; - wakeup(mp); - } - MNT_IUNLOCK(mp); - if (coveredvp) - VOP_UNLOCK(coveredvp, 0); - return (EBUSY); + if (mp->mnt_kern_flag & MNTK_MWAIT) { + mp->mnt_kern_flag &= ~MNTK_MWAIT; + wakeup(mp); } + if (coveredvp) + VOP_UNLOCK(coveredvp, 0); mp->mnt_kern_flag |= MNTK_DRAINING; error = msleep(&mp->mnt_lockref, MNT_MTX(mp), PVFS, "mount drain", 0); + mp->mnt_kern_flag &= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | + MNTK_UNMOUNTF ); + MNT_IUNLOCK(mp); + goto top; } MNT_IUNLOCK(mp); KASSERT(mp->mnt_lockref == 0, From owner-freebsd-fs@FreeBSD.ORG Tue Oct 11 10:09:23 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4F784106566C for ; Tue, 11 Oct 2011 10:09:23 +0000 (UTC) (envelope-from se@freebsd.org) Received: from nm13.bullet.mail.ne1.yahoo.com (nm13.bullet.mail.ne1.yahoo.com [98.138.90.76]) by mx1.freebsd.org (Postfix) with SMTP id 0569D8FC17 for ; Tue, 11 Oct 2011 10:09:22 +0000 (UTC) Received: from [98.138.90.52] by nm13.bullet.mail.ne1.yahoo.com with NNFMP; 11 Oct 2011 09:56:35 -0000 Received: from [98.138.226.62] by tm5.bullet.mail.ne1.yahoo.com with NNFMP; 11 Oct 2011 09:56:35 -0000 Received: from [127.0.0.1] by smtp213.mail.ne1.yahoo.com with NNFMP; 11 Oct 2011 09:56:35 -0000 X-Yahoo-Newman-Id: 203430.45212.bm@smtp213.mail.ne1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: 7kTvs0sVM1mZk3Ha8DovQ_LqDzX939vFQMKtc1kjYKeWm34 42WoSbu1KXqU_ZwA0oh5PKchW1gLM4jccrq13DMnkqriLCUyAWXgGgXRZaoa HAcppjWBX9goM4surFPM_whs6niOGDlml.nmc.1AGPRnYZQW.Gk9HHTFKehz tmRgdBWfcrSlcH_0cBO_O22Kp_vUK.NNeRDxkqbzbuHaoS101JPeoAVeaSr5 PvIKhAMDz5mnO5EFlp7dKYdSymoWX35Lc0gqjfCPjpItOzB2UKzfaR2cj87z W.rIUJSX5gm64kgXBoQvMgipLUK05LJvMuZvLJp2W_WqotSgEevLd_67w8R6 rYIIhsVRCklOjtDlqS54i79UWVQX17CUsveL1jgASEhy7OppBVT2YNovNHuq U3ncstI2.avydRCE- X-Yahoo-SMTP: iDf2N9.swBDAhYEh7VHfpgq0lnq. Received: from [192.168.119.20] (se@81.173.155.124 with plain) by smtp213.mail.ne1.yahoo.com with SMTP; 11 Oct 2011 02:56:34 -0700 PDT Message-ID: <4E9412D3.4020705@freebsd.org> Date: Tue, 11 Oct 2011 11:56:35 +0200 From: Stefan Esser User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1 MIME-Version: 1.0 To: mm@FreeBSD.org References: <201110110726.p9B7QhEC072913@freefall.freebsd.org> In-Reply-To: <201110110726.p9B7QhEC072913@freefall.freebsd.org> Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, sm@kill-9.net Subject: Re: bin/121366: [zfs] [patch] Automatic disk scrubbing from periodic(8) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Oct 2011 10:09:23 -0000 On 11.10.2011 09:26, mm@FreeBSD.org wrote: > Synopsis: [zfs] [patch] Automatic disk scrubbing from periodic(8) > > State-Changed-From-To: open->closed > State-Changed-By: mm > State-Changed-When: Tue Oct 11 07:26:43 UTC 2011 > State-Changed-Why: > Implemented in r209195 by netchild@ (/etc/periodic/daily/800.scrub-zfs) > > http://www.freebsd.org/cgi/query-pr.cgi?pr=121366 There is no mention of daily_scrub_zfs_enable daily_scrub_zfs_pools daily_scrub_zfs_default_threshold daily_scrub_zfs_${poolname}_threshold in /etc/defaults/periodic.conf, which we use to define configuration options for all periodic scripts. The attached patch adds the missing entries. I'm not sure about the commented out line for "daily_scrub_zfs_${poolname}_threshold", but there is precedent (in the "amd" section) and I think many users will grep for scrub_zfs in that file instead of looking into the man-page for periodic.conf. If the defaults in the patch are considered OK (they just state the defaults set in the script), I'd like to commit them to head and MFC within a week. Regards, STefan (Please include my address in replies, since I'm not in freebsd-fs@). PS: The following unrelated variables have no defaults defined: daily_backup_distfile_enable daily_backup_pkgdb_dbdir daily_distfile_enable I have not yet looked up their defaults, but I think they should also be defined in defaults/periodic.conf ... From owner-freebsd-fs@FreeBSD.ORG Tue Oct 11 10:23:35 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C27A1106564A; Tue, 11 Oct 2011 10:23:35 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 455608FC0C; Tue, 11 Oct 2011 10:23:34 +0000 (UTC) Received: from alf.home (alf.kiev.zoral.com.ua [10.1.1.177]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p9BANVLY060155 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 11 Oct 2011 13:23:31 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from alf.home (kostik@localhost [127.0.0.1]) by alf.home (8.14.5/8.14.5) with ESMTP id p9BANVJT023515; Tue, 11 Oct 2011 13:23:31 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by alf.home (8.14.5/8.14.5/Submit) id p9BANVx3023514; Tue, 11 Oct 2011 13:23:31 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: alf.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 11 Oct 2011 13:23:31 +0300 From: Kostik Belousov To: Kirk McKusick Message-ID: <20111011102331.GW1511@deviant.kiev.zoral.com.ua> References: <201110110756.p9B7ul0g051037@chez.mckusick.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Y2BITh8TegacRQ/d" Content-Disposition: inline In-Reply-To: <201110110756.p9B7ul0g051037@chez.mckusick.com> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-3.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: Garrett Cooper , Attilio Rao , Xin LI , freebsd-fs@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Oct 2011 10:23:35 -0000 --Y2BITh8TegacRQ/d Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Oct 11, 2011 at 12:56:47AM -0700, Kirk McKusick wrote: > > Date: Mon, 10 Oct 2011 19:12:59 -0700 > > From: Garrett Cooper > > To: Kostik Belousov > > Cc: Kirk McKusick , Attilio Rao , > > Xin LI , freebsd-fs@freebsd.org > > Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? > >=20 > > 2011/10/10 Kostik Belousov : > >=20 > > > The real case to test is the NFS mount which is wedged due to > > > hung/unresponsive NFS server. I have high suspect that the patch > > > could introduce the unkillable hung unmount process. > >=20 > > It blocked, but I could ^C it perfectly fine. I tested it via: > >=20 > > Setup: > > 1. Started up FreeNAS 8.x image; it acquired an IP from my server with > > dhcp-75.local. > >=20 > > Test 1: > > 1. mount -t nfs dhcp-75:/mnt/tank /mnt/nfs/ from my test workstation. > > 2. Paused VM. > > 3. umount /mnt/nfs (the command blocked). > > 4. ^C. > > 5. mount | grep /mnt/nfs showed nothing (it had unmounted). > >=20 > > Test 2: > > 1. mount -t nfs dhcp-75:/mnt/tank /mnt/nfs/ from my test workstation (b= locked). > > 2. Opened up another ssh session and cd'ed to /mnt/nfs . > > 3. Paused VM. > > 4. umount /mnt/nfs . It failed with EBUSY. > > 5. mount | grep /mnt/nfs showed that it was still mounted, as expected. > >=20 > > So unless there are buffers still waiting to be written out to an > > NFS share, or other reasons that would prevent the NFS share from > > being fully released, I doubt the proposed behavior is really > > different from previous versions of FreeBSD. > > Thanks, > > -Garrett >=20 > Given the testing that has been done and our discussion about deadlocks, I am not sure that it was adequate. If it was not obvious, my main concern is the nfs client that busied the mount point and waiting for the wedged server rpc response. > I believe that I should proceed to check in my originally proposed > change. Notably the one that simply deleted the !=3D MNT_FORCE > conditional. However, there is no harm in using my revised version > that releases the covered vnode before draining vfs_busy, and there > might be some future case where that would be a necessary thing to do. What is the future case where you intend to break the order between vfs_busy() and vnode locks ? >=20 > Speak up if you think I should not proceed to check in this change. > Also, let me know if you have thoughts on which version I should use. If commmitting any of two changes, I would prefer to see the minimal one, which does not unlock the covered vnode. --Y2BITh8TegacRQ/d Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iEYEARECAAYFAk6UGSMACgkQC3+MBN1Mb4ivDQCgj68AvEPpR7R91lqUxwaangpI /pwAoNuKrFsFj7uEB86btnHHvrXKjSQZ =MY9v -----END PGP SIGNATURE----- --Y2BITh8TegacRQ/d-- From owner-freebsd-fs@FreeBSD.ORG Tue Oct 11 10:36:54 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3847B106564A for ; Tue, 11 Oct 2011 10:36:54 +0000 (UTC) (envelope-from se@freebsd.org) Received: from nm23-vm0.bullet.mail.bf1.yahoo.com (nm23-vm0.bullet.mail.bf1.yahoo.com [98.139.212.191]) by mx1.freebsd.org (Postfix) with SMTP id DFE6F8FC12 for ; Tue, 11 Oct 2011 10:36:53 +0000 (UTC) Received: from [98.139.212.151] by nm23.bullet.mail.bf1.yahoo.com with NNFMP; 11 Oct 2011 10:22:48 -0000 Received: from [98.139.213.8] by tm8.bullet.mail.bf1.yahoo.com with NNFMP; 11 Oct 2011 10:22:48 -0000 Received: from [127.0.0.1] by smtp108.mail.bf1.yahoo.com with NNFMP; 11 Oct 2011 10:22:48 -0000 X-Yahoo-Newman-Id: 810738.28270.bm@smtp108.mail.bf1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: HU7UUGoVM1lQCFLiUMuFczva6gupKGKdp6AZxDCmoBlBNnL dgvpm.EbGAi4h6ED9mn6kwMByl6SFPvrGj9qW3B3_wVEVwNybM8ox3kdlqmR tLNbGouFEbdipAUdXsIXV3eCQXblg7sxrCqP4BA8ud29yoEE9nZauPVVk.ab Bxycilk3cwz1qc7oHBVXl_3C40MsHukCoO1nwSy7U4RTkUSYv4hwpojnazvs U7vp8dUay8ehPgXgyq0rGZttQqTSQgalpTAkW7bK0X_scGP_Lrra.1ynmYG_ jzKi7jErkj5fHFLlFWoVwgO933PI5MifuvJVBl92q1aLF9M232F5xlEo.Bdr kJvK4vYLsBHqG9Lwi0euwsB.jPk4PeYch5uUOCFvzUY7D5O09h6DGkSLAwwo hL.ZosNpcQdMspfAW X-Yahoo-SMTP: iDf2N9.swBDAhYEh7VHfpgq0lnq. Received: from [192.168.119.20] (se@81.173.155.124 with plain) by smtp108.mail.bf1.yahoo.com with SMTP; 11 Oct 2011 03:22:48 -0700 PDT Message-ID: <4E9418F9.9030304@freebsd.org> Date: Tue, 11 Oct 2011 12:22:49 +0200 From: Stefan Esser User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1 MIME-Version: 1.0 To: mm@FreeBSD.org References: <201110110726.p9B7QhEC072913@freefall.freebsd.org> In-Reply-To: <201110110726.p9B7QhEC072913@freefall.freebsd.org> Content-Type: multipart/mixed; boundary="------------020706090101080007040807" Cc: freebsd-fs@FreeBSD.org, sm@kill-9.net Subject: [RESENT with patch] Re: bin/121366: [zfs] [patch] Automatic disk scrubbing from periodic(8) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Oct 2011 10:36:54 -0000 This is a multi-part message in MIME format. --------------020706090101080007040807 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit On 11.10.2011 09:26, mm@FreeBSD.org wrote: > Synopsis: [zfs] [patch] Automatic disk scrubbing from periodic(8) > > State-Changed-From-To: open->closed > State-Changed-By: mm > State-Changed-When: Tue Oct 11 07:26:43 UTC 2011 > State-Changed-Why: > Implemented in r209195 by netchild@ (/etc/periodic/daily/800.scrub-zfs) > > http://www.freebsd.org/cgi/query-pr.cgi?pr=121366 There is no mention of daily_scrub_zfs_enable daily_scrub_zfs_pools daily_scrub_zfs_default_threshold daily_scrub_zfs_${poolname}_threshold in /etc/defaults/periodic.conf, which we use to define configuration options for all periodic scripts. The attached patch adds the missing entries. I'm not sure about the commented out line for "daily_scrub_zfs_${poolname}_threshold", but there is precedent (in the "amd" section) and I think many users will grep for scrub_zfs in that file instead of looking into the man-page for periodic.conf. If the defaults in the patch are considered OK (they just state the defaults set in the script), I'd like to commit them to head and MFC within a week. Regards, STefan (Please include my address in replies, since I'm not in freebsd-fs@). PS: The following unrelated variables have no defaults defined: daily_backup_distfile_enable daily_backup_pkgdb_dbdir daily_distfile_enable I have not yet looked up their defaults, but I think they should also be defined in defaults/periodic.conf ... --------------020706090101080007040807 Content-Type: text/plain; name="periodic.conf.zfs-scrub.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="periodic.conf.zfs-scrub.diff" --- /usr/src/etc/defaults/periodic.conf~ 2011-03-26 19:33:15.000000000 +0100 --- /usr/src/etc/defaults/periodic.conf 2011-10-11 11:34:09.245775412 +0200 @@ -205,6 +205,12 @@ # 800.loginfail daily_status_security_loginfail_enable="YES" +# 800.scrub-zfs +daily_scrub_zfs_enable="NO" +daily_scrub_zfs_pools="" # empty string selects all pools +daily_scrub_zfs_default_threshold="30" # days between scrubs +#daily_scrub_zfs_${poolname}_threshold="30" # pool specific threshold + # 900.tcpwrap daily_status_security_tcpwrap_enable="YES" --------------020706090101080007040807-- From owner-freebsd-fs@FreeBSD.ORG Tue Oct 11 12:04:04 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1486510657B7; Tue, 11 Oct 2011 12:04:04 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 048318FC2B; Tue, 11 Oct 2011 12:03:57 +0000 (UTC) Received: by wyj26 with SMTP id 26so10844263wyj.13 for ; Tue, 11 Oct 2011 05:03:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=uXh3kmJbzmqJfqqwus6WXgO7cV+DJqUQgW+AEv9doq4=; b=WFRwNG4hFa5cHybxbB9M5JVpQuoZe1tgsiKKzQWh25BktM5obgq8qq+EWvjsZaB9bu iWLHNFg0eD/UmgJtSZxz7jqra2C7uu3d/9yCcXGWKDhqhUM1IBu4FZCFzVsq/NouEV/t nCtUoHwThZZBFEzH3fhZ6Z0Jcj9TdwEen1Hf0= MIME-Version: 1.0 Received: by 10.216.134.201 with SMTP id s51mr747086wei.27.1318334637047; Tue, 11 Oct 2011 05:03:57 -0700 (PDT) Sender: asmrookie@gmail.com Received: by 10.216.182.3 with HTTP; Tue, 11 Oct 2011 05:03:56 -0700 (PDT) In-Reply-To: <20111011102331.GW1511@deviant.kiev.zoral.com.ua> References: <201110110756.p9B7ul0g051037@chez.mckusick.com> <20111011102331.GW1511@deviant.kiev.zoral.com.ua> Date: Tue, 11 Oct 2011 14:03:56 +0200 X-Google-Sender-Auth: ySK5opZ5kNVbDILb5RuCer8aXqI Message-ID: From: Attilio Rao To: Kostik Belousov Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: Kirk McKusick , Garrett Cooper , Xin LI , freebsd-fs@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Oct 2011 12:04:05 -0000 2011/10/11 Kostik Belousov : > On Tue, Oct 11, 2011 at 12:56:47AM -0700, Kirk McKusick wrote: >> > Date: Mon, 10 Oct 2011 19:12:59 -0700 >> > From: Garrett Cooper >> > To: Kostik Belousov >> > Cc: Kirk McKusick , Attilio Rao , >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0 Xin LI , freebsd-fs@f= reebsd.org >> > Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? >> > >> > 2011/10/10 Kostik Belousov : >> > >> > > The real case to test is the NFS mount which is wedged due to >> > > hung/unresponsive NFS server. I have high suspect that the patch >> > > could introduce the unkillable hung unmount process. >> > >> > =C2=A0 =C2=A0 It blocked, but I could ^C it perfectly fine. I tested i= t via: >> > >> > Setup: >> > 1. Started up FreeNAS 8.x image; it acquired an IP from my server with >> > dhcp-75.local. >> > >> > Test 1: >> > 1. mount -t nfs dhcp-75:/mnt/tank /mnt/nfs/ from my test workstation. >> > 2. Paused VM. >> > 3. umount /mnt/nfs (the command blocked). >> > 4. ^C. >> > 5. mount | grep /mnt/nfs showed nothing (it had unmounted). >> > >> > Test 2: >> > 1. mount -t nfs dhcp-75:/mnt/tank /mnt/nfs/ from my test workstation (= blocked). >> > 2. Opened up another ssh session and cd'ed to /mnt/nfs . >> > 3. Paused VM. >> > 4. umount /mnt/nfs . It failed with EBUSY. >> > 5. mount | grep /mnt/nfs showed that it was still mounted, as expected= . >> > >> > =C2=A0 =C2=A0 So unless there are buffers still waiting to be written = out to an >> > NFS share, or other reasons that would prevent the NFS share from >> > being fully released, I doubt the proposed behavior is really >> > different from previous versions of FreeBSD. >> > Thanks, >> > -Garrett >> >> Given the testing that has been done and our discussion about deadlocks, > I am not sure that it was adequate. > > If it was not obvious, my main concern is the nfs client that busied > the mount point and waiting for the wedged server rpc response. > >> I believe that I should proceed to check in my originally proposed >> change. Notably the one that simply deleted the !=3D MNT_FORCE >> conditional. However, there is no harm in using my revised version >> that releases the covered vnode before draining vfs_busy, and there >> might be some future case where that would be a necessary thing to do. > What is the future case where you intend to break the order between > vfs_busy() and vnode locks ? > >> >> Speak up if you think I should not proceed to check in this change. >> Also, let me know if you have thoughts on which version I should use. > > If commmitting any of two changes, I would prefer to see the minimal one, > which does not unlock the covered vnode. I agree with Kostik, I don't see the point for dropping coveredvnode as long as the ordering is already set up, but I don't have objections to the 'minimal' change. Attilio --=20 Peace can only be achieved by understanding - A. Einstein From owner-freebsd-fs@FreeBSD.ORG Tue Oct 11 18:50:22 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C00C1106566C; Tue, 11 Oct 2011 18:50:22 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [70.36.157.235]) by mx1.freebsd.org (Postfix) with ESMTP id 8652D8FC18; Tue, 11 Oct 2011 18:50:22 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id p9BIoNJf099800; Tue, 11 Oct 2011 11:50:23 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201110111850.p9BIoNJf099800@chez.mckusick.com> To: Kostik Belousov In-reply-to: <20111011102331.GW1511@deviant.kiev.zoral.com.ua> Date: Tue, 11 Oct 2011 11:50:23 -0700 From: Kirk McKusick X-Spam-Status: No, score=0.0 required=5.0 tests=MISSING_MID, UNPARSEABLE_RELAY autolearn=failed version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com Cc: Garrett Cooper , Attilio Rao , Xin LI , freebsd-fs@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Oct 2011 18:50:22 -0000 I have checked in the minimalist change as revision 226265. I set a 3-week MFC period. If no problems have turned up I will push it to the 8 & 9 branches (9 probably still subject to re@ approval). If anyone thinks this is the wrong MFC timeframe, please let me know. Kirk McKusick From owner-freebsd-fs@FreeBSD.ORG Tue Oct 11 23:25:45 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E4B511065670 for ; Tue, 11 Oct 2011 23:25:45 +0000 (UTC) (envelope-from freebsd@penx.com) Received: from Elmer.dco.penx.com (elmer.dco.penx.com [174.46.214.165]) by mx1.freebsd.org (Postfix) with ESMTP id B8F138FC12 for ; Tue, 11 Oct 2011 23:25:45 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by Elmer.dco.penx.com (8.14.5/8.14.4) with ESMTP id p9BNPfUE015568 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 11 Oct 2011 17:25:44 -0600 (MDT) (envelope-from freebsd@penx.com) Date: Tue, 11 Oct 2011 17:25:41 -0600 (MDT) From: Dennis Glatting X-X-Sender: dennisg@Elmer.dco.penx.com To: freebsd-fs@freebsd.org Message-ID: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII Subject: ZFS/compression/performance X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Oct 2011 23:25:46 -0000 I would appreciate someone knowledgeable in ZFS point me in the right direction. I have several ZFS arrays, some using gzip for compression. The compressed arrays hold very large text documents (10MB->20TB) and are highly compressible. Reading the files from a compressed data sets is fast with little load. However, writing to the compressed data sets incurs substantial load on the order of a load average from 12 to 20. My questions are: 1) Why such a heavy load on writing? 2) What kind of limiters can I put into effect to reduce load without impacting compressibilty? For example, is there some variable to controls the number of parallel compression operations? I have a number of different systems. Memory is 24GB on each of the two large data systems, SSD (Revo) for cache, and a SATA II ZIL. One system is a 6 core i7 @ 3.33 GHz and the other 4 core ii7 @ 2.93 GHz. The arrays are RAIDz using cheap 2TB disks. Thanks. From owner-freebsd-fs@FreeBSD.ORG Tue Oct 11 23:59:55 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 83D141065670 for ; Tue, 11 Oct 2011 23:59:55 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-yw0-f54.google.com (mail-yw0-f54.google.com [209.85.213.54]) by mx1.freebsd.org (Postfix) with ESMTP id 43F1C8FC0A for ; Tue, 11 Oct 2011 23:59:55 +0000 (UTC) Received: by ywp17 with SMTP id 17so163302ywp.13 for ; Tue, 11 Oct 2011 16:59:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=5OjcFc6+e9qvsEkiHYK+FK0ozO96Z0IDbgHLeJbucJM=; b=Fs2hvd1Nr1Zr59quEQdlxMubNgaSPCKziFj6OvtALvopIodBSKaDN+RsYw7Jh4DDIm 4ANS7xYXvdFU90fBBBlGK4DsRugFWqWzRjSaeLdLZJkYC+HMmJYW02dpWDImgPuLX7X0 uCiCZyCV9MDJJbrFsxEU6c54NeqTYnEQiSres= MIME-Version: 1.0 Received: by 10.236.186.35 with SMTP id v23mr34072727yhm.80.1318377594595; Tue, 11 Oct 2011 16:59:54 -0700 (PDT) Sender: artemb@gmail.com Received: by 10.236.103.33 with HTTP; Tue, 11 Oct 2011 16:59:54 -0700 (PDT) In-Reply-To: References: Date: Tue, 11 Oct 2011 16:59:54 -0700 X-Google-Sender-Auth: q-wW1xSz5kqNYqIXpw3DpxyQLP4 Message-ID: From: Artem Belevich To: Dennis Glatting Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: ZFS/compression/performance X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Oct 2011 23:59:55 -0000 On Tue, Oct 11, 2011 at 4:25 PM, Dennis Glatting wrote: > I would appreciate someone knowledgeable in ZFS point me in the right > direction. > > I have several ZFS arrays, some using gzip for compression. The compresse= d > arrays hold very large text documents (10MB->20TB) and are highly > compressible. Reading the files from a compressed data sets is fast with > little load. However, writing to the compressed data sets incurs substant= ial > load on the order of a load average from 12 to 20. > > My questions are: > > 1) Why such a heavy load on writing? gzip compression is relatively slow, even gzip-1, even on the fast CPU, even with multiple cores. ZFS does per-block compression and spreads the load across multiple worker threads. Compression seems to happen when the data is being flushed to disk. There are multiple worker threads. What typically happens is that the data you write gets accumulated in ARC. Every 10 seconds (ZFSv28 default, I believe. Used to be 30 before) ZFS starts flushing whatever has been accumulated. Setting compression to its lowest setting (gzip-1) will help a bit. Getting fastest CPU(s) you can afford will help, too, because you will be hard pressed to compress data fast enough to saturate single HDD bandwidth, never mind multi-disk pool. Another option is to switch to lzjb compression. Compression level will be limited to ~2x, but it's pretty fast. > 2) What kind =A0of limiters can I put into effect to reduce load > =A0 without impacting compressibilty? For example, is there some > =A0 variable to controls the number of parallel compression > =A0 operations? If on average you write data faster than yoour CPU can compress it with a chosen compression settings, there's not much you can do. If CPU can keep up with writes in general, then there are few things you can to to prevent compression rush. Tinkering with following tunables may help: vfs.zfs.txg.timeout -- how frequently ZFS flushes data vfs.zfs.txg.write_limit_override -- limits how fast ZFS tries to write data > I have a number of different systems. Memory is 24GB on each of the two > large data systems, SSD (Revo) for cache, and a SATA II ZIL. One system i= s a > 6 core i7 @ 3.33 GHz and the other 4 core ii7 @ 2.93 GHz. The arrays are > RAIDz using cheap 2TB disks. For gzip-9 even 6-core i7 may still be a bottleneck. On a similar system I have gzip only gives me about 12MB/s per core. 6 cores would be barely enough to keep one disk busy. --Artem From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 08:09:37 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0354D1065676 for ; Wed, 12 Oct 2011 08:09:37 +0000 (UTC) (envelope-from prvs=1266ac8959=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id 4700E8FC15 for ; Wed, 12 Oct 2011 08:09:36 +0000 (UTC) X-MDAV-Processed: mail1.multiplay.co.uk, Wed, 12 Oct 2011 08:59:34 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Wed, 12 Oct 2011 08:59:33 +0100 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on mail1.multiplay.co.uk X-Spam-Level: X-Spam-Status: No, score=-5.0 required=6.0 tests=USER_IN_WHITELIST shortcircuit=ham autolearn=disabled version=3.2.5 Received: from r2d2 ([188.220.16.49]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50015532400.msg for ; Wed, 12 Oct 2011 08:59:33 +0100 X-MDRemoteIP: 188.220.16.49 X-Return-Path: prvs=1266ac8959=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk X-MDaemon-Deliver-To: freebsd-fs@freebsd.org Message-ID: <740CDE42E82044428B5FAA3FC73ABD32@multiplay.co.uk> From: "Steven Hartland" To: "Dennis Glatting" , References: Date: Wed, 12 Oct 2011 08:59:29 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=response Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6109 Cc: Subject: Re: ZFS/compression/performance X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 08:09:37 -0000 ----- Original Message ----- From: "Dennis Glatting" >I would appreciate someone knowledgeable in ZFS point me in the right > direction. > > I have several ZFS arrays, some using gzip for compression. The compressed > arrays hold very large text documents (10MB->20TB) and are highly > compressible. Reading the files from a compressed data sets is fast with > little load. However, writing to the compressed data sets incurs > substantial load on the order of a load average from 12 to 20. > > My questions are: > > 1) Why such a heavy load on writing? > 2) What kind of limiters can I put into effect to reduce load > without impacting compressibilty? For example, is there some > variable to controls the number of parallel compression > operations? > > I have a number of different systems. Memory is 24GB on each of the two > large data systems, SSD (Revo) for cache, and a SATA II ZIL. One system is > a 6 core i7 @ 3.33 GHz and the other 4 core ii7 @ 2.93 GHz. The arrays are > RAIDz using cheap 2TB disks. Have you tried using the alternative compression algorithms e.g. lzjb or gzip-[1-5] the default gzip = gzip-6 Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 12:03:04 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A229D106566C for ; Wed, 12 Oct 2011 12:03:04 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id 5E6FF8FC25 for ; Wed, 12 Oct 2011 12:03:04 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1RDxWX-0005VN-0S for freebsd-fs@freebsd.org; Wed, 12 Oct 2011 14:03:01 +0200 Received: from dyn1206-83.wlan.ic.ac.uk ([129.31.206.83]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 12 Oct 2011 14:03:00 +0200 Received: from jtotz by dyn1206-83.wlan.ic.ac.uk with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 12 Oct 2011 14:03:00 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Johannes Totz Date: Wed, 12 Oct 2011 13:02:47 +0100 Lines: 31 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: dyn1206-83.wlan.ic.ac.uk User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1 In-Reply-To: Subject: Re: ZFS/compression/performance X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 12:03:04 -0000 On 12/10/2011 00:25, Dennis Glatting wrote: > I would appreciate someone knowledgeable in ZFS point me in the right > direction. > > I have several ZFS arrays, some using gzip for compression. The > compressed arrays hold very large text documents (10MB->20TB) and are > highly compressible. Reading the files from a compressed data sets is > fast with little load. However, writing to the compressed data sets > incurs substantial load on the order of a load average from 12 to 20. > > My questions are: > > 1) Why such a heavy load on writing? > 2) What kind of limiters can I put into effect to reduce load > without impacting compressibilty? For example, is there some > variable to controls the number of parallel compression > operations? > > I have a number of different systems. Memory is 24GB on each of the two > large data systems, SSD (Revo) for cache, and a SATA II ZIL. One system > is a 6 core i7 @ 3.33 GHz and the other 4 core ii7 @ 2.93 GHz. The > arrays are RAIDz using cheap 2TB disks. Artem gave you a pretty good explanation. I just did a simple write test yesterday: 1) 6 MB/sec for gzip, 1.36x ratio 2) 34 MB/sec for lzjb, 1.23x ratio I'll stick with lzjb. It's good enough to get rid of most of the redundancy and speed is acceptable. From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 13:30:36 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 48A91106564A; Wed, 12 Oct 2011 13:30:36 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 20FAE8FC14; Wed, 12 Oct 2011 13:30:36 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9CDUZMS006004; Wed, 12 Oct 2011 13:30:36 GMT (envelope-from jhb@freefall.freebsd.org) Received: (from jhb@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9CDUZUd005991; Wed, 12 Oct 2011 13:30:35 GMT (envelope-from jhb) Date: Wed, 12 Oct 2011 13:30:35 GMT Message-Id: <201110121330.p9CDUZUd005991@freefall.freebsd.org> To: jhb@FreeBSD.org, freebsd-amd64@FreeBSD.org, freebsd-fs@FreeBSD.org From: jhb@FreeBSD.org Cc: Subject: Re: kern/161493: NFS v3 directory structure update slow X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 13:30:36 -0000 Synopsis: NFS v3 directory structure update slow Responsible-Changed-From-To: freebsd-amd64->freebsd-fs Responsible-Changed-By: jhb Responsible-Changed-When: Wed Oct 12 13:30:14 UTC 2011 Responsible-Changed-Why: Move this over to fs@. http://www.freebsd.org/cgi/query-pr.cgi?pr=161493 From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 13:42:18 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B9D1B1065676; Wed, 12 Oct 2011 13:42:18 +0000 (UTC) (envelope-from patpro@patpro.net) Received: from rack.patpro.net (rack.patpro.net [193.30.227.216]) by mx1.freebsd.org (Postfix) with ESMTP id 6A0928FC22; Wed, 12 Oct 2011 13:42:18 +0000 (UTC) Received: from rack.patpro.net (localhost [127.0.0.1]) by rack.patpro.net (Postfix) with ESMTP id 1262F1CC038; Wed, 12 Oct 2011 15:42:17 +0200 (CEST) X-Virus-Scanned: amavisd-new at patpro.net Received: from amavis-at-patpro.net ([127.0.0.1]) by rack.patpro.net (rack.patpro.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GAGxTzenhZhn; Wed, 12 Oct 2011 15:42:15 +0200 (CEST) Received: from [127.0.0.1] (localhost [127.0.0.1]) by rack.patpro.net (Postfix) with ESMTP; Wed, 12 Oct 2011 15:42:15 +0200 (CEST) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: multipart/signed; boundary=Apple-Mail-28-817318118; protocol="application/pkcs7-signature"; micalg=sha1 From: Patrick Proniewski In-Reply-To: <86ipnwg1s0.fsf@kopusha.home.net> Date: Wed, 12 Oct 2011 15:42:14 +0200 Message-Id: <94455706-B90D-4DBD-A7DE-E9A38F118D35@patpro.net> References: <20110915120007.F41FF10656E1@hub.freebsd.org> <4B8C8026-1E12-4C32-88E3-9B34A3E58A91@patpro.net> <86ipnwg1s0.fsf@kopusha.home.net> To: Mikolaj Golub X-Mailer: Apple Mail (2.1084) X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: measuring IO asynchronously X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 13:42:18 -0000 --Apple-Mail-28-817318118 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii On 10 oct. 2011, at 20:05, Mikolaj Golub wrote: > On Mon, 10 Oct 2011 16:33:11 +0200 Patrick Proniewski wrote: > PP> I would like to monitor the storage on various FreeBSD servers, > PP> especially I/O per seconds. Is there any way to gather statistics > PP> about I/O via asynchronous request, lets say, for example, using a > PP> munin plugin? `iostat -w 1` and `zpool iostat tank 1` are > PP> interesting, but not useable asynchronously. >=20 > I use for this a simple program that I wrote some time ago. It uses = devstat(9) > kernel interface and outputs counters, like below >=20 > kopusha:~% devstat ada0 =20 > ada0: > 1339552256 bytes read > ../.. > You can find it in ports (sysutils/devstat). Thank you Mikolaj, I'm going to give it a try! regards, Patrick= --Apple-Mail-28-817318118-- From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 15:11:30 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8100E106564A for ; Wed, 12 Oct 2011 15:11:30 +0000 (UTC) (envelope-from ler@lerctr.org) Received: from thebighonker.lerctr.org (lrosenman-1-pt.tunnel.tserv8.dal1.ipv6.he.net [IPv6:2001:470:1f0e:3ad::2]) by mx1.freebsd.org (Postfix) with ESMTP id 4A5178FC0C for ; Wed, 12 Oct 2011 15:11:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lerctr.org; s=lerami; h=Content-Transfer-Encoding:Content-Type:Subject:To:MIME-Version:From:Date:Message-ID; bh=R1zkfFtJ1Ff9BxC/gAU9xAq4hX2WrFhBP/EDxJMXqiU=; b=EJuKxdKSfto4pZijdIWec5GGlNA9ve4KfUTjgn7gRGPtjF+rLd51Pcd7tOiVCdn6U15ZDQkNJGcDOQLlFEABQAKdDq67eNSCMCSXYat36DNqy1RwaTyStVFP3kegsKs07JiLUkkdP6OLDUvny1gxoYFpn0J6LeA9rtKKnZDvD4E=; Received: from [32.97.110.60] (port=2299 helo=[9.41.58.142]) by thebighonker.lerctr.org with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1RE0Su-0003lH-LG for freebsd-fs@freebsd.org; Wed, 12 Oct 2011 10:11:29 -0500 Message-ID: <4E95AE08.7030105@lerctr.org> Date: Wed, 12 Oct 2011 10:11:04 -0500 From: Larry Rosenman Organization: LERCTR Consulting User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -2.8 (--) X-LERCTR-Spam-Score: -2.8 (--) X-Spam-Report: SpamScore (-2.8/5.0) ALL_TRUSTED=-1, BAYES_00=-1.9, SARE_SUB_OBFU_OTHER=0.135 X-LERCTR-Spam-Report: SpamScore (-2.8/5.0) ALL_TRUSTED=-1, BAYES_00=-1.9, SARE_SUB_OBFU_OTHER=0.135 Subject: AF (4096 byte sector) drives: Can you mix/match in a ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 15:11:30 -0000 I have a root on ZFS box with 6 drives, all 400G (except one 500G) in a pool. I want to upgrade to 2T or 3T drives, but was wondering if you can mix/match while doing the drive by drive replacement. This is on 9.0-BETA3 if that matters. Thanks! -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 512-248-2683 E-Mail: ler@lerctr.org US Mail: 430 Valona Loop, Round Rock, TX 78681-3893 From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 15:59:41 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AA5AD1065677 for ; Wed, 12 Oct 2011 15:59:41 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta10.westchester.pa.mail.comcast.net (qmta10.westchester.pa.mail.comcast.net [76.96.62.17]) by mx1.freebsd.org (Postfix) with ESMTP id 6BED38FC13 for ; Wed, 12 Oct 2011 15:59:41 +0000 (UTC) Received: from omta22.westchester.pa.mail.comcast.net ([76.96.62.73]) by qmta10.westchester.pa.mail.comcast.net with comcast id jnEa1h0011ap0As5ArzhCW; Wed, 12 Oct 2011 15:59:41 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta22.westchester.pa.mail.comcast.net with comcast id jrzg1h00A1t3BNj3irzgUM; Wed, 12 Oct 2011 15:59:41 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id ADE5C102C1C; Wed, 12 Oct 2011 08:59:38 -0700 (PDT) Date: Wed, 12 Oct 2011 08:59:38 -0700 From: Jeremy Chadwick To: Larry Rosenman Message-ID: <20111012155938.GA24649@icarus.home.lan> References: <4E95AE08.7030105@lerctr.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4E95AE08.7030105@lerctr.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: AF (4096 byte sector) drives: Can you mix/match in a ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 15:59:41 -0000 On Wed, Oct 12, 2011 at 10:11:04AM -0500, Larry Rosenman wrote: > I have a root on ZFS box with 6 drives, all 400G (except one 500G) > in a pool. > > I want to upgrade to 2T or 3T drives, but was wondering if you can > mix/match while doing the drive by drive > replacement. > > This is on 9.0-BETA3 if that matters. This is a very good question, and opens a large can of worms. My gut feeling tells me this discussion is going to be very long. I'm going to say that no, mixing 512-byte and 4096-byte sector drives in a single vdev is a bad idea. Here's why: The procedure I've read for doing this is as follows: ada0 = 512-byte sector disk ada1 = 4096-byte sector disk ada2 = 512-byte sector disk gnop create -S 4096 ada1 zpool create mypool raidz ada0 ada1.nop ada2 zdb | grep ashift zpool export mypool gnop destroy ada1.nop zpool import mypool There's an example of this procedure here, but the author does not disclose if he's using three (3) WD20EARS drives, or if he's only using one (1) WD20EARS drive (shown as ada0 in his list). I have a feeling he's using multi-sized-sector drives, which means his performance probably sucks: http://blog.monsted.dk/?q=node/1 Here's the kicker: the "ashift" parameter -- which is what in ZFS land helps with the alignment issue -- is defined on a per-vdev basis. It's hard to explain. Look at the below zdb output for a 2-disk mirror that consists of a single vdev: mypool: name: 'mypool' ... vdev_children: 1 vdev_tree: type: 'root' id: 0 ... children[0]: type: 'mirror' id: 0 ... ashift: 9 ... children[0]: type: 'disk' id: 0 ... path: '/dev/ada1' phys_path: '/dev/ada1' ... children[1]: type: 'disk' id: 1 ... path: '/dev/ada3' phys_path: '/dev/ada3' ... Note where the "ashift" parameter is located in the above tree. (I imagine a pool with multiple vdevs would therefore have one ashift parameter per vdev set). Circling back to the procedure I stated above: this would result in an ashift=12 alignment for all I/O to all underlying disks. How do you think your 512-byte sector drives are going to perform when doing reads and writes? (Answer: badly) Likewise, what if you just screw the whole gnop thing and stick the drive in and treat it without alignment (e.g. ashift=9, which is the default I believe)? You'll suffer from bad write performance (up to ~30%) due to lack of proper alignment on that one drive. Meaning, that drive will effectively become a delay bottleneck for your writes to the pool. So my advice is do not mix-match 512-byte and 4096-byte sector disks in a vdev that consists of multiple disks. If your next question is "what if I just make the 4096-byte sector disk its own vdev and the 512-byte ones their own vdev?" then the answer is: don't do this if you care about your pool. E.g. a raidz1 pool with 3 512-byte sector disks in a vdev + one 4096-byte sector disk in a vdev means that if the 4096-byte sector disk dies your pool is screwed. If your next question is "what about if I had a mirror that consisted of two vdevs (ada0 + ada1, then another as ada2 + ada3), and say disk ada2 is a 4096-byte sector drive, will that hurt the entire pool or just the vdev?", I do not have an answer. If you use ZFS with a single-disk pool (e.g. zpool create blah ada1), then you should absolutely be able to use the above procedure and not run into any issues. As I finish this Email I'm certain folks will come along and tell me I'm wrong, but given the above data I don't see how that'd be the case. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 16:37:12 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 115A9106566B for ; Wed, 12 Oct 2011 16:37:12 +0000 (UTC) (envelope-from brodbd@uw.edu) Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com [209.85.215.182]) by mx1.freebsd.org (Postfix) with ESMTP id 9E3A38FC08 for ; Wed, 12 Oct 2011 16:37:11 +0000 (UTC) Received: by eyd10 with SMTP id 10so1209161eyd.13 for ; Wed, 12 Oct 2011 09:37:10 -0700 (PDT) MIME-Version: 1.0 Received: by 10.213.108.84 with SMTP id e20mr943637ebp.82.1318437430550; Wed, 12 Oct 2011 09:37:10 -0700 (PDT) Received: by 10.213.112.130 with HTTP; Wed, 12 Oct 2011 09:37:10 -0700 (PDT) In-Reply-To: References: Date: Wed, 12 Oct 2011 09:37:10 -0700 Message-ID: From: David Brodbeck To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Re: ZFS/compression/performance X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 16:37:12 -0000 On Wed, Oct 12, 2011 at 5:02 AM, Johannes Totz wrote: > I just did a simple write test yesterday: > > 1) 6 MB/sec for gzip, 1.36x ratio > 2) 34 MB/sec for lzjb, 1.23x ratio > > I'll stick with lzjb. It's good enough to get rid of most of the > redundancy and speed is acceptable. > That's what we use on our text-heavy filesystems on our OpenSolaris server. (We work with large text corpora, so we have hundreds of gigabytes of pure text.) My benchmarks showed the performance hit for reads is nonexistent when viewed over NFS, and the performance hit for writes is relatively small...plus we don't write to that filesystem much. We see about 1.5x compression overall, with a little over 2x on some datasets that are particularly compressible. -- David Brodbeck System Administrator, Linguistics University of Washington From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 16:50:26 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 23931106566B for ; Wed, 12 Oct 2011 16:50:26 +0000 (UTC) (envelope-from daniel@digsys.bg) Received: from smtp-sofia.digsys.bg (smtp-sofia.digsys.bg [193.68.3.230]) by mx1.freebsd.org (Postfix) with ESMTP id B496F8FC0C for ; Wed, 12 Oct 2011 16:50:25 +0000 (UTC) Received: from dcave.digsys.bg (dcave.digsys.bg [192.92.129.5]) (authenticated bits=0) by smtp-sofia.digsys.bg (8.14.4/8.14.4) with ESMTP id p9CGoFY4005781 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Wed, 12 Oct 2011 19:50:20 +0300 (EEST) (envelope-from daniel@digsys.bg) Message-ID: <4E95C546.70904@digsys.bg> Date: Wed, 12 Oct 2011 19:50:14 +0300 From: Daniel Kalchev User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:7.0.1) Gecko/20111007 Thunderbird/7.0.1 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <4E95AE08.7030105@lerctr.org> <20111012155938.GA24649@icarus.home.lan> In-Reply-To: <20111012155938.GA24649@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: AF (4096 byte sector) drives: Can you mix/match in a ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 16:50:26 -0000 On 12.10.11 18:59, Jeremy Chadwick wrote: > On Wed, Oct 12, 2011 at 10:11:04AM -0500, Larry Rosenman wrote: >> I have a root on ZFS box with 6 drives, all 400G (except one 500G) >> in a pool. >> >> I want to upgrade to 2T or 3T drives, but was wondering if you can >> mix/match while doing the drive by drive >> replacement. >> >> This is on 9.0-BETA3 if that matters. > This is a very good question, and opens a large can of worms. My gut > feeling tells me this discussion is going to be very long. > > I'm going to say that no, mixing 512-byte and 4096-byte sector drives in > a single vdev is a bad idea. Here's why: This was not the original question. The original question is whether replacing 512-byte sector drives in a 512-byte sector aligned zpool with 4096-byte sector drives is possible. It is possible, of course, as most 4096-byte drives today emulate 512-byte drives and some even pretend to be 512-byte sector drives. Performance might degrade, this depends on the workload. In some cases the performance might be way bad. > > The procedure I've read for doing this is as follows: > > ada0 = 512-byte sector disk > ada1 = 4096-byte sector disk > ada2 = 512-byte sector disk > > gnop create -S 4096 ada1 > zpool create mypool raidz ada0 ada1.nop ada2 > zdb | grep ashift > 512-byte alignment> > zpool export mypool > gnop destroy ada1.nop > zpool import mypool It is not important which of the underlying drives will be gnop-ed. You may well gnop all of these. The point is, that ZFS uses the largest sector size of any of the underlying devices to determine the ashift value. That is the "minimum write" value, or the smallest unit of data ZFS will write in an I/O. > Circling back to the procedure I stated above: this would result in an > ashift=12 alignment for all I/O to all underlying disks. How do you > think your 512-byte sector drives are going to perform when doing reads > and writes? (Answer: badly) The gnop trick is used not because you will ask a 512-byte sector drive to write 8 sectors with one I/O, but because you may ask an 4096-byte sector drive to write only 512 bytes -- which for the drive means it has to read 4096 bytes, modify 512 of these bytes and write back 4096 bytes. > So my advice is do not mix-match 512-byte and 4096-byte sector disks in a > vdev that consists of multiple disks. > The proper way to handle this is to create your zpool with 4096-byte alignment, that is, for the time being by using the above gnop 'hack'. This way, you are sure to not have performance implications no matter what (512 or 4096 byte) drives you use in the vdev. There should be no implications to having one vdev with 512 byte alignment and another with 4096 byte alignment. ZFS is smart enough to issue minimum of 512 byte writes to the former and 4096 bytes to the latter thus not creating any bottleneck. Daniel PS: I didn't say you are wrong. ;) From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 16:51:29 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A04AC1065670 for ; Wed, 12 Oct 2011 16:51:29 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta03.westchester.pa.mail.comcast.net (qmta03.westchester.pa.mail.comcast.net [76.96.62.32]) by mx1.freebsd.org (Postfix) with ESMTP id 4D3BB8FC08 for ; Wed, 12 Oct 2011 16:51:28 +0000 (UTC) Received: from omta20.westchester.pa.mail.comcast.net ([76.96.62.71]) by qmta03.westchester.pa.mail.comcast.net with comcast id jqFl1h0061YDfWL53srVcg; Wed, 12 Oct 2011 16:51:29 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta20.westchester.pa.mail.comcast.net with comcast id jsrT1h00q1t3BNj3gsrU3c; Wed, 12 Oct 2011 16:51:29 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id AA4B8102C1D; Wed, 12 Oct 2011 09:51:26 -0700 (PDT) Date: Wed, 12 Oct 2011 09:51:26 -0700 From: Jeremy Chadwick To: freebsd-fs@freebsd.org Message-ID: <20111012165126.GA26562@icarus.home.lan> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: ZFS/compression/performance X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 16:51:29 -0000 On Wed, Oct 12, 2011 at 09:37:10AM -0700, David Brodbeck wrote: > On Wed, Oct 12, 2011 at 5:02 AM, Johannes Totz wrote: > > > I just did a simple write test yesterday: > > > > 1) 6 MB/sec for gzip, 1.36x ratio > > 2) 34 MB/sec for lzjb, 1.23x ratio > > > > I'll stick with lzjb. It's good enough to get rid of most of the > > redundancy and speed is acceptable. > > > > That's what we use on our text-heavy filesystems on our OpenSolaris server. > (We work with large text corpora, so we have hundreds of gigabytes of pure > text.) My benchmarks showed the performance hit for reads is nonexistent > when viewed over NFS, and the performance hit for writes is relatively > small...plus we don't write to that filesystem much. We see about 1.5x > compression overall, with a little over 2x on some datasets that are > particularly compressible. That might be the case on OpenSolaris but the performance hit on FreeBSD RELENG_8 is very high -- enough that enabling compression (using the defaults) causes stalls when I/O occurs (easily noticeable across SSH; characters are delayed/stalled (not buffered)), etc.. The last time I tried it on RELENG_8 was right after ZFSv28 was MFC'd. If things have improved I can try again (I don't remember seeing any commits that could affect this), or if people really think changing the compression model to lzjb will help. Another point: I haven't tinkered with compression on our Solaris 10 machines at work so I don't know if it performs better, equal, or worse than FreeBSD or OpenSolaris. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 16:53:57 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BCE6110656D6 for ; Wed, 12 Oct 2011 16:53:57 +0000 (UTC) (envelope-from tevans.uk@googlemail.com) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id 798638FC13 for ; Wed, 12 Oct 2011 16:53:57 +0000 (UTC) Received: by vcbf13 with SMTP id f13so1077308vcb.13 for ; Wed, 12 Oct 2011 09:53:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=oEYQOKrshODihZ8d4dpyY9kXvDLYiv+iufG9qP5rgbo=; b=Zo2UQzsiWNgKhn4HRFvGCL3Rxeh8lW4JEVcKcHhCimeS8M/YMC6r8YZ5brKuU7fdlO G6zYMIzfB8cKIQSSgSrI2bhfBYUa/qgmZGwB0l4UINja0RaLSBxRnfWdseMpHAX3ArMd j6z4FE9oDXW2oLDIbF+Jc5L/ZPXln7eCyOQoU= MIME-Version: 1.0 Received: by 10.52.37.44 with SMTP id v12mr24482786vdj.53.1318436913453; Wed, 12 Oct 2011 09:28:33 -0700 (PDT) Received: by 10.52.111.201 with HTTP; Wed, 12 Oct 2011 09:28:33 -0700 (PDT) In-Reply-To: <4E95AE08.7030105@lerctr.org> References: <4E95AE08.7030105@lerctr.org> Date: Wed, 12 Oct 2011 17:28:33 +0100 Message-ID: From: Tom Evans To: Larry Rosenman Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org Subject: Re: AF (4096 byte sector) drives: Can you mix/match in a ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 16:53:57 -0000 On Wed, Oct 12, 2011 at 4:11 PM, Larry Rosenman wrote: > I have a root on ZFS box with 6 drives, all 400G (except one 500G) in a > pool. > > I want to upgrade to 2T or 3T drives, but was wondering if you can mix/match > while doing the drive by drive > replacement. > > This is on 9.0-BETA3 if that matters. > > Thanks! > Hi Larry I'm in a similar position. I have a 2 x 6 x 1.5TB raidz system, configured a while ago when I wasn't aware enough of 4k sector drives, and so ZFS is configured to use 512 byte sectors (ashift=9). All of the drives in it were 512 byte sector drives, until one of them failed. At that point, I couldn't lay my hands on a large capacity drive that still used 512 byte sectors, so I replaced it with a 4k sector drive, made sure it was aligned correctly, and hoped for the best. The performance sucks (500MB/s reads -> 150MB/s reads!), but it 'works', all my data is safe. The solution is to make sure that all your vdevs, whether they are backed by disks that have 512 byte or 4k sectors, are created with 4k sectors (ashift=12). It won't negatively affect your older disks, and you won't end up in the position I am in, where I need to recreate the pool to fix the issue (and have 12TB of data with nowhere to put it!) Cheers Tom From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 17:29:15 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5808B106564A for ; Wed, 12 Oct 2011 17:29:15 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta15.westchester.pa.mail.comcast.net (qmta15.westchester.pa.mail.comcast.net [76.96.59.228]) by mx1.freebsd.org (Postfix) with ESMTP id 052A78FC0A for ; Wed, 12 Oct 2011 17:29:14 +0000 (UTC) Received: from omta23.westchester.pa.mail.comcast.net ([76.96.62.74]) by qmta15.westchester.pa.mail.comcast.net with comcast id jnA21h0061c6gX85FtVFTe; Wed, 12 Oct 2011 17:29:15 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta23.westchester.pa.mail.comcast.net with comcast id jtVD1h00S1t3BNj3jtVEE0; Wed, 12 Oct 2011 17:29:14 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 21C5E102C1C; Wed, 12 Oct 2011 10:29:12 -0700 (PDT) Date: Wed, 12 Oct 2011 10:29:12 -0700 From: Jeremy Chadwick To: Daniel Kalchev Message-ID: <20111012172912.GA27013@icarus.home.lan> References: <4E95AE08.7030105@lerctr.org> <20111012155938.GA24649@icarus.home.lan> <4E95C546.70904@digsys.bg> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4E95C546.70904@digsys.bg> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: AF (4096 byte sector) drives: Can you mix/match in a ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 17:29:15 -0000 On Wed, Oct 12, 2011 at 07:50:14PM +0300, Daniel Kalchev wrote: > > > On 12.10.11 18:59, Jeremy Chadwick wrote: > >On Wed, Oct 12, 2011 at 10:11:04AM -0500, Larry Rosenman wrote: > >>I have a root on ZFS box with 6 drives, all 400G (except one 500G) > >>in a pool. > >> > >>I want to upgrade to 2T or 3T drives, but was wondering if you can > >>mix/match while doing the drive by drive > >>replacement. > >> > >>This is on 9.0-BETA3 if that matters. > >This is a very good question, and opens a large can of worms. My gut > >feeling tells me this discussion is going to be very long. > > > >I'm going to say that no, mixing 512-byte and 4096-byte sector drives in > >a single vdev is a bad idea. Here's why: > > This was not the original question. The original question is whether > replacing 512-byte sector drives in a 512-byte sector aligned zpool > with 4096-byte sector drives is possible. > > It is possible, of course, as most 4096-byte drives today emulate > 512-byte drives and some even pretend to be 512-byte sector drives. > > Performance might degrade, this depends on the workload. In some > cases the performance might be way bad. > > > > >The procedure I've read for doing this is as follows: > > > >ada0 = 512-byte sector disk > >ada1 = 4096-byte sector disk > >ada2 = 512-byte sector disk > > > >gnop create -S 4096 ada1 > >zpool create mypool raidz ada0 ada1.nop ada2 > >zdb | grep ashift > > > 512-byte alignment> > >zpool export mypool > >gnop destroy ada1.nop > >zpool import mypool > > It is not important which of the underlying drives will be gnop-ed. > You may well gnop all of these. The point is, that ZFS uses the > largest sector size of any of the underlying devices to determine > the ashift value. That is the "minimum write" value, or the smallest > unit of data ZFS will write in an I/O. > > >Circling back to the procedure I stated above: this would result in an > >ashift=12 alignment for all I/O to all underlying disks. How do you > >think your 512-byte sector drives are going to perform when doing reads > >and writes? (Answer: badly) > > The gnop trick is used not because you will ask a 512-byte sector > drive to write 8 sectors with one I/O, but because you may ask an > 4096-byte sector drive to write only 512 bytes -- which for the > drive means it has to read 4096 bytes, modify 512 of these bytes and > write back 4096 bytes. If I'm reading this correctly, you're effectively stating ashift actually just defines (or helps in calculating) an LBA offset for the start of the pool-related data on that device? "ashift" seems like a badly-named term/variable for what this does, but oh well. I was always under the impression the term "ashift" stood for "align shift" and was applied to the block size of data read from a disk in a single request -- and keep reading (specifically last part of my mail). > >So my advice is do not mix-match 512-byte and 4096-byte sector disks in a > >vdev that consists of multiple disks. > > The proper way to handle this is to create your zpool with 4096-byte > alignment, that is, for the time being by using the above gnop > 'hack'. ...which brings into question why this is needed at all, meaning, why the ZFS code cannot be changed to default to an ashift value that's calculated as 12 (or equivalent) regardless of 512-byte or 4096-byte sector drives. I guess changing this would get into a discussion about whether or not it could (not would) badly impact other forms of media (CF drives, etc.), but if it's literally just a starting LBA offset adjustment value then it shouldn't matter. How was this addressed on Solaris/OpenSolaris? I really need to know this, mainly because we use both SSDs on Solaris 10 at my workplace, in addition to the fact that our Solaris 10 boxes are using 1TB disks and will soon (in many months to come) be upgraded to 2TBs, which almost certainly means we'll end up with 4096-byte sector drives. The last thing I need to deal with is our entire division talking about crummy I/O throughput due to our disk imaging process not forcing ashift to be 12. If I have to deal with Oracle then so be it, but I imagine someone lingering knows... :-) > This way, you are sure to not have performance implications no > matter what (512 or 4096 byte) drives you use in the vdev. > > There should be no implications to having one vdev with 512 byte > alignment and another with 4096 byte alignment. ZFS is smart enough > to issue minimum of 512 byte writes to the former and 4096 bytes to > the latter thus not creating any bottleneck. How does ZFS determine this? I was under the impression that this behaviour was determined by (or "assisted by") ashift. Surely ZFS cannot ask the underlying storage provider (e.g. GEOM on FreeBSD) what logical vs. physical sector size to use (e.g. for SATA what's returned in the ATA IDENTIFY payload), because on SSDs such as Intel SSDs *both* of those sizes are reported as 512 bytes (camcontrol identify confirms). -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 17:31:22 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 60818106566B for ; Wed, 12 Oct 2011 17:31:22 +0000 (UTC) (envelope-from ler@lerctr.org) Received: from thebighonker.lerctr.org (lrosenman-1-pt.tunnel.tserv8.dal1.ipv6.he.net [IPv6:2001:470:1f0e:3ad::2]) by mx1.freebsd.org (Postfix) with ESMTP id 0A7BE8FC12 for ; Wed, 12 Oct 2011 17:31:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lerctr.org; s=lerami; h=Content-Type:MIME-Version:References:Message-ID:In-Reply-To:Subject:cc:To:Sender:From:Date; bh=VYOLoVo3zrrRG68mw3BgtRvqhtQ2DcD/KoTzkdomfAo=; b=QhhPhRF6e/KQf1Uf6P9MXXXbMI3kzuUuiYx9M6St71Va1JgZC2Aj/9nwedpR2HXvGhaovL8e35LwbaIlOJeUIATeQ0BNgYvkh9ikfTtrUca5pvXuyVmpqk0kSbcNIdNIbpJwHw/8Do/20DVjbaZrUuyou3kITun3lV/OfaeuusY=; Received: from cpe-72-182-3-73.austin.res.rr.com ([72.182.3.73]:65231 helo=[192.168.200.4]) by thebighonker.lerctr.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1RE2eF-0005El-Ep; Wed, 12 Oct 2011 12:31:21 -0500 Date: Wed, 12 Oct 2011 12:31:16 -0500 (CDT) From: Larry Rosenman Sender: ler@lrosenman.dyndns.org To: Tom Evans In-Reply-To: Message-ID: References: <4E95AE08.7030105@lerctr.org> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Spam-Score: -2.8 (--) X-LERCTR-Spam-Score: -2.8 (--) X-Spam-Report: SpamScore (-2.8/5.0) ALL_TRUSTED=-1, BAYES_00=-1.9, SARE_SUB_OBFU_OTHER=0.135 X-LERCTR-Spam-Report: SpamScore (-2.8/5.0) ALL_TRUSTED=-1, BAYES_00=-1.9, SARE_SUB_OBFU_OTHER=0.135 Cc: freebsd-fs@freebsd.org Subject: Re: AF (4096 byte sector) drives: Can you mix/match in a ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 17:31:22 -0000 On Wed, 12 Oct 2011, Tom Evans wrote: > On Wed, Oct 12, 2011 at 4:11 PM, Larry Rosenman wrote: >> I have a root on ZFS box with 6 drives, all 400G (except one 500G) in a >> pool. >> >> I want to upgrade to 2T or 3T drives, but was wondering if you can mix/match >> while doing the drive by drive >> replacement. >> >> This is on 9.0-BETA3 if that matters. >> >> Thanks! >> > > Hi Larry > > I'm in a similar position. I have a 2 x 6 x 1.5TB raidz system, > configured a while ago when I wasn't aware enough of 4k sector drives, > and so ZFS is configured to use 512 byte sectors (ashift=9). All of > the drives in it were 512 byte sector drives, until one of them > failed. > > At that point, I couldn't lay my hands on a large capacity drive that > still used 512 byte sectors, so I replaced it with a 4k sector drive, > made sure it was aligned correctly, and hoped for the best. The > performance sucks (500MB/s reads -> 150MB/s reads!), but it 'works', > all my data is safe. > > The solution is to make sure that all your vdevs, whether they are > backed by disks that have 512 byte or 4k sectors, are created with 4k > sectors (ashift=12). It won't negatively affect your older disks, and > you won't end up in the position I am in, where I need to recreate the > pool to fix the issue (and have 12TB of data with nowhere to put it!) > I wish I had asked this question BEFORE I made the box Root on ZFS on Saturday. Here's what I have: pool: zroot state: ONLINE scan: scrub repaired 0 in 0h20m with 0 errors on Sat Oct 8 22:21:50 2011 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 gpt/disk0 ONLINE 0 0 0 gpt/disk1 ONLINE 0 0 0 gpt/disk2 ONLINE 0 0 0 gpt/disk3 ONLINE 0 0 0 gpt/disk4 ONLINE 0 0 0 gpt/disk5 ONLINE 0 0 0 errors: No known data errors zroot: version: 28 name: 'zroot' state: 0 txg: 185 pool_guid: 6776217281607456243 hostname: '' vdev_children: 1 vdev_tree: type: 'root' id: 0 guid: 6776217281607456243 children[0]: type: 'raidz' id: 0 guid: 1402298321185619698 nparity: 1 metaslab_array: 30 metaslab_shift: 34 ashift: 9 asize: 2374730514432 is_log: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 9076139076816521807 path: '/dev/gpt/disk0' phys_path: '/dev/gpt/disk0' whole_disk: 1 create_txg: 4 children[1]: type: 'disk' id: 1 guid: 1302481463702775221 path: '/dev/gpt/disk1' phys_path: '/dev/gpt/disk1' whole_disk: 1 create_txg: 4 children[2]: type: 'disk' id: 2 guid: 15500000621616879018 path: '/dev/gpt/disk2' phys_path: '/dev/gpt/disk2' whole_disk: 1 create_txg: 4 children[3]: type: 'disk' id: 3 guid: 11011035160331724516 path: '/dev/gpt/disk3' phys_path: '/dev/gpt/disk3' whole_disk: 1 create_txg: 4 children[4]: type: 'disk' id: 4 guid: 17522530679015716424 path: '/dev/gpt/disk4' phys_path: '/dev/gpt/disk4' whole_disk: 1 create_txg: 4 children[5]: type: 'disk' id: 5 guid: 16647118440423800168 path: '/dev/gpt/disk5' phys_path: '/dev/gpt/disk5' whole_disk: 1 create_txg: 4 So, is there a way to change/fix/whatever this setup and not have to copy 40+G of data? Thanks for the reply! -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 512-248-2683 E-Mail: ler@lerctr.org US Mail: 430 Valona Loop, Round Rock, TX 78681-3893 From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 17:43:24 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A90EE106566B for ; Wed, 12 Oct 2011 17:43:24 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-gx0-f182.google.com (mail-gx0-f182.google.com [209.85.161.182]) by mx1.freebsd.org (Postfix) with ESMTP id 685C78FC18 for ; Wed, 12 Oct 2011 17:43:24 +0000 (UTC) Received: by ggeq3 with SMTP id q3so1204701gge.13 for ; Wed, 12 Oct 2011 10:43:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=ekBh3eHT9GSxnZpnyfY89maEJwJJtKRaKEMJYsbQq7I=; b=krLto2tensAdE2lf6GOJP1CpTvYR6UE5Z9WTBS0jVUwhgBl1IZM+prT/iLwNYTkPHK Ib9YdGjGJgvWklGToV3B5vKrJPAjBsqSdt5rVdM35vBnLD1G1ymTCoV5VtnSb7j9Yozx QKThCn/BgmnLqWgRgSFSXpRR3Cbuj4xyyiddA= MIME-Version: 1.0 Received: by 10.236.153.200 with SMTP id f48mr5018203yhk.114.1318441403812; Wed, 12 Oct 2011 10:43:23 -0700 (PDT) Sender: artemb@gmail.com Received: by 10.236.103.33 with HTTP; Wed, 12 Oct 2011 10:43:23 -0700 (PDT) In-Reply-To: <20111012172912.GA27013@icarus.home.lan> References: <4E95AE08.7030105@lerctr.org> <20111012155938.GA24649@icarus.home.lan> <4E95C546.70904@digsys.bg> <20111012172912.GA27013@icarus.home.lan> Date: Wed, 12 Oct 2011 10:43:23 -0700 X-Google-Sender-Auth: CbaEozusSzwM2zM9bt0bl8nTrsQ Message-ID: From: Artem Belevich To: Jeremy Chadwick Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: AF (4096 byte sector) drives: Can you mix/match in a ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 17:43:24 -0000 On Wed, Oct 12, 2011 at 10:29 AM, Jeremy Chadwick wrote: > How does ZFS determine this? =A0I was under the impression that this > behaviour was determined by (or "assisted by") ashift. > > Surely ZFS cannot ask the underlying storage provider (e.g. GEOM on > FreeBSD) what logical vs. physical sector size to use (e.g. for SATA > what's returned in the ATA IDENTIFY payload), because on SSDs such as > Intel SSDs *both* of those sizes are reported as 512 bytes (camcontrol > identify confirms). In r222520 mav@ added ADA_Q_4K quirks for bunch of Hitachi, Seagate and WD drives so that geom_disk will be aware of 4K sectors at least for some disks. I'm not sure whether ZFS picks it up, though. --Artem From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 18:15:13 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0F9361065677 for ; Wed, 12 Oct 2011 18:15:13 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta08.westchester.pa.mail.comcast.net (qmta08.westchester.pa.mail.comcast.net [76.96.62.80]) by mx1.freebsd.org (Postfix) with ESMTP id AEDAA8FC12 for ; Wed, 12 Oct 2011 18:15:12 +0000 (UTC) Received: from omta17.westchester.pa.mail.comcast.net ([76.96.62.89]) by qmta08.westchester.pa.mail.comcast.net with comcast id jsQz1h0061vXlb858uFDmy; Wed, 12 Oct 2011 18:15:13 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta17.westchester.pa.mail.comcast.net with comcast id juEA1h00U1t3BNj3duEAcw; Wed, 12 Oct 2011 18:14:11 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id D30F2102C1C; Wed, 12 Oct 2011 11:14:08 -0700 (PDT) Date: Wed, 12 Oct 2011 11:14:08 -0700 From: Jeremy Chadwick To: mav@freebsd.org Message-ID: <20111012181408.GA27604@icarus.home.lan> References: <4E95AE08.7030105@lerctr.org> <20111012155938.GA24649@icarus.home.lan> <4E95C546.70904@digsys.bg> <20111012172912.GA27013@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: AF (4096 byte sector) drives: Can you mix/match in a ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 18:15:13 -0000 On Wed, Oct 12, 2011 at 10:43:23AM -0700, Artem Belevich wrote: > On Wed, Oct 12, 2011 at 10:29 AM, Jeremy Chadwick > wrote: > > How does ZFS determine this? ?I was under the impression that this > > behaviour was determined by (or "assisted by") ashift. > > > > Surely ZFS cannot ask the underlying storage provider (e.g. GEOM on > > FreeBSD) what logical vs. physical sector size to use (e.g. for SATA > > what's returned in the ATA IDENTIFY payload), because on SSDs such as > > Intel SSDs *both* of those sizes are reported as 512 bytes (camcontrol > > identify confirms). > > In r222520 mav@ added ADA_Q_4K quirks for bunch of Hitachi, Seagate > and WD drives so that geom_disk will be aware of 4K sectors at least > for some disks. > I'm not sure whether ZFS picks it up, though. http://svnweb.freebsd.org/base?view=revision&revision=222520 http://svnweb.freebsd.org/base/head/sys/cam/ata/ata_da.c?r1=222520&r2=222519&pathrev=222520 So it looks like this commit is intended to work around drives which report an incorrect **physical** sector size of 512 bytes. Per disk(9), the stripesize and stripeoffset variables would be used to ensure that partitions/slices/etc. are all aligned to said value. I imagine this would predominantly play a role during partition/slice creation. (This might explain some of the performance weirdness I've seen on some of our RELENG_8 boxes which were built/installed with SSDs off code much older than 4 months ago. Might need to get a newer snapshot and reinstall those...) As such, I believe this quirk list needs SSDs added to it, particularly Intel SSDs, which report 512 physical. Alexander, I can provide you either a patch/diff or a list of Intel SSD model numbers (I have X25-M, 320-series, and 510-series drives available to me easily, and maybe X25-V around here somewhere). All of these drives report 512 physical. I'm familiar with Intel's serial naming scheme (it's documented in some of their PDFs) as well as "oddities" with it (such as when they append "HP" to the string for OEM drives for HP, etc.). Let me know if you could, on-list or off-list. Thanks. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 18:38:14 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4FD49106564A for ; Wed, 12 Oct 2011 18:38:13 +0000 (UTC) (envelope-from daniel@digsys.bg) Received: from smtp-sofia.digsys.bg (smtp-sofia.digsys.bg [193.68.3.230]) by mx1.freebsd.org (Postfix) with ESMTP id 42DE08FC13 for ; Wed, 12 Oct 2011 18:38:12 +0000 (UTC) Received: from [192.92.129.186] ([192.92.129.186]) (authenticated bits=0) by smtp-sofia.digsys.bg (8.14.4/8.14.4) with ESMTP id p9CIc2ju006111 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Wed, 12 Oct 2011 21:38:08 +0300 (EEST) (envelope-from daniel@digsys.bg) Mime-Version: 1.0 (Apple Message framework v1244.3) Content-Type: text/plain; charset=us-ascii From: Daniel Kalchev In-Reply-To: <20111012172912.GA27013@icarus.home.lan> Date: Wed, 12 Oct 2011 21:38:02 +0300 Content-Transfer-Encoding: quoted-printable Message-Id: References: <4E95AE08.7030105@lerctr.org> <20111012155938.GA24649@icarus.home.lan> <4E95C546.70904@digsys.bg> <20111012172912.GA27013@icarus.home.lan> To: Jeremy Chadwick X-Mailer: Apple Mail (2.1244.3) Cc: freebsd-fs@freebsd.org Subject: Re: AF (4096 byte sector) drives: Can you mix/match in a ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 18:38:14 -0000 On Oct 12, 2011, at 20:29 , Jeremy Chadwick wrote: >> The gnop trick is used not because you will ask a 512-byte sector >> drive to write 8 sectors with one I/O, but because you may ask an >> 4096-byte sector drive to write only 512 bytes -- which for the >> drive means it has to read 4096 bytes, modify 512 of these bytes and >> write back 4096 bytes. >=20 > If I'm reading this correctly, you're effectively stating ashift > actually just defines (or helps in calculating) an LBA offset for the > start of the pool-related data on that device? "ashift" seems like a > badly-named term/variable for what this does, but oh well. ashift defines the minimum block size of the vdev. The choice is fine, I = believe as it describes how one get's a power of 2 size (by shifting 1 = that number of times) :-) >> The proper way to handle this is to create your zpool with 4096-byte >> alignment, that is, for the time being by using the above gnop >> 'hack'. >=20 > ...which brings into question why this is needed at all, meaning, why > the ZFS code cannot be changed to default to an ashift value that's > calculated as 12 (or equivalent) regardless of 512-byte or 4096-byte > sector drives. Currently the ZFS block size is 512 bytes to 128 kilobytes. That is with = ashift of 9. If you have shift of 12, that effectively means minimum = block size of 4k and maximum block size of 128k. > How was this addressed on Solaris/OpenSolaris? >=20 I don't think they do. >> There should be no implications to having one vdev with 512 byte >> alignment and another with 4096 byte alignment. ZFS is smart enough >> to issue minimum of 512 byte writes to the former and 4096 bytes to >> the latter thus not creating any bottleneck. >=20 > How does ZFS determine this? I was under the impression that this > behaviour was determined by (or "assisted by") shift. ZFS has a piece of data, say 20 kbyte block to write. If you have say 4 = vdevs, one with shift=3D9 (512 bytes), another with ashift=3D12 (4096 = bytes). All other issues ignored (equal size vdev's, full at the same = capacity etc.) it has to write minimum of 9kb (512+512+4096+4096) -- = apparently ZFS wants to fill all vdevs equally, so it will likely issue = one 4k to vdev1, one 4k to vdev2, two 512b to vdev3 and two 512b to = vdev4.=20 If for example, it had 16k to write, it would write one 4k I/O to the 4k = vdev's and 4 x 512b I/O (or a single write of 4k, depending on layering = abstraction) to the 512b vdevs. So yes, it is assisted by shift. But, for the time being you need to assist ZFS how to create the vdev's = with the proper shift value. This is because today's 4k drives lie that = their geometry is 512b. As mentioned, there are patches for FreeBSD to = 'discover' this behavior. Another approach is via gnop. Only at vdev = creation time. Haven't seen anything like this for Solaris. Daniel= From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 19:31:09 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 90B7D106566B for ; Wed, 12 Oct 2011 19:31:09 +0000 (UTC) (envelope-from bsd@vink.pl) Received: from mail-yw0-f54.google.com (mail-yw0-f54.google.com [209.85.213.54]) by mx1.freebsd.org (Postfix) with ESMTP id 4FCA78FC0A for ; Wed, 12 Oct 2011 19:31:09 +0000 (UTC) Received: by ywp17 with SMTP id 17so452884ywp.13 for ; Wed, 12 Oct 2011 12:31:08 -0700 (PDT) Received: by 10.223.5.201 with SMTP id 9mr95922faw.5.1318446394511; Wed, 12 Oct 2011 12:06:34 -0700 (PDT) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx.google.com with ESMTPS id u6sm5268003fan.17.2011.10.12.12.06.33 (version=SSLv3 cipher=OTHER); Wed, 12 Oct 2011 12:06:34 -0700 (PDT) Received: by bkbzs8 with SMTP id zs8so530279bkb.13 for ; Wed, 12 Oct 2011 12:06:33 -0700 (PDT) MIME-Version: 1.0 Received: by 10.204.145.15 with SMTP id b15mr238821bkv.52.1318446392907; Wed, 12 Oct 2011 12:06:32 -0700 (PDT) Received: by 10.204.52.146 with HTTP; Wed, 12 Oct 2011 12:06:32 -0700 (PDT) In-Reply-To: <94455706-B90D-4DBD-A7DE-E9A38F118D35@patpro.net> References: <20110915120007.F41FF10656E1@hub.freebsd.org> <4B8C8026-1E12-4C32-88E3-9B34A3E58A91@patpro.net> <86ipnwg1s0.fsf@kopusha.home.net> <94455706-B90D-4DBD-A7DE-E9A38F118D35@patpro.net> Date: Wed, 12 Oct 2011 21:06:32 +0200 Message-ID: From: Wiktor Niesiobedzki To: Patrick Proniewski Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: Mikolaj Golub , freebsd-fs@freebsd.org Subject: Re: measuring IO asynchronously X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 19:31:09 -0000 2011/10/12 Patrick Proniewski > > On 10 oct. 2011, at 20:05, Mikolaj Golub wrote: > > > On Mon, 10 Oct 2011 16:33:11 +0200 Patrick Proniewski wrote: > > PP> I would like to monitor the storage on various FreeBSD servers, > > PP> especially I/O per seconds. Is there any way to gather statistics > > PP> about I/O via asynchronous request, lets say, for example, using a > > PP> munin plugin? =C2=A0`iostat -w 1` and `zpool iostat tank 1` are > > PP> interesting, but not useable asynchronously. > > > > I use for this a simple program that I wrote some time ago. It uses dev= stat(9) > > kernel interface and outputs counters, like below > > > > kopusha:~% devstat ada0 > > ada0: > > =C2=A0 =C2=A0 =C2=A0 =C2=A01339552256 bytes read > > ../.. > > You can find it in ports (sysutils/devstat). > > Thank you Mikolaj, I'm going to give it a try! > What's wrong about iostat? (and how it is used by munin plugin) %iostat -I -x extended device statistics device r/i w/i kr/i kw/i wait svc_t %b ada0 26756321.0 77603114.0 411881657.5 1192108987.0 0 3.2 1 ada1 13483063.0 1131450.0 943635180.5 43584216.0 0 5.7 0 ada2 13831313.0 1131763.0 943961104.5 43587768.0 0 5.5 0 Cheers, Wiktor Niesiobedzki From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 19:39:42 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D8324106566B for ; Wed, 12 Oct 2011 19:39:42 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74]) by mx1.freebsd.org (Postfix) with ESMTP id 6FC848FC12 for ; Wed, 12 Oct 2011 19:39:41 +0000 (UTC) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id p9CJdeKg005710; Wed, 12 Oct 2011 14:39:40 -0500 (CDT) Date: Wed, 12 Oct 2011 14:39:40 -0500 (CDT) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: Jeremy Chadwick In-Reply-To: <20111012165126.GA26562@icarus.home.lan> Message-ID: References: <20111012165126.GA26562@icarus.home.lan> User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (blade.simplesystems.org [65.66.246.90]); Wed, 12 Oct 2011 14:39:40 -0500 (CDT) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS/compression/performance X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 19:39:43 -0000 On Wed, 12 Oct 2011, Jeremy Chadwick wrote: > > That might be the case on OpenSolaris but the performance hit on > FreeBSD RELENG_8 is very high -- enough that enabling compression (using > the defaults) causes stalls when I/O occurs (easily noticeable across > SSH; characters are delayed/stalled (not buffered)), etc.. Solaris solved the problem by putting the zfs writer threads into a special scheduling class so that they are usually lower priority than normal processing. Before this change, a desktop system would become almost unusable (intermittent loss of keyboard/mouse) while writing lots of data with compression enabled. Some NFS servers encountered severe enough issues that NFS clients reported NFS timeouts. > Another point: I haven't tinkered with compression on our Solaris 10 > machines at work so I don't know if it performs better, equal, or worse > than FreeBSD or OpenSolaris. >From what you describe, Solaris must be doing much better in this regard than FreeBSD. Solaris is not necessarily faster but there is now little impact on interactive tasks. Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 19:56:03 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DFD3D1065670 for ; Wed, 12 Oct 2011 19:56:03 +0000 (UTC) (envelope-from thomas.e.zander@googlemail.com) Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com [209.85.215.182]) by mx1.freebsd.org (Postfix) with ESMTP id 737CA8FC13 for ; Wed, 12 Oct 2011 19:56:03 +0000 (UTC) Received: by eyd10 with SMTP id 10so1458855eyd.13 for ; Wed, 12 Oct 2011 12:56:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=GVau/7W6jAmoe8KkOdqenRl1ki4NeWKysuvnscRhRMs=; b=YN72HGO+P0r4E/jbd1bYPjm87XAjTABLP9DJuPxawQ0G3YPsbbFnqh8XLmePAGwN1Y DPMuwICzuCS4rwCX1j8hBakBvbHJ7D7uQOPOsMe/Eu8CMXA4GTjbEBvBKTCfpDMMdwPJ SVcKsYg2sBABGjiK3XStY4Nu3Ktd6w+L2zFas= MIME-Version: 1.0 Received: by 10.14.17.226 with SMTP id j74mr49104eej.69.1318447697309; Wed, 12 Oct 2011 12:28:17 -0700 (PDT) Received: by 10.14.37.69 with HTTP; Wed, 12 Oct 2011 12:28:17 -0700 (PDT) In-Reply-To: <20111012165126.GA26562@icarus.home.lan> References: <20111012165126.GA26562@icarus.home.lan> Date: Wed, 12 Oct 2011 21:28:17 +0200 Message-ID: From: Thomas Zander To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Subject: Re: ZFS/compression/performance X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 19:56:04 -0000 On Wed, Oct 12, 2011 at 18:51, Jeremy Chadwick wrote: > That might be the case on OpenSolaris but the performance hit on > FreeBSD RELENG_8 is very high -- enough that enabling compression (using > the defaults) causes stalls when I/O occurs (easily noticeable across > SSH; characters are delayed/stalled (not buffered)), etc.. Has this improved considerably in RELENG_9? I couldn't try 9 yet. Riggs From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 20:05:21 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E1A80106564A; Wed, 12 Oct 2011 20:05:21 +0000 (UTC) (envelope-from patpro@patpro.net) Received: from rack.patpro.net (rack.patpro.net [193.30.227.216]) by mx1.freebsd.org (Postfix) with ESMTP id 930568FC12; Wed, 12 Oct 2011 20:05:21 +0000 (UTC) Received: from rack.patpro.net (localhost [127.0.0.1]) by rack.patpro.net (Postfix) with ESMTP id ADF0A1CC038; Wed, 12 Oct 2011 22:05:20 +0200 (CEST) X-Virus-Scanned: amavisd-new at patpro.net Received: from amavis-at-patpro.net ([127.0.0.1]) by rack.patpro.net (rack.patpro.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tEZpYCA5v6Hx; Wed, 12 Oct 2011 22:05:18 +0200 (CEST) Received: from [127.0.0.1] (localhost [127.0.0.1]) by rack.patpro.net (Postfix) with ESMTP; Wed, 12 Oct 2011 22:05:18 +0200 (CEST) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: multipart/signed; boundary=Apple-Mail-8-840301751; protocol="application/pkcs7-signature"; micalg=sha1 From: Patrick Proniewski In-Reply-To: Date: Wed, 12 Oct 2011 22:05:17 +0200 Message-Id: References: <20110915120007.F41FF10656E1@hub.freebsd.org> <4B8C8026-1E12-4C32-88E3-9B34A3E58A91@patpro.net> <86ipnwg1s0.fsf@kopusha.home.net> <94455706-B90D-4DBD-A7DE-E9A38F118D35@patpro.net> To: Wiktor Niesiobedzki X-Mailer: Apple Mail (2.1084) X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Mikolaj Golub , freebsd-fs@freebsd.org Subject: Re: measuring IO asynchronously X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 20:05:22 -0000 --Apple-Mail-8-840301751 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii On 12 oct. 2011, at 21:06, Wiktor Niesiobedzki wrote: > What's wrong about iostat? (and how it is used by munin plugin) > %iostat -I -x ho. I'm so ashamed. You know what, I think I need to sleep for a full week :) thanks, Pat --Apple-Mail-8-840301751-- From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 20:28:30 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 157C5106566C for ; Wed, 12 Oct 2011 20:28:30 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id C51568FC17 for ; Wed, 12 Oct 2011 20:28:29 +0000 (UTC) Received: by vcbf13 with SMTP id f13so1356625vcb.13 for ; Wed, 12 Oct 2011 13:28:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=olWouScW6g10OFxlr70Jaqci+M0c5oUzLbLntUVUn0g=; b=snRtjrA1owB6Ri1vfBsxbdRjM0OrIT44hKY83MlkU8ioIKRJkLSNf9QpU9ANUTcFv4 iNao7xGLJPtk7YqsT3FKYywX6A/G3CR898vx4Cj0QjbKnStess59lsW9JpaKDcfSZqJp qYNDeAu2HvOIgmh3trlh1JCHOxZqeEa8sTUn0= MIME-Version: 1.0 Received: by 10.220.149.19 with SMTP id r19mr52069vcv.80.1318451309000; Wed, 12 Oct 2011 13:28:29 -0700 (PDT) Received: by 10.220.176.200 with HTTP; Wed, 12 Oct 2011 13:28:28 -0700 (PDT) In-Reply-To: <20111012165126.GA26562@icarus.home.lan> References: <20111012165126.GA26562@icarus.home.lan> Date: Wed, 12 Oct 2011 13:28:28 -0700 Message-ID: From: Freddie Cash To: Jeremy Chadwick Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS/compression/performance X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 20:28:30 -0000 On Wed, Oct 12, 2011 at 9:51 AM, Jeremy Chadwick wrote: > That might be the case on OpenSolaris but the performance hit on > FreeBSD RELENG_8 is very high -- enough that enabling compression (using > the defaults) causes stalls when I/O occurs (easily noticeable across > SSH; characters are delayed/stalled (not buffered)), etc.. > > The last time I tried it on RELENG_8 was right after ZFSv28 was MFC'd. > If things have improved I can try again (I don't remember seeing any > commits that could affect this), or if people really think changing the > compression model to lzjb will help. > I would try it again, with the latest RELENG_8 sources. compression=lzjb performance hit is negligible on our newest backups server (8-core Opteron, 16 GB RAM, 4x 6-drive raidz2, dedupe enabled). I am able to connect via SSH, navigate around the filesystem, tail -f multiple log files, monitor network via iftop/trafshow, and watch top, all via a tmux session (multiple tmux "windows"). This is with 5 rsync processes running, transferring data at 200-300 Mbps. compression=gzip-9 performance hit is noticeable on our older backups servers (4-core Opteron, one has 8 GB RAM the other 12 GB RAM, 3x 8-drive raidz2, no dedupe). This pool is 93% full and heavily fragmented, though. SSH connections are slow, typing is choppy, basically everything is choppy, while running 5 rsync processes. This box used to run 12 rsyncs simultaneously, but will now lockup completely if you try to run 7 or more. I've switched to lzjb on these two boxes to see if that helps any. I'm thinking the fullness and fragmenting of the pool is the root cause of the slowness on these boxes now. -- Freddie Cash fjwcash@gmail.com From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 20:48:30 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7E0B8106566C for ; Wed, 12 Oct 2011 20:48:30 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com [209.85.215.182]) by mx1.freebsd.org (Postfix) with ESMTP id BAF0B8FC13 for ; Wed, 12 Oct 2011 20:48:29 +0000 (UTC) Received: by eyd10 with SMTP id 10so1520271eyd.13 for ; Wed, 12 Oct 2011 13:48:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=from:to:cc:subject:references:x-comment-to:sender:date:in-reply-to :message-id:user-agent:mime-version:content-type :content-transfer-encoding; bh=nQazewpLk7sLXXeN7gJjtt36bK4Ojj3Tu8KNhVoLzpA=; b=oLJLS5k9mZlNFt9oFI5h91i4/Lz45V6Dg652aboad4opRS8c8/bBWrTOnyv5YTKe1o AbLE412NkKvnf/E5wJuTDOID3WtnTlv5OEcUoMO4HcGbT1q9Ap7q/DLYyQFL0IEIn3UJ RhMIQxDodTcOJ6ZBsnp26cKqwC1+MFFiAW+r0= Received: by 10.223.17.11 with SMTP id q11mr1084336faa.13.1318452508374; Wed, 12 Oct 2011 13:48:28 -0700 (PDT) Received: from localhost ([95.69.173.122]) by mx.google.com with ESMTPS id u6sm5688421faf.3.2011.10.12.13.48.26 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 12 Oct 2011 13:48:27 -0700 (PDT) From: Mikolaj Golub To: Wiktor Niesiobedzki References: <20110915120007.F41FF10656E1@hub.freebsd.org> <4B8C8026-1E12-4C32-88E3-9B34A3E58A91@patpro.net> <86ipnwg1s0.fsf@kopusha.home.net> <94455706-B90D-4DBD-A7DE-E9A38F118D35@patpro.net> X-Comment-To: Wiktor Niesiobedzki Sender: Mikolaj Golub Date: Wed, 12 Oct 2011 23:48:24 +0300 In-Reply-To: (Wiktor Niesiobedzki's message of "Wed, 12 Oct 2011 21:06:32 +0200") Message-ID: <86d3e2lyw7.fsf@kopusha.home.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 8bit Cc: freebsd-fs@freebsd.org, Patrick Proniewski Subject: Re: measuring IO asynchronously X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 20:48:30 -0000 On Wed, 12 Oct 2011 21:06:32 +0200 Wiktor Niesiobedzki wrote: WN> 2011/10/12 Patrick Proniewski >> >> On 10 oct. 2011, at 20:05, Mikolaj Golub wrote: >> >> > On Mon, 10 Oct 2011 16:33:11 +0200 Patrick Proniewski wrote: >> > PP> I would like to monitor the storage on various FreeBSD servers, >> > PP> especially I/O per seconds. Is there any way to gather statistics >> > PP> about I/O via asynchronous request, lets say, for example, using a >> > PP> munin plugin? š`iostat -w 1` and `zpool iostat tank 1` are >> > PP> interesting, but not useable asynchronously. >> > >> > I use for this a simple program that I wrote some time ago. It uses devstat(9) >> > kernel interface and outputs counters, like below >> > >> > kopusha:~% devstat ada0 >> > ada0: >> > š š š š1339552256 bytes read >> > ../.. >> > You can find it in ports (sysutils/devstat). >> >> Thank you Mikolaj, I'm going to give it a try! >> WN> What's wrong about iostat? (and how it is used by munin plugin) WN> %iostat -I -x WN> extended device statistics WN> device r/i w/i kr/i WN> kw/i wait svc_t %b WN> ada0 26756321.0 77603114.0 411881657.5 1192108987.0 0 3.2 1 WN> ada1 13483063.0 1131450.0 943635180.5 43584216.0 0 5.7 0 WN> ada2 13831313.0 1131763.0 943961104.5 43587768.0 0 5.5 0 Well, may be nothing wrong and I just don't well understand all its capabilities. But taking your example above, usually I am very interested in busy %. Running the following: kopusha:~% iostat -I -x ada0 ; devstat ada0; sleep 60; iostat -I -x ada0 ; devstat ada0 extended device statistics device r/i w/i kr/i kw/i qlen svc_t %b ada0 343745.0 105155.0 1321490.0 2021456.5 0 3.5 10 ada0: 1353218560 bytes read 2069971456 bytes written 0 bytes freed 343746 reads 105155 writes 0 frees 32 other duration: 1273 9857428529740234978/2^64 sec reads 300 1686376179659773404/2^64 sec writes 0 0/2^64 sec frees 1125 13002965656490398846/2^64 sec busy time 5 18135162071037186601/2^64 sec creation time 512 block size tags sent: 448933 simple 0 ordered 0 head of queue supported statistics measurements flags: 0 device type: 32 devstat list insert priority: 272 extended device statistics device r/i w/i kr/i kw/i qlen svc_t %b ada0 361949.0 106292.0 1342183.5 2060756.0 1 3.5 10 ada0: 1374395904 bytes read 2110214144 bytes written 0 bytes freed 361949 reads 106292 writes 0 frees 32 other duration: 1313 2389256899477793268/2^64 sec reads 308 8013777376704320252/2^64 sec writes 0 0/2^64 sec frees 1163 13731036726762718640/2^64 sec busy time 5 18135162071037186601/2^64 sec creation time 512 block size tags sent: 468273 simple 0 ordered 0 head of queue supported statistics measurements flags: 0 device type: 32 devstat list insert priority: 272 read/write counters are the same so I could calculate e.g average writes per sec for that minute. But %b column from iostat output not very useful while taking busy time from devstat output I can calculate that I had for that minute (1163 - 1125) / 60 * 100 = 63% disk busy. Also, there are "sec reads/writes", which actually give results that look a little confusing for me, but at least they tell if the disk was mostly busy reading or writing. And running this with my gather utility (can be found in ports, advertisement again :-) I can look at disk busy % at any interesting for me time: kopusha:~% gather show -t 10m grep 'ada0.*busy time' devstat | awk '{if (old) print $0 "\t", 100.0*($4-old)/60; old=$4}' 2011-10-12 23:34: ada0: 1293 14987467510083879798/2^64 sec busy time 1.66667 2011-10-12 23:35: ada0: 1293 16332874492165216024/2^64 sec busy time 0 2011-10-12 23:36: ada0: 1294 8579604187790414908/2^64 sec busy time 1.66667 2011-10-12 23:37: ada0: 1295 3568670710727311468/2^64 sec busy time 1.66667 2011-10-12 23:38: ada0: 1327 9449174491988114920/2^64 sec busy time 53.3333 2011-10-12 23:39: ada0: 1328 291265921399568052/2^64 sec busy time 1.66667 2011-10-12 23:40: ada0: 1328 17998157209099769080/2^64 sec busy time 0 2011-10-12 23:41: ada0: 1329 8166685340768241914/2^64 sec busy time 1.66667 2011-10-12 23:42: ada0: 1330 18381582750921335228/2^64 sec busy time 1.66667 2011-10-12 23:43: ada0: 1331 7748302105755174350/2^64 sec busy time 1.66667 On the other hand on some problem productions I have been running permanently something like below :-) iostat $IOSTATOPTIONS $IOSTATINTERVAL $IOSTATCOUNT | perl -MPOSIX -ne 'print strftime("%F %H:%M:%S: ", gmtime), $_;' > "$statdir/$IOSTATOUT" which outputs 2011-10-12 12:00:01: extended device statistics 2011-10-12 12:00:01: device r/s w/s kr/s kw/s wait svc_t %b 2011-10-12 12:00:01: mfid0 5.0 33.4 435.2 1132.4 0 78.4 7 2011-10-12 12:00:07: extended device statistics 2011-10-12 12:00:07: device r/s w/s kr/s kw/s wait svc_t %b 2011-10-12 12:00:07: mfid0 3.3 19.5 44.5 311.4 0 0.9 2 2011-10-12 12:00:12: extended device statistics 2011-10-12 12:00:12: device r/s w/s kr/s kw/s wait svc_t %b 2011-10-12 12:00:12: mfid0 0.4 0.6 2.7 9.3 0 2.7 0 2011-10-12 12:00:17: extended device statistics 2011-10-12 12:00:17: device r/s w/s kr/s kw/s wait svc_t %b 2011-10-12 12:00:17: mfid0 0.2 1.8 2.8 115.1 0 0.7 0 ... -- Mikolaj Golub From owner-freebsd-fs@FreeBSD.ORG Wed Oct 12 22:41:57 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7A83E1065680 for ; Wed, 12 Oct 2011 22:41:57 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id 366B58FC19 for ; Wed, 12 Oct 2011 22:41:56 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1RE7Un-0001yS-S5 for freebsd-fs@freebsd.org; Thu, 13 Oct 2011 00:41:53 +0200 Received: from dyn1247-37.vpn.ic.ac.uk ([129.31.247.37]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 13 Oct 2011 00:41:53 +0200 Received: from jtotz by dyn1247-37.vpn.ic.ac.uk with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 13 Oct 2011 00:41:53 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Johannes Totz Date: Wed, 12 Oct 2011 23:41:41 +0100 Lines: 15 Message-ID: References: <4E95AE08.7030105@lerctr.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: dyn1247-37.vpn.ic.ac.uk User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1 In-Reply-To: <4E95AE08.7030105@lerctr.org> Subject: Re: AF (4096 byte sector) drives: Can you mix/match in a ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2011 22:41:57 -0000 On 12/10/2011 16:11, Larry Rosenman wrote: > I have a root on ZFS box with 6 drives, all 400G (except one 500G) in a > pool. > > I want to upgrade to 2T or 3T drives, but was wondering if you can > mix/match while doing the drive by drive > replacement. > > This is on 9.0-BETA3 if that matters. Not sure if this applies to you... I tried to add a 4k-gnop'd drive to an existing 512-byte-sector pool. And zpool tool complained about alignment mismatch. This was on a recent 8-stable. It didn't work. From owner-freebsd-fs@FreeBSD.ORG Thu Oct 13 01:00:28 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E381F1065695 for ; Thu, 13 Oct 2011 01:00:28 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id D26218FC0A for ; Thu, 13 Oct 2011 01:00:28 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9D10SZx048658 for ; Thu, 13 Oct 2011 01:00:28 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9D10SNo048657; Thu, 13 Oct 2011 01:00:28 GMT (envelope-from gnats) Date: Thu, 13 Oct 2011 01:00:28 GMT Message-Id: <201110130100.p9D10SNo048657@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Rick Macklem Cc: Subject: Re: amd64/161493: NFS v3 directory structure update slow X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Rick Macklem List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Oct 2011 01:00:29 -0000 The following reply was made to PR kern/161493; it has been noted by GNATS. From: Rick Macklem To: John Baldwin Cc: George Breahna , freebsd-gnats-submit@freebsd.org, Rick Macklem , freebsd-amd64@freebsd.org Subject: Re: amd64/161493: NFS v3 directory structure update slow Date: Wed, 12 Oct 2011 20:25:11 -0400 (EDT) John Baldwin wrote: > On Tuesday, October 11, 2011 11:07:13 am George Breahna wrote: > > > > >Number: 161493 > > >Category: amd64 > > >Synopsis: NFS v3 directory structure update slow > > >Confidential: no > > >Severity: critical > > >Priority: high > > >Responsible: freebsd-amd64 > > >State: open > > >Quarter: > > >Keywords: > > >Date-Required: > > >Class: sw-bug > > >Submitter-Id: current-users > > >Arrival-Date: Tue Oct 11 15:10:07 UTC 2011 > > >Closed-Date: > > >Last-Modified: > > >Originator: George Breahna > > >Release: 9.0 Beta 2 > > >Organization: > > >Environment: > > FreeBSD store2 9.0-BETA2 FreeBSD 9.0-BETA2 #0: Sun Sep 18 22:02:45 > > EDT 2011 > pulsar@store2.emailarray.com:/usr/obj/usr/src/sys/PULSAR amd64 > > >Description: > > We used to run a NFS server on FreeBSD 6.2 but we built a new box > > recently > and installed 9.0 Beta 2 on it. The data was moved over as it serves > as the > back-end for a mail system. It runs NFS v3 over TCP only and all the > NFS- > related processes (rpcbind, mountd, lockd, etc ) run with the -h > switch and > bind to the local IP address. > > > > The NFS server exports the data to 7 NFS clients ranging from > > FreeBSD 6.1 to > 8.2, the majority being 8.2 The mount on the NFS clients is done > simply with - > o tcp,rsize=32768,wsize=32768 > > > > Usual file operations, such as accessing files, creating > > directories, > removing files, chmod, chown, etc work perfectly but we noticed there > were > issues in removing directories that contained data. We had a strange > error: > > > > rm -rf nick/ > > rm: fts_read: Input/output error > > > > Using 'truss' on rm revealed this: > > > > open("..",O_RDONLY,00) ERR#5 'Input/output error' > > > > After much testing and debugging we realized the problem is in the > > NFS > protocol. ( either server or client but we assume server since this > used to > work very well with FreeBSD 6.2 ). The problem appears to be that NFS > does not > show the '..' after modifying a directory structure. Take the > following > example executed on a FreeBSD 8.2 client accessing the NFS share from > the > 9.0B2 server: > > > > imap5# mkdir test1 > > imap5# cd test1 > > imap5# touch file1 > > imap5# touch file2 > > imap5# ls -la > > ls: ..: Input/output error > > total 4 > > drwxr-xr-x 2 root vchkpw 512 Oct 11 10:55 . > > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:55 file1 > > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:55 file2 > > > > Notice the '..' is missing from the display. If we now try and > > remove the > directory 'test1' it will throw the "rm: fts_read: Input/output error" > error. > > > > If we wait in between 1 minute and 5 minutes, '..' will eventually > > appear by > itself. During this whole time, '..' effectively exists on the NFS > server but > it's not displayed by any of the NFS clients. > > > > I can force the NFS client to show it faster by doing an ls -la from > > the > parent level. For example: > > > > imap5# mkdir test1 > > imap5# touch test1/file1 > > imap5# touch test1/file2 > > imap5# touch test1/file3 > > imap5# ls -la test1 > > total 8 > > drwxr-xr-x 2 root vchkpw 512 Oct 11 10:59 . > > drwx------ 10 vpopmail vchkpw 1024 Oct 11 10:59 .. > > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:59 file1 > > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:59 file2 > > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:59 file3 > > imap5# cd test1 > > imap5# ls -la > > total 8 > > drwxr-xr-x 2 root vchkpw 512 Oct 11 10:59 . > > drwx------ 10 vpopmail vchkpw 1024 Oct 11 10:59 .. > > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:59 file1 > > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:59 file2 > > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:59 file3 > > > > but if we wait 5 seconds after that display and try again: > > > > ls -la > > ls: ..: Input/output error > > total 4 > > drwxr-xr-x 2 root vchkpw 512 Oct 11 10:59 . > > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:59 file1 > > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:59 file2 > > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:59 file3 > > > > Again, if we wait longer ( 1-5 minutes ), the '..' will properly > > appear in > there. > > > > There are no error messages on the console or other log files. This > > is > reproducible 100% of the time with any FreeBSD client. Have tried > unmounting/remounting several times without any effect. Also tried > different > rsize/wsize, no effect. I think there is some delay in updating the > directory > structure and it's causing this bug. > > > > Here's also some output from nfsstat on the server: > > > > > > Server Info: > > Getattr Setattr Lookup Readlink Read Write Create > Remove > > 114731225 20496896 254966151 133 11697392 19963641 0 > 9228861 > > Rename Link Symlink Mkdir Rmdir Readdir RdirPlus > Access > > 4313471 1157651 39 1955 16511932 15479669 0 > 116927742 > > Mknod Fsstat Fsinfo PathConf Commit > > 0 4748487 48 0 14921747 > > Server Ret-Failed > > 0 > > Server Faults > > 0 > > Server Cache Stats: > > Inprog Idem Non-idem Misses > > 0 0 0 613368147 > > Server Write Gathering: > > WriteOps WriteRPC Opsaved > > 19963641 19963641 0 > > > > >How-To-Repeat: > > imap5# mkdir test1 > > imap5# cd test1 > > imap5# touch file1 > > imap5# touch file2 > > imap5# ls -la > > ls: ..: Input/output error > > total 4 > > drwxr-xr-x 2 root vchkpw 512 Oct 11 10:55 . > > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:55 file1 > > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:55 file2 > > >Fix: > > Can you try using the "old" NFS server as a test? > Please make sure you have the patch in r225356 in your server's kernel sources (it went into head on Sep. 3, but I don't know if your Sep. 11 build would have it?). It fixed a problem that would cause lookup of ".." to fail intermittently, because a field in struct nameidata added on Aug. 13 wasn't initialized. You can find the one line patch here: http://svnweb.freebsd.org/base/head/sys/fs/nfsserver/nfs_nfsdport.c?r1=224911&r2=225356 Please let us know if you have this patch and, if not, apply it and see if the problem goes away. Thanks, rick From owner-freebsd-fs@FreeBSD.ORG Thu Oct 13 02:10:12 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4019C1065692 for ; Thu, 13 Oct 2011 02:10:12 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 057B08FC16 for ; Thu, 13 Oct 2011 02:10:12 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9D2ABxr012335 for ; Thu, 13 Oct 2011 02:10:11 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9D2AB9j012334; Thu, 13 Oct 2011 02:10:11 GMT (envelope-from gnats) Date: Thu, 13 Oct 2011 02:10:11 GMT Message-Id: <201110130210.p9D2AB9j012334@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: "George Breahna" Cc: Subject: RE: amd64/161493: NFS v3 directory structure update slow X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: George Breahna List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Oct 2011 02:10:12 -0000 The following reply was made to PR kern/161493; it has been noted by GNATS. From: "George Breahna" To: "'John Baldwin'" , Cc: , "'Rick Macklem'" Subject: RE: amd64/161493: NFS v3 directory structure update slow Date: Wed, 12 Oct 2011 21:41:34 -0400 I can also confirm that using -o option in nfsd makes the problem go away. George -----Original Message----- From: John Baldwin [mailto:jhb@freebsd.org] Sent: Wednesday, October 12, 2011 9:30 AM To: freebsd-amd64@freebsd.org Cc: George Breahna; freebsd-gnats-submit@freebsd.org; Rick Macklem Subject: Re: amd64/161493: NFS v3 directory structure update slow On Tuesday, October 11, 2011 11:07:13 am George Breahna wrote: > > >Number: 161493 > >Category: amd64 > >Synopsis: NFS v3 directory structure update slow > >Confidential: no > >Severity: critical > >Priority: high > >Responsible: freebsd-amd64 > >State: open > >Quarter: > >Keywords: > >Date-Required: > >Class: sw-bug > >Submitter-Id: current-users > >Arrival-Date: Tue Oct 11 15:10:07 UTC 2011 > >Closed-Date: > >Last-Modified: > >Originator: George Breahna > >Release: 9.0 Beta 2 > >Organization: > >Environment: > FreeBSD store2 9.0-BETA2 FreeBSD 9.0-BETA2 #0: Sun Sep 18 22:02:45 EDT 2011 pulsar@store2.emailarray.com:/usr/obj/usr/src/sys/PULSAR amd64 > >Description: > We used to run a NFS server on FreeBSD 6.2 but we built a new box recently and installed 9.0 Beta 2 on it. The data was moved over as it serves as the back-end for a mail system. It runs NFS v3 over TCP only and all the NFS- related processes (rpcbind, mountd, lockd, etc ) run with the -h switch and bind to the local IP address. > > The NFS server exports the data to 7 NFS clients ranging from FreeBSD 6.1 to 8.2, the majority being 8.2 The mount on the NFS clients is done simply with - o tcp,rsize=32768,wsize=32768 > > Usual file operations, such as accessing files, creating directories, removing files, chmod, chown, etc work perfectly but we noticed there were issues in removing directories that contained data. We had a strange error: > > rm -rf nick/ > rm: fts_read: Input/output error > > Using 'truss' on rm revealed this: > > open("..",O_RDONLY,00) ERR#5 'Input/output error' > > After much testing and debugging we realized the problem is in the NFS protocol. ( either server or client but we assume server since this used to work very well with FreeBSD 6.2 ). The problem appears to be that NFS does not show the '..' after modifying a directory structure. Take the following example executed on a FreeBSD 8.2 client accessing the NFS share from the 9.0B2 server: > > imap5# mkdir test1 > imap5# cd test1 > imap5# touch file1 > imap5# touch file2 > imap5# ls -la > ls: ..: Input/output error > total 4 > drwxr-xr-x 2 root vchkpw 512 Oct 11 10:55 . > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:55 file1 > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:55 file2 > > Notice the '..' is missing from the display. If we now try and remove the directory 'test1' it will throw the "rm: fts_read: Input/output error" error. > > If we wait in between 1 minute and 5 minutes, '..' will eventually appear by itself. During this whole time, '..' effectively exists on the NFS server but it's not displayed by any of the NFS clients. > > I can force the NFS client to show it faster by doing an ls -la from the parent level. For example: > > imap5# mkdir test1 > imap5# touch test1/file1 > imap5# touch test1/file2 > imap5# touch test1/file3 > imap5# ls -la test1 > total 8 > drwxr-xr-x 2 root vchkpw 512 Oct 11 10:59 . > drwx------ 10 vpopmail vchkpw 1024 Oct 11 10:59 .. > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:59 file1 > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:59 file2 > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:59 file3 > imap5# cd test1 > imap5# ls -la > total 8 > drwxr-xr-x 2 root vchkpw 512 Oct 11 10:59 . > drwx------ 10 vpopmail vchkpw 1024 Oct 11 10:59 .. > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:59 file1 > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:59 file2 > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:59 file3 > > but if we wait 5 seconds after that display and try again: > > ls -la > ls: ..: Input/output error > total 4 > drwxr-xr-x 2 root vchkpw 512 Oct 11 10:59 . > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:59 file1 > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:59 file2 > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:59 file3 > > Again, if we wait longer ( 1-5 minutes ), the '..' will properly appear in there. > > There are no error messages on the console or other log files. This is reproducible 100% of the time with any FreeBSD client. Have tried unmounting/remounting several times without any effect. Also tried different rsize/wsize, no effect. I think there is some delay in updating the directory structure and it's causing this bug. > > Here's also some output from nfsstat on the server: > > > Server Info: > Getattr Setattr Lookup Readlink Read Write Create Remove > 114731225 20496896 254966151 133 11697392 19963641 0 9228861 > Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access > 4313471 1157651 39 1955 16511932 15479669 0 116927742 > Mknod Fsstat Fsinfo PathConf Commit > 0 4748487 48 0 14921747 > Server Ret-Failed > 0 > Server Faults > 0 > Server Cache Stats: > Inprog Idem Non-idem Misses > 0 0 0 613368147 > Server Write Gathering: > WriteOps WriteRPC Opsaved > 19963641 19963641 0 > > >How-To-Repeat: > imap5# mkdir test1 > imap5# cd test1 > imap5# touch file1 > imap5# touch file2 > imap5# ls -la > ls: ..: Input/output error > total 4 > drwxr-xr-x 2 root vchkpw 512 Oct 11 10:55 . > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:55 file1 > -rw-r--r-- 1 root vchkpw 0 Oct 11 10:55 file2 > >Fix: Can you try using the "old" NFS server as a test? -- John Baldwin From owner-freebsd-fs@FreeBSD.ORG Thu Oct 13 04:17:41 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 360D21065674 for ; Thu, 13 Oct 2011 04:17:41 +0000 (UTC) (envelope-from freebsd@penx.com) Received: from Elmer.dco.penx.com (elmer.dco.penx.com [174.46.214.165]) by mx1.freebsd.org (Postfix) with ESMTP id E844B8FC15 for ; Thu, 13 Oct 2011 04:17:40 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by Elmer.dco.penx.com (8.14.5/8.14.4) with ESMTP id p9D4HbqD028476 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 12 Oct 2011 22:17:39 -0600 (MDT) (envelope-from freebsd@penx.com) Date: Wed, 12 Oct 2011 22:17:37 -0600 (MDT) From: Dennis Glatting X-X-Sender: dennisg@Elmer.dco.penx.com To: Steven Hartland Message-ID: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII Cc: freebsd-fs@freebsd.org Subject: Re: ZFS/compression/performance X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Oct 2011 04:17:41 -0000 On Wed, 2011-10-12 at 08:59 +0100, Steven Hartland wrote: ----- Original Message ----- > From: "Dennis Glatting" > > > >I would appreciate someone knowledgeable in ZFS point me in the right > > direction. > > > > I have several ZFS arrays, some using gzip for compression. The > > compressed arrays hold very large text documents (10MB->20TB) > > and are highly compressible. Reading the files from a compressed > > data sets is fast with little load. However, writing to the > > compressed data sets incurs substantial load on the order of a > > load average from 12 to 20. > > > > My questions are: > > > > 1) Why such a heavy load on writing? > > 2) What kind of limiters can I put into effect to reduce load > > without impacting compressibilty? For example, is there some > > variable to controls the number of parallel compression > > operations? > > > > I have a number of different systems. Memory is 24GB on each of the two > > large data systems, SSD (Revo) for cache, and a SATA II ZIL. One system is > > a 6 core i7 @ 3.33 GHz and the other 4 core ii7 @ 2.93 GHz. The arrays are > > RAIDz using cheap 2TB disks. > > Have you tried using the alternative compression algorithms > e.g. lzjb or gzip-[1-5] the default gzip = gzip-6 > I have tried lzjb and I am unimpressed. I have not tried different levels of gzip on ZFS but I have tried it on documents with results I expected. As I mentioned, I have a lot of data. Two files were 26GB uncompressed but I had to kill those data sets because I ran out of room (I have reorganized my arrays since then). My ZFS compression ratio is 4.93x and I would require more storage at different gzip levels or ljzb. An option is not too compress with ZFS rather directly with gzip however I would still need lots of temporary storage for manipulation, which is what I am doing now (e.g., sort). Processing with zcat isn't always a good solution because some applications want files, but you have to do what you have to do. > Regards > Steve > > ================================================ > This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. > > In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 > or return the E.mail to postmaster@multiplay.co.uk. > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > -- Dennis Glatting From owner-freebsd-fs@FreeBSD.ORG Thu Oct 13 14:27:54 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7655E1065670 for ; Thu, 13 Oct 2011 14:27:54 +0000 (UTC) (envelope-from ler@lerctr.org) Received: from thebighonker.lerctr.org (lrosenman-1-pt.tunnel.tserv8.dal1.ipv6.he.net [IPv6:2001:470:1f0e:3ad::2]) by mx1.freebsd.org (Postfix) with ESMTP id 2254C8FC08 for ; Thu, 13 Oct 2011 14:27:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lerctr.org; s=lerami; h=Content-Type:MIME-Version:References:Message-ID:In-Reply-To:Subject:cc:To:From:Date; bh=qxJXPoisfqO2Hs5JtviXAQDBJSW7IPYYmr7sqKSEbwQ=; b=FdmuvIWWvTlBcWp5hmY+d76S19Yh0MyRfjA8j9M/HEvmJzx1eytf+MlzG/SUzlQ+Fbs9oMkouLV7jvf7cfwtt4uR0dHKe4w3gYiV7ee6zreaHNz8jBdac3Q2qYlFgRbtb6wEJSCsja6kH5oGfNU8RE3KNGIwvuYbn6Jv9socejs=; Received: from lrosenman-1-pt.tunnel.tserv8.dal1.ipv6.he.net ([2001:470:1f0e:3ad::2]:37335) by thebighonker.lerctr.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1REMG6-0002KP-Us; Thu, 13 Oct 2011 09:27:50 -0500 Date: Thu, 13 Oct 2011 09:27:38 -0500 (CDT) From: Larry Rosenman To: Johannes Totz In-Reply-To: Message-ID: References: <4E95AE08.7030105@lerctr.org> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Spam-Score: -2.7 (--) X-LERCTR-Spam-Score: -2.7 (--) X-Spam-Report: SpamScore (-2.7/5.0) ALL_TRUSTED=-1, BAYES_00=-1.9, SARE_SUB_OBFU_OTHER=0.135, TW_ZF=0.077 X-LERCTR-Spam-Report: SpamScore (-2.7/5.0) ALL_TRUSTED=-1, BAYES_00=-1.9, SARE_SUB_OBFU_OTHER=0.135, TW_ZF=0.077 Cc: freebsd-fs@freebsd.org Subject: Re: AF (4096 byte sector) drives: Can you mix/match in a ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Oct 2011 14:27:54 -0000 On Wed, 12 Oct 2011, Johannes Totz wrote: > On 12/10/2011 16:11, Larry Rosenman wrote: >> I have a root on ZFS box with 6 drives, all 400G (except one 500G) in a >> pool. >> >> I want to upgrade to 2T or 3T drives, but was wondering if you can >> mix/match while doing the drive by drive >> replacement. >> >> This is on 9.0-BETA3 if that matters. > > Not sure if this applies to you... > I tried to add a 4k-gnop'd drive to an existing 512-byte-sector pool. > And zpool tool complained about alignment mismatch. > This was on a recent 8-stable. It didn't work. I just rebuilt the entire pool with one gnop'd drive, and then removed the gnop, and it's now ashift=12. I also made sure that the first partition started at block 36. I think I'll be ok now. Here's what it looks like: pool: zroot state: ONLINE scan: scrub repaired 0 in 0h3m with 0 errors on Thu Oct 13 02:16:23 2011 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 gptid/dab78d7a-f17f-11e0-a060-0030488e9ff3 ONLINE 0 0 0 gptid/54f70329-f180-11e0-a060-0030488e9ff3 ONLINE 0 0 0 gptid/551f4215-f180-11e0-a060-0030488e9ff3 ONLINE 0 0 0 gptid/554fbc57-f180-11e0-a060-0030488e9ff3 ONLINE 0 0 0 gptid/557e84df-f180-11e0-a060-0030488e9ff3 ONLINE 0 0 0 gptid/55ab4aad-f180-11e0-a060-0030488e9ff3 ONLINE 0 0 0 errors: No known data errors zroot: version: 28 name: 'zroot' state: 0 txg: 558 pool_guid: 5966906085647800520 hostid: 4114256494 hostname: '' vdev_children: 1 vdev_tree: type: 'root' id: 0 guid: 5966906085647800520 children[0]: type: 'raidz' id: 0 guid: 15251804863163772249 nparity: 1 metaslab_array: 30 metaslab_shift: 34 ashift: 12 asize: 2374730514432 is_log: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 13248457090896416694 path: '/dev/gptid/dab78d7a-f17f-11e0-a060-0030488e9ff3' phys_path: '/dev/gptid/dab78d7a-f17f-11e0-a060-0030488e9ff3' whole_disk: 1 DTL: 153 create_txg: 4 children[1]: type: 'disk' id: 1 guid: 14267002476812000053 path: '/dev/gptid/54f70329-f180-11e0-a060-0030488e9ff3' phys_path: '/dev/gptid/54f70329-f180-11e0-a060-0030488e9ff3' whole_disk: 1 DTL: 152 create_txg: 4 children[2]: type: 'disk' id: 2 guid: 4813683986967800595 path: '/dev/gptid/551f4215-f180-11e0-a060-0030488e9ff3' phys_path: '/dev/gptid/551f4215-f180-11e0-a060-0030488e9ff3' whole_disk: 1 DTL: 151 create_txg: 4 children[3]: type: 'disk' id: 3 guid: 6938997802835048973 path: '/dev/gptid/554fbc57-f180-11e0-a060-0030488e9ff3' phys_path: '/dev/gptid/554fbc57-f180-11e0-a060-0030488e9ff3' whole_disk: 1 DTL: 150 create_txg: 4 children[4]: type: 'disk' id: 4 guid: 18091841386475062099 path: '/dev/gptid/557e84df-f180-11e0-a060-0030488e9ff3' phys_path: '/dev/gptid/557e84df-f180-11e0-a060-0030488e9ff3' whole_disk: 1 DTL: 149 create_txg: 4 children[5]: type: 'disk' id: 5 guid: 5543901141375635781 path: '/dev/gptid/55ab4aad-f180-11e0-a060-0030488e9ff3' phys_path: '/dev/gptid/55ab4aad-f180-11e0-a060-0030488e9ff3' whole_disk: 1 DTL: 148 create_txg: 4 I think something(tm) should be put in the handbook about this. (oh, here's the partitions: Geom name: ada0 modified: false state: OK fwheads: 16 fwsectors: 63 last: 781422734 first: 34 entries: 128 scheme: GPT Providers: 1. Name: ada0p1 Mediasize: 65536 (64k) Sectorsize: 512 Stripesize: 0 Stripeoffset: 18432 Mode: r0w0e0 rawuuid: bc4c1c42-f17f-11e0-a060-0030488e9ff3 rawtype: 83bd6b9d-7f41-11dc-be0b-001560b84f0f label: (null) length: 65536 offset: 18432 type: freebsd-boot index: 1 end: 163 start: 36 2. Name: ada0p2 Mediasize: 4294967296 (4.0G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 83968 Mode: r1w1e1 rawuuid: ca30afcf-f17f-11e0-a060-0030488e9ff3 rawtype: 516e7cb5-6ecf-11d6-8ff8-00022d09712b label: swap0 length: 4294967296 offset: 83968 type: freebsd-swap index: 2 end: 8388771 start: 164 3. Name: ada0p3 Mediasize: 395793389056 (368G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 83968 Mode: r1w1e2 rawuuid: dab78d7a-f17f-11e0-a060-0030488e9ff3 rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b label: disk0 length: 395793389056 offset: 4295051264 type: freebsd-zfs index: 3 end: 781422734 start: 8388772 Consumers: 1. Name: ada0 Mediasize: 400088457216 (372G) Sectorsize: 512 Mode: r2w2e5 Geom name: ada1 modified: false state: OK fwheads: 16 fwsectors: 63 last: 781422734 first: 34 entries: 128 scheme: GPT Providers: 1. Name: ada1p1 Mediasize: 65536 (64k) Sectorsize: 512 Stripesize: 0 Stripeoffset: 18432 Mode: r0w0e0 rawuuid: 54e701ff-f180-11e0-a060-0030488e9ff3 rawtype: 83bd6b9d-7f41-11dc-be0b-001560b84f0f label: (null) length: 65536 offset: 18432 type: freebsd-boot index: 1 end: 163 start: 36 2. Name: ada1p2 Mediasize: 4294967296 (4.0G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 83968 Mode: r1w1e1 rawuuid: 54eebd95-f180-11e0-a060-0030488e9ff3 rawtype: 516e7cb5-6ecf-11d6-8ff8-00022d09712b label: swap1 length: 4294967296 offset: 83968 type: freebsd-swap index: 2 end: 8388771 start: 164 3. Name: ada1p3 Mediasize: 395793389056 (368G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 83968 Mode: r1w1e2 rawuuid: 54f70329-f180-11e0-a060-0030488e9ff3 rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b label: disk1 length: 395793389056 offset: 4295051264 type: freebsd-zfs index: 3 end: 781422734 start: 8388772 Consumers: 1. Name: ada1 Mediasize: 400088457216 (372G) Sectorsize: 512 Mode: r2w2e5 Geom name: ada2 modified: false state: OK fwheads: 16 fwsectors: 63 last: 976773134 first: 34 entries: 128 scheme: GPT Providers: 1. Name: ada2p1 Mediasize: 65536 (64k) Sectorsize: 512 Stripesize: 0 Stripeoffset: 18432 Mode: r0w0e0 rawuuid: 550e22c7-f180-11e0-a060-0030488e9ff3 rawtype: 83bd6b9d-7f41-11dc-be0b-001560b84f0f label: (null) length: 65536 offset: 18432 type: freebsd-boot index: 1 end: 163 start: 36 2. Name: ada2p2 Mediasize: 4294967296 (4.0G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 83968 Mode: r1w1e1 rawuuid: 5515f1c4-f180-11e0-a060-0030488e9ff3 rawtype: 516e7cb5-6ecf-11d6-8ff8-00022d09712b label: swap2 length: 4294967296 offset: 83968 type: freebsd-swap index: 2 end: 8388771 start: 164 3. Name: ada2p3 Mediasize: 495812793856 (461G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 83968 Mode: r1w1e2 rawuuid: 551f4215-f180-11e0-a060-0030488e9ff3 rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b label: disk2 length: 495812793856 offset: 4295051264 type: freebsd-zfs index: 3 end: 976773134 start: 8388772 Consumers: 1. Name: ada2 Mediasize: 500107862016 (465G) Sectorsize: 512 Mode: r2w2e5 Geom name: ada3 modified: false state: OK fwheads: 16 fwsectors: 63 last: 781422734 first: 34 entries: 128 scheme: GPT Providers: 1. Name: ada3p1 Mediasize: 65536 (64k) Sectorsize: 512 Stripesize: 0 Stripeoffset: 18432 Mode: r0w0e0 rawuuid: 553d6d5f-f180-11e0-a060-0030488e9ff3 rawtype: 83bd6b9d-7f41-11dc-be0b-001560b84f0f label: (null) length: 65536 offset: 18432 type: freebsd-boot index: 1 end: 163 start: 36 2. Name: ada3p2 Mediasize: 4294967296 (4.0G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 83968 Mode: r1w1e1 rawuuid: 554677f6-f180-11e0-a060-0030488e9ff3 rawtype: 516e7cb5-6ecf-11d6-8ff8-00022d09712b label: swap3 length: 4294967296 offset: 83968 type: freebsd-swap index: 2 end: 8388771 start: 164 3. Name: ada3p3 Mediasize: 395793389056 (368G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 83968 Mode: r1w1e2 rawuuid: 554fbc57-f180-11e0-a060-0030488e9ff3 rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b label: disk3 length: 395793389056 offset: 4295051264 type: freebsd-zfs index: 3 end: 781422734 start: 8388772 Consumers: 1. Name: ada3 Mediasize: 400088457216 (372G) Sectorsize: 512 Mode: r2w2e5 Geom name: ada4 modified: false state: OK fwheads: 16 fwsectors: 63 last: 781422734 first: 34 entries: 128 scheme: GPT Providers: 1. Name: ada4p1 Mediasize: 65536 (64k) Sectorsize: 512 Stripesize: 0 Stripeoffset: 18432 Mode: r0w0e0 rawuuid: 556c3554-f180-11e0-a060-0030488e9ff3 rawtype: 83bd6b9d-7f41-11dc-be0b-001560b84f0f label: (null) length: 65536 offset: 18432 type: freebsd-boot index: 1 end: 163 start: 36 2. Name: ada4p2 Mediasize: 4294967296 (4.0G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 83968 Mode: r1w1e1 rawuuid: 55754618-f180-11e0-a060-0030488e9ff3 rawtype: 516e7cb5-6ecf-11d6-8ff8-00022d09712b label: swap4 length: 4294967296 offset: 83968 type: freebsd-swap index: 2 end: 8388771 start: 164 3. Name: ada4p3 Mediasize: 395793389056 (368G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 83968 Mode: r1w1e2 rawuuid: 557e84df-f180-11e0-a060-0030488e9ff3 rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b label: disk4 length: 395793389056 offset: 4295051264 type: freebsd-zfs index: 3 end: 781422734 start: 8388772 Consumers: 1. Name: ada4 Mediasize: 400088457216 (372G) Sectorsize: 512 Mode: r2w2e5 Geom name: ada5 modified: false state: OK fwheads: 16 fwsectors: 63 last: 781422734 first: 34 entries: 128 scheme: GPT Providers: 1. Name: ada5p1 Mediasize: 65536 (64k) Sectorsize: 512 Stripesize: 0 Stripeoffset: 18432 Mode: r0w0e0 rawuuid: 5598ebf4-f180-11e0-a060-0030488e9ff3 rawtype: 83bd6b9d-7f41-11dc-be0b-001560b84f0f label: (null) length: 65536 offset: 18432 type: freebsd-boot index: 1 end: 163 start: 36 2. Name: ada5p2 Mediasize: 4294967296 (4.0G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 83968 Mode: r1w1e1 rawuuid: 55a1f6a0-f180-11e0-a060-0030488e9ff3 rawtype: 516e7cb5-6ecf-11d6-8ff8-00022d09712b label: swap5 length: 4294967296 offset: 83968 type: freebsd-swap index: 2 end: 8388771 start: 164 3. Name: ada5p3 Mediasize: 395793389056 (368G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 83968 Mode: r1w1e2 rawuuid: 55ab4aad-f180-11e0-a060-0030488e9ff3 rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b label: disk5 length: 395793389056 offset: 4295051264 type: freebsd-zfs index: 3 end: 781422734 start: 8388772 Consumers: 1. Name: ada5 Mediasize: 400088457216 (372G) Sectorsize: 512 Mode: r2w2e5 From owner-freebsd-fs@FreeBSD.ORG Thu Oct 13 16:21:38 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 90E0D106566C for ; Thu, 13 Oct 2011 16:21:38 +0000 (UTC) (envelope-from tevans.uk@googlemail.com) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id 437908FC08 for ; Thu, 13 Oct 2011 16:21:37 +0000 (UTC) Received: by vcbf13 with SMTP id f13so283864vcb.13 for ; Thu, 13 Oct 2011 09:21:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=Tr2baO8MaXUyZVPmDcJWgnEt2ciW/MDdKl0SlRYWOYI=; b=Gx9DTe8cBvM8LYDZu98WLgaZA2nXq4XYCSUEI9eTM5yBwNz0KMY2cHwhH+q0Ew+34C w8+TggKX75UMrKbDFdRUhmbhwV05Zq5ymRRy4EJIh28Km2Y5e3zPTL3a8CfrDCM9hFnr rNGOk9tMyZEzBz+rsBBLXcYpJWHzB4jEs26EI= MIME-Version: 1.0 Received: by 10.52.72.9 with SMTP id z9mr4493099vdu.70.1318522897477; Thu, 13 Oct 2011 09:21:37 -0700 (PDT) Received: by 10.52.111.201 with HTTP; Thu, 13 Oct 2011 09:21:37 -0700 (PDT) In-Reply-To: References: <4E95AE08.7030105@lerctr.org> Date: Thu, 13 Oct 2011 17:21:37 +0100 Message-ID: From: Tom Evans To: Larry Rosenman Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org, Johannes Totz Subject: Re: AF (4096 byte sector) drives: Can you mix/match in a ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Oct 2011 16:21:38 -0000 On Thu, Oct 13, 2011 at 3:27 PM, Larry Rosenman wrote: > I think something(tm) should be put in the handbook about this. TBH I think that ZFS should just move it's default ashift to 11 and have 4k blocks by default. Saves all this messing around with temporary gnop devices. Documentation that says (in effect) "if you do it the default way with most common hard drives these days performance sucks, so follow this convoluted work around" is not that useful - a lot of people will only come across it when their performance sucks and be disappointed. Cheers Tom From owner-freebsd-fs@FreeBSD.ORG Thu Oct 13 19:14:02 2011 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A0F9E106566C for ; Thu, 13 Oct 2011 19:14:02 +0000 (UTC) (envelope-from danfe@regency.nsu.ru) Received: from mx.nsu.ru (r2b9.nsu.ru [212.192.164.39]) by mx1.freebsd.org (Postfix) with ESMTP id 4A2088FC13 for ; Thu, 13 Oct 2011 19:14:01 +0000 (UTC) Received: from regency.nsu.ru ([193.124.210.26]) by mx.nsu.ru with esmtp (Exim 4.69) (envelope-from ) id 1REPnV-00089L-J8 for fs@freebsd.org; Fri, 14 Oct 2011 01:14:25 +0700 Received: from regency.nsu.ru (localhost [127.0.0.1]) by regency.nsu.ru (8.14.2/8.14.2) with ESMTP id p9DIGOc0046187 for ; Fri, 14 Oct 2011 01:16:24 +0700 (NOVST) (envelope-from danfe@regency.nsu.ru) Received: (from danfe@localhost) by regency.nsu.ru (8.14.2/8.14.2/Submit) id p9DIG3Sh046039 for fs@freebsd.org; Fri, 14 Oct 2011 01:16:03 +0700 (NOVST) (envelope-from danfe) Date: Fri, 14 Oct 2011 01:16:02 +0700 From: Alexey Dokuchaev To: fs@freebsd.org Message-ID: <20111013181602.GA35354@regency.nsu.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i Cc: Subject: Call for msdosfs/ntfs experts (or better, maintainers) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Oct 2011 19:14:02 -0000 Hello there, For quite a while already, our FAT and NTFS support need some love to shine on them, I believe. AFAICT, they currently have no maintainers and thus are not receiving proper care. Case 1. FreeBSD still has problems with UTF-8 locale and correct handling of e.g. Chinese characters in both filesystems. Patches were worked out to address this problem; they are available here: http://deadshot.googlecode.com/svn/trunk/freebsd-patch/filesystem/ PR kern/133174 was filed on 29 Mar 2009 with the original patch for msdosfs, which I've cleaned up a bit per style(9). No action was taken since then. I've contacted the original author of these patches. He's very collaborative and is eager to provide all the guidance required to review and include these changes in our code base. Any takers? It's a shame for us not to be on par with Apple and even OpenBSD/NetBSD (as I've been told, they support UTF-8 out of the box). Case 2. Apparently, Apple actually released their NTFS implementation under BSD license which seems quite worthy to take a look at: http://opensource.apple.com/source/ntfs/ntfs-78/kext/ Not the userland part (open source but under APSL only) but perhaps we don't really care about these parts since they are mostly mount utilities and are trivial to rewrite from scratch (quoting delphij@). Are there plans to make at least some use of this code? Our NTFS implementation right now loses considerably to other Unixen, and since its original author and de-jure maintainer is long gone from FreeBSD, we should do something about it. Not being a FS hacker myself, is there anything I can do to expedite the progress on these issues (apart from becoming one)? ./danfe From owner-freebsd-fs@FreeBSD.ORG Thu Oct 13 21:49:03 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4A11D106566B for ; Thu, 13 Oct 2011 21:49:03 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74]) by mx1.freebsd.org (Postfix) with ESMTP id 0D0B98FC0C for ; Thu, 13 Oct 2011 21:49:02 +0000 (UTC) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id p9DLn15R013175; Thu, 13 Oct 2011 16:49:01 -0500 (CDT) Date: Thu, 13 Oct 2011 16:49:01 -0500 (CDT) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: Tom Evans In-Reply-To: Message-ID: References: <4E95AE08.7030105@lerctr.org> User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (blade.simplesystems.org [65.66.246.90]); Thu, 13 Oct 2011 16:49:02 -0500 (CDT) Cc: freebsd-fs@freebsd.org, Johannes Totz Subject: Re: AF (4096 byte sector) drives: Can you mix/match in a ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Oct 2011 21:49:03 -0000 On Thu, 13 Oct 2011, Tom Evans wrote: > On Thu, Oct 13, 2011 at 3:27 PM, Larry Rosenman wrote: >> I think something(tm) should be put in the handbook about this. > > TBH I think that ZFS should just move it's default ashift to 11 and > have 4k blocks by default. Saves all this messing around with > temporary gnop devices. There is a cost to using 4k blocks becaus zfs metadata will then consume 4k bytes (each copy) rather than 512 bytes. As someone posted to the zfs-discuss list a few days ago, this consumes quite a lot of space when the zfs block size is set to 8K. This results in almost 2X the disk space consumption when using 8K zfs block size. There is additional performance cost because the drive will be doing I/O in 4K chunks rather than 512 byte chunks. Typical SAS enterprise drives (not near-line drives) are surely still all using small sectors because their storage size is not very high and to get more IOPS. Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ From owner-freebsd-fs@FreeBSD.ORG Thu Oct 13 23:04:00 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 28C18106566C for ; Thu, 13 Oct 2011 23:04:00 +0000 (UTC) (envelope-from dlin@panzura.com) Received: from mail131.messagelabs.com (mail131.messagelabs.com [216.82.242.99]) by mx1.freebsd.org (Postfix) with ESMTP id A613A8FC0A for ; Thu, 13 Oct 2011 23:03:59 +0000 (UTC) X-Env-Sender: dlin@panzura.com X-Msg-Ref: server-9.tower-131.messagelabs.com!1318545434!4085254!3 X-Originating-IP: [216.166.12.32] X-StarScan-Version: 6.3.6; banners=-,-,- X-VirusChecked: Checked Received: (qmail 1018 invoked from network); 13 Oct 2011 22:37:18 -0000 Received: from out001.collaborationhost.net (HELO out001.collaborationhost.net) (216.166.12.32) by server-9.tower-131.messagelabs.com with RC4-SHA encrypted SMTP; 13 Oct 2011 22:37:18 -0000 Received: from AUSP01VMBX02.collaborationhost.net ([10.2.8.8]) by AUSP01MHUB02.collaborationhost.net ([10.2.8.26]) with mapi; Thu, 13 Oct 2011 17:37:15 -0500 From: Dave Lin To: "freebsd-fs@freebsd.org" Date: Thu, 13 Oct 2011 17:33:02 -0500 Thread-Topic: cannot run ztest (zfs test suite) without seeing core dump on 8.2 Thread-Index: AcyJ9/g0sCs+DfXwQaC98ouiA8FKKw== Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailman-Approved-At: Thu, 13 Oct 2011 23:56:12 +0000 Subject: cannot run ztest (zfs test suite) without seeing core dump on 8.2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Oct 2011 23:04:00 -0000 Hello all,=20 It seems that I cannot run ztest (zfs test suite) without seeing core dump= on latest 8.2 release. Has anyone seen this before and is there way to re= solve this issue? Thanks. Here's the output capture: dave-free82# ztest -V=09 5 vdevs, 7 datasets, 23 threads, 300 seconds... child died with signal 11 Here's the uname -a info: 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Thu Feb 17 02:41:51 UTC 2011 root@m= ason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 Sincerely,=20 -Dave From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 06:10:31 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3FD311065670 for ; Fri, 14 Oct 2011 06:10:31 +0000 (UTC) (envelope-from radiomlodychbandytow@o2.pl) Received: from moh2-ve3.go2.pl (moh2-ve3.go2.pl [193.17.41.208]) by mx1.freebsd.org (Postfix) with ESMTP id F1C048FC17 for ; Fri, 14 Oct 2011 06:10:30 +0000 (UTC) Received: from moh2-ve3.go2.pl (unknown [10.0.0.208]) by moh2-ve3.go2.pl (Postfix) with ESMTP id F3D34370F60 for ; Fri, 14 Oct 2011 08:10:25 +0200 (CEST) Received: from unknown (unknown [10.0.0.108]) by moh2-ve3.go2.pl (Postfix) with SMTP for ; Fri, 14 Oct 2011 08:10:25 +0200 (CEST) Received: from host892524678.com-promis.3s.pl [89.25.246.78] by poczta.o2.pl with ESMTP id jvUGhQ; Fri, 14 Oct 2011 08:12:06 +0200 Message-ID: <4E97D24C.4010606@o2.pl> Date: Fri, 14 Oct 2011 08:10:20 +0200 From: =?UTF-8?B?UmFkaW8gbcWCb2R5Y2ggYmFuZHl0w7N3?= User-Agent: Mozilla/5.0 (Windows NT 5.2; WOW64; rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <20111013120032.D6BA71065760@hub.freebsd.org> In-Reply-To: <20111013120032.D6BA71065760@hub.freebsd.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-O2-Trust: 2, 61 X-O2-SPF: neutral Subject: Re: ZFS/compression/performance X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 06:10:31 -0000 On 2011-10-13 14:00, freebsd-fs-request@freebsd.org wrote: > An option is not too compress with ZFS rather directly with gzip however I > would still need lots of temporary storage for manipulation, which is what > I am doing now (e.g., sort). Processing with zcat isn't always a good > solution because some applications want files, but you have to do what you > have to do. It seems that with your data gzipping directly is a better option. Though I suggest that you experiment with codecs that support larger dictionary, i.e. 7zip, I expect that you would see huge strength improvement with something like 7z a -mx=1 -md=26 out.7z in. You can use higher -md values if you have enough memory, compression mode 1 (mx=1) uses 4,5*2^md bytes of RAM, so if my maths is good, md=26 uses ~288 MB. If LZMA is too slow, you can at least try 7-zip's deflate64. It's not great, but not as bad as gzip. -- Twoje radio From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 06:34:59 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B51071065670; Fri, 14 Oct 2011 06:34:59 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 5D1FF8FC15; Fri, 14 Oct 2011 06:34:58 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id JAA16728; Fri, 14 Oct 2011 09:34:53 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1REbM5-000PXI-Gc; Fri, 14 Oct 2011 09:34:53 +0300 Message-ID: <4E97D80B.90204@FreeBSD.org> Date: Fri, 14 Oct 2011 09:34:51 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:7.0.1) Gecko/20111002 Thunderbird/7.0.1 MIME-Version: 1.0 To: Dave Lin References: In-Reply-To: X-Enigmail-Version: undefined Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: "freebsd-fs@freebsd.org" Subject: Re: cannot run ztest (zfs test suite) without seeing core dump on 8.2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 06:34:59 -0000 on 14/10/2011 01:33 Dave Lin said the following: > Hello all, > > It seems that I cannot run ztest (zfs test suite) without seeing core dump on latest 8.2 release. Has anyone seen this before and is there way to resolve this issue? Thanks. > > Here's the output capture: > > dave-free82# ztest -V > 5 vdevs, 7 datasets, 23 threads, 300 seconds... > child died with signal 11 > > Here's the uname -a info: > > 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Thu Feb 17 02:41:51 UTC 2011 root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 Please try the following changes: https://gitorious.org/~avg/freebsd/avgbsd/commit/5f0bbe6ff83f463f583358b76cfe2e179449091b https://gitorious.org/~avg/freebsd/avgbsd/commit/b430b23e6cd579c577f8ff1dae445a8ee2603ffa https://gitorious.org/~avg/freebsd/avgbsd/commit/96e8886589b0c6bb91e1019efb204e6aac87f4ef https://gitorious.org/~avg/freebsd/avgbsd/commit/5479bc325beb8fa85f50e50f3dc18069489a2119 -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 08:47:16 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 330DF106564A for ; Fri, 14 Oct 2011 08:47:16 +0000 (UTC) (envelope-from bra@fsn.hu) Received: from people.fsn.hu (people.fsn.hu [195.228.252.137]) by mx1.freebsd.org (Postfix) with ESMTP id 5686D8FC0C for ; Fri, 14 Oct 2011 08:47:15 +0000 (UTC) Received: by people.fsn.hu (Postfix, from userid 1001) id 8C308AA5C3E; Fri, 14 Oct 2011 10:47:13 +0200 (CEST) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.2 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MF-ACE0E1EA [pR: 8.4953] X-CRM114-CacheID: sfid-20111014_10471_04BFCFB5 X-CRM114-Status: Good ( pR: 8.4953 ) X-DSPAM-Result: Whitelisted X-DSPAM-Processed: Fri Oct 14 10:47:13 2011 X-DSPAM-Confidence: 0.5429 X-DSPAM-Probability: 0.0000 X-DSPAM-Signature: 4e97f711836501250914652 X-DSPAM-Factors: 27, From*Attila Nagy , 0.00010, cache, 0.00256, cache, 0.00256, adding, 0.00510, STABLE, 0.00556, logs, 0.00763, config, 0.00796, 8+STABLE, 0.01000, CEST, 0.01000, 15+01, 0.01000, STATE, 0.99000, Received*14+Oct, 0.99000, Date*47+12, 0.99000, Date*14+Oct, 0.99000, 01+25, 0.01000, format+The, 0.99000, Date*10+47, 0.99000, reboot, 0.01000, reboot, 0.01000, software+versions, 0.01000, READ, 0.99000, removing, 0.01000, Date*Fri+14, 0.99000, Received*Fri+14, 0.99000, User-Agent*i686, 0.01216, problem+is, 0.01302, X-Spambayes-Classification: ham; 0.00 Received: from japan.t-online.private (japan.t-online.co.hu [195.228.243.99]) by people.fsn.hu (Postfix) with ESMTPSA id 104AFAA5C30 for ; Fri, 14 Oct 2011 10:47:13 +0200 (CEST) Message-ID: <4E97F710.8000004@fsn.hu> Date: Fri, 14 Oct 2011 10:47:12 +0200 From: Attila Nagy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.23) Gecko/20090817 Thunderbird/2.0.0.23 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org X-Stationery: 0.7.5 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: cache devices come up as dsk/original_device_name in zpools X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 08:47:16 -0000 Hi, A have a zpool with cache devices on 8-STABLE (csuped and compiled at Sep 14 15:01:25 CEST 2011). The problem is every time I reboot, the cache devices turn to UNAVAIL (because device name changes to dsk/daXX): dsk/da37 UNAVAIL 0 0 0 cannot open dsk/da38 UNAVAIL 0 0 0 cannot open After removing and re-adding them, everyting goes back to normal, until the next reboot. I have no /boot/zfs/zpool.cache (because the machine is netbooted), maybe this is the cause? In previous versions everything was fine. # zpool remove home dsk/da37 # zpool remove home dsk/da38 # zpool add home cache da37 # zpool add home cache da38 zpool status pool: home state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. scan: none requested config: NAME STATE READ WRITE CKSUM home ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da0 ONLINE 0 0 0 da1 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 da10 ONLINE 0 0 0 da11 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 da12 ONLINE 0 0 0 da13 ONLINE 0 0 0 mirror-3 ONLINE 0 0 0 da14 ONLINE 0 0 0 da15 ONLINE 0 0 0 mirror-4 ONLINE 0 0 0 da16 ONLINE 0 0 0 da17 ONLINE 0 0 0 mirror-5 ONLINE 0 0 0 da18 ONLINE 0 0 0 da19 ONLINE 0 0 0 mirror-6 ONLINE 0 0 0 da2 ONLINE 0 0 0 da20 ONLINE 0 0 0 mirror-7 ONLINE 0 0 0 da21 ONLINE 0 0 0 da22 ONLINE 0 0 0 mirror-8 ONLINE 0 0 0 da23 ONLINE 0 0 0 da24 ONLINE 0 0 0 mirror-9 ONLINE 0 0 0 da25 ONLINE 0 0 0 da26 ONLINE 0 0 0 mirror-10 ONLINE 0 0 0 da27 ONLINE 0 0 0 da28 ONLINE 0 0 0 mirror-11 ONLINE 0 0 0 da29 ONLINE 0 0 0 da3 ONLINE 0 0 0 mirror-12 ONLINE 0 0 0 da30 ONLINE 0 0 0 da31 ONLINE 0 0 0 mirror-13 ONLINE 0 0 0 da32 ONLINE 0 0 0 da33 ONLINE 0 0 0 mirror-14 ONLINE 0 0 0 da34 ONLINE 0 0 0 da4 ONLINE 0 0 0 mirror-15 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 mirror-16 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 logs da35 ONLINE 0 0 0 cache da37 ONLINE 0 0 0 da38 ONLINE 0 0 0 errors: No known data errors From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 09:00:03 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6CCF61065672 for ; Fri, 14 Oct 2011 09:00:03 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from QMTA11.westchester.pa.mail.comcast.net (qmta11.westchester.pa.mail.comcast.net [76.96.59.211]) by mx1.freebsd.org (Postfix) with ESMTP id 06D348FC0C for ; Fri, 14 Oct 2011 09:00:02 +0000 (UTC) Received: from omta20.westchester.pa.mail.comcast.net ([76.96.62.71]) by QMTA11.westchester.pa.mail.comcast.net with comcast id kZ011h0021YDfWL5BZ03BA; Fri, 14 Oct 2011 09:00:03 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta20.westchester.pa.mail.comcast.net with comcast id kZ021h00F1t3BNj3gZ0201; Fri, 14 Oct 2011 09:00:02 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id BC00C102C1C; Fri, 14 Oct 2011 02:00:00 -0700 (PDT) Date: Fri, 14 Oct 2011 02:00:00 -0700 From: Jeremy Chadwick To: Attila Nagy Message-ID: <20111014090000.GA66602@icarus.home.lan> References: <4E97F710.8000004@fsn.hu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4E97F710.8000004@fsn.hu> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: cache devices come up as dsk/original_device_name in zpools X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 09:00:03 -0000 On Fri, Oct 14, 2011 at 10:47:12AM +0200, Attila Nagy wrote: > Hi, > > A have a zpool with cache devices on 8-STABLE (csuped and compiled > at Sep 14 15:01:25 CEST 2011). The problem is every time I reboot, > the cache devices turn to UNAVAIL (because device name changes to > dsk/daXX): > dsk/da37 UNAVAIL 0 0 0 cannot open > dsk/da38 UNAVAIL 0 0 0 cannot open > > After removing and re-adding them, everyting goes back to normal, > until the next reboot. I have no /boot/zfs/zpool.cache (because the > machine is netbooted), maybe this is the cause? In previous versions > everything was fine. Obviously at some point when you built this system you entered "dsk/da37" and "dsk/da38". So the metadata on those drives probably contains references to those strings. You need to clear/change that. I'm not sure how to go about doing that, especially on a system which lacks /boot/zfs/zpool.cache. A one-time "zpool export" then a reboot, I imagine, would suffice, but I'm not sure if export actually changes the metadata on the disk itself or just updates the zpool.cache file. If you ran "zdb" on this system (the output will be HUGE given the number of vdevs and devices you have!), you should see some relevant information under each disk (child), specifically "path" vs. "phys_path". Maybe these differ? You might also try tinkering about with the loader.conf(5) variables zpool_cache_*. Depending on your setup, you might be able to move the zpool.cache file to a different location -- I realise you PXE boot, but if you have any sort of storage media on that system that isn't under ZFS that *is* available (e.g. a small UFS partition, etc.) then you might consider storing it there. See /boot/defaults/loader.conf. Otherwise I'm not sure how to go about changing the actual strings in the disk metadata. Maybe remove the cache devices entirely, zero out the first and last ~16MBytes of the da37 and da38 disks (using dd), then re-add them using their "daXX" name? That might suffice. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 09:18:45 2011 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 67239106564A for ; Fri, 14 Oct 2011 09:18:45 +0000 (UTC) (envelope-from gleb.kurtsou@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id E69A48FC17 for ; Fri, 14 Oct 2011 09:18:44 +0000 (UTC) Received: by bkbzx1 with SMTP id zx1so1621582bkb.13 for ; Fri, 14 Oct 2011 02:18:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=TBpa33rT8/vXBWK9Xaf7295O4AjoA0225xPlaaEgrEQ=; b=CHsp+VJtxigdEuZfuszPRs6SZR/5C4lDzpmRn8gAgaCBP7szgvIpJfl+menPBWU3tV ORtSMr5VE4tSRxzqt0yUJZiSGtyAhDpjq7yA5223MfOeFhsBHOJySUcas55TRLmPjuYF XVyGmI9OMS3b4cXzF5lj9OxpAab00O7JfgKIM= Received: by 10.204.138.216 with SMTP id b24mr6006031bku.68.1318582585212; Fri, 14 Oct 2011 01:56:25 -0700 (PDT) Received: from localhost ([78.157.92.5]) by mx.google.com with ESMTPS id k6sm7621784bkv.8.2011.10.14.01.56.23 (version=SSLv3 cipher=OTHER); Fri, 14 Oct 2011 01:56:23 -0700 (PDT) Date: Fri, 14 Oct 2011 11:54:05 +0300 From: Gleb Kurtsou To: Alexey Dokuchaev Message-ID: <20111014085405.GA3711@tops> References: <20111013181602.GA35354@regency.nsu.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20111013181602.GA35354@regency.nsu.ru> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: fs@freebsd.org Subject: Re: Call for msdosfs/ntfs experts (or better, maintainers) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 09:18:45 -0000 On (14/10/2011 01:16), Alexey Dokuchaev wrote: > Hello there, > > For quite a while already, our FAT and NTFS support need some love to > shine on them, I believe. AFAICT, they currently have no maintainers > and thus are not receiving proper care. > > Case 1. FreeBSD still has problems with UTF-8 locale and correct > handling of e.g. Chinese characters in both filesystems. Patches were > worked out to address this problem; they are available here: > > http://deadshot.googlecode.com/svn/trunk/freebsd-patch/filesystem/ > > PR kern/133174 was filed on 29 Mar 2009 with the original patch for > msdosfs, which I've cleaned up a bit per style(9). No action was taken > since then. > > I've contacted the original author of these patches. He's very > collaborative and is eager to provide all the guidance required to > review and include these changes in our code base. Any takers? It's > a shame for us not to be on par with Apple and even OpenBSD/NetBSD (as > I've been told, they support UTF-8 out of the box). > > Case 2. Apparently, Apple actually released their NTFS implementation > under BSD license which seems quite worthy to take a look at: > > http://opensource.apple.com/source/ntfs/ntfs-78/kext/ Sounds very interesting. I'd peek it up and do a FreeBSD port, but I won't be able to start at least in two months. I've just started reviewing and testing this year FUSE GSoC project. It works quite ok for me (ntfs, encfs), no more random panics etc. But I was told it somewhat reduces number of supported file systems. I'm going to publish it on github soon. Question remains if it's worth doing the port from darwin if we have working fuse ntfs. How mature darwin implementation is? Is there public source repository with history? attilio@ is working on MPSAFE NTFS: http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS > > Not the userland part (open source but under APSL only) but perhaps we > don't really care about these parts since they are mostly mount utilities > and are trivial to rewrite from scratch (quoting delphij@). > > Are there plans to make at least some use of this code? Our NTFS > implementation right now loses considerably to other Unixen, and since > its original author and de-jure maintainer is long gone from FreeBSD, we > should do something about it. > > Not being a FS hacker myself, is there anything I can do to expedite the > progress on these issues (apart from becoming one)? > > ./danfe > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 09:20:33 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 288801065670 for ; Fri, 14 Oct 2011 09:20:33 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4]) by mx1.freebsd.org (Postfix) with ESMTP id E1AD38FC08 for ; Fri, 14 Oct 2011 09:20:32 +0000 (UTC) Received: from elsa.codelab.cz (localhost [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 9869028431 for ; Fri, 14 Oct 2011 11:20:31 +0200 (CEST) Received: from [192.168.1.2] (ip-86-49-61-235.net.upcbroadband.cz [86.49.61.235]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id 9269C28429 for ; Fri, 14 Oct 2011 11:20:30 +0200 (CEST) Message-ID: <4E97FEDD.7060205@quip.cz> Date: Fri, 14 Oct 2011 11:20:29 +0200 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.1.19) Gecko/20110420 Lightning/1.0b1 SeaMonkey/2.0.14 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 7bit Subject: dirhash and dynamic memory allocation X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 09:20:33 -0000 Hi all, I tried some tuning of dirhash on our servers and after googlig a bit, I found an old GSoC project wiki page about Dynamic Memory Allocation for Dirhash: http://wiki.freebsd.org/DirhashDynamicMemory Is there any reason not to use it / not commit it to HEAD? And second question - is there any negative impact with higher vfs.ufs.dirhash_maxmem? It stil defaults to 2MB (on FreeBSD 8.2) after 10 years, but I think we all are using bigger FS in these days with lot of files and directories and 2MB is not enough. Miroslav Lachman From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 10:35:32 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 565DD1065670 for ; Fri, 14 Oct 2011 10:35:32 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id D63568FC20 for ; Fri, 14 Oct 2011 10:35:31 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1REf6v-0003Aj-TB for freebsd-fs@freebsd.org; Fri, 14 Oct 2011 12:35:29 +0200 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 14 Oct 2011 12:35:29 +0200 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 14 Oct 2011 12:35:29 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Date: Fri, 14 Oct 2011 12:35:05 +0200 Lines: 53 Message-ID: References: <4E97FEDD.7060205@quip.cz> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig4D4328D43A65108185353101" X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:7.0.1) Gecko/20111004 Thunderbird/7.0.1 In-Reply-To: <4E97FEDD.7060205@quip.cz> X-Enigmail-Version: 1.1.2 Subject: Re: dirhash and dynamic memory allocation X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 10:35:32 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig4D4328D43A65108185353101 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 14/10/2011 11:20, Miroslav Lachman wrote: > Hi all, >=20 > I tried some tuning of dirhash on our servers and after googlig a bit, = I > found an old GSoC project wiki page about Dynamic Memory Allocation for= > Dirhash: http://wiki.freebsd.org/DirhashDynamicMemory > Is there any reason not to use it / not commit it to HEAD? AFAIK it's sort-of already present. In 8-stable and recent kernels you can give huge amounts of memory to dirhash via vfs.ufs.dirhash_maxmem (but except in really large edge cases I don't think you *need* more than 32 MB), and the kernel will scale-down or free the memory if not needed. In effect, vfs.ufs.dirhash_maxmem is the upper limit - the kernel will use less and will free the allocated memory in low memory situations (which I've tried and it works). > And second question - is there any negative impact with higher > vfs.ufs.dirhash_maxmem? It stil defaults to 2MB (on FreeBSD 8.2) after Not that I know of. > 10 years, but I think we all are using bigger FS in these days with lot= > of files and directories and 2MB is not enough. AFAIK I've changed it to autotune so it's configured to approximately 4 MB on a 4 GB machine (and scales up) in 9. --------------enig4D4328D43A65108185353101 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6YEF8ACgkQldnAQVacBcgr/QCeLLxe/5TD3zr4EKC9DGG8dmhC q5oAoPrVCYgrh5bFIl7CwSzEIgc45Ty3 =NAht -----END PGP SIGNATURE----- --------------enig4D4328D43A65108185353101-- From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 11:20:38 2011 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9DA57106566C for ; Fri, 14 Oct 2011 11:20:38 +0000 (UTC) (envelope-from danfe@regency.nsu.ru) Received: from mx.nsu.ru (r2b9.nsu.ru [212.192.164.39]) by mx1.freebsd.org (Postfix) with ESMTP id 3F7588FC17 for ; Fri, 14 Oct 2011 11:20:37 +0000 (UTC) Received: from regency.nsu.ru ([193.124.210.26]) by mx.nsu.ru with esmtp (Exim 4.69) (envelope-from ) id 1REfo1-00083R-Ua; Fri, 14 Oct 2011 18:20:02 +0700 Received: from regency.nsu.ru (localhost [127.0.0.1]) by regency.nsu.ru (8.14.2/8.14.2) with ESMTP id p9EBM2lc089310; Fri, 14 Oct 2011 18:22:02 +0700 (NOVST) (envelope-from danfe@regency.nsu.ru) Received: (from danfe@localhost) by regency.nsu.ru (8.14.2/8.14.2/Submit) id p9EBLkvA089146; Fri, 14 Oct 2011 18:21:46 +0700 (NOVST) (envelope-from danfe) Date: Fri, 14 Oct 2011 18:21:46 +0700 From: Alexey Dokuchaev To: Gleb Kurtsou Message-ID: <20111014112146.GA80058@regency.nsu.ru> References: <20111013181602.GA35354@regency.nsu.ru> <20111014085405.GA3711@tops> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20111014085405.GA3711@tops> User-Agent: Mutt/1.4.2.1i Cc: fs@freebsd.org Subject: Re: Call for msdosfs/ntfs experts (or better, maintainers) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 11:20:38 -0000 On Fri, Oct 14, 2011 at 11:54:05AM +0300, Gleb Kurtsou wrote: > > Case 2. Apparently, Apple actually released their NTFS implementation > > under BSD license which seems quite worthy to take a look at: > > > > http://opensource.apple.com/source/ntfs/ntfs-78/kext/ > > > Sounds very interesting. I'd peek it up and do a FreeBSD port, but I > won't be able to start at least in two months. I've just started > reviewing and testing this year FUSE GSoC project. It works quite ok for > me (ntfs, encfs), no more random panics etc. > > Question remains if it's worth doing the port from darwin if we have > working fuse ntfs. Real in-kernel FS implementation should be faster; also, some people do not like extra level (that is, fuse) sitting between their data and system. > How mature darwin implementation is? Is there public source repository > with history? If this code is what they actually use in Mac OS X, I would assume it should be at least not worse than ours. :-) As for public repo, perhaps it could be arranged if we contact Apple and/or Anton Altaparmakov (assuming he is the author of this code). ./danfe From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 12:33:21 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 94B42106564A; Fri, 14 Oct 2011 12:33:21 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4]) by mx1.freebsd.org (Postfix) with ESMTP id 527DF8FC15; Fri, 14 Oct 2011 12:33:21 +0000 (UTC) Received: from elsa.codelab.cz (localhost [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 7AE4228435; Fri, 14 Oct 2011 14:33:20 +0200 (CEST) Received: from [192.168.1.2] (ip-86-49-61-235.net.upcbroadband.cz [86.49.61.235]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id 56B9F28434; Fri, 14 Oct 2011 14:33:19 +0200 (CEST) Message-ID: <4E982C0E.2060900@quip.cz> Date: Fri, 14 Oct 2011 14:33:18 +0200 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.1.19) Gecko/20110420 Lightning/1.0b1 SeaMonkey/2.0.14 MIME-Version: 1.0 To: Ivan Voras References: <4E97FEDD.7060205@quip.cz> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: dirhash and dynamic memory allocation X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 12:33:21 -0000 Ivan Voras wrote: > On 14/10/2011 11:20, Miroslav Lachman wrote: >> Hi all, >> >> I tried some tuning of dirhash on our servers and after googlig a bit, I >> found an old GSoC project wiki page about Dynamic Memory Allocation for >> Dirhash: http://wiki.freebsd.org/DirhashDynamicMemory >> Is there any reason not to use it / not commit it to HEAD? > > AFAIK it's sort-of already present. In 8-stable and recent kernels you > can give huge amounts of memory to dirhash via vfs.ufs.dirhash_maxmem > (but except in really large edge cases I don't think you *need* more > than 32 MB), and the kernel will scale-down or free the memory if not > needed. Is this change documented somewhere? Maybe it could be noticed on DirhashDynamicMemory wiki page. Otherwise it seems as abandoned GSoC project. > In effect, vfs.ufs.dirhash_maxmem is the upper limit - the kernel will > use less and will free the allocated memory in low memory situations > (which I've tried and it works). > >> And second question - is there any negative impact with higher >> vfs.ufs.dirhash_maxmem? It stil defaults to 2MB (on FreeBSD 8.2) after > > Not that I know of. > >> 10 years, but I think we all are using bigger FS in these days with lot >> of files and directories and 2MB is not enough. > > AFAIK I've changed it to autotune so it's configured to approximately 4 > MB on a 4 GB machine (and scales up) in 9. I don't have 9 installed to test it. Only 8-STABLE. Hope I will test it soon. Thank you for your informations! Miroslav Lachman From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 13:52:23 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 18CD11065672 for ; Fri, 14 Oct 2011 13:52:23 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from QMTA11.westchester.pa.mail.comcast.net (qmta11.westchester.pa.mail.comcast.net [76.96.59.211]) by mx1.freebsd.org (Postfix) with ESMTP id BA04E8FC15 for ; Fri, 14 Oct 2011 13:52:22 +0000 (UTC) Received: from omta22.westchester.pa.mail.comcast.net ([76.96.62.73]) by QMTA11.westchester.pa.mail.comcast.net with comcast id kbvv1h0081ap0As5BdsNoV; Fri, 14 Oct 2011 13:52:23 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta22.westchester.pa.mail.comcast.net with comcast id kdsM1h00x1t3BNj3idsN1E; Fri, 14 Oct 2011 13:52:22 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 475C5102C1C; Fri, 14 Oct 2011 06:52:20 -0700 (PDT) Date: Fri, 14 Oct 2011 06:52:20 -0700 From: Jeremy Chadwick To: Miroslav Lachman <000.fbsd@quip.cz> Message-ID: <20111014135220.GA73637@icarus.home.lan> References: <4E97FEDD.7060205@quip.cz> <4E982C0E.2060900@quip.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4E982C0E.2060900@quip.cz> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org, Ivan Voras Subject: Re: dirhash and dynamic memory allocation X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 13:52:23 -0000 On Fri, Oct 14, 2011 at 02:33:18PM +0200, Miroslav Lachman wrote: > > > Ivan Voras wrote: > >On 14/10/2011 11:20, Miroslav Lachman wrote: > >>Hi all, > >> > >>I tried some tuning of dirhash on our servers and after googlig a bit, I > >>found an old GSoC project wiki page about Dynamic Memory Allocation for > >>Dirhash: http://wiki.freebsd.org/DirhashDynamicMemory > >>Is there any reason not to use it / not commit it to HEAD? > > > >AFAIK it's sort-of already present. In 8-stable and recent kernels you > >can give huge amounts of memory to dirhash via vfs.ufs.dirhash_maxmem > >(but except in really large edge cases I don't think you *need* more > >than 32 MB), and the kernel will scale-down or free the memory if not > >needed. > > Is this change documented somewhere? Maybe it could be noticed on > DirhashDynamicMemory wiki page. Otherwise it seems as abandoned GSoC > project. There is no real form of "documentation" for this kind of change, but I do remember it being discussed on the mailing list at some point (an announcement or something? I forget -- man it was a while ago). There's probably a commit message. I can dig it up if need be, but commit messages are often "Add Snake Wanglesnort support, I like samgyeopsal" and not much else. I'm not even sure what man page would be relevant to document such pieces. How about this? $ sysctl -a | grep dirhash vfs.ufs.dirhash_reclaimage: 5 vfs.ufs.dirhash_lowmemcount: 0 vfs.ufs.dirhash_docheck: 0 vfs.ufs.dirhash_mem: 1075212 vfs.ufs.dirhash_maxmem: 16777216 vfs.ufs.dirhash_minsize: 2560 $ for i in `sysctl -a | grep dirhash | cut -d: -f1`; do sysctl -d $i; done vfs.ufs.dirhash_reclaimage: max time in seconds of hash inactivity before deletion in low VM events vfs.ufs.dirhash_lowmemcount: number of times low memory hook called vfs.ufs.dirhash_docheck: enable extra sanity tests vfs.ufs.dirhash_mem: current dirhash memory usage vfs.ufs.dirhash_maxmem: maximum allowed dirhash memory usage vfs.ufs.dirhash_minsize: minimum directory size in bytes for which to use hashed lookup No idea what docheck does. But it looks like reclaimage (that's "reclaim age") is a pretty good indicator of what Ivan is describing. So basically, if you're worried about how much memory dirhash is taking up, you could log/graph vfs.ufs.dirhash_mem (unit = bytes). > >In effect, vfs.ufs.dirhash_maxmem is the upper limit - the kernel will > >use less and will free the allocated memory in low memory situations > >(which I've tried and it works). > > > >>And second question - is there any negative impact with higher > >>vfs.ufs.dirhash_maxmem? It stil defaults to 2MB (on FreeBSD 8.2) after > > > >Not that I know of. > > > >>10 years, but I think we all are using bigger FS in these days with lot > >>of files and directories and 2MB is not enough. > > > >AFAIK I've changed it to autotune so it's configured to approximately 4 > >MB on a 4 GB machine (and scales up) in 9. > > I don't have 9 installed to test it. Only 8-STABLE. Hope I will test > it soon. The above data I showed was taken from a RELENG_8 box. The dirhash change in question I believe was either implemented in very early 8.x, or very late 7.x. Again: I can dig up the commit if requested. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 14:07:36 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D6CE3106564A for ; Fri, 14 Oct 2011 14:07:36 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74]) by mx1.freebsd.org (Postfix) with ESMTP id 813D88FC16 for ; Fri, 14 Oct 2011 14:07:36 +0000 (UTC) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id p9EE7Z8i018557; Fri, 14 Oct 2011 09:07:35 -0500 (CDT) Date: Fri, 14 Oct 2011 09:07:35 -0500 (CDT) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: Jeremy Chadwick In-Reply-To: <20111014090000.GA66602@icarus.home.lan> Message-ID: References: <4E97F710.8000004@fsn.hu> <20111014090000.GA66602@icarus.home.lan> User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (blade.simplesystems.org [65.66.246.90]); Fri, 14 Oct 2011 09:07:35 -0500 (CDT) Cc: freebsd-fs@freebsd.org Subject: Re: cache devices come up as dsk/original_device_name in zpools X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 14:07:36 -0000 On Fri, 14 Oct 2011, Jeremy Chadwick wrote: > > I'm not sure how to go about doing that, especially on a system which > lacks /boot/zfs/zpool.cache. A one-time "zpool export" then a reboot, I > imagine, would suffice, but I'm not sure if export actually changes the > metadata on the disk itself or just updates the zpool.cache file. Clearly export changes disk metadata since a zfs pool knows if it is (was) currently imported on a machine and will complain if an attempt is made to import the pool on another machine without it being exported first. Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 14:50:38 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4F843106567B for ; Fri, 14 Oct 2011 14:50:38 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id 0F0E28FC1A for ; Fri, 14 Oct 2011 14:50:37 +0000 (UTC) Received: by gyd8 with SMTP id 8so1533536gyd.13 for ; Fri, 14 Oct 2011 07:50:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=18D+nV6xM8IixPxl5ZCXus4ohDyRJy1f+Fa5QDAxEeo=; b=wMS2/zN5PvfXwEd2OD9HPKmIu7bptj9TRi6rGxVU+eC9RTsO0uMZsUp8bjW7Ju497Q NITmaJvd6LvXbagG5IVdRmutnT0P1jhk/TM9/0xnU66gfcMqi/SExvgkmYANdbVLy93I vJPlfIt1SV6ZG8kLtH3BIlimDzOz1+WlV/S4g= MIME-Version: 1.0 Received: by 10.236.187.101 with SMTP id x65mr12364384yhm.63.1318603837498; Fri, 14 Oct 2011 07:50:37 -0700 (PDT) Sender: artemb@gmail.com Received: by 10.236.103.33 with HTTP; Fri, 14 Oct 2011 07:50:37 -0700 (PDT) In-Reply-To: <4E97D24C.4010606@o2.pl> References: <20111013120032.D6BA71065760@hub.freebsd.org> <4E97D24C.4010606@o2.pl> Date: Fri, 14 Oct 2011 07:50:37 -0700 X-Google-Sender-Auth: qqWSdnnf0BWaDCw5MwoSeNeNjWU Message-ID: From: Artem Belevich To: =?ISO-8859-2?Q?Radio_m=B3odych_bandyt=F3w?= Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: ZFS/compression/performance X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 14:50:38 -0000 2011/10/13 Radio m=B3odych bandyt=F3w : > On 2011-10-13 14:00, freebsd-fs-request@freebsd.org wrote: >> >> An option is not too compress with ZFS rather directly with gzip however= I >> would still need lots of temporary storage for manipulation, which is wh= at >> I am doing now (e.g., sort). Processing with zcat isn't always a good >> solution because some applications want files, but you have to do what y= ou >> have to do. > > It seems that with your data gzipping directly is a better option. Though= I > suggest that you experiment with codecs that support larger dictionary, i= .e. > 7zip, I expect that you would see huge strength improvement with somethin= g > like 7z a -mx=3D1 -md=3D26 out.7z in. You can use higher -md values if yo= u have > enough memory, compression mode 1 (mx=3D1) uses 4,5*2^md bytes of RAM, so= if > my maths is good, md=3D26 uses ~288 MB. If LZMA is too slow, you can at l= east > try 7-zip's deflate64. It's not great, but not as bad as gzip. Yup. Stand-alone archiver may work better. ZFS compression works on blocks. Subsequent blocks can't benefit from the data gathered compressing preceding block, so overall compression rate with ZFS would be lower than that of stand-alone gzip with the same compression level. On the other hand, ZFS will parallelize compression and on multi-core machine it would compress the same amount of data in less time than single instance of gzip would. --Artem From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 15:23:11 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BA9D7106564A for ; Fri, 14 Oct 2011 15:23:11 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from QMTA11.westchester.pa.mail.comcast.net (qmta11.westchester.pa.mail.comcast.net [76.96.59.211]) by mx1.freebsd.org (Postfix) with ESMTP id 67A848FC08 for ; Fri, 14 Oct 2011 15:23:11 +0000 (UTC) Received: from omta22.westchester.pa.mail.comcast.net ([76.96.62.73]) by QMTA11.westchester.pa.mail.comcast.net with comcast id kf5h1h0031ap0As5BfPBWJ; Fri, 14 Oct 2011 15:23:11 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta22.westchester.pa.mail.comcast.net with comcast id kfPA1h03z1t3BNj3ifPBni; Fri, 14 Oct 2011 15:23:11 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id A4156102C1C; Fri, 14 Oct 2011 08:23:09 -0700 (PDT) Date: Fri, 14 Oct 2011 08:23:09 -0700 From: Jeremy Chadwick To: freebsd-fs@freebsd.org Message-ID: <20111014152309.GA75162@icarus.home.lan> References: <20111012165126.GA26562@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20111012165126.GA26562@icarus.home.lan> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: ZFS/compression/performance X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 15:23:11 -0000 On Wed, Oct 12, 2011 at 09:51:26AM -0700, Jeremy Chadwick wrote: > On Wed, Oct 12, 2011 at 09:37:10AM -0700, David Brodbeck wrote: > > On Wed, Oct 12, 2011 at 5:02 AM, Johannes Totz wrote: > > > > > I just did a simple write test yesterday: > > > > > > 1) 6 MB/sec for gzip, 1.36x ratio > > > 2) 34 MB/sec for lzjb, 1.23x ratio > > > > > > I'll stick with lzjb. It's good enough to get rid of most of the > > > redundancy and speed is acceptable. > > > > > > > That's what we use on our text-heavy filesystems on our OpenSolaris server. > > (We work with large text corpora, so we have hundreds of gigabytes of pure > > text.) My benchmarks showed the performance hit for reads is nonexistent > > when viewed over NFS, and the performance hit for writes is relatively > > small...plus we don't write to that filesystem much. We see about 1.5x > > compression overall, with a little over 2x on some datasets that are > > particularly compressible. > > That might be the case on OpenSolaris but the performance hit on > FreeBSD RELENG_8 is very high -- enough that enabling compression (using > the defaults) causes stalls when I/O occurs (easily noticeable across > SSH; characters are delayed/stalled (not buffered)), etc.. > > The last time I tried it on RELENG_8 was right after ZFSv28 was MFC'd. > If things have improved I can try again (I don't remember seeing any > commits that could affect this), or if people really think changing the > compression model to lzjb will help. Follow-up: Tried this out on RELENG_8 (2011/09/28) with ZFS v28 filesystems: zfs create -o mountpoint=/test -o compression=lzjb data/test cd /test dd if=/dev/urandom of=testfile bs=16k Then in another SSH session to the machine, held down a single key at my bash prompt. Every 3 seconds, like clockwork, SSH I/O would stall/drop for about ~0.4 seconds. CPU in the system is an Intel C2D E8400 (3GHz), with ULE scheduler in use. Then I did this: rm testfile zfs set compression=none data/test dd if=/dev/urandom of=testfile bs=16k And repeated the procedure: no stalls. Then I tried using gzip-1: even worse. The stalls were about 3-4 full seconds long. I imagine this is expected since gzip is much slower than lzjb. The important thing to note here is that the entire machine appears to spin hard when ZFS compression is in use. Even things like switching virtual consoles (Alt-Fx) lock up until the compression bits do whatever they need to do. I will try to find a Solaris 10 test system at work later today and tinker with compression there to see if it behaves the same, but given what Bob described below I'm doubting it will: http://lists.freebsd.org/pipermail/freebsd-fs/2011-October/012726.html Just an FYI for folks considering use of ZFS compression on RELENG_8 as of this writing. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 16:34:53 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 09A88106566C; Fri, 14 Oct 2011 16:34:53 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4]) by mx1.freebsd.org (Postfix) with ESMTP id 57C198FC08; Fri, 14 Oct 2011 16:34:51 +0000 (UTC) Received: from elsa.codelab.cz (localhost [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 1B7D22842C; Fri, 14 Oct 2011 18:34:50 +0200 (CEST) Received: from [192.168.1.2] (ip-86-49-61-235.net.upcbroadband.cz [86.49.61.235]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id 4C81E2842A; Fri, 14 Oct 2011 18:34:48 +0200 (CEST) Message-ID: <4E9864A7.9030209@quip.cz> Date: Fri, 14 Oct 2011 18:34:47 +0200 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.1.19) Gecko/20110420 Lightning/1.0b1 SeaMonkey/2.0.14 MIME-Version: 1.0 To: Jeremy Chadwick References: <4E97FEDD.7060205@quip.cz> <4E982C0E.2060900@quip.cz> <20111014135220.GA73637@icarus.home.lan> In-Reply-To: <20111014135220.GA73637@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, Ivan Voras Subject: Re: dirhash and dynamic memory allocation X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 16:34:53 -0000 Jeremy Chadwick wrote: > On Fri, Oct 14, 2011 at 02:33:18PM +0200, Miroslav Lachman wrote: >> >> >> Ivan Voras wrote: >>> On 14/10/2011 11:20, Miroslav Lachman wrote: >>>> Hi all, >>>> >>>> I tried some tuning of dirhash on our servers and after googlig a bit, I >>>> found an old GSoC project wiki page about Dynamic Memory Allocation for >>>> Dirhash: http://wiki.freebsd.org/DirhashDynamicMemory >>>> Is there any reason not to use it / not commit it to HEAD? >>> >>> AFAIK it's sort-of already present. In 8-stable and recent kernels you >>> can give huge amounts of memory to dirhash via vfs.ufs.dirhash_maxmem >>> (but except in really large edge cases I don't think you *need* more >>> than 32 MB), and the kernel will scale-down or free the memory if not >>> needed. >> >> Is this change documented somewhere? Maybe it could be noticed on >> DirhashDynamicMemory wiki page. Otherwise it seems as abandoned GSoC >> project. > > There is no real form of "documentation" for this kind of change, but I > do remember it being discussed on the mailing list at some point (an > announcement or something? I forget -- man it was a while ago). I didn't mean real doc (man page or handbook), but just some official place (release notes?) stating the change of the dirhash behavior. > There's probably a commit message. I can dig it up if need be, but > commit messages are often "Add Snake Wanglesnort support, I like > samgyeopsal" and not much else. I'm not even sure what man page would > be relevant to document such pieces. How about this? > > $ sysctl -a | grep dirhash > vfs.ufs.dirhash_reclaimage: 5 > vfs.ufs.dirhash_lowmemcount: 0 > vfs.ufs.dirhash_docheck: 0 > vfs.ufs.dirhash_mem: 1075212 > vfs.ufs.dirhash_maxmem: 16777216 > vfs.ufs.dirhash_minsize: 2560 > > $ for i in `sysctl -a | grep dirhash | cut -d: -f1`; do sysctl -d $i; done > vfs.ufs.dirhash_reclaimage: max time in seconds of hash inactivity before deletion in low VM events > vfs.ufs.dirhash_lowmemcount: number of times low memory hook called > vfs.ufs.dirhash_docheck: enable extra sanity tests > vfs.ufs.dirhash_mem: current dirhash memory usage > vfs.ufs.dirhash_maxmem: maximum allowed dirhash memory usage > vfs.ufs.dirhash_minsize: minimum directory size in bytes for which to use hashed lookup Yes, I did this listing in the past when I tried tuning of dirhash on heavy loaded servers. > No idea what docheck does. But it looks like reclaimage (that's > "reclaim age") is a pretty good indicator of what Ivan is describing. I don't know the real purpose of docheck too, but it is mentioned in 10 years old paper http://www.maths.tcd.ie/~dwmalone/p/bsdcon01.pdf "Make sure vfs.ufs.dirhash docheck set to 0." Nothing more. > So basically, if you're worried about how much memory dirhash is taking > up, you could log/graph vfs.ufs.dirhash_mem (unit = bytes). I know this, my concern (and surprise) is that after some change, default value of dirhash_maxmem remains on 2MB, that's why I think that dynamic allocation GSoC was not commited. >>> In effect, vfs.ufs.dirhash_maxmem is the upper limit - the kernel will >>> use less and will free the allocated memory in low memory situations >>> (which I've tried and it works). >>> >>>> And second question - is there any negative impact with higher >>>> vfs.ufs.dirhash_maxmem? It stil defaults to 2MB (on FreeBSD 8.2) after >>> >>> Not that I know of. >>> >>>> 10 years, but I think we all are using bigger FS in these days with lot >>>> of files and directories and 2MB is not enough. >>> >>> AFAIK I've changed it to autotune so it's configured to approximately 4 >>> MB on a 4 GB machine (and scales up) in 9. >> >> I don't have 9 installed to test it. Only 8-STABLE. Hope I will test >> it soon. > > The above data I showed was taken from a RELENG_8 box. The dirhash > change in question I believe was either implemented in very early 8.x, > or very late 7.x. Again: I can dig up the commit if requested. Data is same for one of our old 7.2 box. And not much different for 4.11 (default values of min and max) sysctl -a | grep dirhash vfs.ufs.dirhash_minsize: 2560 vfs.ufs.dirhash_maxmem: 2097152 vfs.ufs.dirhash_mem: 1954584 vfs.ufs.dirhash_docheck: 0 And this is from already tuned 7.3 sysctl -a | grep dirhash vfs.ufs.dirhash_reclaimage: 5 vfs.ufs.dirhash_lowmemcount: 20823 vfs.ufs.dirhash_docheck: 0 vfs.ufs.dirhash_mem: 5122176 vfs.ufs.dirhash_maxmem: 8388608 vfs.ufs.dirhash_minsize: 2560 When I read the wiki page about dynamic allocation, I thought there must be some bigger change in sysctl (no maxmem at all, or much higher default value). The current state requires me to manually raise maxmem to 8M (or more) on almost all our webservers and fileservers. That's why I asked the question on the list, because nothing is indicating the change in dirhash. (Ivan already write about changes, so my question is answered :} ) Anyway - thank you for your reply. Miroslav Lachman From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 17:15:26 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9F42C106564A for ; Fri, 14 Oct 2011 17:15:26 +0000 (UTC) (envelope-from prvs=1268c3f8f1=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id 14B4E8FC0C for ; Fri, 14 Oct 2011 17:15:25 +0000 (UTC) X-MDAV-Processed: mail1.multiplay.co.uk, Fri, 14 Oct 2011 18:04:56 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Fri, 14 Oct 2011 18:04:56 +0100 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on mail1.multiplay.co.uk X-Spam-Level: X-Spam-Status: No, score=-5.0 required=6.0 tests=USER_IN_WHITELIST shortcircuit=ham autolearn=disabled version=3.2.5 Received: from r2d2 ([188.220.16.49]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50015582991.msg for ; Fri, 14 Oct 2011 18:04:55 +0100 X-MDRemoteIP: 188.220.16.49 X-Return-Path: prvs=1268c3f8f1=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk X-MDaemon-Deliver-To: freebsd-fs@freebsd.org Message-ID: <480AA0893BC748759163FE8D677EF0D8@multiplay.co.uk> From: "Steven Hartland" To: "Jeremy Chadwick" , References: <20111012165126.GA26562@icarus.home.lan> <20111014152309.GA75162@icarus.home.lan> Date: Fri, 14 Oct 2011 18:04:46 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6109 Cc: Subject: Re: ZFS/compression/performance X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 17:15:26 -0000 ----- Original Message ----- From: "Jeremy Chadwick" To: Sent: Friday, October 14, 2011 4:23 PM Subject: Re: ZFS/compression/performance > On Wed, Oct 12, 2011 at 09:51:26AM -0700, Jeremy Chadwick wrote: >> On Wed, Oct 12, 2011 at 09:37:10AM -0700, David Brodbeck wrote: >> > On Wed, Oct 12, 2011 at 5:02 AM, Johannes Totz wrote: >> > >> > > I just did a simple write test yesterday: >> > > >> > > 1) 6 MB/sec for gzip, 1.36x ratio >> > > 2) 34 MB/sec for lzjb, 1.23x ratio >> > > >> > > I'll stick with lzjb. It's good enough to get rid of most of the >> > > redundancy and speed is acceptable. >> > > >> > >> > That's what we use on our text-heavy filesystems on our OpenSolaris server. >> > (We work with large text corpora, so we have hundreds of gigabytes of pure >> > text.) My benchmarks showed the performance hit for reads is nonexistent >> > when viewed over NFS, and the performance hit for writes is relatively >> > small...plus we don't write to that filesystem much. We see about 1.5x >> > compression overall, with a little over 2x on some datasets that are >> > particularly compressible. >> >> That might be the case on OpenSolaris but the performance hit on >> FreeBSD RELENG_8 is very high -- enough that enabling compression (using >> the defaults) causes stalls when I/O occurs (easily noticeable across >> SSH; characters are delayed/stalled (not buffered)), etc.. >> >> The last time I tried it on RELENG_8 was right after ZFSv28 was MFC'd. >> If things have improved I can try again (I don't remember seeing any >> commits that could affect this), or if people really think changing the >> compression model to lzjb will help. > > Follow-up: > > Tried this out on RELENG_8 (2011/09/28) with ZFS v28 filesystems: > > zfs create -o mountpoint=/test -o compression=lzjb data/test > cd /test > dd if=/dev/urandom of=testfile bs=16k > > Then in another SSH session to the machine, held down a single key at my > bash prompt. Every 3 seconds, like clockwork, SSH I/O would stall/drop > for about ~0.4 seconds. CPU in the system is an Intel C2D E8400 (3GHz), > with ULE scheduler in use. > > Then I did this: > > rm testfile > zfs set compression=none data/test > dd if=/dev/urandom of=testfile bs=16k > > And repeated the procedure: no stalls. > > Then I tried using gzip-1: even worse. The stalls were about 3-4 full > seconds long. I imagine this is expected since gzip is much slower than > lzjb. > > The important thing to note here is that the entire machine appears to > spin hard when ZFS compression is in use. Even things like switching > virtual consoles (Alt-Fx) lock up until the compression bits do whatever > they need to do. > > I will try to find a Solaris 10 test system at work later today and > tinker with compression there to see if it behaves the same, but given > what Bob described below I'm doubting it will: > > http://lists.freebsd.org/pipermail/freebsd-fs/2011-October/012726.html > > Just an FYI for folks considering use of ZFS compression on RELENG_8 as > of this writing. We use compression heavily here on 8.2-RELEASE and cant say we've noticed such a behaviour will do some testing to see if we can reproduce here as well. Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 17:44:20 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 15ABE106564A for ; Fri, 14 Oct 2011 17:44:20 +0000 (UTC) (envelope-from bf1783@googlemail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id A64668FC14 for ; Fri, 14 Oct 2011 17:44:19 +0000 (UTC) Received: by wyj26 with SMTP id 26so4558572wyj.13 for ; Fri, 14 Oct 2011 10:44:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=mime-version:reply-to:date:message-id:subject:from:to:content-type; bh=HJtsqBjSXImBCRECqOjZKrhhvHjcs/CN88dRLc/S39Y=; b=mV7Z4PJNVa2R5A/B0OLkUbe5AW63yxqsyzPwV+rrPbQ5Wci8EtBMU4n46wWye/Jf4g 1cyc4Y5ToPTsQkjvV2nPcAu3zHO7xLI3/jyqQCFCt7uAoQ+MhnYbVbtFA8TRSVTZU5RU NlIErNh7gRtcRF7/OABn123+qroFp3ORqn0qo= MIME-Version: 1.0 Received: by 10.227.199.197 with SMTP id et5mr3300047wbb.89.1318612621285; Fri, 14 Oct 2011 10:17:01 -0700 (PDT) Received: by 10.180.105.164 with HTTP; Fri, 14 Oct 2011 10:17:01 -0700 (PDT) Date: Fri, 14 Oct 2011 13:17:01 -0400 Message-ID: From: "b. f." To: freebsd-fs@FreeBSD.org, Miroslav Lachman <000.fbsd@quip.cz> Content-Type: text/plain; charset=ISO-8859-1 Cc: Subject: Re: dirhash and dynamic memory allocation X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: bf1783@gmail.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 17:44:20 -0000 Miroslav Lachman wrote: > Jeremy Chadwick wrote: > > On Fri, Oct 14, 2011 at 02:33:18PM +0200, Miroslav Lachman wrote: > >> Ivan Voras wrote: > >>> On 14/10/2011 11:20, Miroslav Lachman wrote: ... > >>>> I tried some tuning of dirhash on our servers and after googlig a bit, I > >>>> found an old GSoC project wiki page about Dynamic Memory Allocation for > >>>> Dirhash: http://wiki.freebsd.org/DirhashDynamicMemory > >>>> Is there any reason not to use it / not commit it to HEAD? ... > >> Is this change documented somewhere? Maybe it could be noticed on > >> DirhashDynamicMemory wiki page. Otherwise it seems as abandoned GSoC > >> project. > > > > There is no real form of "documentation" for this kind of change, but I > > do remember it being discussed on the mailing list at some point (an > > announcement or something? I forget -- man it was a while ago). > > I didn't mean real doc (man page or handbook), but just some official > place (release notes?) stating the change of the dirhash behavior. >From the page you cited: "Get code in a state suitable for being committed to -CURRENT. Done ... (2009-7-7) I committed the dirhash vm_lowmem handler to -CURRENT about a month ago, and it will be included in 8.0-RELEASE. Also I plan to commit a backport of these changes to 7-STABLE, probably around September." >From http://www.FreeBSD.org/releases/7.3R/relnotes-detailed.html : "UFS_DIRHASH (enabled by default) now supports removing the cache data when the system memory is low (via vm_lowmem event handler). A bug that the system caused a panic when decreasing a sysctl variable vfs.ufs.dirhash_maxmem below the current amount of memory used by UFS_DIRHASH, has been fixed." In the commit logs: http://svnweb.FreeBSD.org/base/head/sys/ufs/ufs/ufs_dirhash.c?view=log So the changes have been documented. Perhaps not in exhaustive detail, but enough to provide a basis for further inquiry. And as someone pointed out, there are the suggestively-named OIDs, and their descriptions. b. From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 17:58:51 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AC75A106566B for ; Fri, 14 Oct 2011 17:58:51 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74]) by mx1.freebsd.org (Postfix) with ESMTP id 730E08FC0A for ; Fri, 14 Oct 2011 17:58:51 +0000 (UTC) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id p9EHwo2Q020592; Fri, 14 Oct 2011 12:58:50 -0500 (CDT) Date: Fri, 14 Oct 2011 12:58:50 -0500 (CDT) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: Jeremy Chadwick In-Reply-To: <20111014152309.GA75162@icarus.home.lan> Message-ID: References: <20111012165126.GA26562@icarus.home.lan> <20111014152309.GA75162@icarus.home.lan> User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (blade.simplesystems.org [65.66.246.90]); Fri, 14 Oct 2011 12:58:50 -0500 (CDT) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS/compression/performance X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 17:58:51 -0000 On Fri, 14 Oct 2011, Jeremy Chadwick wrote: > > I will try to find a Solaris 10 test system at work later today and > tinker with compression there to see if it behaves the same, but given > what Bob described below I'm doubting it will: > > http://lists.freebsd.org/pipermail/freebsd-fs/2011-October/012726.html It needs to be a new enough Solaris 10 release. Perhaps U8 or U8 plus a kernel patch. If you run 'prstat' you will see at least one 'zpool-poolname' thread per pool. I notice that these are all priority 99 (very high) on my system while it is idle. Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 19:00:03 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 037FD1065673 for ; Fri, 14 Oct 2011 19:00:02 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4]) by mx1.freebsd.org (Postfix) with ESMTP id A92568FC08 for ; Fri, 14 Oct 2011 19:00:01 +0000 (UTC) Received: from elsa.codelab.cz (localhost [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 8E7DE28426; Fri, 14 Oct 2011 20:59:59 +0200 (CEST) Received: from [192.168.1.2] (ip-86-49-61-235.net.upcbroadband.cz [86.49.61.235]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id 44A2328432; Fri, 14 Oct 2011 20:59:58 +0200 (CEST) Message-ID: <4E9886AD.3080207@quip.cz> Date: Fri, 14 Oct 2011 20:59:57 +0200 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.1.19) Gecko/20110420 Lightning/1.0b1 SeaMonkey/2.0.14 MIME-Version: 1.0 To: bf1783@gmail.com References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org Subject: Re: dirhash and dynamic memory allocation X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 19:00:03 -0000 b. f. wrote: > Miroslav Lachman wrote: [...] >> I didn't mean real doc (man page or handbook), but just some official >> place (release notes?) stating the change of the dirhash behavior. > >> From the page you cited: > > "Get code in a state suitable for being committed to -CURRENT. Done > ... > (2009-7-7) I committed the dirhash vm_lowmem handler to -CURRENT about > a month ago, and it will be included in 8.0-RELEASE. Also I plan to > commit a backport of these changes to 7-STABLE, probably around > September." OK, I overlookd this sentence that it is already committed >> From http://www.FreeBSD.org/releases/7.3R/relnotes-detailed.html : > > "UFS_DIRHASH (enabled by default) now supports removing the cache data > when the system memory is low (via vm_lowmem event handler). A bug > that the system caused a panic when decreasing a sysctl variable > vfs.ufs.dirhash_maxmem below the current amount of memory used by > UFS_DIRHASH, has been fixed." This message sounds like bug fix message to me (I am not native english speaker, so it is rather my problem, and you are right, that it was documented) > In the commit logs: > > http://svnweb.FreeBSD.org/base/head/sys/ufs/ufs/ufs_dirhash.c?view=log > > So the changes have been documented. Perhaps not in exhaustive > detail, but enough to provide a basis for further inquiry. And as > someone pointed out, there are the suggestively-named OIDs, and their > descriptions. The OIDs and their description... well, max, min etc. is suggestive, but I was not sure about "lowmemcount" > sysctl -d vfs.ufs.dirhash_lowmemcount vfs.ufs.dirhash_lowmemcount: number of times low memory hook called Was it meant as "number of times, where dirhash_mem is equal to dirhash_maxmem and new entries cannot be added" ... or rather "number of times the system was low on free memory, so some dirhash entries were removed and some dirhash memory freed"? I initially thought the first, but now I think the second. Thank you for your detailed explanation and links. Shame on me! (this week was hard in work... and it's friday) Miroslav Lachman From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 20:12:33 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 393C5106566C for ; Fri, 14 Oct 2011 20:12:33 +0000 (UTC) (envelope-from kraduk@gmail.com) Received: from mail-yx0-f177.google.com (mail-yx0-f177.google.com [209.85.213.177]) by mx1.freebsd.org (Postfix) with ESMTP id EC5228FC1A for ; Fri, 14 Oct 2011 20:12:32 +0000 (UTC) Received: by yxk36 with SMTP id 36so1539324yxk.8 for ; Fri, 14 Oct 2011 13:12:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=vMHeI19K2zwl1c4b5WY+uWFr/o8T4p2loGJHu9QH1iY=; b=dnCLflGh/B8Mh3jdo6kQZzsdcl9qy9jXwnijD9W9EyTzNpMAQNypMcW1prG8mvoKUK yj5JY24OPdhX2Eq1ixBkxUIplytPxxSud0c4Hp2RguwbWQC1o14u5LP+WlJVAEA25kuG 8iakpLp3axiQJ+2Q4vqTo/Cq0XzmKDbol8O3g= MIME-Version: 1.0 Received: by 10.236.79.231 with SMTP id i67mr14753110yhe.33.1318623152417; Fri, 14 Oct 2011 13:12:32 -0700 (PDT) Received: by 10.236.109.133 with HTTP; Fri, 14 Oct 2011 13:12:32 -0700 (PDT) In-Reply-To: <4E97D24C.4010606@o2.pl> References: <20111013120032.D6BA71065760@hub.freebsd.org> <4E97D24C.4010606@o2.pl> Date: Fri, 14 Oct 2011 21:12:32 +0100 Message-ID: From: krad To: =?ISO-8859-2?Q?Radio_m=B3odych_bandyt=F3w?= Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS/compression/performance X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 20:12:33 -0000 2011/10/14 Radio m=B3odych bandyt=F3w > On 2011-10-13 14:00, freebsd-fs-request@freebsd.org wrote: > >> An option is not too compress with ZFS rather directly with gzip however= I >> would still need lots of temporary storage for manipulation, which is wh= at >> I am doing now (e.g., sort). Processing with zcat isn't always a good >> solution because some applications want files, but you have to do what y= ou >> have to do. >> > It seems that with your data gzipping directly is a better option. Though= I > suggest that you experiment with codecs that support larger dictionary, i= .e. > 7zip, I expect that you would see huge strength improvement with somethin= g > like 7z a -mx=3D1 -md=3D26 out.7z in. You can use higher -md values if yo= u have > enough memory, compression mode 1 (mx=3D1) uses 4,5*2^md bytes of RAM, so= if > my maths is good, md=3D26 uses ~288 MB. If LZMA is too slow, you can at l= east > try 7-zip's deflate64. It's not great, but not as bad as gzip. > > -- > Twoje radio > > > ______________________________**_________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/**mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@**freebsd.org > " > if speed is an issue, make sure you get one of the multithreaded compressio= n utilitys as most arent and that can often be a bottlneck From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 21:15:08 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 38148106564A; Fri, 14 Oct 2011 21:15:08 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 0F02D8FC13; Fri, 14 Oct 2011 21:15:08 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9ELF7At008834; Fri, 14 Oct 2011 21:15:07 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9ELF7Uf008829; Fri, 14 Oct 2011 21:15:07 GMT (envelope-from linimon) Date: Fri, 14 Oct 2011 21:15:07 GMT Message-Id: <201110142115.p9ELF7Uf008829@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/161579: [smbfs] FreeBSD sometimes panics when an smb share is mounted and the serving machine is disconnected/rebooted X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 21:15:08 -0000 Old Synopsis: FreeBSD sometimes panics when an smb share is mounted and the serving machine is disconnected/rebooted New Synopsis: [smbfs] FreeBSD sometimes panics when an smb share is mounted and the serving machine is disconnected/rebooted Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Fri Oct 14 21:14:56 UTC 2011 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=161579 From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 21:21:05 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EB3AE106566C; Fri, 14 Oct 2011 21:21:05 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id C268D8FC0C; Fri, 14 Oct 2011 21:21:05 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9ELL5M7014106; Fri, 14 Oct 2011 21:21:05 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9ELL566014099; Fri, 14 Oct 2011 21:21:05 GMT (envelope-from linimon) Date: Fri, 14 Oct 2011 21:21:05 GMT Message-Id: <201110142121.p9ELL566014099@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/161511: [unionfs] Filesystem deadlocks when using multiple unionfs mounts on top of single filesystem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 21:21:06 -0000 Old Synopsis: Filesystem deadlocks when using multiple unionfs mounts on top of single filesystem New Synopsis: [unionfs] Filesystem deadlocks when using multiple unionfs mounts on top of single filesystem Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Fri Oct 14 21:20:54 UTC 2011 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=161511 From owner-freebsd-fs@FreeBSD.ORG Fri Oct 14 21:21:39 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 204E31065675; Fri, 14 Oct 2011 21:21:39 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id EBE1A8FC17; Fri, 14 Oct 2011 21:21:38 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9ELLc75016476; Fri, 14 Oct 2011 21:21:38 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9ELLcEm016464; Fri, 14 Oct 2011 21:21:38 GMT (envelope-from linimon) Date: Fri, 14 Oct 2011 21:21:38 GMT Message-Id: <201110142121.p9ELLcEm016464@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/161533: [zfs] [panic] zfs receive panic: system ioctl returning with 1 locks held X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2011 21:21:39 -0000 Old Synopsis: zfs receive panic: system ioctl returning with 1 locks held New Synopsis: [zfs] [panic] zfs receive panic: system ioctl returning with 1 locks held Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Fri Oct 14 21:21:28 UTC 2011 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=161533 From owner-freebsd-fs@FreeBSD.ORG Sat Oct 15 00:42:15 2011 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B773C106566B for ; Sat, 15 Oct 2011 00:42:15 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 70B5F8FC0C for ; Sat, 15 Oct 2011 00:42:15 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap8EAFXPmE6DaFvO/2dsb2JhbABDhHakdoFuAQEBAQMBAQEaBgQnHQMLGxgCAg0ZAikBCSYGCAcEARwEh2alVJF5gSyFOYEUBJFdghqRcQ X-IronPort-AV: E=Sophos;i="4.69,349,1315195200"; d="scan'208";a="141552640" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 14 Oct 2011 20:12:29 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 79DD1B3F2C; Fri, 14 Oct 2011 20:12:29 -0400 (EDT) Date: Fri, 14 Oct 2011 20:12:29 -0400 (EDT) From: Rick Macklem To: Gleb Kurtsou Message-ID: <241418811.3153064.1318637549468.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20111014085405.GA3711@tops> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: Alexey Dokuchaev , fs@freebsd.org Subject: Re: Call for msdosfs/ntfs experts (or better, maintainers) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Oct 2011 00:42:15 -0000 Gleb Kurtsou wrote: > On (14/10/2011 01:16), Alexey Dokuchaev wrote: > > Hello there, > > > > For quite a while already, our FAT and NTFS support need some love > > to > > shine on them, I believe. AFAICT, they currently have no maintainers > > and thus are not receiving proper care. > > > > Case 1. FreeBSD still has problems with UTF-8 locale and correct > > handling of e.g. Chinese characters in both filesystems. Patches > > were > > worked out to address this problem; they are available here: > > > > http://deadshot.googlecode.com/svn/trunk/freebsd-patch/filesystem/ > > > > PR kern/133174 was filed on 29 Mar 2009 with the original patch for > > msdosfs, which I've cleaned up a bit per style(9). No action was > > taken > > since then. > > > > I've contacted the original author of these patches. He's very > > collaborative and is eager to provide all the guidance required to > > review and include these changes in our code base. Any takers? It's > > a shame for us not to be on par with Apple and even OpenBSD/NetBSD > > (as > > I've been told, they support UTF-8 out of the box). > > > > Case 2. Apparently, Apple actually released their NTFS > > implementation > > under BSD license which seems quite worthy to take a look at: > > > > http://opensource.apple.com/source/ntfs/ntfs-78/kext/ > Sounds very interesting. I'd peek it up and do a FreeBSD port, but I > won't be able to start at least in two months. I've just started > reviewing and testing this year FUSE GSoC project. It works quite ok > for > me (ntfs, encfs), no more random panics etc. But I was told it > somewhat > reduces number of supported file systems. I'm going to publish it on > github soon. > > Question remains if it's worth doing the port from darwin if we have > working fuse ntfs. How mature darwin implementation is? Is there > public > source repository with history? > Just fyi, in case you haven't looked, the Mac OS X VFS/VOP interface is quite different from FreeBSD's these days, so the port won't be trivial. (The VFS/VOP diverged mostly during the Panther->Tiger era, so if it was stable in Panther with the BSD license, that might be the place to start. The Panther kernel sources are xnu-5nn, but I have no idea if there is a BSD licensed ntfs in the tarball.) As far as I know, there are no public repos from Apple. All there is are source tarballs that correspond to the production releases. However, it can't hurt to check in case ntfs is an exception? rick > attilio@ is working on MPSAFE NTFS: > http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS > > > > > Not the userland part (open source but under APSL only) but perhaps > > we > > don't really care about these parts since they are mostly mount > > utilities > > and are trivial to rewrite from scratch (quoting delphij@). > > > > Are there plans to make at least some use of this code? Our NTFS > > implementation right now loses considerably to other Unixen, and > > since > > its original author and de-jure maintainer is long gone from > > FreeBSD, we > > should do something about it. > > > > Not being a FS hacker myself, is there anything I can do to expedite > > the > > progress on these issues (apart from becoming one)? > > > > ./danfe > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to > > "freebsd-fs-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Sat Oct 15 18:41:15 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 58FBB106566C; Sat, 15 Oct 2011 18:41:15 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com [209.85.215.182]) by mx1.freebsd.org (Postfix) with ESMTP id D924C8FC0A; Sat, 15 Oct 2011 18:41:13 +0000 (UTC) Received: by eyd10 with SMTP id 10so2542175eyd.13 for ; Sat, 15 Oct 2011 11:41:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=from:to:cc:subject:references:x-comment-to:sender:date:message-id :user-agent:mime-version:content-type; bh=KyaBxG4AV2wYQu7b84aZ/EyfEfivtcMTiE3l8at5k+o=; b=ZysTP5QHscw7HbESapiNEQgecI/k1S3tTmSHw/o+mCCWFi2KNER1nqf5hUy1AcHw41 V5uxYbgGLAB3M31sHUsJvdBlM1/WPMEVWAvynWDrdbf+n+gtYiEdoCThAaQI06FkDST8 xP1b9X0FCAWCcmaoq0ykFr5LH6k6qtPihsKLI= Received: by 10.223.16.149 with SMTP id o21mr12319787faa.6.1318704072878; Sat, 15 Oct 2011 11:41:12 -0700 (PDT) Received: from localhost ([95.69.173.122]) by mx.google.com with ESMTPS id r6sm11923289fam.0.2011.10.15.11.41.08 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 15 Oct 2011 11:41:09 -0700 (PDT) From: Mikolaj Golub To: Robert Millan References: <201108102152.p7ALqUl4075207@red.freebsd.org> <201108102200.p7AM0Nu9026320@freefall.freebsd.org> <86k48wz3mc.fsf@kopusha.home.net> X-Comment-To: Mikolaj Golub Sender: Mikolaj Golub Date: Sat, 15 Oct 2011 21:41:07 +0300 Message-ID: <86obxim724.fsf@kopusha.home.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Josef Karthauser , freebsd-bugs@freebsd.org, Adrian Chadd , freebsd-fs@freebsd.org, Robert Watson Subject: Re: kern/159663: sockets don't work though nullfs mounts X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Oct 2011 18:41:15 -0000 On Mon, 26 Sep 2011 00:58:03 +0300 Mikolaj Golub wrote to Robert Millan: MG> Hi, MG> On Sun, 25 Sep 2011 17:32:27 +0200 Robert Millan wrote: RM>> 2011/9/24 Robert Millan : >>> I found a thread from 2007 with further discussion about this problem: >>> >>> http://lists.freebsd.org/pipermail/freebsd-fs/2007-February/002669.html RM>> Hi, RM>> I've looked at the situation in a bit more detail, for now only with RM>> sockets in mind (not named pipes). My understanding is (please RM>> correct me if I'm wrong): RM>> - nullfs holds reference counts for each vnode, but sockets have their RM>> own mechanism for reference counting (so_count / soref / sorele). RM>> vnode reference counting doesn't protect against socket being closed, RM>> which would leave a stale pointer in the upper nullfs layer. RM>> - Increasing the reference count of the socket itself can't be done in RM>> null_nodeget() because this function is merely a getter whose call RM>> doesn't indicate any meaningful event. RM>> - It's not clear to me that there's any event in time where the socket RM>> reference can be increased. If mounting a nullfs were that event, RM>> then all existing sockets would be soref'ed but we wouldn't be RM>> soref'ing future sockets created in the lower layer after the mount. RM>> This doesn't seem correct. RM>> - Possible solution: null_nodeget() semantics are replaced with RM>> something that actually allows vnodes in the upper layer to be created RM>> and destroyed. RM>> - Possible solution: upper layer has a memory structure to keep track RM>> of which sockets in the lower layer have been soref'ed. MG> It looks like there is no need in setting vp->v_un = lowervp->v_un for MG> VFIFO. They work without this modification bypassing vnode operations to lover MG> node and lowervp->v_un is used. MG> The issue is only with local sockets, because when bind or connnect is called MG> for nullfs file the upper v_un is used. MG> For me the approach "vp->v_un = lowervp->v_un" has many complications. May be MG> it is much easier to use always only lower vnode? What we need for this is to MG> make bind and connect get the lower vnode when they are called on nullfs file. Thinking more about "vp->v_un = lowervp->v_un" approach it looks for me that there should not be any coherency issues on contents of v_un between the two file system layers (the main worry about this approach in the thread mentioned above). Consider a scenario when binding to lower fs vnode and then connnecting to the upper fs path. On connect lookup returns nullfs node with: lvnp->v_un = bind_socket uvnp->v_un = bind_socket uvnp is locked (usecount is 1). bind_socket is used to establish the connection. After the connection is established uvnp is released by vput(), usecount is 0, so nullfs vnode is deactivated and destroyed. Thus uvnp->v_un has short time of life and it looks like it can't be stale during this time. When we bind to the upper fs vnode, in bind VOP_CREATE will return nullfs node with: lvnp->v_un = NULL uvnp->v_un = NULL bind sets uvnp->v_un, lvnp->v_un remains NULL. The nullfs node remains active until bind socket is closed, so on connect uvnp->v_un of this node is used. The connection to lower fs will return ECONNREFUSE. Thus I don't see a scenario when uvnp->v_un would be stale. I did some crash testing and did not manage to panic the system. But the issue is that if we bind to an upper fs path, we can't connect to the lower fs path. This behavior contradicts with overall nullfs behavior (all changes done on the upper layer are seen from the lower layer) and is more unionfs-like. That is why my proposal (return lover vnode instead of upper vnode in null_lookup and null_create if the vnode type is VSOCK) looks for me more interesting. But as I wrote it also has an issue: you can bind using the upper fs path and then unmount nullfs without force while the socket is still bound. The updated patch can be found here: http://people.freebsd.org/~trociny/nullfs.VSOCK.patch Anyway, for me any of these solutions, although not ideal, looks like better than having nothing at all, maybe just documenting the behavior in BUGS section. MG> As a proof of concept below is a patch that implements it. Currently I am not MG> sure that vrele/vref magic is done properly, but it looks like it works for MG> me. MG> The issues with this approach I see so far: MG> - we need an additional flag for namei; MG> - nullfs can be unmounted with a socket file still being opened. MG> -- MG> Mikolaj Golub -- Mikolaj Golub From owner-freebsd-fs@FreeBSD.ORG Sat Oct 15 20:03:00 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1B3CC106564A for ; Sat, 15 Oct 2011 20:03:00 +0000 (UTC) (envelope-from florian@wagner-flo.net) Received: from umbracor.wagner-flo.net (umbracor.wagner-flo.net [213.165.81.202]) by mx1.freebsd.org (Postfix) with ESMTP id D43038FC17 for ; Sat, 15 Oct 2011 20:02:59 +0000 (UTC) Received: from naclador.mos32.de (ppp-188-174-49-251.dynamic.mnet-online.de [188.174.49.251]) by umbracor.wagner-flo.net (Postfix) with ESMTPSA id 0EA133C0612F for ; Sat, 15 Oct 2011 21:43:52 +0200 (CEST) Date: Sat, 15 Oct 2011 21:43:47 +0200 From: Florian Wagner To: freebsd-fs@freebsd.org Message-ID: <20111015214347.09f68e4e@naclador.mos32.de> X-Mailer: Claws Mail 3.7.9 (GTK+ 2.24.5; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: multipart/signed; micalg=PGP-SHA1; boundary="Sig_/NAIlwJKcT31XX0+ohyvgNvt"; protocol="application/pgp-signature" Subject: Extending zfsboot.c to allow selecting filesystem from boot.config X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Oct 2011 20:03:00 -0000 --Sig_/NAIlwJKcT31XX0+ohyvgNvt Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Hi, from looking at the code in sys/boot/i386/zfsboot/zfsboot.c the ZFS aware boot block already allows to select pool to load the kernel from by adding : to the boot.config. As this code calls the zfs_mount_pool function it will look for the bootfs property on the new pool or use its root dataset to get the file from there. How much work would it be to extend the loader to also allow selecting a ZFS filesystem? What I'd like to do is place a boot.config on the (otherwise empty) root of my system pool and then tell it to get the loader from another filesystem by putting "rpool/root/stable-8-r226381:/boot/zfsloader" in there. Regards Florian Wagner --Sig_/NAIlwJKcT31XX0+ohyvgNvt Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iEYEARECAAYFAk6Z4nQACgkQLvW/2gp2pPx2qgCgjyXEiP1V2P8iEAGEwmHfJW5q aBUAoKBA/A36iei9VZEtCz8mdPx0ZIbs =2FCJ -----END PGP SIGNATURE----- --Sig_/NAIlwJKcT31XX0+ohyvgNvt--