From owner-freebsd-fs@freebsd.org Sun Sep 17 04:16:44 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 334E1E21BC5 for ; Sun, 17 Sep 2017 04:16:44 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 22457711B3 for ; Sun, 17 Sep 2017 04:16:44 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8H4Ghb2084036 for ; Sun, 17 Sep 2017 04:16:44 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 222288] g_bio leak after zfs ABD commit Date: Sun, 17 Sep 2017 04:16:43 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.1-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: fk@fabiankeil.de X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: avg@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Sep 2017 04:16:44 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D222288 Fabian Keil changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |fk@fabiankeil.de --- Comment #3 from Fabian Keil --- Thanks a lot for the report Dan. I noticed that something was leaking but didn't have time to track it down yet. Thanks to the report I didn't have to. To work around the issue in ElectroBSD I've reverted r321610/a0dddc24c9050 after reverting the follow-up commits that would cause revert conflicts. Your patch looks good to me, Andriy. I've imported it and will test it in the next couple of days. Thanks. It occurred to me that this issue could be easily detected automatically if there was a way to specify a time limit between uma_zalloc() and uma_zfree() calls for a given zone (or item from the zone). Obviously this only works if an upper limit makes sense (and items are expected to be freed), but in case of g_bio I believe that this is the case and there are a bunch of other zones where enforcing allocation time limits should work. I wouldn't be surprised if there were a bunch of other zone item leaks that haven't been detected yet because they don't occur frequently enough to have a big impact. --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-fs@freebsd.org Sun Sep 17 21:02:52 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 95566E0510F for ; Sun, 17 Sep 2017 21:02:52 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 822186E0F0 for ; Sun, 17 Sep 2017 21:02:52 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8HL2qBQ098236 for ; Sun, 17 Sep 2017 21:02:52 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 218626] [PATCH] cuse: new error code CUSE_ERR_NO_DEVICE (ENODEV) Date: Sun, 17 Sep 2017 21:02:52 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: jan.kokemueller@gmail.com X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: attachments.isobsolete attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Sep 2017 21:02:52 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D218626 Jan Kokem=C3=BCller changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #181753|0 |1 is obsolete| | --- Comment #4 from Jan Kokem=C3=BCller --- Created attachment 186488 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D186488&action= =3Dedit patch adding a CUSE_ERR_NO_DEVICE error code - v2 I have bumped the CUSE_VERSION macro and added CUSE_ERR_NO_DEVICE to the man page. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Mon Sep 18 06:52:42 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DF1A0E243C5 for ; Mon, 18 Sep 2017 06:52:42 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CD90981FC4 for ; Mon, 18 Sep 2017 06:52:42 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8I6qgUr077195 for ; Mon, 18 Sep 2017 06:52:42 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 218626] [PATCH] cuse: new error code CUSE_ERR_NO_DEVICE (ENODEV) Date: Mon, 18 Sep 2017 06:52:43 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: hselasky@FreeBSD.org X-Bugzilla-Status: In Progress X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_status Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Sep 2017 06:52:43 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D218626 Hans Petter Selasky changed: What |Removed |Added ---------------------------------------------------------------------------- Status|Open |In Progress --- Comment #5 from Hans Petter Selasky --- Looks good. Give me a few days and I'll get it in. --HPS --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Mon Sep 18 14:26:35 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CAC97E1422B for ; Mon, 18 Sep 2017 14:26:35 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B99A16B3B6 for ; Mon, 18 Sep 2017 14:26:35 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8IEQX1a036529 for ; Mon, 18 Sep 2017 14:26:35 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 222288] g_bio leak after zfs ABD commit Date: Mon, 18 Sep 2017 14:26:34 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.1-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: avg@FreeBSD.org X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: avg@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Sep 2017 14:26:35 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D222288 --- Comment #4 from Andriy Gapon --- Created attachment 186510 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D186510&action= =3Dedit alternative patch Here is an alternative patch that does not require the modification of ZIO_IOCTL_PIPELINE. --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-fs@freebsd.org Tue Sep 19 09:33:29 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 76C51E0871E for ; Tue, 19 Sep 2017 09:33:29 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from dss.incore.de (dss.incore.de [195.145.1.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 388C2743E2 for ; Tue, 19 Sep 2017 09:33:28 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from inetmail.dmz (inetmail.dmz [10.3.0.3]) by dss.incore.de (Postfix) with ESMTP id 04B7CC06 for ; Tue, 19 Sep 2017 11:33:20 +0200 (CEST) X-Virus-Scanned: amavisd-new at incore.de Received: from dss.incore.de ([10.3.0.3]) by inetmail.dmz (inetmail.dmz [10.3.0.3]) (amavisd-new, port 10024) with LMTP id 0si45Dmauj20 for ; Tue, 19 Sep 2017 11:33:19 +0200 (CEST) Received: from mail.local.incore (fwintern.dmz [10.0.0.253]) by dss.incore.de (Postfix) with ESMTP id 213E4C43 for ; Tue, 19 Sep 2017 11:33:19 +0200 (CEST) Received: from bsdlo.incore (bsdlo.incore [192.168.0.84]) by mail.local.incore (Postfix) with ESMTP id 11E50508D4 for ; Tue, 19 Sep 2017 11:33:19 +0200 (CEST) Message-ID: <59C0E45E.3070603@incore.de> Date: Tue, 19 Sep 2017 11:33:18 +0200 From: Andreas Longwitz User-Agent: Thunderbird 2.0.0.19 (X11/20090113) MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: nullfs with double mount is broken in FreeBSD 10 Stable Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Sep 2017 09:33:29 -0000 The described problem does not exist in V8, but in V10 Stable r317936. I use a device /dev/md2 defined in /etc/rc.conf: mdconfig_md2="-t swap -s 384m" mdconfig_md2_newfs="-n" mdconfig_md2_owner="root:wheel" mdconfig_md2_perms="750" In /etc/fstab I have: /dev/md2 /tmp1 ufs rw,async,noatime,noauto 0 0 /tmp1 /home/tmp1 nullfs rw,noauto 0 0 /tmp1 /var/tmp1 nullfs rw,noauto 0 0 Now I run: mount /home/tmp1 mount /var/tmp1 while true; do rm -f /home/tmp1/* cpdup /boot/kernel /home/tmp1 cp /var/tmp1/kernel /dev/null df -h /tmp1 done The output is Filesystem Size Used Avail Capacity Mounted on /dev/md2 372M 88M 254M 26% /tmp1 Filesystem Size Used Avail Capacity Mounted on /dev/md2 372M 100M 242M 29% /tmp1 Filesystem Size Used Avail Capacity Mounted on /dev/md2 372M 112M 229M 33% /tmp1 Filesystem Size Used Avail Capacity Mounted on /dev/md2 372M 125M 217M 36% /tmp1 ..... The "Used" space grows quickly until "No space available". After umount /var/tmp1 the space is back again. I see the same behaviour when md is replaced by tmpfs. I have tried the patch from commit 317936 (V11) without success. The problem exists on i386 and amd64. In production (V8) I use this double mount construction for jail communication without any problems. -- Andreas Longwitz From owner-freebsd-fs@freebsd.org Tue Sep 19 09:54:52 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EF1E1E097AA for ; Tue, 19 Sep 2017 09:54:52 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 867FE750DE for ; Tue, 19 Sep 2017 09:54:52 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v8J9slnD040667 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 19 Sep 2017 12:54:47 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v8J9slnD040667 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id v8J9slKC040664; Tue, 19 Sep 2017 12:54:47 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 19 Sep 2017 12:54:47 +0300 From: Konstantin Belousov To: Andreas Longwitz Cc: freebsd-fs@freebsd.org Subject: Re: nullfs with double mount is broken in FreeBSD 10 Stable Message-ID: <20170919095447.GZ78693@kib.kiev.ua> References: <59C0E45E.3070603@incore.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <59C0E45E.3070603@incore.de> User-Agent: Mutt/1.9.0 (2017-09-02) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Sep 2017 09:54:53 -0000 On Tue, Sep 19, 2017 at 11:33:18AM +0200, Andreas Longwitz wrote: > The described problem does not exist in V8, but in V10 Stable r317936. > > I use a device /dev/md2 defined in /etc/rc.conf: > > mdconfig_md2="-t swap -s 384m" > mdconfig_md2_newfs="-n" > mdconfig_md2_owner="root:wheel" > mdconfig_md2_perms="750" > > In /etc/fstab I have: > > /dev/md2 /tmp1 ufs rw,async,noatime,noauto 0 0 > /tmp1 /home/tmp1 nullfs rw,noauto 0 0 > /tmp1 /var/tmp1 nullfs rw,noauto 0 0 > > Now I run: > > mount /home/tmp1 > mount /var/tmp1 > while true; do > rm -f /home/tmp1/* > cpdup /boot/kernel /home/tmp1 > cp /var/tmp1/kernel /dev/null > df -h /tmp1 > done > > The output is > > Filesystem Size Used Avail Capacity Mounted on > /dev/md2 372M 88M 254M 26% /tmp1 > Filesystem Size Used Avail Capacity Mounted on > /dev/md2 372M 100M 242M 29% /tmp1 > Filesystem Size Used Avail Capacity Mounted on > /dev/md2 372M 112M 229M 33% /tmp1 > Filesystem Size Used Avail Capacity Mounted on > /dev/md2 372M 125M 217M 36% /tmp1 > ..... > > The "Used" space grows quickly until "No space available". > After umount /var/tmp1 the space is back again. > > I see the same behaviour when md is replaced by tmpfs. > > I have tried the patch from commit 317936 (V11) without success. > The problem exists on i386 and amd64. In production (V8) I use this > double mount construction for jail communication without any problems. Can you try the _latest_ stable/11 ? It is enough to only install the kernel, keep the userspace. If it does not help, perhaps you will need the -o nocache mount option. From owner-freebsd-fs@freebsd.org Tue Sep 19 10:05:09 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 17ECFE0A0D1 for ; Tue, 19 Sep 2017 10:05:09 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 06F9B7564C for ; Tue, 19 Sep 2017 10:05:09 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8JA58rB043977 for ; Tue, 19 Sep 2017 10:05:08 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 222288] g_bio leak after zfs ABD commit Date: Tue, 19 Sep 2017 10:05:08 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.1-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: borjam@sarenet.es X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: avg@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Sep 2017 10:05:09 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D222288 Borja Marcos changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |borjam@sarenet.es --- Comment #5 from Borja Marcos --- I've applied the second patch. Seems to be working fine, no leaks. Only one thing has me slightly puzzled, a very high used UMA slabs count. ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP UMA Kegs: 384, 0, 242, 8, 242, 0, 0 UMA Zones: 4736, 0, 259, 0, 259, 0, 0 UMA Slabs: 80, 0, 542261, 39, 553319, 0, 0 UMA Hash: 256, 0, 60, 60, 109, 0, 0 The other server running Elasticsearch, which is still on 11.1-RELEASE and = has also 8 GB of memory, shows this. Note the high number of free UMA slabs. ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP UMA Kegs: 384, 0, 240, 0, 240, 0, 0 UMA Zones: 2176, 0, 257, 0, 257, 0, 0 UMA Slabs: 80, 0, 207526, 243674, 2339176, 0, 0 I'll let it run for several days anyway. As I said it's not critical. Thanks! --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-fs@freebsd.org Tue Sep 19 14:20:10 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 64806E1568D for ; Tue, 19 Sep 2017 14:20:10 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 53DE081776 for ; Tue, 19 Sep 2017 14:20:10 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8JEKAfn085959 for ; Tue, 19 Sep 2017 14:20:10 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 222288] g_bio leak after zfs ABD commit Date: Tue, 19 Sep 2017 14:20:10 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.1-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: borjam@sarenet.es X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: avg@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Sep 2017 14:20:10 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D222288 --- Comment #6 from Borja Marcos --- (In reply to Borja Marcos from comment #5) Ignore the UMA slab comment, it's irrelevant. Sorry. --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-fs@freebsd.org Tue Sep 19 14:58:37 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E68C2E174F4 for ; Tue, 19 Sep 2017 14:58:37 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from dss.incore.de (dss.incore.de [195.145.1.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AAC8482E75 for ; Tue, 19 Sep 2017 14:58:37 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from inetmail.dmz (inetmail.dmz [10.3.0.3]) by dss.incore.de (Postfix) with ESMTP id 11A01C55; Tue, 19 Sep 2017 16:58:35 +0200 (CEST) X-Virus-Scanned: amavisd-new at incore.de Received: from dss.incore.de ([10.3.0.3]) by inetmail.dmz (inetmail.dmz [10.3.0.3]) (amavisd-new, port 10024) with LMTP id kGutFqBXAm4c; Tue, 19 Sep 2017 16:58:30 +0200 (CEST) Received: from mail.local.incore (fwintern.dmz [10.0.0.253]) by dss.incore.de (Postfix) with ESMTP id ED1B3C77; Tue, 19 Sep 2017 16:58:30 +0200 (CEST) Received: from bsdlo.incore (bsdlo.incore [192.168.0.84]) by mail.local.incore (Postfix) with ESMTP id CA43C508D1; Tue, 19 Sep 2017 16:58:30 +0200 (CEST) Message-ID: <59C13096.9070402@incore.de> Date: Tue, 19 Sep 2017 16:58:30 +0200 From: Andreas Longwitz User-Agent: Thunderbird 2.0.0.19 (X11/20090113) MIME-Version: 1.0 To: Konstantin Belousov CC: freebsd-fs@freebsd.org Subject: Re: nullfs with double mount is broken in FreeBSD 10 Stable References: <59C0E45E.3070603@incore.de> <20170919095447.GZ78693@kib.kiev.ua> In-Reply-To: <20170919095447.GZ78693@kib.kiev.ua> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Sep 2017 14:58:38 -0000 Thanks for quick answer ! Konstantin Belousov wrote: > On Tue, Sep 19, 2017 at 11:33:18AM +0200, Andreas Longwitz wrote: >> The described problem does not exist in V8, but in V10 Stable r317936. >> >> I use a device /dev/md2 defined in /etc/rc.conf: >> >> mdconfig_md2="-t swap -s 384m" >> mdconfig_md2_newfs="-n" >> mdconfig_md2_owner="root:wheel" >> mdconfig_md2_perms="750" >> >> In /etc/fstab I have: >> >> /dev/md2 /tmp1 ufs rw,async,noatime,noauto 0 0 >> /tmp1 /home/tmp1 nullfs rw,noauto 0 0 >> /tmp1 /var/tmp1 nullfs rw,noauto 0 0 >> >> Now I run: >> >> mount /home/tmp1 >> mount /var/tmp1 >> while true; do >> rm -f /home/tmp1/* >> cpdup /boot/kernel /home/tmp1 >> cp /var/tmp1/kernel /dev/null >> df -h /tmp1 >> done >> >> The output is >> >> Filesystem Size Used Avail Capacity Mounted on >> /dev/md2 372M 88M 254M 26% /tmp1 >> Filesystem Size Used Avail Capacity Mounted on >> /dev/md2 372M 100M 242M 29% /tmp1 >> Filesystem Size Used Avail Capacity Mounted on >> /dev/md2 372M 112M 229M 33% /tmp1 >> Filesystem Size Used Avail Capacity Mounted on >> /dev/md2 372M 125M 217M 36% /tmp1 >> ..... >> >> The "Used" space grows quickly until "No space available". >> After umount /var/tmp1 the space is back again. >> >> I see the same behaviour when md is replaced by tmpfs. >> >> I have tried the patch from commit 317936 (V11) without success. >> The problem exists on i386 and amd64. In production (V8) I use this >> double mount construction for jail communication without any problems. > > Can you try the _latest_ stable/11 ? It is enough to only install the > kernel, keep the userspace. > > If it does not help, perhaps you will need the -o nocache mount option. Yes, the option -o nocache worked for me. Please can you tell me whether the commit 305659 (nullfs: plug vnode ref leak in null_vptocnp) should be used in stable/10 (in stable/11 it is commit 306682) ? -- Andreas Longwitz From owner-freebsd-fs@freebsd.org Tue Sep 19 15:39:54 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 598A0E19478 for ; Tue, 19 Sep 2017 15:39:54 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E336F84988 for ; Tue, 19 Sep 2017 15:39:53 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v8JFdmkD017338 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 19 Sep 2017 18:39:48 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v8JFdmkD017338 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id v8JFdmLu017337; Tue, 19 Sep 2017 18:39:48 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 19 Sep 2017 18:39:48 +0300 From: Konstantin Belousov To: Andreas Longwitz Cc: freebsd-fs@freebsd.org Subject: Re: nullfs with double mount is broken in FreeBSD 10 Stable Message-ID: <20170919153948.GC78693@kib.kiev.ua> References: <59C0E45E.3070603@incore.de> <20170919095447.GZ78693@kib.kiev.ua> <59C13096.9070402@incore.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <59C13096.9070402@incore.de> User-Agent: Mutt/1.9.0 (2017-09-02) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Sep 2017 15:39:54 -0000 On Tue, Sep 19, 2017 at 04:58:30PM +0200, Andreas Longwitz wrote: > Yes, the option -o nocache worked for me. > Please can you tell me whether the commit 305659 (nullfs: plug vnode ref > leak in null_vptocnp) should be used in stable/10 (in stable/11 it is > commit 306682) ? If it is not merged to 10, perhaps yes. From owner-freebsd-fs@freebsd.org Tue Sep 19 16:14:11 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 47E7AE1B241 for ; Tue, 19 Sep 2017 16:14:11 +0000 (UTC) (envelope-from juan@kognito.com) Received: from mail-lf0-x22b.google.com (mail-lf0-x22b.google.com [IPv6:2a00:1450:4010:c07::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B4B261B9F for ; Tue, 19 Sep 2017 16:14:10 +0000 (UTC) (envelope-from juan@kognito.com) Received: by mail-lf0-x22b.google.com with SMTP id 80so131713lfy.4 for ; Tue, 19 Sep 2017 09:14:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kognito-com.20150623.gappssmtp.com; s=20150623; h=mime-version:from:date:message-id:subject:to; bh=ZtVzt6iyY7ui+3tKJ2vjTA/yBAwnFVGv5MzQ4cR3VHI=; b=qGpv1yRX24gZ0DduX8N9cDvAnjTgskSaa9nRuwrqfKWoMVTePHGssBj/6shK3XCxDX +ZJbzvyE5urvMkiCAyJTLpCnh8qGL4g6R++zoLTxwrJ+XnXz9QaSDBHY13u1rUFsyR8P rGVTs8rnGtRj8FyCo1TCF0QgGBMeQE3WaibhA2n+kLVmi1EUKQyWkuCQr0VxrgfuOlqX NWqzCRQThHIvqrzIYGdqbMAO9Xu9phNe01qKE1ntAx5RND6zIoxyDVsjFiF7BTv1Bej1 U8K0PKxijeRXv2kcVgWQXkMMddZEb5/yNptFyweq8R8V77jDrVOewlPQYv7G90ccqos5 VzxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=ZtVzt6iyY7ui+3tKJ2vjTA/yBAwnFVGv5MzQ4cR3VHI=; b=GWdyzfE295fA41h4K6NL8fGNVpd3QFo0iuTxFLTxBNuzgNucJYYclgH2vr3DgnUd2t 0w6AR+yJDxiQfrwMQcFA7gPb24e7EY5odQUG7w9hbu1TDl+xUVaeE1tFxho2f8/5SsiT Q4MAIhg4BzPL3uk976zyncFQpRBKHdbfux6xG2635CrvwUg6YDfACUFJk43Uv6anOO0S jJ7y1sjuOlFIKJbw7FGmiKpMoU1vPK+l54qnrPDvO6dSx9Qhq3ZB02kCmoGvKjeTDtPU FtuVjlxsoZJ2sGHl5D6jBHZhgxzJUr5Lx+MwCHql0BreJ77itDb9JGVio7Ib635bCjOs R7IA== X-Gm-Message-State: AHPjjUhD3LH889zRqnyn+4F1oSSTY0314LLQSZPylwCmdLxFiwlyn5Ku 9hPet5LfR3K/AdrIz6fBI51HUeOfvrKeo7c1rwnaQpY= X-Google-Smtp-Source: AOwi7QDpNw0fWQBECq2HKkyrz8+IBV8Wxsauu+PbNXaB4C94X3Xltx0SCxtIvIcsGsVBOt9PV4cISrs0m8adxnUiFfA= X-Received: by 10.25.39.202 with SMTP id n193mr851958lfn.131.1505837647579; Tue, 19 Sep 2017 09:14:07 -0700 (PDT) MIME-Version: 1.0 Received: by 10.25.44.77 with HTTP; Tue, 19 Sep 2017 09:13:36 -0700 (PDT) From: Juan Manuel Palacios Date: Tue, 19 Sep 2017 12:13:36 -0400 Message-ID: Subject: Trying to understand confusing 'zfs diff' output To: freebsd-fs@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Sep 2017 16:14:11 -0000 Hi everyone, I'm trying to understand a result I'm seeing when comparing the diff of two local ZFS snapshots that were successfully replicated to a remote pool over SSH against the diff of their remote counterparts. The latter shows three files as having been deleted, while the local diff of the exact same two snapshots doesn't. Following are those diffs, with empty lines entered manually into the local diff where the remote one shows the deleted files; moreover, lines right above & below this confusing part of the diffs have been trimmed for brevity's sake, since they were identical across both outputs. 1) Local diff (FreeBSD 10.3-RELEASE-p21 system): -> zfs diff zroot/mysql@automated_2017-07-31_23:45:04-EDT zroot/mysql@automated_2017-08-01_23:45:03-EDT | gawk '{ match($2, /\/(.*)/, matches); printf "%s\t%s\n", $1, matches[1]; }' (trimmed) M mysql/data/knet@002dlrs/lrs_providers.ibd M mysql/data/snap/sessions.ibd M mysql/data/knet@002dlrs/sessions.ibd M mysql/data/leads/demo_requests.ibd M mysql/tmp/nk-dev.sql.gz M mysql/data/knet/account_manager_memberships.ibd (trimmed) 2) Remote diff (FreeBSD 10.3-RELEASE-p19 system): -> zfs diff backup/mysql@automated_2017-07-31_23:45:04-EDT backup/mysql@automated_2017-08-01_23:45:03-EDT | gawk '{ match($2, /\/mnt\/backup\/(.*)/, matches); printf "%s\t%s\n", $1, matches[1]; }' (trimmed) M mysql/data/knet@002dlrs/lrs_providers.ibd - mysql/tmp/nk-dump--2017-07-19_01:32:16-EDT.sql M mysql/data/snap/sessions.ibd M mysql/data/knet@002dlrs/sessions.ibd M mysql/data/leads/demo_requests.ibd M mysql/tmp/nk-dev.sql.gz - mysql/replication/mysql-bin.001241 - mysql/replication/mysql-bin.001242 M mysql/data/knet/account_manager_memberships.ibd (trimmed) So, as I was saying, the remote diff shows these three files as having been deleted, because they were: 1) Locally: -> ls -l /mysql/.zfs/snapshot/automated_2017-07-31_23\:45\: 04-EDT/tmp/nk-dump--2017-07-19_01\:32\:16-EDT.sql -rw-r--r-- 1 jmpp jmpp 46M Jul 19 01:32 /mysql/.zfs/snapshot/ automated_2017-07-31_23:45:04-EDT/tmp/nk-dump--2017-07-19_01:32:16-EDT.sql -> ls -l /mysql/.zfs/snapshot/automated_2017-08-01_23\:45\: 03-EDT/tmp/nk-dump--2017-07-19_01\:32\:16-EDT.sql ls: /mysql/.zfs/snapshot/automated_2017-08-01_23:45:03- EDT/tmp/nk-dump--2017-07-19_01:32:16-EDT.sql: No such file or directory 2) On the remote pool: -> ls -l /mnt/backup/mysql/.zfs/snapshot/automated_2017-07-31_ 23\:45\:04-EDT/tmp/nk-dump--2017-07-19_01:32:16-EDT.sql -rw-r--r-- 1 1001 1001 48288840 Jul 18 22:32 /mnt/backup/mysql/.zfs/ snapshot/automated_2017-07-31_23:45:04-EDT/tmp/nk-dump-- 2017-07-19_01:32:16-EDT.sql -> ls -l /mnt/backup/mysql/.zfs/snapshot/automated_2017-08-01_ 23\:45\:03-EDT/tmp/nk-dump--2017-07-19_01:32:16-EDT.sql ls: /mnt/backup/mysql/.zfs/snapshot/automated_2017-08-01_ 23:45:03-EDT/tmp/nk-dump--2017-07-19_01:32:16-EDT.sql: No such file or directory And so for the other two files. So, my question is, if the remote diff is correct in showing these files as having been deleted between the two snapshots (it *is* correct in doing that, right?), why does the local diff not show it also? Thanks in advance for any help! -- Juan Palacios Senior Software Architect 135 West 26th St l 12th Floor l NY, NY 10001 212.675.9234 l 646.217.3677 Register for our upcoming webinar with The Healthy Minds Network and AUCCCD: Trends in Higher Education Mental Health: Research Highlights Connect with us! From owner-freebsd-fs@freebsd.org Tue Sep 19 17:51:14 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 98E2EE20CEA for ; Tue, 19 Sep 2017 17:51:14 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 867096524C for ; Tue, 19 Sep 2017 17:51:14 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8JHpE6F091759 for ; Tue, 19 Sep 2017 17:51:14 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 176857] [softupdates] [panic] 9.1-RELEASE/amd64/GENERIC panic in softdepflush/remove_from_journal Date: Tue, 19 Sep 2017 17:51:14 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 9.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: eugen@freebsd.org X-Bugzilla-Status: Closed X-Bugzilla-Resolution: Overcome By Events X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: eugen@freebsd.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_status assigned_to cc resolution Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Sep 2017 17:51:14 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D176857 Eugene Grosbein changed: What |Removed |Added ---------------------------------------------------------------------------- Status|In Progress |Closed Assignee|freebsd-fs@FreeBSD.org |eugen@freebsd.org CC| |eugen@freebsd.org Resolution|--- |Overcome By Events --- Comment #2 from Eugene Grosbein --- My PR. Close this as believed to fixed already as it did not reproduce since then. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Sep 20 08:28:22 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B9605E023E0 for ; Wed, 20 Sep 2017 08:28:22 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A863F24CD for ; Wed, 20 Sep 2017 08:28:22 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8K8SL9A077690 for ; Wed, 20 Sep 2017 08:28:22 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 222288] g_bio leak after zfs ABD commit Date: Wed, 20 Sep 2017 08:28:22 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.1-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: commit-hook@freebsd.org X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: avg@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Sep 2017 08:28:22 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D222288 --- Comment #7 from commit-hook@freebsd.org --- A commit references this bug: Author: avg Date: Wed Sep 20 08:27:21 UTC 2017 New revision: 323796 URL: https://svnweb.freebsd.org/changeset/base/323796 Log: fix memory leak in g_bio zone introduced in r320452, another ABD fallout I overlooked the fact that that ZIO_IOCTL_PIPELINE does not include ZIO_STAGE_VDEV_IO_DONE stage. We do allocate a struct bio for an ioctl zio (a disk cache flush), but we never freed it. This change splits bio handling into two groups, one for normal read/write i/o that passes data around and, thus, needs the abd data tranform; the other group is for "data-less" i/o such as trim and cache flush. PR: 222288 Reported by: Dan Nelson Tested by: Borja Marcos MFC after: 10 days Changes: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-fs@freebsd.org Wed Sep 20 08:32:58 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 94232E026FF for ; Wed, 20 Sep 2017 08:32:58 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 831D62826 for ; Wed, 20 Sep 2017 08:32:58 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8K8WwEi091650 for ; Wed, 20 Sep 2017 08:32:58 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 222288] g_bio leak after zfs ABD commit Date: Wed, 20 Sep 2017 08:32:58 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.1-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: avg@FreeBSD.org X-Bugzilla-Status: In Progress X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: avg@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_status Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Sep 2017 08:32:58 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D222288 Andriy Gapon changed: What |Removed |Added ---------------------------------------------------------------------------- Status|Open |In Progress --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-fs@freebsd.org Wed Sep 20 08:51:54 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6F235E035C4 for ; Wed, 20 Sep 2017 08:51:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5E46C3544 for ; Wed, 20 Sep 2017 08:51:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8K8psTV035395 for ; Wed, 20 Sep 2017 08:51:54 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 222377] ZFS ABD wasteful... Date: Wed, 20 Sep 2017 08:51:54 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: avg@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Sep 2017 08:51:54 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D222377 Andriy Gapon changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-bugs@FreeBSD.org |freebsd-fs@FreeBSD.org --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Sep 20 11:07:54 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E86B8E08F4C for ; Wed, 20 Sep 2017 11:07:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D71AF66CC9 for ; Wed, 20 Sep 2017 11:07:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8KB7sPA060017 for ; Wed, 20 Sep 2017 11:07:54 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 222377] ZFS ABD wasteful... Date: Wed, 20 Sep 2017 11:07:55 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: borjam@sarenet.es X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Sep 2017 11:07:55 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D222377 Borja Marcos changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |borjam@sarenet.es --- Comment #2 from Borja Marcos --- Just for the record, I=C2=B4ve applied the patch to 11-STABLE from today an= d it's working fine. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Sep 20 11:09:58 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3D3D6E090E3 for ; Wed, 20 Sep 2017 11:09:58 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2C3EE66E64 for ; Wed, 20 Sep 2017 11:09:58 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8KB9wML063140 for ; Wed, 20 Sep 2017 11:09:58 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 222377] ZFS ABD wasteful... Date: Wed, 20 Sep 2017 11:09:58 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: borjam@sarenet.es X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Sep 2017 11:09:58 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D222377 --- Comment #3 from Borja Marcos --- (In reply to Borja Marcos from comment #2) Please disregard, I added the comment to the wrong bug :/ --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Sep 20 11:11:31 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8F339E091D7 for ; Wed, 20 Sep 2017 11:11:31 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7E4AF66FB4 for ; Wed, 20 Sep 2017 11:11:31 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8KBBVDv072142 for ; Wed, 20 Sep 2017 11:11:31 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 222288] g_bio leak after zfs ABD commit Date: Wed, 20 Sep 2017 11:11:31 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.1-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: borjam@sarenet.es X-Bugzilla-Status: In Progress X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: avg@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Sep 2017 11:11:31 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D222288 --- Comment #8 from Borja Marcos --- For the record, I have applied the patch to 11-STABLE from today and so far it's running fine on my worst case server (running Elasticsearch). No leaks. --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-fs@freebsd.org Wed Sep 20 18:13:19 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2FF49E1B84A for ; Wed, 20 Sep 2017 18:13:19 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1E6C475ED9 for ; Wed, 20 Sep 2017 18:13:19 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8KIDIAR039821 for ; Wed, 20 Sep 2017 18:13:18 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 214981] POLA violation: ZFS happily and silently remounts any existing mount on pool import Date: Wed, 20 Sep 2017 18:13:19 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: vlad-fbsd@acheronmedia.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Sep 2017 18:13:19 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D214981 --- Comment #2 from Vladimir Krstulja --- I now believe this problem should be taken more seriously. I'd also like to formally request the FreeBSD project to assign a CVE to this issue. While I managed to train myself to always use -R or -N for zpool import, I = now found out the hard way that if you have zfs_enable=3D"YES" in rc.conf, whic= h you would if you wanted your "local" datasets be mounted on boot, it has a side-effect of automatically importing and mounting datasets for any pool t= hat becomes visible. In other words, anythign you "plug in" that contains a ZFS pool. Say, a sne= aky USB stick. Merely unlocking geli'd drives will result with any pools on those drives b= eing imported, datasets automounted, existing mountpoints remounted, root includ= ed, with zero warning, notification or complaint. So technically, we don't even have the protection of import -R or -N. This = is a security issue. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Sep 20 18:13:40 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 85352E1B8C1 for ; Wed, 20 Sep 2017 18:13:40 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 72D0875F65 for ; Wed, 20 Sep 2017 18:13:40 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8KIDexB040234 for ; Wed, 20 Sep 2017 18:13:40 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 214981] ZFS happily and silently remounts any existing mount on pool import (POLA violation and security issue!) Date: Wed, 20 Sep 2017 18:13:40 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: vlad-fbsd@acheronmedia.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: short_desc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Sep 2017 18:13:40 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D214981 Vladimir Krstulja changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|POLA violation: ZFS happily |ZFS happily and silently |and silently remounts any |remounts any existing mount |existing mount on pool |on pool import (POLA |import |violation and security | |issue!) --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Sep 20 18:34:09 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8C084E1C815 for ; Wed, 20 Sep 2017 18:34:09 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7A59F76A8E for ; Wed, 20 Sep 2017 18:34:09 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8KIY9TD088827 for ; Wed, 20 Sep 2017 18:34:09 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 214981] ZFS happily and silently remounts any existing mount on pool import (POLA violation and security issue!) Date: Wed, 20 Sep 2017 18:34:09 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: avg@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Sep 2017 18:34:09 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D214981 --- Comment #3 from Andriy Gapon --- (In reply to Vladimir Krstulja from comment #2) FreeBSD does not import any pool that has not been manually imported first = with 'zpool import'. The root pool is the only exception for obvious reasons. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Sep 20 21:01:31 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 525D2E233CB for ; Wed, 20 Sep 2017 21:01:31 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 401AD7F732 for ; Wed, 20 Sep 2017 21:01:31 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8KL1Uuk029269 for ; Wed, 20 Sep 2017 21:01:31 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 214981] ZFS happily and silently remounts any existing mount on pool import (POLA violation and security issue!) Date: Wed, 20 Sep 2017 21:01:30 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: vlad-fbsd@acheronmedia.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Sep 2017 21:01:31 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D214981 --- Comment #4 from Vladimir Krstulja --- (In reply to Andriy Gapon from comment #3) Unfortunately, in my view, that doesn't change anything. One major problem = is with ZFS receives, which is what hit me in this case. The server was receiv= ing backup pools from production, a root pool included. The obvious part is solved with import -R or -N, and giving -u to `zfs rece= ive` so it doesn't mount received snapshots. All was well until after quite a lo= ng time I had to reboot the server. The act of unlocking the drives that conta= ined the backup datasets, the very act of hitting enter on last geli passphrase imported and mounted everything it found, so I haven't had a chance to -R or -N. The security problem in this is also through received datasets. One could a= rgue that you have to trust data you receive, and I partially agree. It doesn't = help that ZFS does not, with this, offer any safety net in an form of an ability= to prevent automatic importing + mounting, from happening at all. Oh yeah, dis= able zfs service maybe. But totally not a solution. Automatic, implicit, quiet, non-obvious remounts, especially of /, without = the user explicitly stating it's okay to do so, should NEVER happen. Ever. I really hope this issue will be treated as a serious problem. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Sep 20 21:32:31 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D768AE24DAA for ; Wed, 20 Sep 2017 21:32:31 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C5AEF8113D for ; Wed, 20 Sep 2017 21:32:31 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8KLWV7a007476 for ; Wed, 20 Sep 2017 21:32:31 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 214981] ZFS happily and silently remounts any existing mount on pool import (POLA violation and security issue!) Date: Wed, 20 Sep 2017 21:32:31 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: smh@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Sep 2017 21:32:32 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D214981 Steven Hartland changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |smh@FreeBSD.org --- Comment #5 from Steven Hartland --- I think the option your looking for is the canmount property. At the end of the day there are loads of ways to break things from rm -rf or zfs destroy to pulling out a physical disk. ZFS is a very powerful tool and it rightly assumes you know what your doing= .=20 Ensuring you=E2=80=99re aware of how receiving streams work and that unless= told otherwise you want the file systems mounted is just part of your responsibi= lity when you have that power. Have I shot myself in the foot by receiving a stream without disabling moun= t, yes I have, do I believe ZFS should have prevented me from doing something = so stupid absolutely not. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Sep 20 21:40:13 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 099A3E25108 for ; Wed, 20 Sep 2017 21:40:13 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id EB63881308 for ; Wed, 20 Sep 2017 21:40:12 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8KLeCBV018356 for ; Wed, 20 Sep 2017 21:40:12 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 214981] ZFS happily and silently remounts any existing mount on pool import (POLA violation and security issue!) Date: Wed, 20 Sep 2017 21:40:13 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: vlad-fbsd@acheronmedia.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Sep 2017 21:40:13 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D214981 --- Comment #6 from Vladimir Krstulja --- Except you can't rm -rf / . Why is it that you can't rm -rf /, but you can remount it with a random dataset that becomes available, with no questions asked, and no warnings given? And it's simply not comparable. Running rm -rf is a deliberate, explicit action. Unlocking a geli provider and getting your root remounted is nowhere near that. I'm sorry, but I don't accept that. Plus, nothing would remove the power and flexibility of ZFS if it required confirmation or a --force flag, for such destructive actions. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Thu Sep 21 08:09:43 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EED2EE1B827 for ; Thu, 21 Sep 2017 08:09:43 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DC8EC6E3A1 for ; Thu, 21 Sep 2017 08:09:43 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8L89hHC019926 for ; Thu, 21 Sep 2017 08:09:43 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 214981] ZFS happily and silently remounts any existing mount on pool import (POLA violation and security issue!) Date: Thu, 21 Sep 2017 08:09:44 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: smh@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Sep 2017 08:09:44 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D214981 --- Comment #7 from Steven Hartland --- Running zfs import is also a deliberate action and as Andriy already pointed out zfs doesn't automatically import pools that have not already been manua= lly imported. If you don't want the filesystems mounted just set canmount=3Doff or canmount=3Dnoauto. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Thu Sep 21 08:59:00 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BB9B1E1DE2A for ; Thu, 21 Sep 2017 08:59:00 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from dss.incore.de (dss.incore.de [195.145.1.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6843C6FC76 for ; Thu, 21 Sep 2017 08:59:00 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from inetmail.dmz (inetmail.dmz [10.3.0.3]) by dss.incore.de (Postfix) with ESMTP id C8997ABB; Thu, 21 Sep 2017 10:58:50 +0200 (CEST) X-Virus-Scanned: amavisd-new at incore.de Received: from dss.incore.de ([10.3.0.3]) by inetmail.dmz (inetmail.dmz [10.3.0.3]) (amavisd-new, port 10024) with LMTP id l158Ua42Rgib; Thu, 21 Sep 2017 10:58:48 +0200 (CEST) Received: from mail.local.incore (fwintern.dmz [10.0.0.253]) by dss.incore.de (Postfix) with ESMTP id B99F74F2; Thu, 21 Sep 2017 10:58:46 +0200 (CEST) Received: from bsdlo.incore (bsdlo.incore [192.168.0.84]) by mail.local.incore (Postfix) with ESMTP id A351E508D1; Thu, 21 Sep 2017 10:58:46 +0200 (CEST) Message-ID: <59C37F46.80509@incore.de> Date: Thu, 21 Sep 2017 10:58:46 +0200 From: Andreas Longwitz User-Agent: Thunderbird 2.0.0.19 (X11/20090113) MIME-Version: 1.0 To: Konstantin Belousov CC: Kirk McKusick , freebsd-fs@freebsd.org Subject: Re: fsync: giving up on dirty on ufs partitions running vfs_write_suspend() References: <201709110519.v8B5JVmf060773@chez.mckusick.com> <59BD0EAC.8030206@incore.de> <20170916183117.GF78693@kib.kiev.ua> In-Reply-To: <20170916183117.GF78693@kib.kiev.ua> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Sep 2017 08:59:00 -0000 Konstantin Belousov wrote: > On Sat, Sep 16, 2017 at 01:44:44PM +0200, Andreas Longwitz wrote: >> Ok, I understand your thoughts about the "big loop" and I agree. On the >> other side it is not easy to measure the progress of the dirty buffers >> because these buffers a created from another process at the same time we >> loop in vop_stdfsync(). I can explain from my tests, where I use the >> following loop on a gjournaled partition: >> >> while true; do >> cp -p bigfile bigfile.tmp >> rm bigfile >> mv bigfile.tmp bigfile >> done >> >> When g_journal_switcher starts vfs_write_suspend() immediately after the >> rm command has started to do his "rm stuff" (ufs_inactive, ffs_truncate, >> ffs_indirtrunc at different levels, ffs_blkfree, ...) the we must loop >> (that means wait) in vop_stdfsync() until the rm process has finished >> his work. A lot of locking overhead is needed for coordination. >> Returning from bufobj_wwait() we always see one left dirty buffer (very >> seldom two), that is not optimal. Therefore I have tried the following >> patch (instead of bumping maxretry): >> >> --- vfs_default.c.orig 2016-10-24 12:26:57.000000000 +0200 >> +++ vfs_default.c 2017-09-15 12:30:44.792274000 +0200 >> @@ -688,6 +688,8 @@ >> bremfree(bp); >> bawrite(bp); >> } >> + if( maxretry < 1000) >> + DELAY(waitns); >> BO_LOCK(bo); >> goto loop2; >> } >> >> with different values for waitns. If I run the testloop 5000 times on my >> testserver, the problem is triggered always round about 10 times. The >> results from several runs are given in the following table: >> >> waitns max time max loops >> ------------------------------- >> no DELAY 0,5 sec 8650 (maxres = 100000) >> 1000 0,2 sec 24 >> 10000 0,8 sec 3 >> 100000 7,2 sec 3 >> >> "time" means spent time in vop_stdfsync() measured from entry to return >> by a dtrace script. "loops" means the number of times "--maxretry" is >> executed. I am not sure if DELAY() is the best way to wait or if waiting >> has other drawbacks. Anyway with DELAY() it does not take more than five >> iterazions to finish. > This is not explicitly stated in your message, but I suppose that the > vop_stdfsync() is called due to VOP_FSYNC(devvp, MNT_SUSPEND) call in > ffs_sync(). Am I right ? Yes, the stack trace given by dtrace script looks always like this: 4 22140 vop_stdfsync:entry kernel`devfs_fsync+0x26 kernel`VOP_FSYNC_APV+0xa7 kernel`ffs_sync+0x3bb kernel`vfs_write_suspend+0x1cd geom_journal.ko`g_journal_switcher+0x9a4 kernel`fork_exit+0x9a kernel`0xffffffff8095502e > If yes, then the solution is most likely to continue looping in the > vop_stdfsync() until there is no dirty buffers or the mount point > mnt_secondary_writes counter is zero. The pauses trick you tried might > be still useful, e.g. after some threshold of the performed loop > iterations. I have checked your proposal and found that indeed the mnt_secondary_writes counter goes to zero when the dirties have reached zero. During the loop the mnt_secondary_write counter is always equal to one, so there is not something like a countdown and thats Kirk wanted to see. A dtrace output (with DELAY of 1ms in the loop) for the biggest loop count on a three day test is this: 18 32865 kern_unlinkat:entry path=bigfile, tid=101201, tid=101201, execname=rm 18 12747 ufs_remove:entry gj=mirror/gmbkp4p5.journal, inum=11155630, blocks=22301568, size=11415525660 18 12748 ufs_remove:return returncode=0, inum=11155630, blocks=22301568 18 18902 ffs_truncate:entry gj=mirror/gmbkp4p5.journal, inum=11155630, size=11415525660, mnt_flag=0x12001040, mnt_kern_flag=0x40006142, blocks=22301568 6 33304 vfs_write_suspend:entry gj=mirror/gmbkp4p5.journal, mnt_kern_flag=0x40006142, tid=100181 6 22140 vop_stdfsync:entry mounted on /home, waitfor=1, numoutput=0, clean=10, dirty=6, secondary_writes=1 10 28117 bufobj_wwait:return calls to bufobj_wait = 1, dirtycnt=2, secondary_writes=1 10 28117 bufobj_wwait:return calls to bufobj_wait = 2, dirtycnt=1, secondary_writes=1 10 28117 bufobj_wwait:return calls to bufobj_wait = 3, dirtycnt=1, secondary_writes=1 10 28117 bufobj_wwait:return calls to bufobj_wait = 4, dirtycnt=3, secondary_writes=1 10 28117 bufobj_wwait:return calls to bufobj_wait = 5, dirtycnt=2, secondary_writes=1 6 28117 bufobj_wwait:return calls to bufobj_wait = 6, dirtycnt=3, secondary_writes=1 6 28117 bufobj_wwait:return calls to bufobj_wait = 7, dirtycnt=3, secondary_writes=1 6 28117 bufobj_wwait:return calls to bufobj_wait = 8, dirtycnt=3, secondary_writes=1 6 28117 bufobj_wwait:return calls to bufobj_wait = 9, dirtycnt=3, secondary_writes=1 6 28117 bufobj_wwait:return calls to bufobj_wait = 10, dirtycnt=2, secondary_writes=1 6 28117 bufobj_wwait:return calls to bufobj_wait = 11, dirtycnt=2, secondary_writes=1 6 28117 bufobj_wwait:return calls to bufobj_wait = 12, dirtycnt=3, secondary_writes=1 6 28117 bufobj_wwait:return calls to bufobj_wait = 13, dirtycnt=3, secondary_writes=1 6 28117 bufobj_wwait:return calls to bufobj_wait = 14, dirtycnt=3, secondary_writes=1 6 28117 bufobj_wwait:return calls to bufobj_wait = 15, dirtycnt=4, secondary_writes=1 6 28117 bufobj_wwait:return calls to bufobj_wait = 16, dirtycnt=3, secondary_writes=1 6 28117 bufobj_wwait:return calls to bufobj_wait = 17, dirtycnt=3, secondary_writes=1 2 18903 ffs_truncate:return returncode=0, inum=11155630, blocks=0 2 32866 kern_unlinkat:return returncode=0, errno=0, number io's: 791/791 6 28117 bufobj_wwait:return calls to bufobj_wait = 18, dirtycnt=3, secondary_writes=0 6 28117 bufobj_wwait:return calls to bufobj_wait = 19, dirtycnt=0, secondary_writes=0 6 22141 vop_stdfsync:return returncode=0, pid=26, tid=100181, spent 240373850 nsecs So the spent time in vop_stdfsync() is 0,24 sec in the worst case I found using DELAY with 1 ms. I would prefer this solution. My first appoach (simple bumping maxres from 1000 to 100000) is also ok, but max spend time will be raise up to 0,5 sec. Perhaps you like something like if( maxretry < 1000 && maxretry % 10 = 0) DELAY(waitns); That is also ok but does not make a noteworthy difference. The main argument remains: we have to wait until there are no dirties left. For me the problem with the "giving up on dirty" is solved. -- Andreas Longwitz From owner-freebsd-fs@freebsd.org Thu Sep 21 10:59:02 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id ED9C2E24628 for ; Thu, 21 Sep 2017 10:59:02 +0000 (UTC) (envelope-from gljennjohn@gmail.com) Received: from mail-wr0-x22b.google.com (mail-wr0-x22b.google.com [IPv6:2a00:1450:400c:c0c::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7B6A474496 for ; Thu, 21 Sep 2017 10:59:02 +0000 (UTC) (envelope-from gljennjohn@gmail.com) Received: by mail-wr0-x22b.google.com with SMTP id l22so4295531wrc.10 for ; Thu, 21 Sep 2017 03:59:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; bh=1g6fWncLWBjkoghq934lXgpyEneRk8N7Zp4Vi6hTYwM=; b=e3QHzpScvSm1+spgrcU6Q533cjV0gWF5GI2Z3ocD7R8kq/zt+XO+hRpnp2RFRYrTz9 dFmk3YHfDQiIOTKZzowC+xAd5+cOsDkP5aQC5r08o584qrcL8513y0a6SN2X4DYimyBm VZtpr6Sp3nASlqRXBgx2+txesY2ZemScxi8wXe11WOeCGeN4KOIYPDJ7vY3uDHL7qMGf 8yQk7CAFElJwe457Pu0CAeh8MQELdtifk9aRpkx613am9DIJk4LYeds3zND64yguNew5 rKpp6Z1QthF9EJICG9WHlmZ7Vy2WcPtytYer9DK9x9epsIhAwUcXJABv8IMTeLjjhmAh LVkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:reply-to:mime-version:content-transfer-encoding; bh=1g6fWncLWBjkoghq934lXgpyEneRk8N7Zp4Vi6hTYwM=; b=p336YeoFQnrZSIvtQpWDYjva/hHVDQI+4Zff2/0h2JZdTSEY7/p/0Uss90+KJhoTJ3 dkxhmet0lNo1QX/VaUMTxIOVJBP53WmZAaIng/tRodwHh8TU5iKtZLZ/B+ETkb56tyVK 62mIKfqrY/HIbNDPV7wgARE/CSYqrr3+s9a9I1PN1CC+oK2fC075geHVHcy6Hv4Zc8xu nJSv5qz/7Ci8r36iqsjQFfp2IVgErr8OLkPosFmIznxM7JxzRAjc64d9AmP/KI3AfvFE hvuuP7MWiKBlsj3Y+34G/VpxoAZnJWpdlkrpN40c8VGB6/qlXaTdm0WU+jD4Ibjx3mHb HPJQ== X-Gm-Message-State: AHPjjUilrNIudDOU7bwQb6O5XYS2HOlgGOH8D0JlMeow13d3264VsAqy KJnp8wAuQj3+guf39AkDiFb9Gw== X-Google-Smtp-Source: AOwi7QCp2IWI7ccAvUyTeFmeFXqZVVEyQEbsuvecxhFOIeSuF82EDdrQS34c8l0vYhzzGL2wF3nCMw== X-Received: by 10.223.138.235 with SMTP id z40mr1670862wrz.14.1505991540845; Thu, 21 Sep 2017 03:59:00 -0700 (PDT) Received: from ernst.home (p4FCA62DB.dip0.t-ipconnect.de. [79.202.98.219]) by smtp.gmail.com with ESMTPSA id l91sm1021092wrc.16.2017.09.21.03.58.59 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 21 Sep 2017 03:59:00 -0700 (PDT) Date: Thu, 21 Sep 2017 12:58:58 +0200 From: Gary Jennejohn To: Andreas Longwitz Cc: freebsd-fs@freebsd.org Subject: Re: fsync: giving up on dirty on ufs partitions running vfs_write_suspend() Message-ID: <20170921125858.64ffe077@ernst.home> In-Reply-To: <59C37F46.80509@incore.de> References: <201709110519.v8B5JVmf060773@chez.mckusick.com> <59BD0EAC.8030206@incore.de> <20170916183117.GF78693@kib.kiev.ua> <59C37F46.80509@incore.de> Reply-To: gljennjohn@gmail.com X-Mailer: Claws Mail 3.15.1 (GTK+ 2.24.31; amd64-portbld-freebsd12.0) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Sep 2017 10:59:03 -0000 On Thu, 21 Sep 2017 10:58:46 +0200 Andreas Longwitz wrote: [snip] > I have checked your proposal and found that indeed the > mnt_secondary_writes counter goes to zero when the dirties have reached > zero. During the loop the mnt_secondary_write counter is always equal to > one, so there is not something like a countdown and thats Kirk wanted to > see. A dtrace output (with DELAY of 1ms in the loop) for the biggest > loop count on a three day test is this: > > 18 32865 kern_unlinkat:entry path=bigfile, tid=101201, > tid=101201, execname=rm > 18 12747 ufs_remove:entry gj=mirror/gmbkp4p5.journal, > inum=11155630, blocks=22301568, size=11415525660 > 18 12748 ufs_remove:return returncode=0, inum=11155630, > blocks=22301568 > 18 18902 ffs_truncate:entry gj=mirror/gmbkp4p5.journal, > inum=11155630, size=11415525660, mnt_flag=0x12001040, > mnt_kern_flag=0x40006142, blocks=22301568 > 6 33304 vfs_write_suspend:entry gj=mirror/gmbkp4p5.journal, > mnt_kern_flag=0x40006142, tid=100181 > 6 22140 vop_stdfsync:entry mounted on /home, waitfor=1, > numoutput=0, clean=10, dirty=6, secondary_writes=1 > 10 28117 bufobj_wwait:return calls to bufobj_wait = 1, > dirtycnt=2, secondary_writes=1 > 10 28117 bufobj_wwait:return calls to bufobj_wait = 2, > dirtycnt=1, secondary_writes=1 > 10 28117 bufobj_wwait:return calls to bufobj_wait = 3, > dirtycnt=1, secondary_writes=1 > 10 28117 bufobj_wwait:return calls to bufobj_wait = 4, > dirtycnt=3, secondary_writes=1 > 10 28117 bufobj_wwait:return calls to bufobj_wait = 5, > dirtycnt=2, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 6, > dirtycnt=3, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 7, > dirtycnt=3, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 8, > dirtycnt=3, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 9, > dirtycnt=3, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 10, > dirtycnt=2, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 11, > dirtycnt=2, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 12, > dirtycnt=3, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 13, > dirtycnt=3, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 14, > dirtycnt=3, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 15, > dirtycnt=4, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 16, > dirtycnt=3, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 17, > dirtycnt=3, secondary_writes=1 > 2 18903 ffs_truncate:return returncode=0, inum=11155630, blocks=0 > 2 32866 kern_unlinkat:return returncode=0, errno=0, number > io's: 791/791 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 18, > dirtycnt=3, secondary_writes=0 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 19, > dirtycnt=0, secondary_writes=0 > 6 22141 vop_stdfsync:return returncode=0, pid=26, tid=100181, > spent 240373850 nsecs > > So the spent time in vop_stdfsync() is 0,24 sec in the worst case I > found using DELAY with 1 ms. I would prefer this solution. My first > appoach (simple bumping maxres from 1000 to 100000) is also ok, but max > spend time will be raise up to 0,5 sec. Perhaps you like something like > > if( maxretry < 1000 && maxretry % 10 = 0) ^ == > DELAY(waitns); > The argument to DELAY is in micro-seconds. -- Gary Jennejohn From owner-freebsd-fs@freebsd.org Thu Sep 21 16:51:50 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BD1CEE103EE for ; Thu, 21 Sep 2017 16:51:50 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from dss.incore.de (dss.incore.de [195.145.1.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 82100A17 for ; Thu, 21 Sep 2017 16:51:49 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from inetmail.dmz (inetmail.dmz [10.3.0.3]) by dss.incore.de (Postfix) with ESMTP id 36B82F0C; Thu, 21 Sep 2017 18:51:47 +0200 (CEST) X-Virus-Scanned: amavisd-new at incore.de Received: from dss.incore.de ([10.3.0.3]) by inetmail.dmz (inetmail.dmz [10.3.0.3]) (amavisd-new, port 10024) with LMTP id bCzo0qt1h4Tf; Thu, 21 Sep 2017 18:51:46 +0200 (CEST) Received: from mail.local.incore (fwintern.dmz [10.0.0.253]) by dss.incore.de (Postfix) with ESMTP id 4704FC68; Thu, 21 Sep 2017 18:51:46 +0200 (CEST) Received: from bsdlo.incore (bsdlo.incore [192.168.0.84]) by mail.local.incore (Postfix) with ESMTP id 1E6A4508D6; Thu, 21 Sep 2017 18:51:46 +0200 (CEST) Message-ID: <59C3EE21.1010401@incore.de> Date: Thu, 21 Sep 2017 18:51:45 +0200 From: Andreas Longwitz User-Agent: Thunderbird 2.0.0.19 (X11/20090113) MIME-Version: 1.0 To: gljennjohn@gmail.com, freebsd-fs@freebsd.org Subject: Re: fsync: giving up on dirty on ufs partitions running vfs_write_suspend() References: <201709110519.v8B5JVmf060773@chez.mckusick.com> <59BD0EAC.8030206@incore.de> <20170916183117.GF78693@kib.kiev.ua> <59C37F46.80509@incore.de> <20170921125858.64ffe077@ernst.home> In-Reply-To: <20170921125858.64ffe077@ernst.home> Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Sep 2017 16:51:50 -0000 Gary Jennejohn schrieb: >> So the spent time in vop_stdfsync() is 0,24 sec in the worst case I >> found using DELAY with 1 ms. I would prefer this solution. My first >> appoach (simple bumping maxres from 1000 to 100000) is also ok, but max >> spend time will be raise up to 0,5 sec. Perhaps you like something like >> >> if( maxretry < 1000 && maxretry % 10 = 0) > ^ == >> DELAY(waitns); >> > > The argument to DELAY is in micro-seconds. > Thanks for your hint. Correct is waitns = 1000; if( maxretry < 1000 && maxretry % 10 == 0) DELAY(waitns); Andreas Longwitz From owner-freebsd-fs@freebsd.org Thu Sep 21 17:29:15 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0DCBAE141C5 for ; Thu, 21 Sep 2017 17:29:15 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AA3513539 for ; Thu, 21 Sep 2017 17:29:14 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v8LHT2IS048804 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Thu, 21 Sep 2017 20:29:02 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v8LHT2IS048804 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id v8LHT2Sh048793; Thu, 21 Sep 2017 20:29:02 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 21 Sep 2017 20:29:02 +0300 From: Konstantin Belousov To: Andreas Longwitz Cc: Kirk McKusick , freebsd-fs@freebsd.org Subject: Re: fsync: giving up on dirty on ufs partitions running vfs_write_suspend() Message-ID: <20170921172902.GW78693@kib.kiev.ua> References: <201709110519.v8B5JVmf060773@chez.mckusick.com> <59BD0EAC.8030206@incore.de> <20170916183117.GF78693@kib.kiev.ua> <59C37F46.80509@incore.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <59C37F46.80509@incore.de> User-Agent: Mutt/1.9.0 (2017-09-02) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Sep 2017 17:29:15 -0000 On Thu, Sep 21, 2017 at 10:58:46AM +0200, Andreas Longwitz wrote: > Konstantin Belousov wrote: > > On Sat, Sep 16, 2017 at 01:44:44PM +0200, Andreas Longwitz wrote: > >> Ok, I understand your thoughts about the "big loop" and I agree. On the > >> other side it is not easy to measure the progress of the dirty buffers > >> because these buffers a created from another process at the same time we > >> loop in vop_stdfsync(). I can explain from my tests, where I use the > >> following loop on a gjournaled partition: > >> > >> while true; do > >> cp -p bigfile bigfile.tmp > >> rm bigfile > >> mv bigfile.tmp bigfile > >> done > >> > >> When g_journal_switcher starts vfs_write_suspend() immediately after the > >> rm command has started to do his "rm stuff" (ufs_inactive, ffs_truncate, > >> ffs_indirtrunc at different levels, ffs_blkfree, ...) the we must loop > >> (that means wait) in vop_stdfsync() until the rm process has finished > >> his work. A lot of locking overhead is needed for coordination. > >> Returning from bufobj_wwait() we always see one left dirty buffer (very > >> seldom two), that is not optimal. Therefore I have tried the following > >> patch (instead of bumping maxretry): > >> > >> --- vfs_default.c.orig 2016-10-24 12:26:57.000000000 +0200 > >> +++ vfs_default.c 2017-09-15 12:30:44.792274000 +0200 > >> @@ -688,6 +688,8 @@ > >> bremfree(bp); > >> bawrite(bp); > >> } > >> + if( maxretry < 1000) > >> + DELAY(waitns); > >> BO_LOCK(bo); > >> goto loop2; > >> } > >> > >> with different values for waitns. If I run the testloop 5000 times on my > >> testserver, the problem is triggered always round about 10 times. The > >> results from several runs are given in the following table: > >> > >> waitns max time max loops > >> ------------------------------- > >> no DELAY 0,5 sec 8650 (maxres = 100000) > >> 1000 0,2 sec 24 > >> 10000 0,8 sec 3 > >> 100000 7,2 sec 3 > >> > >> "time" means spent time in vop_stdfsync() measured from entry to return > >> by a dtrace script. "loops" means the number of times "--maxretry" is > >> executed. I am not sure if DELAY() is the best way to wait or if waiting > >> has other drawbacks. Anyway with DELAY() it does not take more than five > >> iterazions to finish. > > This is not explicitly stated in your message, but I suppose that the > > vop_stdfsync() is called due to VOP_FSYNC(devvp, MNT_SUSPEND) call in > > ffs_sync(). Am I right ? > > Yes, the stack trace given by dtrace script looks always like this: > 4 22140 vop_stdfsync:entry > kernel`devfs_fsync+0x26 > kernel`VOP_FSYNC_APV+0xa7 > kernel`ffs_sync+0x3bb > kernel`vfs_write_suspend+0x1cd > geom_journal.ko`g_journal_switcher+0x9a4 > kernel`fork_exit+0x9a > kernel`0xffffffff8095502e > > > > If yes, then the solution is most likely to continue looping in the > > vop_stdfsync() until there is no dirty buffers or the mount point > > mnt_secondary_writes counter is zero. The pauses trick you tried might > > be still useful, e.g. after some threshold of the performed loop > > iterations. > > I have checked your proposal and found that indeed the > mnt_secondary_writes counter goes to zero when the dirties have reached > zero. During the loop the mnt_secondary_write counter is always equal to > one, so there is not something like a countdown and thats Kirk wanted to > see. This is because mnt_secondary_write counts number of threads which entered the vn_start_secondary_write() block and potentially can issue a write dirtying a buffer. In principle, some writer may start the secondary write block again even if the counter is zero, but practically some primary writer must make a modification for secondary writers to have work. I.e., the change would not cover the problem to claim it being completely solved, but for the current UFS code I doubt that the issue can be triggered. > A dtrace output (with DELAY of 1ms in the loop) for the biggest > loop count on a three day test is this: > > 18 32865 kern_unlinkat:entry path=bigfile, tid=101201, > tid=101201, execname=rm > 18 12747 ufs_remove:entry gj=mirror/gmbkp4p5.journal, > inum=11155630, blocks=22301568, size=11415525660 > 18 12748 ufs_remove:return returncode=0, inum=11155630, > blocks=22301568 > 18 18902 ffs_truncate:entry gj=mirror/gmbkp4p5.journal, > inum=11155630, size=11415525660, mnt_flag=0x12001040, > mnt_kern_flag=0x40006142, blocks=22301568 > 6 33304 vfs_write_suspend:entry gj=mirror/gmbkp4p5.journal, > mnt_kern_flag=0x40006142, tid=100181 > 6 22140 vop_stdfsync:entry mounted on /home, waitfor=1, > numoutput=0, clean=10, dirty=6, secondary_writes=1 > 10 28117 bufobj_wwait:return calls to bufobj_wait = 1, > dirtycnt=2, secondary_writes=1 > 10 28117 bufobj_wwait:return calls to bufobj_wait = 2, > dirtycnt=1, secondary_writes=1 > 10 28117 bufobj_wwait:return calls to bufobj_wait = 3, > dirtycnt=1, secondary_writes=1 > 10 28117 bufobj_wwait:return calls to bufobj_wait = 4, > dirtycnt=3, secondary_writes=1 > 10 28117 bufobj_wwait:return calls to bufobj_wait = 5, > dirtycnt=2, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 6, > dirtycnt=3, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 7, > dirtycnt=3, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 8, > dirtycnt=3, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 9, > dirtycnt=3, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 10, > dirtycnt=2, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 11, > dirtycnt=2, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 12, > dirtycnt=3, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 13, > dirtycnt=3, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 14, > dirtycnt=3, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 15, > dirtycnt=4, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 16, > dirtycnt=3, secondary_writes=1 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 17, > dirtycnt=3, secondary_writes=1 > 2 18903 ffs_truncate:return returncode=0, inum=11155630, blocks=0 > 2 32866 kern_unlinkat:return returncode=0, errno=0, number > io's: 791/791 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 18, > dirtycnt=3, secondary_writes=0 > 6 28117 bufobj_wwait:return calls to bufobj_wait = 19, > dirtycnt=0, secondary_writes=0 > 6 22141 vop_stdfsync:return returncode=0, pid=26, tid=100181, > spent 240373850 nsecs > > So the spent time in vop_stdfsync() is 0,24 sec in the worst case I > found using DELAY with 1 ms. I would prefer this solution. My first > appoach (simple bumping maxres from 1000 to 100000) is also ok, but max > spend time will be raise up to 0,5 sec. Perhaps you like something like > > if( maxretry < 1000 && maxretry % 10 = 0) > DELAY(waitns); > > That is also ok but does not make a noteworthy difference. The main > argument remains: we have to wait until there are no dirties left. > > For me the problem with the "giving up on dirty" is solved. Will you provide the patch ? From owner-freebsd-fs@freebsd.org Thu Sep 21 20:00:11 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 93448E1E63A for ; Thu, 21 Sep 2017 20:00:11 +0000 (UTC) (envelope-from bsd@vink.pl) Received: from mail-qk0-x234.google.com (mail-qk0-x234.google.com [IPv6:2607:f8b0:400d:c09::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4DF1367D2B for ; Thu, 21 Sep 2017 20:00:11 +0000 (UTC) (envelope-from bsd@vink.pl) Received: by mail-qk0-x234.google.com with SMTP id t184so6872444qke.10 for ; Thu, 21 Sep 2017 13:00:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vink-pl.20150623.gappssmtp.com; s=20150623; h=mime-version:from:date:message-id:subject:to; bh=u38NirGb+SPp0kstCaRyyK9QqwUGMCyYvnGV88SRvzo=; b=IKHL+VficnwtlUSqDYmPLA0KLb4hiqxPxcldvYKqhYak44YMntV1rzGoCVpVEAXUGo 7Qt0d+FBMcQaSgHBaGFU8AB75Uz3Q2366LDnXbg+prTiBeUpskWqAtAneG792GMuGSeL j7xKUGSA5r9RQ7HHvzbuoFyVrAfLQcNEZ+y6IIqeBFFk3HObgEvXy6tfc+sFVH6MYB+H XfJo7EZhUPnNeEoLdxAcQD6rqBv6YgaswDrrcQbsN/9irceRs8/8MXkecAMx8DZxMjSW KT73bSwHf6RICcUSfW/EBLfK0ruFhYJIjSSC5QisQqWjh0ri4ccJLIjUDbDCQgh+SDLE G1FA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=u38NirGb+SPp0kstCaRyyK9QqwUGMCyYvnGV88SRvzo=; b=FkLNRqH08hUfo2o2f5yZRicRj5B9jGT9bKV93UMUWxwONUaI+avWoZ00ytRMnbdK5/ bnxignWZ5ffc/7b+RckUXtsTrh6wYPc81EeX8QVGIQJydtCzOVKFuPDtmeJiSenCDD65 NuU7vKLvx5a+7LLOZT+aBJkwNQoCOjv8w396aYE7sC+MI03cbeerTNiTtXWkehHSA3ni v6OxawFsZ/HifupPLr5OG12A9xHduXO4q3nF5M2uy87fyY7r2r/HhnsGqZuwcDoyj5Ym vz4AaIOaQBnjGRMbvvPC3Lkq/OVcqV3iFtl1H/Hh20D/lEm0p3KuO5RFaWrIdJyukFNh lSKQ== X-Gm-Message-State: AHPjjUgaUPT+YQK9de4eJzlaxxO4aZEUqyzG4OAOaznORxYhVzjp8DSR /0jDCzjc9jeu+OQyq/dfHe2oCRe3 X-Received: by 10.55.167.135 with SMTP id q129mr4738278qke.311.1506024009995; Thu, 21 Sep 2017 13:00:09 -0700 (PDT) Received: from mail-qk0-f178.google.com (mail-qk0-f178.google.com. [209.85.220.178]) by smtp.gmail.com with ESMTPSA id p31sm1606467qtp.12.2017.09.21.13.00.09 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 21 Sep 2017 13:00:09 -0700 (PDT) Received: by mail-qk0-f178.google.com with SMTP id j5so6911577qkd.0 for ; Thu, 21 Sep 2017 13:00:09 -0700 (PDT) X-Google-Smtp-Source: AOwi7QDcTfElYgGTdbFZj3qb1oVMovtJdbyG4hkpyuqFITV23NlXwC2qKs0aMCuIxOVgpW9Z+MaJJmcCzkS2fCmQbNo= X-Received: by 10.55.155.203 with SMTP id d194mr4565101qke.288.1506024009006; Thu, 21 Sep 2017 13:00:09 -0700 (PDT) MIME-Version: 1.0 Received: by 10.12.139.1 with HTTP; Thu, 21 Sep 2017 13:00:08 -0700 (PDT) From: Wiktor Niesiobedzki Date: Thu, 21 Sep 2017 22:00:08 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: ZVOL with volblocksize=64k+ results in data corruption [was Re: Resolving errors with ZVOL-s] To: freebsd-fs Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Sep 2017 20:00:11 -0000 Hi, I've conducted additional tests. It looks like when I create volumes with following commands: # zfs create -V50g -o volmode=3Ddev -o volblocksize=3D64k -o compression=3D= off -o com.sun:auto-snapshot=3Dfalse tank/test # zfs create -V50g -o volmode=3Ddev -o volblocksize=3D128k -o compression= =3Doff -o com.sun:auto-snapshot=3Dfalse tank/test I'm able to get checksum errors quite reliably in 2-12h of normal work of the volume. I tested also different volbocksizes: # zfs create -V50g -o volmode=3Ddev -o volblocksize=3D32k -o compression=3D= off -o com.sun:auto-snapshot=3Dfalse tank/test # zfs create -V50g -o volmode=3Ddev -o volblocksize=3D8k -o compression=3Do= ff -o com.sun:auto-snapshot=3Dfalse tank/test # zfs create -V50g -o volmode=3Ddev -o volblocksize=3D4k -o compression=3Do= ff -o com.sun:auto-snapshot=3Dfalse tank/test And gave them more than 24h of work with no apparent errors (I also moved other volumes to 4k and they did not show any checksum errors for more than 2 weeks). I was running with volblocksize=3D128k from January this year. The problem started to appear only after I updated from 11.0 to 11.1. Should I file bug report for this? What additional information should I gather? Cheers, Wiktor Niesiob=C4=99dzki PS. I found a way to solve errors reporting in zpool status. It turns out, that they disappear after scrub, but only if scrub was run substantially later than zfs destroy. Maybe some references in ZIL prevent these errors from being removed? Is this a bug? 2017-09-04 19:12 GMT+02:00 Wiktor Niesiobedzki : > Hi, > > I can follow up on my issue - the same problem just happened on the secon= d > ZVOL that I've created: > # zpool status -v > pool: tank > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://illumos.org/msg/ZFS-8000-8A > scan: scrub repaired 0 in 5h27m with 0 errors on Sat Sep 2 15:30:59 20= 17 > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 14 > mirror-0 ONLINE 0 0 28 > gpt/tank1.eli ONLINE 0 0 28 > gpt/tank2.eli ONLINE 0 0 28 > > errors: Permanent errors have been detected in the following files: > > tank/docker-big:<0x1> > <0x5095>:<0x1> > > > I suspect that these errors might be related to my recent upgrade to 11.1= . > Until 19 of August I was running 11.0. I consider rolling back to 11.0 > right now. > > For reference: > # zfs get all tank/docker-big > NAME PROPERTY VALUE SOURCE > tank/docker-big type volume - > tank/docker-big creation Sat Sep 2 10:09 2017 - > tank/docker-big used 100G - > tank/docker-big available 747G - > tank/docker-big referenced 10.5G - > tank/docker-big compressratio 4.58x - > tank/docker-big reservation none default > tank/docker-big volsize 100G local > tank/docker-big volblocksize 128K - > tank/docker-big checksum skein inherited > from tank > tank/docker-big compression lz4 inherited > from tank > tank/docker-big readonly off default > tank/docker-big copies 1 default > tank/docker-big refreservation 100G local > tank/docker-big primarycache all default > tank/docker-big secondarycache all default > tank/docker-big usedbysnapshots 0 - > tank/docker-big usedbydataset 10.5G - > tank/docker-big usedbychildren 0 - > tank/docker-big usedbyrefreservation 89.7G - > tank/docker-big logbias latency default > tank/docker-big dedup off default > tank/docker-big mlslabel - > tank/docker-big sync standard default > tank/docker-big refcompressratio 4.58x - > tank/docker-big written 10.5G - > tank/docker-big logicalused 47.8G - > tank/docker-big logicalreferenced 47.8G - > tank/docker-big volmode dev local > tank/docker-big snapshot_limit none default > tank/docker-big snapshot_count none default > tank/docker-big redundant_metadata all default > tank/docker-big com.sun:auto-snapshot false local > > Any ideas what should I try before rolling back? > > > Cheers, > > Wiktor > > 2017-09-02 19:17 GMT+02:00 Wiktor Niesiobedzki : > >> Hi, >> >> I have recently encountered errors on my ZFS Pool on my 11.1-R: >> $ uname -a >> FreeBSD kadlubek 11.1-RELEASE-p1 FreeBSD 11.1-RELEASE-p1 #0: Wed Aug 9 >> 11:55:48 UTC 2017 root@amd64-builder.daemonology >> .net:/usr/obj/usr/src/sys/GENERIC amd64 >> >> # zpool status -v tank >> pool: tank >> state: ONLINE >> status: One or more devices has experienced an error resulting in data >> corruption. Applications may be affected. >> action: Restore the file in question if possible. Otherwise restore the >> entire pool from backup. >> see: http://illumos.org/msg/ZFS-8000-8A >> scan: scrub repaired 0 in 5h27m with 0 errors on Sat Sep 2 15:30:59 >> 2017 >> config: >> >> NAME STATE READ WRITE CKSUM >> tank ONLINE 0 0 98 >> mirror-0 ONLINE 0 0 196 >> gpt/tank1.eli ONLINE 0 0 196 >> gpt/tank2.eli ONLINE 0 0 196 >> >> errors: Permanent errors have been detected in the following files: >> >> dkr-test:<0x1> >> >> dkr-test is ZVOL that I use within bhyve and indeed - within bhyve I hav= e >> noticed I/O errors on this volume. This ZVOL did not have any snapshots. >> >> Following the advice mentioned in action I tried to restore the ZVOL: >> # zfs desroy tank/dkr-test >> >> But still errors are mentioned in zpool status: >> errors: Permanent errors have been detected in the following files: >> >> <0x5095>:<0x1> >> >> I can't find any reference to this dataset in zdb: >> # zdb -d tank | grep 5095 >> # zdb -d tank | grep 20629 >> >> >> I tried also getting statistics about metadata in this pool: >> # zdb -b tank >> >> Traversing all blocks to verify nothing leaked ... >> >> loading space map for vdev 0 of 1, metaslab 159 of 174 ... >> No leaks (block sum matches space maps exactly) >> >> bp count: 24426601 >> ganged count: 0 >> bp logical: 1983127334912 avg: 81187 >> bp physical: 1817897247232 avg: 74422 compression: >> 1.09 >> bp allocated: 1820446928896 avg: 74527 compression: >> 1.09 >> bp deduped: 0 ref>1: 0 deduplication: 1.= 00 >> SPA allocated: 1820446928896 used: 60.90% >> >> additional, non-pointer bps of type 0: 57981 >> Dittoed blocks on same vdev: 296490 >> >> And zdb got stuck using 100% CPU >> >> And now to my questions: >> 1. Do I interpret correctly, that this situation is probably due to erro= r >> during write, and both copies of the block got checksum mismatching thei= r >> data? And if it is a hardware problem, it is probably something other th= an >> disk? (No, I don't use ECC RAM) >> >> 2. Is there any way to remove offending dataset and clean the pool of th= e >> errors? >> >> 3. Is my metadata OK? Or should I restore entire pool from backup? >> >> 4. I tried also running zdb -bc tank, but this resulted in kernel panic. >> I might try to get the stack trace once I get physical access to machine >> next week. Also - checksum verification slows down process from 1000MB/s= to >> less than 1MB/s. Is this expected? >> >> 5. When I work with zdb (as as above) should I try to limit writes to th= e >> pool (e.g. by unmounting the datasets)? >> >> Cheers, >> >> Wiktor Niesiobedzki >> >> PS. Sorry for previous partial message. >> >> > From owner-freebsd-fs@freebsd.org Fri Sep 22 08:48:09 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A0ED3E1F17A for ; Fri, 22 Sep 2017 08:48:09 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citapm.icyb.net.ua (citapm.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id EB1A71C1D for ; Fri, 22 Sep 2017 08:48:08 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citapm.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA11305; Fri, 22 Sep 2017 11:48:06 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1dvJd0-0008ab-7C; Fri, 22 Sep 2017 11:48:06 +0300 Subject: Re: ZVOL with volblocksize=64k+ results in data corruption [was Re: Resolving errors with ZVOL-s] To: Wiktor Niesiobedzki , freebsd-fs References: From: Andriy Gapon Message-ID: <062dfcf9-b56e-c6c4-4039-c48bf7bdd610@FreeBSD.org> Date: Fri, 22 Sep 2017 11:46:44 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Sep 2017 08:48:09 -0000 On 21/09/2017 23:00, Wiktor Niesiobedzki wrote: > I've conducted additional tests. It looks like when I create volumes with > following commands: > # zfs create -V50g -o volmode=dev -o volblocksize=64k -o compression=off -o > com.sun:auto-snapshot=false tank/test > # zfs create -V50g -o volmode=dev -o volblocksize=128k -o compression=off > -o com.sun:auto-snapshot=false tank/test > > I'm able to get checksum errors quite reliably in 2-12h of normal work of > the volume. I tested also different volbocksizes: > # zfs create -V50g -o volmode=dev -o volblocksize=32k -o compression=off -o > com.sun:auto-snapshot=false tank/test > # zfs create -V50g -o volmode=dev -o volblocksize=8k -o compression=off -o > com.sun:auto-snapshot=false tank/test > # zfs create -V50g -o volmode=dev -o volblocksize=4k -o compression=off -o > com.sun:auto-snapshot=false tank/test > > And gave them more than 24h of work with no apparent errors (I also moved > other volumes to 4k and they did not show any checksum errors for more than > 2 weeks). > > I was running with volblocksize=128k from January this year. The problem > started to appear only after I updated from 11.0 to 11.1. > > Should I file bug report for this? What additional information should I > gather? Are you able to patch your version of FreeBSD? Could you please try https://svnweb.freebsd.org/changeset/base/323918 and see if it makes things better? Thanks! -- Andriy Gapon From owner-freebsd-fs@freebsd.org Fri Sep 22 09:15:09 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A6664E2078C for ; Fri, 22 Sep 2017 09:15:09 +0000 (UTC) (envelope-from bsd@vink.pl) Received: from mail-qt0-x22f.google.com (mail-qt0-x22f.google.com [IPv6:2607:f8b0:400d:c0d::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5CC7436BD for ; Fri, 22 Sep 2017 09:15:09 +0000 (UTC) (envelope-from bsd@vink.pl) Received: by mail-qt0-x22f.google.com with SMTP id b1so436781qtc.4 for ; Fri, 22 Sep 2017 02:15:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vink-pl.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=MVBnYDicVX36CkvB/eNH6qDzWUZl2MRtnLbkrwnHQ8k=; b=ZJ1H+7+nuOPqnuRCBXXsMJsPInqAol2HfhqbR1s0RN0+FMNVitzz94FeC5E+yQMZoW ALneyX5+t+0MSnQn4DWNyiJ+8OlYmIKnFpzetQw0w2DXcTKj54BqqQMn078d6G7mmxDj MRDKqYw39Lsg2od84RI4CG3ZddF4mu45jkWV79O8WUk2RBI/Kd9xPhb5VF5/JaKV1qo6 Ty18RDR5dcHvWCT6CU3rgZUrER0IoJzz/76cbUl5i6Ya9GpWtkrLZ/Htj6hQIkz6lguc 8P2b0V7y9o3VqxR+9EUUB4nPwAgB/lMy9dBhi+Cu9xdavlGHM43wgxWWt4ziKQcGs92/ uQtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=MVBnYDicVX36CkvB/eNH6qDzWUZl2MRtnLbkrwnHQ8k=; b=RaxxIVaTf/s/UzYflAnWQ7C7WXslUKSmknHH/DYJDOYGad1ZxGnI/hrWKqRiPg6hqX ja6D4Cwb8US5f2xn7QKo/73mgSZE/ULtBt1RXohbWWXDRoq6nW6DXHX6TBEm1u4eZd0I kqBiyL/JdlLgwsYa19JIfS7Apk7EN8clpeXoUiecp/TcpChLvx+6dXRL+20RZKoe3vrk 5w8GNk2fL5VTBux0LY4GQ0pKNx/JVUrqndig9c/Jo1c2yEU0rD79iZCLNRGpgqbTLUV4 vyM6e355x63jHFxcyWR/6EthjAzeI5pSQqOaDSqEb5eor2quxYg8kWArChl7Hl920b2/ dkqQ== X-Gm-Message-State: AHPjjUhdKZJ0CGElHET7vHnOCK86cu3XX6wWoqdI7DllkQw+VCjKAB2M 41ACt1hrg+e5VvuYKFKCQPvA6Ero X-Received: by 10.200.38.170 with SMTP id 39mr7474753qto.114.1506071708237; Fri, 22 Sep 2017 02:15:08 -0700 (PDT) Received: from mail-qt0-f182.google.com (mail-qt0-f182.google.com. [209.85.216.182]) by smtp.gmail.com with ESMTPSA id y9sm2545680qth.6.2017.09.22.02.15.06 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 22 Sep 2017 02:15:07 -0700 (PDT) Received: by mail-qt0-f182.google.com with SMTP id q4so430505qtq.8 for ; Fri, 22 Sep 2017 02:15:06 -0700 (PDT) X-Google-Smtp-Source: AOwi7QAUBFwgU1nWnx6a5EqNVC0k0XXyoVX5j9IoYxoaXwp4a29pg0Yp9O9W5jM//k5j+Yv7UJDzPiCE9jhpq4mwFI0= X-Received: by 10.200.6.8 with SMTP id d8mr7019349qth.142.1506071705779; Fri, 22 Sep 2017 02:15:05 -0700 (PDT) MIME-Version: 1.0 References: <062dfcf9-b56e-c6c4-4039-c48bf7bdd610@FreeBSD.org> In-Reply-To: <062dfcf9-b56e-c6c4-4039-c48bf7bdd610@FreeBSD.org> From: Wiktor Niesiobedzki Date: Fri, 22 Sep 2017 09:14:53 +0000 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: ZVOL with volblocksize=64k+ results in data corruption [was Re: Resolving errors with ZVOL-s] To: Andriy Gapon , freebsd-fs Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Sep 2017 09:15:09 -0000 On Fri, 22 Sep 2017 at 10:48, Andriy Gapon wrote: > On 21/09/2017 23:00, Wiktor Niesiobedzki wrote: > > I've conducted additional tests. It looks like when I create volumes with > > following commands: > > # zfs create -V50g -o volmode=dev -o volblocksize=64k -o compression=off > -o > > com.sun:auto-snapshot=false tank/test > > # zfs create -V50g -o volmode=dev -o volblocksize=128k -o compression=off > > -o com.sun:auto-snapshot=false tank/test > > > > I'm able to get checksum errors quite reliably in 2-12h of normal work of > > the volume. I tested also different volbocksizes: > > # zfs create -V50g -o volmode=dev -o volblocksize=32k -o compression=off > -o > > com.sun:auto-snapshot=false tank/test > > # zfs create -V50g -o volmode=dev -o volblocksize=8k -o compression=off > -o > > com.sun:auto-snapshot=false tank/test > > # zfs create -V50g -o volmode=dev -o volblocksize=4k -o compression=off > -o > > com.sun:auto-snapshot=false tank/test > > > > And gave them more than 24h of work with no apparent errors (I also moved > > other volumes to 4k and they did not show any checksum errors for more > than > > 2 weeks). > > > > I was running with volblocksize=128k from January this year. The problem > > started to appear only after I updated from 11.0 to 11.1. > > > > Should I file bug report for this? What additional information should I > > gather? > > Are you able to patch your version of FreeBSD? > Could you please try https://svnweb.freebsd.org/changeset/base/323918 > and see if it makes things better? > Yes, I'll give it a try for a weekend and let you know about the result. Cheers, Wiktor From owner-freebsd-fs@freebsd.org Fri Sep 22 10:02:22 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7E67CE21D6D for ; Fri, 22 Sep 2017 10:02:22 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from dss.incore.de (dss.incore.de [195.145.1.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 10AAC641D6 for ; Fri, 22 Sep 2017 10:02:21 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from inetmail.dmz (inetmail.dmz [10.3.0.3]) by dss.incore.de (Postfix) with ESMTP id C73C21469; Fri, 22 Sep 2017 12:02:11 +0200 (CEST) X-Virus-Scanned: amavisd-new at incore.de Received: from dss.incore.de ([10.3.0.3]) by inetmail.dmz (inetmail.dmz [10.3.0.3]) (amavisd-new, port 10024) with LMTP id zQ40TYN-rJp1; Fri, 22 Sep 2017 12:02:09 +0200 (CEST) Received: from mail.local.incore (fwintern.dmz [10.0.0.253]) by dss.incore.de (Postfix) with ESMTP id 3E6301321; Fri, 22 Sep 2017 12:01:50 +0200 (CEST) Received: from bsdlo.incore (bsdlo.incore [192.168.0.84]) by mail.local.incore (Postfix) with ESMTP id 2F0B3508DE; Fri, 22 Sep 2017 12:01:50 +0200 (CEST) Message-ID: <59C4DF8D.5070004@incore.de> Date: Fri, 22 Sep 2017 12:01:49 +0200 From: Andreas Longwitz User-Agent: Thunderbird 2.0.0.19 (X11/20090113) MIME-Version: 1.0 To: Konstantin Belousov CC: Kirk McKusick , freebsd-fs@freebsd.org Subject: Re: fsync: giving up on dirty on ufs partitions running vfs_write_suspend() References: <201709110519.v8B5JVmf060773@chez.mckusick.com> <59BD0EAC.8030206@incore.de> <20170916183117.GF78693@kib.kiev.ua> <59C37F46.80509@incore.de> <20170921172902.GW78693@kib.kiev.ua> In-Reply-To: <20170921172902.GW78693@kib.kiev.ua> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Sep 2017 10:02:22 -0000 Konstantin Belousov schrieb: > On Thu, Sep 21, 2017 at 10:58:46AM +0200, Andreas Longwitz wrote: >> Konstantin Belousov wrote: >>> On Sat, Sep 16, 2017 at 01:44:44PM +0200, Andreas Longwitz wrote: >>>> Ok, I understand your thoughts about the "big loop" and I agree. On the >>>> other side it is not easy to measure the progress of the dirty buffers >>>> because these buffers a created from another process at the same time we >>>> loop in vop_stdfsync(). I can explain from my tests, where I use the >>>> following loop on a gjournaled partition: >>>> >>>> while true; do >>>> cp -p bigfile bigfile.tmp >>>> rm bigfile >>>> mv bigfile.tmp bigfile >>>> done >>>> >>>> When g_journal_switcher starts vfs_write_suspend() immediately after the >>>> rm command has started to do his "rm stuff" (ufs_inactive, ffs_truncate, >>>> ffs_indirtrunc at different levels, ffs_blkfree, ...) the we must loop >>>> (that means wait) in vop_stdfsync() until the rm process has finished >>>> his work. A lot of locking overhead is needed for coordination. >>>> Returning from bufobj_wwait() we always see one left dirty buffer (very >>>> seldom two), that is not optimal. Therefore I have tried the following >>>> patch (instead of bumping maxretry): >>>> >>>> --- vfs_default.c.orig 2016-10-24 12:26:57.000000000 +0200 >>>> +++ vfs_default.c 2017-09-15 12:30:44.792274000 +0200 >>>> @@ -688,6 +688,8 @@ >>>> bremfree(bp); >>>> bawrite(bp); >>>> } >>>> + if( maxretry < 1000) >>>> + DELAY(waitns); >>>> BO_LOCK(bo); >>>> goto loop2; >>>> } >>>> >>>> with different values for waitns. If I run the testloop 5000 times on my >>>> testserver, the problem is triggered always round about 10 times. The >>>> results from several runs are given in the following table: >>>> >>>> waitns max time max loops >>>> ------------------------------- >>>> no DELAY 0,5 sec 8650 (maxres = 100000) >>>> 1000 0,2 sec 24 >>>> 10000 0,8 sec 3 >>>> 100000 7,2 sec 3 >>>> >>>> "time" means spent time in vop_stdfsync() measured from entry to return >>>> by a dtrace script. "loops" means the number of times "--maxretry" is >>>> executed. I am not sure if DELAY() is the best way to wait or if waiting >>>> has other drawbacks. Anyway with DELAY() it does not take more than five >>>> iterazions to finish. >>> This is not explicitly stated in your message, but I suppose that the >>> vop_stdfsync() is called due to VOP_FSYNC(devvp, MNT_SUSPEND) call in >>> ffs_sync(). Am I right ? >> Yes, the stack trace given by dtrace script looks always like this: >> 4 22140 vop_stdfsync:entry >> kernel`devfs_fsync+0x26 >> kernel`VOP_FSYNC_APV+0xa7 >> kernel`ffs_sync+0x3bb >> kernel`vfs_write_suspend+0x1cd >> geom_journal.ko`g_journal_switcher+0x9a4 >> kernel`fork_exit+0x9a >> kernel`0xffffffff8095502e >> >> >>> If yes, then the solution is most likely to continue looping in the >>> vop_stdfsync() until there is no dirty buffers or the mount point >>> mnt_secondary_writes counter is zero. The pauses trick you tried might >>> be still useful, e.g. after some threshold of the performed loop >>> iterations. >> I have checked your proposal and found that indeed the >> mnt_secondary_writes counter goes to zero when the dirties have reached >> zero. During the loop the mnt_secondary_write counter is always equal to >> one, so there is not something like a countdown and thats Kirk wanted to >> see. > This is because mnt_secondary_write counts number of threads which entered > the vn_start_secondary_write() block and potentially can issue a write > dirtying a buffer. In principle, some writer may start the secondary > write block again even if the counter is zero, but practically some > primary writer must make a modification for secondary writers to have > work. > > I.e., the change would not cover the problem to claim it being completely > solved, but for the current UFS code I doubt that the issue can be triggered. > >> A dtrace output (with DELAY of 1ms in the loop) for the biggest >> loop count on a three day test is this: >> >> 18 32865 kern_unlinkat:entry path=bigfile, tid=101201, >> tid=101201, execname=rm >> 18 12747 ufs_remove:entry gj=mirror/gmbkp4p5.journal, >> inum=11155630, blocks=22301568, size=11415525660 >> 18 12748 ufs_remove:return returncode=0, inum=11155630, >> blocks=22301568 >> 18 18902 ffs_truncate:entry gj=mirror/gmbkp4p5.journal, >> inum=11155630, size=11415525660, mnt_flag=0x12001040, >> mnt_kern_flag=0x40006142, blocks=22301568 >> 6 33304 vfs_write_suspend:entry gj=mirror/gmbkp4p5.journal, >> mnt_kern_flag=0x40006142, tid=100181 >> 6 22140 vop_stdfsync:entry mounted on /home, waitfor=1, >> numoutput=0, clean=10, dirty=6, secondary_writes=1 >> 10 28117 bufobj_wwait:return calls to bufobj_wait = 1, >> dirtycnt=2, secondary_writes=1 >> 10 28117 bufobj_wwait:return calls to bufobj_wait = 2, >> dirtycnt=1, secondary_writes=1 >> 10 28117 bufobj_wwait:return calls to bufobj_wait = 3, >> dirtycnt=1, secondary_writes=1 >> 10 28117 bufobj_wwait:return calls to bufobj_wait = 4, >> dirtycnt=3, secondary_writes=1 >> 10 28117 bufobj_wwait:return calls to bufobj_wait = 5, >> dirtycnt=2, secondary_writes=1 >> 6 28117 bufobj_wwait:return calls to bufobj_wait = 6, >> dirtycnt=3, secondary_writes=1 >> 6 28117 bufobj_wwait:return calls to bufobj_wait = 7, >> dirtycnt=3, secondary_writes=1 >> 6 28117 bufobj_wwait:return calls to bufobj_wait = 8, >> dirtycnt=3, secondary_writes=1 >> 6 28117 bufobj_wwait:return calls to bufobj_wait = 9, >> dirtycnt=3, secondary_writes=1 >> 6 28117 bufobj_wwait:return calls to bufobj_wait = 10, >> dirtycnt=2, secondary_writes=1 >> 6 28117 bufobj_wwait:return calls to bufobj_wait = 11, >> dirtycnt=2, secondary_writes=1 >> 6 28117 bufobj_wwait:return calls to bufobj_wait = 12, >> dirtycnt=3, secondary_writes=1 >> 6 28117 bufobj_wwait:return calls to bufobj_wait = 13, >> dirtycnt=3, secondary_writes=1 >> 6 28117 bufobj_wwait:return calls to bufobj_wait = 14, >> dirtycnt=3, secondary_writes=1 >> 6 28117 bufobj_wwait:return calls to bufobj_wait = 15, >> dirtycnt=4, secondary_writes=1 >> 6 28117 bufobj_wwait:return calls to bufobj_wait = 16, >> dirtycnt=3, secondary_writes=1 >> 6 28117 bufobj_wwait:return calls to bufobj_wait = 17, >> dirtycnt=3, secondary_writes=1 >> 2 18903 ffs_truncate:return returncode=0, inum=11155630, blocks=0 >> 2 32866 kern_unlinkat:return returncode=0, errno=0, number >> io's: 791/791 >> 6 28117 bufobj_wwait:return calls to bufobj_wait = 18, >> dirtycnt=3, secondary_writes=0 >> 6 28117 bufobj_wwait:return calls to bufobj_wait = 19, >> dirtycnt=0, secondary_writes=0 >> 6 22141 vop_stdfsync:return returncode=0, pid=26, tid=100181, >> spent 240373850 nsecs >> >> So the spent time in vop_stdfsync() is 0,24 sec in the worst case I >> found using DELAY with 1 ms. I would prefer this solution. My first >> appoach (simple bumping maxres from 1000 to 100000) is also ok, but max >> spend time will be raise up to 0,5 sec. Perhaps you like something like >> >> if( maxretry < 1000 && maxretry % 10 = 0) >> DELAY(waitns); >> >> That is also ok but does not make a noteworthy difference. The main >> argument remains: we have to wait until there are no dirties left. >> >> For me the problem with the "giving up on dirty" is solved. > Will you provide the patch ? Patch against HEAD: --- vfs_default.c.orig 2017-09-22 11:56:26.950084000 +0200 +++ vfs_default.c 2017-09-22 11:58:33.211196000 +0200 @@ -690,6 +690,8 @@ bremfree(bp); bawrite(bp); } + if( maxretry < 1000) + DELAY(1000); /* 1 ms */ BO_LOCK(bo); goto loop2; } -- Andreas Longwitz From owner-freebsd-fs@freebsd.org Fri Sep 22 10:29:25 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4F2FCE22D01 for ; Fri, 22 Sep 2017 10:29:25 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DA06564F1C for ; Fri, 22 Sep 2017 10:29:24 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v8MATHMK018754 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 22 Sep 2017 13:29:17 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v8MATHMK018754 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id v8MATHaZ018753; Fri, 22 Sep 2017 13:29:17 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 22 Sep 2017 13:29:17 +0300 From: Konstantin Belousov To: Andreas Longwitz Cc: Kirk McKusick , freebsd-fs@freebsd.org Subject: Re: fsync: giving up on dirty on ufs partitions running vfs_write_suspend() Message-ID: <20170922102917.GC2271@kib.kiev.ua> References: <201709110519.v8B5JVmf060773@chez.mckusick.com> <59BD0EAC.8030206@incore.de> <20170916183117.GF78693@kib.kiev.ua> <59C37F46.80509@incore.de> <20170921172902.GW78693@kib.kiev.ua> <59C4DF8D.5070004@incore.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <59C4DF8D.5070004@incore.de> User-Agent: Mutt/1.9.0 (2017-09-02) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Sep 2017 10:29:25 -0000 On Fri, Sep 22, 2017 at 12:01:49PM +0200, Andreas Longwitz wrote: > Patch against HEAD: Of course I meant the patch which waits for secondary writers to pass. > --- vfs_default.c.orig 2017-09-22 11:56:26.950084000 +0200 > +++ vfs_default.c 2017-09-22 11:58:33.211196000 +0200 > @@ -690,6 +690,8 @@ > bremfree(bp); > bawrite(bp); > } > + if( maxretry < 1000) > + DELAY(1000); /* 1 ms */ > BO_LOCK(bo); > goto loop2; > } From owner-freebsd-fs@freebsd.org Fri Sep 22 15:46:48 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3C0E5E0263F for ; Fri, 22 Sep 2017 15:46:48 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from dss.incore.de (dss.incore.de [195.145.1.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 00B1B711DF for ; Fri, 22 Sep 2017 15:46:47 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from inetmail.dmz (inetmail.dmz [10.3.0.3]) by dss.incore.de (Postfix) with ESMTP id B6E1011FF; Fri, 22 Sep 2017 17:46:44 +0200 (CEST) X-Virus-Scanned: amavisd-new at incore.de Received: from dss.incore.de ([10.3.0.3]) by inetmail.dmz (inetmail.dmz [10.3.0.3]) (amavisd-new, port 10024) with LMTP id kKOC4Mspx48t; Fri, 22 Sep 2017 17:46:43 +0200 (CEST) Received: from mail.local.incore (fwintern.dmz [10.0.0.253]) by dss.incore.de (Postfix) with ESMTP id 10E34125F; Fri, 22 Sep 2017 17:46:43 +0200 (CEST) Received: from bsdlo.incore (bsdlo.incore [192.168.0.84]) by mail.local.incore (Postfix) with ESMTP id F0A8D508A2; Fri, 22 Sep 2017 17:46:42 +0200 (CEST) Message-ID: <59C53062.5070709@incore.de> Date: Fri, 22 Sep 2017 17:46:42 +0200 From: Andreas Longwitz User-Agent: Thunderbird 2.0.0.19 (X11/20090113) MIME-Version: 1.0 To: Konstantin Belousov CC: Kirk McKusick , freebsd-fs@freebsd.org Subject: Re: fsync: giving up on dirty on ufs partitions running vfs_write_suspend() References: <201709110519.v8B5JVmf060773@chez.mckusick.com> <59BD0EAC.8030206@incore.de> <20170916183117.GF78693@kib.kiev.ua> <59C37F46.80509@incore.de> <20170921172902.GW78693@kib.kiev.ua> <59C4DF8D.5070004@incore.de> <20170922102917.GC2271@kib.kiev.ua> In-Reply-To: <20170922102917.GC2271@kib.kiev.ua> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Sep 2017 15:46:48 -0000 Konstantin Belousov wroze: > On Fri, Sep 22, 2017 at 12:01:49PM +0200, Andreas Longwitz wrote: >> Patch against HEAD: > Of course I meant the patch which waits for secondary writers to pass. > >> --- vfs_default.c.orig 2017-09-22 11:56:26.950084000 +0200 >> +++ vfs_default.c 2017-09-22 11:58:33.211196000 +0200 >> @@ -690,6 +690,8 @@ >> bremfree(bp); >> bawrite(bp); >> } >> + if( maxretry < 1000) >> + DELAY(1000); /* 1 ms */ >> BO_LOCK(bo); >> goto loop2; >> } Excuse me, but I don't have a patch which waits for secondary writers to pass. As I posted before I have checked using a dtrace script, that the counter bo->bo_dirty.bv_cnt (which is used by the kernel) always goes to zero, when the secondary writes (as you explained: meaning number of threads in vn_start_secondary_write) goes to zero. To be exact: bo->bo_dirty.bv_cnt goes to zero one loop step later than mnt_secondary_writes. But without bumping up maxres or introducing some kind of DELAY we will sometimes trigger the "giving up on dirty" message. First of all I prefer to get rid of this message. Your proposal to check the secondary writes would be a further improvement of the code. -- Andreas Longwitz From owner-freebsd-fs@freebsd.org Fri Sep 22 22:20:26 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6ED70E12190 for ; Fri, 22 Sep 2017 22:20:26 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (unknown [IPv6:2602:304:b010:ef20::f2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 31272812C0 for ; Fri, 22 Sep 2017 22:20:26 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id v8MMKEW5085371; Fri, 22 Sep 2017 15:20:18 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201709222220.v8MMKEW5085371@gw.catspoiler.org> Date: Fri, 22 Sep 2017 15:20:14 -0700 (PDT) From: Don Lewis Subject: Re: fsync: giving up on dirty on ufs partitions running vfs_write_suspend() To: longwitz@incore.de cc: kostikbel@gmail.com, mckusick@mckusick.com, freebsd-fs@freebsd.org In-Reply-To: <59C4DF8D.5070004@incore.de> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Sep 2017 22:20:26 -0000 On 22 Sep, Andreas Longwitz wrote: > Konstantin Belousov schrieb: >> On Thu, Sep 21, 2017 at 10:58:46AM +0200, Andreas Longwitz wrote: >>> Konstantin Belousov wrote: >>>> On Sat, Sep 16, 2017 at 01:44:44PM +0200, Andreas Longwitz wrote: >>>>> Ok, I understand your thoughts about the "big loop" and I agree. On the >>>>> other side it is not easy to measure the progress of the dirty buffers >>>>> because these buffers a created from another process at the same time we >>>>> loop in vop_stdfsync(). I can explain from my tests, where I use the >>>>> following loop on a gjournaled partition: >>>>> >>>>> while true; do >>>>> cp -p bigfile bigfile.tmp >>>>> rm bigfile >>>>> mv bigfile.tmp bigfile >>>>> done >>>>> >>>>> When g_journal_switcher starts vfs_write_suspend() immediately after the >>>>> rm command has started to do his "rm stuff" (ufs_inactive, ffs_truncate, >>>>> ffs_indirtrunc at different levels, ffs_blkfree, ...) the we must loop >>>>> (that means wait) in vop_stdfsync() until the rm process has finished >>>>> his work. A lot of locking overhead is needed for coordination. >>>>> Returning from bufobj_wwait() we always see one left dirty buffer (very >>>>> seldom two), that is not optimal. Therefore I have tried the following >>>>> patch (instead of bumping maxretry): >>>>> >>>>> --- vfs_default.c.orig 2016-10-24 12:26:57.000000000 +0200 >>>>> +++ vfs_default.c 2017-09-15 12:30:44.792274000 +0200 >>>>> @@ -688,6 +688,8 @@ >>>>> bremfree(bp); >>>>> bawrite(bp); >>>>> } >>>>> + if( maxretry < 1000) >>>>> + DELAY(waitns); >>>>> BO_LOCK(bo); >>>>> goto loop2; >>>>> } >>>>> >>>>> with different values for waitns. If I run the testloop 5000 times on my >>>>> testserver, the problem is triggered always round about 10 times. The >>>>> results from several runs are given in the following table: >>>>> >>>>> waitns max time max loops >>>>> ------------------------------- >>>>> no DELAY 0,5 sec 8650 (maxres = 100000) >>>>> 1000 0,2 sec 24 >>>>> 10000 0,8 sec 3 >>>>> 100000 7,2 sec 3 >>>>> >>>>> "time" means spent time in vop_stdfsync() measured from entry to return >>>>> by a dtrace script. "loops" means the number of times "--maxretry" is >>>>> executed. I am not sure if DELAY() is the best way to wait or if waiting >>>>> has other drawbacks. Anyway with DELAY() it does not take more than five >>>>> iterazions to finish. >>>> This is not explicitly stated in your message, but I suppose that the >>>> vop_stdfsync() is called due to VOP_FSYNC(devvp, MNT_SUSPEND) call in >>>> ffs_sync(). Am I right ? >>> Yes, the stack trace given by dtrace script looks always like this: >>> 4 22140 vop_stdfsync:entry >>> kernel`devfs_fsync+0x26 >>> kernel`VOP_FSYNC_APV+0xa7 >>> kernel`ffs_sync+0x3bb >>> kernel`vfs_write_suspend+0x1cd >>> geom_journal.ko`g_journal_switcher+0x9a4 >>> kernel`fork_exit+0x9a >>> kernel`0xffffffff8095502e >>> >>> >>>> If yes, then the solution is most likely to continue looping in the >>>> vop_stdfsync() until there is no dirty buffers or the mount point >>>> mnt_secondary_writes counter is zero. The pauses trick you tried might >>>> be still useful, e.g. after some threshold of the performed loop >>>> iterations. >>> I have checked your proposal and found that indeed the >>> mnt_secondary_writes counter goes to zero when the dirties have reached >>> zero. During the loop the mnt_secondary_write counter is always equal to >>> one, so there is not something like a countdown and thats Kirk wanted to >>> see. >> This is because mnt_secondary_write counts number of threads which entered >> the vn_start_secondary_write() block and potentially can issue a write >> dirtying a buffer. In principle, some writer may start the secondary >> write block again even if the counter is zero, but practically some >> primary writer must make a modification for secondary writers to have >> work. >> >> I.e., the change would not cover the problem to claim it being completely >> solved, but for the current UFS code I doubt that the issue can be triggered. >> >>> A dtrace output (with DELAY of 1ms in the loop) for the biggest >>> loop count on a three day test is this: >>> >>> 18 32865 kern_unlinkat:entry path=bigfile, tid=101201, >>> tid=101201, execname=rm >>> 18 12747 ufs_remove:entry gj=mirror/gmbkp4p5.journal, >>> inum=11155630, blocks=22301568, size=11415525660 >>> 18 12748 ufs_remove:return returncode=0, inum=11155630, >>> blocks=22301568 >>> 18 18902 ffs_truncate:entry gj=mirror/gmbkp4p5.journal, >>> inum=11155630, size=11415525660, mnt_flag=0x12001040, >>> mnt_kern_flag=0x40006142, blocks=22301568 >>> 6 33304 vfs_write_suspend:entry gj=mirror/gmbkp4p5.journal, >>> mnt_kern_flag=0x40006142, tid=100181 >>> 6 22140 vop_stdfsync:entry mounted on /home, waitfor=1, >>> numoutput=0, clean=10, dirty=6, secondary_writes=1 >>> 10 28117 bufobj_wwait:return calls to bufobj_wait = 1, >>> dirtycnt=2, secondary_writes=1 >>> 10 28117 bufobj_wwait:return calls to bufobj_wait = 2, >>> dirtycnt=1, secondary_writes=1 >>> 10 28117 bufobj_wwait:return calls to bufobj_wait = 3, >>> dirtycnt=1, secondary_writes=1 >>> 10 28117 bufobj_wwait:return calls to bufobj_wait = 4, >>> dirtycnt=3, secondary_writes=1 >>> 10 28117 bufobj_wwait:return calls to bufobj_wait = 5, >>> dirtycnt=2, secondary_writes=1 >>> 6 28117 bufobj_wwait:return calls to bufobj_wait = 6, >>> dirtycnt=3, secondary_writes=1 >>> 6 28117 bufobj_wwait:return calls to bufobj_wait = 7, >>> dirtycnt=3, secondary_writes=1 >>> 6 28117 bufobj_wwait:return calls to bufobj_wait = 8, >>> dirtycnt=3, secondary_writes=1 >>> 6 28117 bufobj_wwait:return calls to bufobj_wait = 9, >>> dirtycnt=3, secondary_writes=1 >>> 6 28117 bufobj_wwait:return calls to bufobj_wait = 10, >>> dirtycnt=2, secondary_writes=1 >>> 6 28117 bufobj_wwait:return calls to bufobj_wait = 11, >>> dirtycnt=2, secondary_writes=1 >>> 6 28117 bufobj_wwait:return calls to bufobj_wait = 12, >>> dirtycnt=3, secondary_writes=1 >>> 6 28117 bufobj_wwait:return calls to bufobj_wait = 13, >>> dirtycnt=3, secondary_writes=1 >>> 6 28117 bufobj_wwait:return calls to bufobj_wait = 14, >>> dirtycnt=3, secondary_writes=1 >>> 6 28117 bufobj_wwait:return calls to bufobj_wait = 15, >>> dirtycnt=4, secondary_writes=1 >>> 6 28117 bufobj_wwait:return calls to bufobj_wait = 16, >>> dirtycnt=3, secondary_writes=1 >>> 6 28117 bufobj_wwait:return calls to bufobj_wait = 17, >>> dirtycnt=3, secondary_writes=1 >>> 2 18903 ffs_truncate:return returncode=0, inum=11155630, blocks=0 >>> 2 32866 kern_unlinkat:return returncode=0, errno=0, number >>> io's: 791/791 >>> 6 28117 bufobj_wwait:return calls to bufobj_wait = 18, >>> dirtycnt=3, secondary_writes=0 >>> 6 28117 bufobj_wwait:return calls to bufobj_wait = 19, >>> dirtycnt=0, secondary_writes=0 >>> 6 22141 vop_stdfsync:return returncode=0, pid=26, tid=100181, >>> spent 240373850 nsecs >>> >>> So the spent time in vop_stdfsync() is 0,24 sec in the worst case I >>> found using DELAY with 1 ms. I would prefer this solution. My first >>> appoach (simple bumping maxres from 1000 to 100000) is also ok, but max >>> spend time will be raise up to 0,5 sec. Perhaps you like something like >>> >>> if( maxretry < 1000 && maxretry % 10 = 0) >>> DELAY(waitns); >>> >>> That is also ok but does not make a noteworthy difference. The main >>> argument remains: we have to wait until there are no dirties left. >>> >>> For me the problem with the "giving up on dirty" is solved. >> Will you provide the patch ? > > Patch against HEAD: > --- vfs_default.c.orig 2017-09-22 11:56:26.950084000 +0200 > +++ vfs_default.c 2017-09-22 11:58:33.211196000 +0200 > @@ -690,6 +690,8 @@ > bremfree(bp); > bawrite(bp); > } > + if( maxretry < 1000) > + DELAY(1000); /* 1 ms */ > BO_LOCK(bo); > goto loop2; > } Do you need to use a busy loop here, or can you yield the cpu by using something like pause(9)? From owner-freebsd-fs@freebsd.org Sat Sep 23 12:35:14 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 88817E0488F for ; Sat, 23 Sep 2017 12:35:14 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 71836720A3 for ; Sat, 23 Sep 2017 12:35:14 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v8NCZETR032144 for ; Sat, 23 Sep 2017 12:35:14 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 219972] Unable to zpool export following some zfs recv Date: Sat, 23 Sep 2017 12:35:14 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: matthias.pfaller@familie-pfaller.de X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Sep 2017 12:35:14 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D219972 --- Comment #7 from Matthias Pfaller = --- Additional information: I'm sending zvols. @pfribeiro: Do you have zvols in your recv as well? --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Sat Sep 23 16:18:15 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EEEF7E085F8 for ; Sat, 23 Sep 2017 16:18:15 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from dss.incore.de (dss.incore.de [195.145.1.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B32B177C08; Sat, 23 Sep 2017 16:18:15 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from inetmail.dmz (inetmail.dmz [10.3.0.3]) by dss.incore.de (Postfix) with ESMTP id 652CB10A7; Sat, 23 Sep 2017 18:18:06 +0200 (CEST) X-Virus-Scanned: amavisd-new at incore.de Received: from dss.incore.de ([10.3.0.3]) by inetmail.dmz (inetmail.dmz [10.3.0.3]) (amavisd-new, port 10024) with LMTP id HH6UjIAJ_NlA; Sat, 23 Sep 2017 18:18:02 +0200 (CEST) Received: from mail.local.incore (fwintern.dmz [10.0.0.253]) by dss.incore.de (Postfix) with ESMTP id 784B4E0E; Sat, 23 Sep 2017 18:18:02 +0200 (CEST) Received: from bsdmhs.longwitz (unknown [192.168.99.6]) by mail.local.incore (Postfix) with ESMTP id 30FD0508D6; Sat, 23 Sep 2017 18:18:02 +0200 (CEST) Message-ID: <59C68939.2040509@incore.de> Date: Sat, 23 Sep 2017 18:18:01 +0200 From: Andreas Longwitz User-Agent: Thunderbird 2.0.0.19 (X11/20090113) MIME-Version: 1.0 To: Don Lewis CC: kostikbel@gmail.com, mckusick@mckusick.com, freebsd-fs@freebsd.org Subject: Re: fsync: giving up on dirty on ufs partitions running vfs_write_suspend() References: <201709222220.v8MMKEW5085371@gw.catspoiler.org> In-Reply-To: <201709222220.v8MMKEW5085371@gw.catspoiler.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Sep 2017 16:18:16 -0000 Don Lewis wrote: >> Patch against HEAD: >> --- vfs_default.c.orig 2017-09-22 11:56:26.950084000 +0200 >> +++ vfs_default.c 2017-09-22 11:58:33.211196000 +0200 >> @@ -690,6 +690,8 @@ >> bremfree(bp); >> bawrite(bp); >> } >> + if( maxretry < 1000) >> + DELAY(1000); /* 1 ms */ >> BO_LOCK(bo); >> goto loop2; >> } > > Do you need to use a busy loop here, or can you yield the cpu by using > something like pause(9)? > No, I don't need a busy loop, it is even bad if we have only one CPU. I have assumed that the person doing eventually a commit for this problem will much better know than I how to wait 1 ms in the kernel. I have repested my tests with "pause("dirty", hz/1000)" instead of "DELAY(1000)" and got the same results in maximal loop numbers and spent times as before. -- Andreas Longwitz