From owner-freebsd-fs@freebsd.org Sun Apr 1 10:09:49 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9DF83F7DECD for ; Sun, 1 Apr 2018 10:09:49 +0000 (UTC) (envelope-from aijazbaig1@gmail.com) Received: from mail-io0-x234.google.com (mail-io0-x234.google.com [IPv6:2607:f8b0:4001:c06::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5687C819C6 for ; Sun, 1 Apr 2018 10:09:49 +0000 (UTC) (envelope-from aijazbaig1@gmail.com) Received: by mail-io0-x234.google.com with SMTP id m83so15172324ioi.8 for ; Sun, 01 Apr 2018 03:09:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=/mYCr28wYmIUFvpE93phq1gwpcGPmBLmX9JhubGScoA=; b=OOd33JsIYQEGi4u4NpXWP0RlGa9LN+NxIUiT1SMAUzxWPDqlq5v7SGa/B7tdJLRySO ptO/j77HN9nUTKo612K3LdyRi9ffmB3RpQ7JL60Kz30596cYAnhblTSOruP1+SlkVghc qH4fnLmTcOFCCyOn4wnGaYs0ZTbFtnTfmz/H7AXBSDLqzLsT368xHiv5DQtavnyPkyyR NYc+tswqBKf3UiSzKkmSN7W5/za2W4pZ+AsqsvWBICIBvbIKd67UntQ2j+e4v8Q9qz8a 1sLkOHnnoRNigiVt3vxu9LOsH69du3Za5WbVH/2mMJgq4d1xvarxf3pkQft9Ro3cMMYf BITQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=/mYCr28wYmIUFvpE93phq1gwpcGPmBLmX9JhubGScoA=; b=qds38qnyiLi37cy93U/KqHNsmDHZzF1hgMgLgx0E9quMwEeVkEL7sBRZyAr7ws/wyX YEbBAHSjxSy31PHZlFV9ppaAI2/NaoLLhXO1ls/JeTzGxs1h808xs3Wpk+cVcyZpBc15 PMuE1RTRh3e1HTnM9T9z2O5mdRgNF7RipmG8iixH21N1M6gGPLk03mvGr2XfCMYdRxnR fWMbxueT1g1Rk7cRwcHIAWB88DpkZN6lEfZyUwkw1S0MfNNCCnbUPiKu/FLrgRtSDfAi VfivROlBPS5U0gZYMkvPnZs4jjcJYoGugOQ+QqmGPygEScgjhokyyotlV+UzsflNd4gq +9Dg== X-Gm-Message-State: ALQs6tAr8d+92OwAwuJ8ubKbkC9Q8ob3gec2y9ciz1ruGxMOuLumgX8R zKQQm2MVJ35HNr0HZU1zSRi1lG4DPIx1+wWbUPM= X-Google-Smtp-Source: AIpwx4/+beP+BENmYY/OGz1jJKfBHInzXaOftwgx85UDyAUYmpnBqD/dQ+V5awPpFnjJ/pW9Vpj6k+agJ0wGfrhN/tw= X-Received: by 10.107.12.202 with SMTP id 71mr4760543iom.63.1522577388553; Sun, 01 Apr 2018 03:09:48 -0700 (PDT) MIME-Version: 1.0 Received: by 10.2.141.22 with HTTP; Sun, 1 Apr 2018 03:09:47 -0700 (PDT) In-Reply-To: <201802250642.w1P6g8Wm064509@chez.mckusick.com> References: <201802250642.w1P6g8Wm064509@chez.mckusick.com> From: Aijaz Baig Date: Sun, 1 Apr 2018 15:39:47 +0530 Message-ID: Subject: Re: Regarding vop_vector To: Kirk McKusick Cc: freebsd-fs@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 01 Apr 2018 10:09:49 -0000 Thank you for the detailed summary! Will go through On Sun, Feb 25, 2018 at 12:12 PM, Kirk McKusick wrote: > > From: Aijaz Baig > > Date: Sun, 25 Feb 2018 07:24:12 +0530 > > Subject: Regarding vop_vector > > To: freebsd-fs@freebsd.org > > > > Hello > > > > I am trying to understand how the VFS layer within FreeBSD and I was > rather > > stumped while trying to find where vop_vector was declared. Upon > searching > > the internet, realized that an awk script is used to "generate" this > like so: > > > > /sys/tools/vnode_if.awk /sys/kern/vnode_if.src -q > > > > So I was wondering if anyone could provide me a (brief would be fine as > > well) walk through the memory lane as to why such an 'odd looking' way > was > > adopted (perhaps it is brilliant but my thick skull is unable to fathom > > it's brilliance). > > > > Keen to hear from you experts out there please! > > -- > > Best Regards, > > Aijaz Baig > > As the person that came up with this idea (in the 1980's) let me try to > explain my thinking. Suppose you want to add a new VFS operation. Before > I created the script, you had to create macros for it in at least four > different files. Now you just need to add a concise and readable > description > of it in /sys/kern/vnode_if.src then run the script and all the boilerplate > gets generated for you. > > Let me give you a short example. Consider the VFS operator to check to > see if a vnode is locked, typically used as > > if (VOP_ISLOCKED(vp)) { > do something with locked vp; > } > > The description of it in /sys/kern/vnode_if.src is three lines: > > vop_islocked { > IN struct vnode *vp; > }; > > Here is what is generated for it in the compile directory by the awk > script. > In /sys/amd64/compile/GENERIC/vnode_if.c: > > static int vop_islocked_vp_offsets[] = { > VOPARG_OFFSETOF(struct vop_islocked_args,a_vp), > VDESC_NO_OFFSET > }; > > > SDT_PROBE_DEFINE2(vfs, vop, vop_islocked, entry, "struct vnode *", "struct > vop_islocked_args *"); > > SDT_PROBE_DEFINE3(vfs, vop, vop_islocked, return, "struct vnode *", > "struct vop_islocked_args *", "int"); > > > int > VOP_ISLOCKED_AP(struct vop_islocked_args *a) > { > > return(VOP_ISLOCKED_APV(a->a_vp->v_op, a)); > } > > int > VOP_ISLOCKED_APV(struct vop_vector *vop, struct vop_islocked_args *a) > { > int rc; > > VNASSERT(a->a_gen.a_desc == &vop_islocked_desc, a->a_vp, > ("Wrong a_desc in vop_islocked(%p, %p)", a->a_vp, a)); > while(vop != NULL && \ > vop->vop_islocked == NULL && vop->vop_bypass == NULL) > vop = vop->vop_default; > VNASSERT(vop != NULL, a->a_vp, ("No vop_islocked(%p, %p)", > a->a_vp, a)); > SDT_PROBE2(vfs, vop, vop_islocked, entry, a->a_vp, a); > > KTR_START1(KTR_VOP, "VOP", "VOP_ISLOCKED", (uintptr_t)a, > "vp:0x%jX", (uintptr_t)a->a_vp); > VFS_PROLOGUE(a->a_vp->v_mount); > if (vop->vop_islocked != NULL) > rc = vop->vop_islocked(a); > else > rc = vop->vop_bypass(&a->a_gen); > VFS_EPILOGUE(a->a_vp->v_mount); > SDT_PROBE3(vfs, vop, vop_islocked, return, a->a_vp, a, rc); > > if (rc == 0) { > } else { > } > KTR_STOP1(KTR_VOP, "VOP", "VOP_ISLOCKED", (uintptr_t)a, > "vp:0x%jX", (uintptr_t)a->a_vp); > return (rc); > } > > struct vnodeop_desc vop_islocked_desc = { > "vop_islocked", > 0, > (vop_bypass_t *)VOP_ISLOCKED_AP, > vop_islocked_vp_offsets, > VDESC_NO_OFFSET, > VDESC_NO_OFFSET, > VDESC_NO_OFFSET, > VDESC_NO_OFFSET, > }; > > In /sys/amd64/compile/GENERIC/vnode_if.h: > > struct vop_islocked_args { > struct vop_generic_args a_gen; > struct vnode *a_vp; > }; > > extern struct vnodeop_desc vop_islocked_desc; > > int VOP_ISLOCKED_AP(struct vop_islocked_args *); > int VOP_ISLOCKED_APV(struct vop_vector *vop, struct vop_islocked_args *); > > static __inline int VOP_ISLOCKED( > struct vnode *vp) > { > struct vop_islocked_args a; > > a.a_gen.a_desc = &vop_islocked_desc; > a.a_vp = vp; > return (VOP_ISLOCKED_APV(vp->v_op, &a)); > } > > In /sys/amd64/compile/GENERIC/vnode_if_newproto.h, > an entry in the vop_vector array: > > struct vop_vector { > struct vop_vector *vop_default; > vop_bypass_t *vop_bypass; > vop_islocked_t *vop_islocked; > ... > }; > > And finally in /sys/amd64/compile/GENERIC/vnode_if_typedef.h: > > struct vop_islocked_args; > typedef int vop_islocked_t(struct vop_islocked_args *); > > And absent this script, every time you wanted to make a change in the > boilerplate, you would have to go through and fix it for every existing > VFS operator (and trust me the boilerplate has changed a *lot* since I > first did it in the 1980's :-) > > Kirk McKusick > -- Best Regards, Aijaz Baig From owner-freebsd-fs@freebsd.org Sun Apr 1 21:01:01 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E3713F7B32A for ; Sun, 1 Apr 2018 21:01:00 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 651E47D1D9 for ; Sun, 1 Apr 2018 21:01:00 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id B26532CBEE for ; Sun, 1 Apr 2018 21:00:59 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w31L0xE5040072 for ; Sun, 1 Apr 2018 21:00:59 GMT (envelope-from bugzilla-noreply@FreeBSD.org) Received: (from bugzilla@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w31L0x92040065 for freebsd-fs@FreeBSD.org; Sun, 1 Apr 2018 21:00:59 GMT (envelope-from bugzilla-noreply@FreeBSD.org) Message-Id: <201804012100.w31L0x92040065@kenobi.freebsd.org> X-Authentication-Warning: kenobi.freebsd.org: bugzilla set sender to bugzilla-noreply@FreeBSD.org using -f From: bugzilla-noreply@FreeBSD.org To: freebsd-fs@FreeBSD.org Subject: Problem reports for freebsd-fs@FreeBSD.org that need special attention Date: Sun, 1 Apr 2018 21:00:59 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 01 Apr 2018 21:01:01 -0000 To view an individual PR, use: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=(Bug Id). The following is a listing of current problems submitted by FreeBSD users, which need special attention. These represent problem reports covering all versions including experimental development code and obsolete releases. Status | Bug Id | Description ------------+-----------+--------------------------------------------------- New | 203492 | mount_unionfs -o below causes panic New | 217062 | for file systems mounted with -o noexec, exec=off New | 221909 | [ZFS] Add a sysctl to toggle send_corrupt_data Open | 136470 | [nfs] Cannot mount / in read-only, over NFS Open | 139651 | [nfs] mount(8): read-only remount of NFS volume d Open | 140068 | [smbfs] [patch] smbfs does not allow semicolon in Open | 144447 | [zfs] sharenfs fsunshare() & fsshare_main() non f Open | 211491 | System hangs after "Uptime" on reboot with ZFS 8 problems total for which you should take action. From owner-freebsd-fs@freebsd.org Mon Apr 2 13:36:58 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2E09CF59E3A for ; Mon, 2 Apr 2018 13:36:58 +0000 (UTC) (envelope-from stilezy@gmail.com) Received: from mail-ua0-x22c.google.com (mail-ua0-x22c.google.com [IPv6:2607:f8b0:400c:c08::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id BA8777EC91 for ; Mon, 2 Apr 2018 13:36:57 +0000 (UTC) (envelope-from stilezy@gmail.com) Received: by mail-ua0-x22c.google.com with SMTP id v4so8404475uaj.3 for ; Mon, 02 Apr 2018 06:36:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=aJMIJe8I3GWMn5ej7GH92lcdXDJuTmqz+peWsozB5tg=; b=eizcxFI0o7IB/L3EgHKNpMkRbotEhbEOTl0Hb5Eci0FFu5eUBHju1/p6Oqwvb66Rqa Tw11CeV/lWerkwgFvip2tQFU5SXbhuwUeO04Xif7ItooslaZV3SJjxhThq8krR4bSPdD SDiq2b/AmkXTe/neYSostZyTXK60iMf9Nz2zxdi2FAalW+fljzItwGqdav8NuyFFq7bc wW1os2mNHgHHhixRRpnYso6Oe2M9Ca7pnNwi6fWvWlHHgmtSb6E3qDTdinSZhhnlE1bQ qxbqcpDGlhLOs8Zo8AZxUkOyzIaTJBT2q5JdSq3gNKhfRclsJfQMIWcD+T7b2SOE1XtY e9yA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=aJMIJe8I3GWMn5ej7GH92lcdXDJuTmqz+peWsozB5tg=; b=SEIRd2GBAJK5/9igL8lrqz0GtA4viYjVebaRqUuGugQIUp6WJAKGkGqyfQ+a3n7Cra ciqLl+h07nEoXV7I/szTsyWPanfzN7YqQ1a+NoRTBsl4/4uJm955bGcK8aVrgWdQ3ne9 YP6/KN8cqKg6etfsnHW/fgNgdmLA69ZhC2dHsi19Oq6uBbEwxvVxqwa9oQqNAI1SeST1 VOW2RTdeGLEjMYWt+OejDueeewmVvi9ZYRr4U2Nry4yBossjzYt+mC5Pl4ShLXPuVsZE F5iMMzPuRqGW9LUcOQnZY7dQ1+1G1NJ4G3rvxjjpwuMlS0QKbZKbZWxd3u/bHFx406u2 rQvQ== X-Gm-Message-State: ALQs6tDn9nT8/i5z6x0WquZ1iBeJkgB5uiSm0sSkmhOmxL3TRr85/XOT SF5GFp4AdKCvRnlGJkmATuOYwveFrbsjvt0d+sEfxA== X-Google-Smtp-Source: AIpwx4+VmGVBjYqJu834sjZ7UC3v4EkcHgNzA/wF4lD0jlgYvQQNRC8BVH7SO9LkYOGCbKOIV9Tr4zb49c2gSycUm2s= X-Received: by 10.159.35.230 with SMTP id 93mr5492107uao.63.1522676216868; Mon, 02 Apr 2018 06:36:56 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.219.148 with HTTP; Mon, 2 Apr 2018 06:36:26 -0700 (PDT) From: Stilez Stilezy Date: Mon, 2 Apr 2018 14:36:26 +0100 Message-ID: Subject: ZFS dedup write pathway - possible inefficiency or..? To: freebsd-fs@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2018 13:36:58 -0000 Hi list, I'm writing because of the fairly technical nature of this question about 2 possibly related issues within ZFS. The first issue is specific to the dedup write pathway. I've tested locally to a point where it seems it's not due to inadequate hardware and it's very consistent and specific, even on idle conditions/minimal load. I'm wondering whether there's a code bottleneck specifically affecting just the dedup write pathway. The second issue seems to be that in some scenarios, ZFS doesn't read from the IO buffers where I'd expect it to, causing netio issues elsewhere. I should say that I'm aware of the intense nature of dedup processing and hopefully I'm not a noob user asking as usual about dedup on crap hardware that can't do the job with data that shouldn't be anywhere near a dedup engine. That's not the case here. The system I'm testing on is built and specced to handle dedup and has a ton of dedupable data with a high ratio, so it's not an edge case: it should be ideal. I'm not really asking for help in resolving the issue. My question is aimed at understanding technically more about the bottlenecks/issues so I can make intelligent decisions how to approach it. More to the point, my gut feeling is that there is some kind of issue/inefficiency in the dedup write pathway, and a second issue whereby ZFS isn't always reading as it should from the network buffers: netio rcv buffers periodically fail to empty during ZFS processing, it's apparently somehow related to txg handling, and causes netrcv buffer backup and TCP zero window issuance within milliseconds lasting almost continually for lengthy periods. That doesn't look right. My main reason for posting here is that if these do seem to be genuine inefficiencies/issues, I'd like to ask if it's sensible to put in an enhancement request on bugzilla for either or both of them. So either way these probably needs technical/dev/committer insight, as I'd like to find out (1) if it's possible to guess what the internal underlying ZFS issues are, and (2) if it's worth putting in enhancement/fix requests. *TEST HARDWARE / OS:* - *Baseboard/CPU/RAM* = Supermicro X10 series + Xeon E5-1680 v4 (3.4+ GHz, octo core, 20MB cache, Broadwell generation) + 128 GB ECC @ 2400 - *Main pool* = array of 12 enterprise 7200 SAS HDDs hanging off 2 x LSI 9311 PCIe3 HBA. The HDDs are configured in ZFS as 4 x (3 way mirrors). Cache drives = Intel P3700 NVMe SLOG/ZIL (reckoned to be very good for reliable low write latency) and 250GB Samsung NVMe (L2ARC) - *NIC* = 10G Chelsio (if it matters) - *Power stability* = EVGA Supernova Platinum 1600W + APC 1500VA UPS - *OS version* = clean install full ISO FreeBSD 11.1 arm64 onto wiped boot SSD mirror (tested with both bare install and also prepackaged as "FreeNAS") - *Installed sw:* Very little running beyond bare OS - no jails, no bhyve, no mods/patches, no custom kernel. Samba and iperf for testing across LAN (see below). - *Main pool:* The main pool has >22 TB capacity and in physical terms it's about 55% full. The data is nicely balanced across the disks, which are almost all the same (or very similar) performance. The data in the existing pool is highly dedupable - it has a ratio of about 4x and judging by zdb's output (total blocks x bytes needed per block) the DDT is about 50 GB. - * Sysctls / loader / rc:* Various sysctls - can list if required. In particular metadata about 75GB of RAM, so that DDTs aren't likely to be forced out, with the remainder split between OS and other file cache (about 10G for OS and about 35G for ARC not reserved for metadata). *Significant values if needed: **vfs.zfs.arc_meta_limit* 75G, *vfs.zfs.l2arc_write_max/write_boost* 300000000 (300MB/s), *vfs.zfs.vdev.cache.size* 200MB, *vfs.zfs.delay_min_dirty_percent* 70. Also various tunables for efficient 10G networking, including testing with large receive buffer sizes. In theory, it should be a fairly powerful setup for handling the heavy workload of a small-scale dedup pool, with no parity data/RaidZ, in quiet circumstances. Certainly I'm not expecting the dedup write outcomes I'm seeing. *TEST SETUP:* I attached 2 x fast wiped SSDs capable of 500 MB/s+ rw, formatted as UFS, and an additional 3 way mirror of another 7200 enterprise SAS drives on the same HBAs, for testing. I created a second pool on the temporary HDD pool, configured identical to the main pool but empty and with dedup=off. I copied a few very large files and a directory of smaller files onto the SSDs (30, 50 and 110GB single files, plus a mixed dir of 3MB mp3s, datasets, ISOs etc), and also copied them to twin SSDs on my directly connected workstation. I hash-checked the copies to ensure they were identical, so that dedup would probably match the blocks they contain. Then I tested copying the files onto both dedup and non-dedup pools, locally (CLI) and across the LAN (Samba) as well as testing raw networking IO (iperf). In each case I copied the files to/from newly created empty dirs, with the intent that the dedup pool would dedup these against existing copies, and the non-dedup pool would just write them as normal. The network and server were both checked as being quiet/idle apart from these copies (previous write flushing finished, all netio/diskio/CPU idle for several seconds, etc). I copied the files from SSD (client/UFS) to the dedup pool and the non-dedup pool, repeatedly and in turn, to offset/minimise issues related to non-cached vs cached data, and to ensure that if dedup was on, performance was measured when DDTs already contained entries for the blocks and they were alrready known cached in ARC. In theory, the writes would be identical other than dedup on/off. Repeat to check stable results. When copying data, I watched common system stats (gstat, iostat, netstat, top, via SSH in multiple windows, all updating every 1 sec). *RESULTS:* Checking with iperf and Samba showed that the system was very fast for reading from the pool, and networking (both ways). It was capable of up to 1 GByte/sec both ways (duplex) on Samba, fractionally more on iperf. But when writing data, whether locally via CLI or across the LAN with Samba, writing to the dedup pool was consistently 10x ~ 20x slower than writing to the non-dedup pool. (raw file write speeds 30~50 MB/s dedup vs 400 ~ 1000 MB/s non-dedup, as seen by client on a 100 GB single file transfer, before allowing for caching effects, nothing else going on). I had known dedup would impact RAM and performance but I had expected a good CPU and hardware to mitigate it a lot, and it wasn't being mitigated much if at all. It was impacting so much that when writing across Samba, the networking subsystem could be seen in tcpdump to be driven to smaller windows and floods of tiny- and zero-windows on TCP, in order to allow *something* within ZFS a lot of extra time for write request handling. Nothing like this outcome happened during non-dedup pool writing, or during dedup reading, of these files. But the server's performance consistently dropped by 10x ~ 20x when writing to the dedup pool. The system should almost surely have enough RAM, and high standard of hardware/setup. DiskIO and txg's looked about right, networking looked sane, and the issue affected only dedup writing. The main suspect seems likely to be either CPU/threading, or an unexpectedly huge avalanche of required metadata updates. With a large amount of RAM to play with and very fast ZIL, and large diskIO caching setup to even out diskIO, it doesn't seem *that* likely to be down to metadata updates, and "top" showed that most of the CPU was idle, but I can't tell if that's cause or effect. I altered a number of tunables to increase TXG/max dirty data/write coalescing, to the point it was writing in noticeable bursts, and even so it didn't help. I also noticed as an aside, if relevant, that where I'd expected one TXG to be building up while the previous TXG was writing out, that wasn't what was happening when writing across the LAN. What I saw consistently was that regularly, ZFS would stop pulling data off the network buffers for a lengthy number of seconds. At 10G speeds the netrcv buffer backed up in milliseconds, causing zero windows. Then, abruptly, the buffer would almost instantly empty and this seemed to coincide with the start of a high lebvel of HDD writing-out. I'm not sure why networking is being stalled and not continuing smoothly, perhaps someone will know? I posted some technical details elsewhere - graphs showing netrcv buffer fill rate (which matches to the millisecond what you'd see if ZFS completely stopped reading incoming data for a lengthy period), and other screenshots. If useful I'll add links in a followup. *DISCUSSION / QUESTIONS:* I suspect that the reason dedup writes (only) were so slow, is that somewhere in the dedup write pathway, where it hashes data, matches it to a DDT entry, and optionally verifies it, something inefficient is going on, and it's slowing down the entire pathway. Perhaps it's only using single core? Perhaps metadata updates aren'are more serious than I realised or inefficient? I'm not sure what's up. But it's very consistent, I've repeated this on multiple platforms and installs since doing the first tests. I don't know where to look further and I probably need input from someone knowledgeable with the internals of the ZFS subsystem to do more. I'd like to nail this down closer and get ideas what's (probably) up. And I'd like to see if it can be enhanced for others, by feeding back into bugzilla if helpful. So my questions are - 1. There seem to be two ZFS issues, and they're somehow linked: *(A)* the dedup write pathway suffers from what feels like an unexpectedly horrible slowdown that's excessive in the scenario; *and* *(B)* ZFS seems to halt from pulling data from the network rcv buffer during a significant part of its processing cycle, to the point that netio is forced to zero win for lengthy periods and most of the time, whereas I'd expect incoming network data to be processed into a new txg regardless of other processing going on, and not cause congestion unless a lot more was going on. Does anyone "in the know" on the technical side have an insight into what might be going on with either of these, or suggest any diagnostics/further info that's useful to pin it down? 2. I'd expect dedup write performance to have *some* kind of impact due to the processing required. But is the dedup write performance slowdown on dedup write (specifically) that I'm seeing, usual *to this extent, on this class of hardware*, on an idle system with just one large file being written between 2 local file systems? 3. If either of these matters *does* turn out to be a threading or other clear inefficiency on the write pathway or anywhere else, is it likely to be useful if I file an enhancement request in bugzilla? After all, dedup is incredibly useful in a small number of scenarios and if a server of this hardware on a single user load is struggling that much, it would be interesting to know the technical point where it's occurring. (Equally, I guess many people advocate not using ZFS dedup in almost any scenario, because end users inevitably use it on completely inadequate hardware or s totally unsuitable data, so perhaps it's a pathologised area with little patience and "don't expect much to be done now it's stable"!) Anyhow, hoping for an insightful reply! Thank you Stilez From owner-freebsd-fs@freebsd.org Mon Apr 2 17:00:34 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 26308F77A5B for ; Mon, 2 Apr 2018 17:00:34 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B54D3875EE for ; Mon, 2 Apr 2018 17:00:33 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id ECA957616 for ; Mon, 2 Apr 2018 17:00:32 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w32H0WXG065982 for ; Mon, 2 Apr 2018 17:00:32 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w32H0WMw065981 for freebsd-fs@FreeBSD.org; Mon, 2 Apr 2018 17:00:32 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 227204] Combination of gmirror and enabled softupdates journalling cause slow filesystem degradation Date: Mon, 02 Apr 2018 17:00:33 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.3-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: linimon@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2018 17:00:34 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D227204 Mark Linimon changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-bugs@FreeBSD.org |freebsd-fs@FreeBSD.org CC| |freebsd-geom@FreeBSD.org --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Mon Apr 2 18:30:36 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 702F4F7D040 for ; Mon, 2 Apr 2018 18:30:36 +0000 (UTC) (envelope-from stilezy@gmail.com) Received: from mail-vk0-x22b.google.com (mail-vk0-x22b.google.com [IPv6:2607:f8b0:400c:c05::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 082396B219; Mon, 2 Apr 2018 18:30:36 +0000 (UTC) (envelope-from stilezy@gmail.com) Received: by mail-vk0-x22b.google.com with SMTP id r184so8750753vke.11; Mon, 02 Apr 2018 11:30:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=B3yu1DPdBV/js4zRM3IDntqzxSn8LH54ftNekeA6sdo=; b=h0LG6EiuXBZDELpuPOgtK7ewUD95gHGTCBWCJWtEpljcyNT7U3018EnK1aXNRPP4XT 29qgcxcMhTVJTN3EEzQ6Y3lkZGrVNDFplbsUsxucEUEP5dtIkYlFob6zqUY97uxvs4en Cvhb68yHC8rEvzvJlENtrKi8SiyvAxfMDt6pFb21Hen+m2oQMIpCyATrseBAf9KNspNP ILw1LB6RJ5xyM9u3IIA13JzV/26GXuUzEhMfi64CfP2dGSnzcZtX/Co0Z9YNA2EuFp6V z/kqRjKPN8LOLZV9dKviYeCTTzLWgOHOumfq44NIaQ5nSUS9l8t/6FSk0j4C5g7zueRc Ylig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=B3yu1DPdBV/js4zRM3IDntqzxSn8LH54ftNekeA6sdo=; b=ourwkwibUNh/5fUru+iD0LO5WjGO1V60946k5wO9Vp5tJFP9OQiHkTlFwYBtNyghrY 8VjbVcSaGOZ5Q30aaItXniMFmrYVe5WuuTcnevyxQsz3aK2YenMgDi4f8wA2BARmHaKo GeaSfZSctnJQx8gZivpVU/dquU7S07VC0Fv7yBN9BJrwe+1ZJsH9N1CWFY2pwGyPWQUr j8l4bbvy+Dc64unYePcHgOE7ggQW3rzZ6gyNMfI9bUa8K7B3Qw3uK8H0k8CJxm91WWVu R/urLWFdrArl/28My+KYAPWvRta5TkdMZjJg14uYLcuXRmX//l6TgX+b9oBsEgwH0gbv L3Hg== X-Gm-Message-State: ALQs6tDfYkeLjGsp9tiyGqaat6taSunNtWzN9mzJNPftRTOJno72U41g LUBM8eW5G875PGhxOqcnTEV1m9BJlLF1xvV99qyFNQ== X-Google-Smtp-Source: AIpwx4/wTvtORWzmy/ZLHrxE9yKUVMCu+mQeVrKIXDqsKzAvRORpzEK5PW+wzYP1b6O1HKqTYtbfniONsB0fUjEfBy0= X-Received: by 10.31.41.130 with SMTP id p124mr5401775vkp.8.1522693835481; Mon, 02 Apr 2018 11:30:35 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.219.148 with HTTP; Mon, 2 Apr 2018 11:30:05 -0700 (PDT) In-Reply-To: <14c857cc-463f-a56e-bcf6-c0702da6d3bc@FreeBSD.org> References: <14c857cc-463f-a56e-bcf6-c0702da6d3bc@FreeBSD.org> From: Stilez Stilezy Date: Mon, 2 Apr 2018 19:30:05 +0100 Message-ID: Subject: Re: ZFS dedup write pathway - possible inefficiency or..? To: Andriy Gapon Cc: freebsd-fs@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2018 18:30:36 -0000 Thanks on that tip, Andriy, I don't have the knowledge to tell if that's the first of the 2 issues I'm seeing. It could be. What exactly is that bug describing and what behaviour would it create? Assuming it's the same - 1) Are there any known workarounds or sysctls that help to reduce the issue? 2) Are there any easy diagnostic tests I can easily do, to confirm if this is the same behaviour as that bug? Nobody else uses the test system, so I can change any sysctls to sane or extreme values for testing, or create large txg's and spread out reads and writes to see what's happening and what coincides with what. Thanks again On 2 April 2018 at 18:47, Andriy Gapon wrote: > On 02/04/2018 16:36, Stilez Stilezy wrote: > > The first issue is specific to the > > dedup write pathway. I've tested locally to a point where it seems it's > > not due to inadequate hardware and it's very consistent and specific, > even > > on idle conditions/minimal load. I'm wondering whether there's a code > > bottleneck specifically affecting just the dedup write pathway. > > I think that this might be https://www.illumos.org/issues/8353 > > -- > Andriy Gapon > From owner-freebsd-fs@freebsd.org Mon Apr 2 19:15:31 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2A056F7F3E8 for ; Mon, 2 Apr 2018 19:15:31 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-wr0-f182.google.com (mail-wr0-f182.google.com [209.85.128.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8E0CA6CED7 for ; Mon, 2 Apr 2018 19:15:30 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-wr0-f182.google.com with SMTP id 80so15247234wrb.2 for ; Mon, 02 Apr 2018 12:15:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:openpgp:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=8vmz6wIwwmWGtoIEDjRWOQdO4EeniVQEkDMY+b5/yKU=; b=ETYKax2m3itaf1DipL6jFj+IICePa+qx7vn7c3XhzUJPtUZ8RsKHQPv0U6QB/TA3X4 FoJTknCPiFRXAJBboPFcIrRv5Umf2CQDggJGP+LGWR/tgWiWcLP0iGfZGxe+gCOVO11e JoDSGxnkzuFGmW/Mbs1HBJh7nBt9UPCTC27f2FyLusd2CU4PQsvsvvGYZxEgG0poJcJo vQv/qJC9bQa0UF8KB3Mxz5mQS5yBYSm2uvQclRJydfSfbNjfP3U6jumh45IzGOY4WQAG CYuzRFfw8txBZCr7LFYyt/Tdqgwklgt8TszUlxD9i/jEiGcxL445Qd5U1M207St45j7/ py2A== X-Gm-Message-State: ALQs6tD0ULHOrUL9KAdabjbpewpRuz4Y1fAAkWXAg685sqKSrMISImVX dlklwqCKSL6R0USpma+h2tSaVVyb X-Google-Smtp-Source: AIpwx49zcqX6ulOj8UEJByacv+4H/e92qcbMiJM7jNq7RUKnHZRaIAgZLtK9TeWqxsNTv6sKLoJelQ== X-Received: by 2002:a19:179b:: with SMTP id 27-v6mr5986274lfx.143.1522691233742; Mon, 02 Apr 2018 10:47:13 -0700 (PDT) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id q23sm146319ljh.10.2018.04.02.10.47.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 02 Apr 2018 10:47:13 -0700 (PDT) Subject: Re: ZFS dedup write pathway - possible inefficiency or..? To: Stilez Stilezy , freebsd-fs@freebsd.org References: From: Andriy Gapon Openpgp: preference=signencrypt Autocrypt: addr=avg@FreeBSD.org; keydata= xsFNBFm4LIgBEADNB/3lT7f15UKeQ52xCFQx/GqHkSxEdVyLFZTmY3KyNPQGBtyvVyBfprJ7 mAeXZWfhat6cKNRAGZcL5EmewdQuUfQfBdYmKjbw3a9GFDsDNuhDA2QwFt8BmkiVMRYyvI7l N0eVzszWCUgdc3qqM6qqcgBaqsVmJluwpvwp4ZBXmch5BgDDDb1MPO8AZ2QZfIQmplkj8Y6Z AiNMknkmgaekIINSJX8IzRzKD5WwMsin70psE8dpL/iBsA2cpJGzWMObVTtCxeDKlBCNqM1i gTXta1ukdUT7JgLEFZk9ceYQQMJJtUwzWu1UHfZn0Fs29HTqawfWPSZVbulbrnu5q55R4PlQ /xURkWQUTyDpqUvb4JK371zhepXiXDwrrpnyyZABm3SFLkk2bHlheeKU6Yql4pcmSVym1AS4 dV8y0oHAfdlSCF6tpOPf2+K9nW1CFA8b/tw4oJBTtfZ1kxXOMdyZU5fiG7xb1qDgpQKgHUX8 7Rd2T1UVLVeuhYlXNw2F+a2ucY+cMoqz3LtpksUiBppJhw099gEXehcN2JbUZ2TueJdt1FdS ztnZmsHUXLxrRBtGwqnFL7GSd6snpGIKuuL305iaOGODbb9c7ne1JqBbkw1wh8ci6vvwGlzx rexzimRaBzJxlkjNfMx8WpCvYebGMydNoeEtkWldtjTNVsUAtQARAQABzR5BbmRyaXkgR2Fw b24gPGF2Z0BGcmVlQlNELm9yZz7CwZQEEwEIAD4WIQS+LEO7ngQnXA4Bjr538m7TUc1yjwUC WbgsiAIbIwUJBaOagAULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRB38m7TUc1yj+JAEACV l9AK/nOWAt/9cufV2fRj0hdOqB1aCshtSrwHk/exXsDa4/FkmegxXQGY+3GWX3deIyesbVRL rYdtdK0dqJyT1SBqXK1h3/at9rxr9GQA6KWOxTjUFURsU7ok/6SIlm8uLRPNKO+yq0GDjgaO LzN+xykuBA0FlhQAXJnpZLcVfPJdWv7sSHGedL5ln8P8rxR+XnmsA5TUaaPcbhTB+mG+iKFj GghASDSfGqLWFPBlX/fpXikBDZ1gvOr8nyMY9nXhgfXpq3B6QCRYKPy58ChrZ5weeJZ29b7/ QdEO8NFNWHjSD9meiLdWQaqo9Y7uUxN3wySc/YUZxtS0bhAd8zJdNPsJYG8sXgKjeBQMVGuT eCAJFEYJqbwWvIXMfVWop4+O4xB+z2YE3jAbG/9tB/GSnQdVSj3G8MS80iLS58frnt+RSEw/ psahrfh0dh6SFHttE049xYiC+cM8J27Aaf0i9RflyITq57NuJm+AHJoU9SQUkIF0nc6lfA+o JRiyRlHZHKoRQkIg4aiKaZSWjQYRl5Txl0IZUP1dSWMX4s3XTMurC/pnja45dge/4ESOtJ9R 8XuIWg45Oq6MeIWdjKddGhRj3OohsltKgkEU3eLKYtB6qRTQypHHUawCXz88uYt5e3w4V16H lCpSTZV/EVHnNe45FVBlvK7k7HFfDDkryM7BTQRZuCyIARAAlq0slcsVboY/+IUJdcbEiJRW be9HKVz4SUchq0z9MZPX/0dcnvz/gkyYA+OuM78dNS7Mbby5dTvOqfpLJfCuhaNYOhlE0wY+ 1T6Tf1f4c/uA3U/YiadukQ3+6TJuYGAdRZD5EqYFIkreARTVWg87N9g0fT9BEqLw9lJtEGDY EWUE7L++B8o4uu3LQFEYxcrb4K/WKmgtmFcm77s0IKDrfcX4doV92QTIpLiRxcOmCC/OCYuO jB1oaaqXQzZrCutXRK0L5XN1Y1PYjIrEzHMIXmCDlLYnpFkK+itlXwlE2ZQxkfMruCWdQXye syl2fynAe8hvp7Mms9qU2r2K9EcJiR5N1t1C2/kTKNUhcRv7Yd/vwusK7BqJbhlng5ZgRx0m WxdntU/JLEntz3QBsBsWM9Y9wf2V4tLv6/DuDBta781RsCB/UrU2zNuOEkSixlUiHxw1dccI 6CVlaWkkJBxmHX22GdDFrcjvwMNIbbyfQLuBq6IOh8nvu9vuItup7qemDG3Ms6TVwA7BD3j+ 3fGprtyW8Fd/RR2bW2+LWkMrqHffAr6Y6V3h5kd2G9Q8ZWpEJk+LG6Mk3fhZhmCnHhDu6CwN MeUvxXDVO+fqc3JjFm5OxhmfVeJKrbCEUJyM8ESWLoNHLqjywdZga4Q7P12g8DUQ1mRxYg/L HgZY3zfKOqcAEQEAAcLBfAQYAQgAJhYhBL4sQ7ueBCdcDgGOvnfybtNRzXKPBQJZuCyIAhsM BQkFo5qAAAoJEHfybtNRzXKPBVwQAKfFy9P7N3OsLDMB56A4Kf+ZT+d5cIx0Yiaf4n6w7m3i ImHHHk9FIetI4Xe54a2IXh4Bq5UkAGY0667eIs+Z1Ea6I2i27Sdo7DxGwq09Qnm/Y65ADvXs 3aBvokCcm7FsM1wky395m8xUos1681oV5oxgqeRI8/76qy0hD9WR65UW+HQgZRIcIjSel9vR XDaD2HLGPTTGr7u4v00UeTMs6qvPsa2PJagogrKY8RXdFtXvweQFz78NbXhluwix2Tb9ETPk LIpDrtzV73CaE2aqBG/KrboXT2C67BgFtnk7T7Y7iKq4/XvEdDWscz2wws91BOXuMMd4c/c4 OmGW9m3RBLufFrOag1q5yUS9QbFfyqL6dftJP3Zq/xe+mr7sbWbhPVCQFrH3r26mpmy841ym dwQnNcsbIGiBASBSKksOvIDYKa2Wy8htPmWFTEOPRpFXdGQ27awcjjnB42nngyCK5ukZDHi6 w0qK5DNQQCkiweevCIC6wc3p67jl1EMFY5+z+zdTPb3h7LeVnGqW0qBQl99vVFgzLxchKcl0 R/paSFgwqXCZhAKMuUHncJuynDOP7z5LirUeFI8qsBAJi1rXpQoLJTVcW72swZ42IdPiboqx NbTMiNOiE36GqMcTPfKylCbF45JNX4nF9ElM0E+Y8gi4cizJYBRr2FBJgay0b9Cp Message-ID: <14c857cc-463f-a56e-bcf6-c0702da6d3bc@FreeBSD.org> Date: Mon, 2 Apr 2018 20:47:12 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2018 19:15:31 -0000 On 02/04/2018 16:36, Stilez Stilezy wrote: > The first issue is specific to the > dedup write pathway. I've tested locally to a point where it seems it's > not due to inadequate hardware and it's very consistent and specific, even > on idle conditions/minimal load. I'm wondering whether there's a code > bottleneck specifically affecting just the dedup write pathway. I think that this might be https://www.illumos.org/issues/8353 -- Andriy Gapon From owner-freebsd-fs@freebsd.org Mon Apr 2 22:23:41 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 853A2F533C2 for ; Mon, 2 Apr 2018 22:23:41 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id F17DE75CE5 for ; Mon, 2 Apr 2018 22:23:40 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id 0D3E6123F3 for ; Mon, 2 Apr 2018 22:23:40 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w32MNdQf009826 for ; Mon, 2 Apr 2018 22:23:39 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w32MNdd9009825 for freebsd-fs@FreeBSD.org; Mon, 2 Apr 2018 22:23:39 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 227204] Combination of gmirror and enabled softupdates journalling cause slow filesystem degradation Date: Mon, 02 Apr 2018 22:23:39 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.3-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: mckusick@FreeBSD.org X-Bugzilla-Status: Closed X-Bugzilla-Resolution: Works As Intended X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_status cc resolution Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2018 22:23:41 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D227204 Kirk McKusick changed: What |Removed |Added ---------------------------------------------------------------------------- Status|New |Closed CC| |mckusick@FreeBSD.org Resolution|--- |Works As Intended --- Comment #1 from Kirk McKusick --- This is a problem that is endemic to all overwriting filesystems that use journalling. Specifically, the journal only checks and corrects things that= it knows need to be fixed. Under normal circumstances it knows about everything that might be wrong. Unfortunately most disks are run with `write cache enabled' which means that they can lie about completing writes to stable st= ore. Specifically they report that a write is on the platter (or in the flash) w= hen in fact it is only in the disk's volatile cache. If there is a power-fail event, they are usually able to flush their cache, but not always. Since the journal has been told that the write completed, it does not check for the missed write and the corresponding corruption of the filesystem remains unt= il a full fsck is run (which checks all of the metadata integrity). If the missed write was an update to a cylinder-group map, then you can end up double-allocating a block (such as you see in your example). When an attemp= t is made to free a double-allocated block you will get a system panic with "fre= eing free block". Some systems have tried periodically forcing a full fsck (on the order of e= very month or so) to catch these types of errors, but the disruption if the rebo= ot happened during a busy period led them to drop this practice. Still it is a good idea to periodically run a full fsck just to ensure that your filesyst= ems stay healthy. If this is not practical you should consider using ZFS which provides a great deal more redundancy and integrity though requires considerably more resources (disk + CPU + memory) for a given storage load = than does UFS. I am closing this report with "Works as Intended" as that is the closest to "This is a known shortcoming of journalled overwriting filesystems". --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Tue Apr 3 16:03:29 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3819DF53BD9 for ; Tue, 3 Apr 2018 16:03:29 +0000 (UTC) (envelope-from anthoine.bourgeois@blade-group.com) Received: from mail-it0-x244.google.com (mail-it0-x244.google.com [IPv6:2607:f8b0:4001:c0b::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CC14C84703 for ; Tue, 3 Apr 2018 16:03:28 +0000 (UTC) (envelope-from anthoine.bourgeois@blade-group.com) Received: by mail-it0-x244.google.com with SMTP id 15-v6so13249142itl.1 for ; Tue, 03 Apr 2018 09:03:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=blade-group-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=6Xr74/MyobwhhFdgXrC8ZFDtTLN2ElEjUEZ2Olfg4NI=; b=E8cpdvd2EZdLf9ikHTx94BeiX95xYkHmeDwuZlW/J2B+O6VNZu2QzND2C9NKE1mShZ zFbXQ3T0kPC42vOs8eUbYDYRn1vN6+/AfGpZtzmKz3HJOw8MH+54Jf3BfBo947YV94Gn IYYEXriu/chXVK7IG9Htp2jI+0V2IckpYuvVHr1eVYT4sDD4svOJFv4KR0VIb251HDsU DCXwJg6cGNnYIMwSuvsv2A0uicOQWjP0M2Q/qAN9GfPi40b6hMxm5aDu8WgLX5jEzlw8 85YbPsFZh8c637XxiPX3KyKKfbVqTjNp/aDi6ztNf5NCQXTAg/rwNDzS/rhP7G1qD8ym UvCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=6Xr74/MyobwhhFdgXrC8ZFDtTLN2ElEjUEZ2Olfg4NI=; b=kc92L5tvAUHALqSX3oi7iCplAWX8lASeq7OTxyvGSuSUU7Y5lPHM1rYyrzkOczpzEg WJRnW0wfJyf/Fn37xZBRHNODfnXUBDjvvie79YtQaA52FwImIVxuG6C3Uw216+d8Dncj hUsZ+CQaqewz92Tze7A3/2GUmvkYykoSx1xWp4stLfMxQvROkSBxn1D4cGAX/nEght1S 6nTCiAKJg0vA29hWiATRfT4q08Vi0a14YKzDGCrWzWFBL3iMGh5xuxZNtqepeYh9GVbq X5gvQKMmW9bflG0S4dmPxqOQVWLmQEjf/amInzPZD8IPRqn6NZpbfBU/P+KpXorU9xew FmMQ== X-Gm-Message-State: ALQs6tBBd5SEfDq9HpOGcYeR4W7fRTkrUi0F6vbwOaOCEIX/3iAhjTYY 8QWPBjNGGP5U0wGE5EHN/rr2ljIu+xJ9KbsOwJRhVQ== X-Google-Smtp-Source: AIpwx48cQsxilRuR99gQB6Q1mMm2QU4jEu8w4uIpafHiQFUaMH9tpARnrgHZyuNY0bEp1eBsHf+XHuBWNmkFC7zSksc= X-Received: by 2002:a24:5b87:: with SMTP id g129-v6mr5911394itb.90.1522771407799; Tue, 03 Apr 2018 09:03:27 -0700 (PDT) MIME-Version: 1.0 Received: by 10.2.63.100 with HTTP; Tue, 3 Apr 2018 09:03:27 -0700 (PDT) In-Reply-To: <20180403160149.73nxnm6lgez5n6xk@gmail.com> References: <20180403160149.73nxnm6lgez5n6xk@gmail.com> From: Anthoine Bourgeois Date: Tue, 3 Apr 2018 18:03:27 +0200 Message-ID: Subject: ZFS: move zvol device operation out of txg sync thread To: freebsd-fs@freebsd.org Content-Type: text/plain; charset="UTF-8" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2018 16:03:29 -0000 Hello, I posted the review https://reviews.freebsd.org/D14669 I wonder if I should insert the command in the queue earlier in the functions or should it be fine to be close to the dsl_dataset operations as it is currently ? Any review is welcome. Another question: I create a new review D14669 because I can't find how to update the review D7179, that may be more proper I think. When I click on "Update diff" and select my diff file, the "Attach To" list only show me the "Create a new revision..." option. Is is normal ? Best, Anthoine From owner-freebsd-fs@freebsd.org Tue Apr 3 18:18:19 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A56FCF74ACF for ; Tue, 3 Apr 2018 18:18:19 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3CC7B6AE29 for ; Tue, 3 Apr 2018 18:18:19 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id 8896A1C995 for ; Tue, 3 Apr 2018 18:18:18 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w33IIIAN085010 for ; Tue, 3 Apr 2018 18:18:18 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w33IIIBH084996 for freebsd-fs@FreeBSD.org; Tue, 3 Apr 2018 18:18:18 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 219972] Unable to zpool export following some zfs recv Date: Tue, 03 Apr 2018 18:18:18 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: patrik@hildingsson.se X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2018 18:18:19 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D219972 Patrik Hildingsson changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |patrik@hildingsson.se --- Comment #12 from Patrik Hildingsson --- I'm running FreeBSD 11.1-RELEASE-p8 and I am also experiencing the very same issue. I too zfs send|recv zvols on the same machine between two pools. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Apr 4 17:04:42 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 42D26F9D3B8 for ; Wed, 4 Apr 2018 17:04:42 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-wm0-f43.google.com (mail-wm0-f43.google.com [74.125.82.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 91F76862C8 for ; Wed, 4 Apr 2018 17:04:41 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-wm0-f43.google.com with SMTP id x82so44409105wmg.1 for ; Wed, 04 Apr 2018 10:04:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:openpgp:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=v4tkc+vD8QdEJe7wohDTmhUKWaExzuB1XoMovjfCSCo=; b=OY07I/cV9Si4KSXtGkNObNfHntMdcQrdOQ1yEhYHSreSLlAgtf38+fPvuUqCj7GV6r 41GqYDjeRXvN2M+xH+z2wv7gAyWkPwOX0V0g3+B1U02fJhWgZ1h+hjdrA2CnouxaBD/n RBypUwIb1mqZcMwBmRcrofaXOI7Dz36J72KWCls2vvwySyThBZcr+cB6ho5M+F4ZNHcq MSNEyfJU3DCG+6RuHbkAyjQAi8ncVKxPEewzSdSpUgkslCWRBf/4kWlIog3WhHvGcjHv 5mNHDHeqPq30Ol+6e7zTVZYTSHwJ0JrbVu01wFmj8HOxDdPq1/MuOrgE7uQybP+yyEpY tJuw== X-Gm-Message-State: ALQs6tChQ4FBluE/4guMPKC/5AYVML8fhHgWj5vq0oDepFjVtqpX1h8C yFcaOBVVdUA85D/B6iyQoHae+3mm X-Google-Smtp-Source: AIpwx4+ka7R6J7ePJeOFNYVGX9MY7cJrCIYiuGb4rS6FFbntlf0jXO1Ku9LxXA9dA7g3PxSc5aK1Ng== X-Received: by 10.46.135.6 with SMTP id m6mr11676049lji.124.1522861474395; Wed, 04 Apr 2018 10:04:34 -0700 (PDT) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id o26sm942974ljc.62.2018.04.04.10.04.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Apr 2018 10:04:33 -0700 (PDT) Subject: Re: ZFS dedup write pathway - possible inefficiency or..? To: Stilez Stilezy Cc: freebsd-fs@freebsd.org References: <14c857cc-463f-a56e-bcf6-c0702da6d3bc@FreeBSD.org> From: Andriy Gapon Openpgp: preference=signencrypt Autocrypt: addr=avg@FreeBSD.org; prefer-encrypt=mutual; keydata= xsFNBFm4LIgBEADNB/3lT7f15UKeQ52xCFQx/GqHkSxEdVyLFZTmY3KyNPQGBtyvVyBfprJ7 mAeXZWfhat6cKNRAGZcL5EmewdQuUfQfBdYmKjbw3a9GFDsDNuhDA2QwFt8BmkiVMRYyvI7l N0eVzszWCUgdc3qqM6qqcgBaqsVmJluwpvwp4ZBXmch5BgDDDb1MPO8AZ2QZfIQmplkj8Y6Z AiNMknkmgaekIINSJX8IzRzKD5WwMsin70psE8dpL/iBsA2cpJGzWMObVTtCxeDKlBCNqM1i gTXta1ukdUT7JgLEFZk9ceYQQMJJtUwzWu1UHfZn0Fs29HTqawfWPSZVbulbrnu5q55R4PlQ /xURkWQUTyDpqUvb4JK371zhepXiXDwrrpnyyZABm3SFLkk2bHlheeKU6Yql4pcmSVym1AS4 dV8y0oHAfdlSCF6tpOPf2+K9nW1CFA8b/tw4oJBTtfZ1kxXOMdyZU5fiG7xb1qDgpQKgHUX8 7Rd2T1UVLVeuhYlXNw2F+a2ucY+cMoqz3LtpksUiBppJhw099gEXehcN2JbUZ2TueJdt1FdS ztnZmsHUXLxrRBtGwqnFL7GSd6snpGIKuuL305iaOGODbb9c7ne1JqBbkw1wh8ci6vvwGlzx rexzimRaBzJxlkjNfMx8WpCvYebGMydNoeEtkWldtjTNVsUAtQARAQABzR5BbmRyaXkgR2Fw b24gPGF2Z0BGcmVlQlNELm9yZz7CwZQEEwEIAD4WIQS+LEO7ngQnXA4Bjr538m7TUc1yjwUC WbgsiAIbIwUJBaOagAULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRB38m7TUc1yj+JAEACV l9AK/nOWAt/9cufV2fRj0hdOqB1aCshtSrwHk/exXsDa4/FkmegxXQGY+3GWX3deIyesbVRL rYdtdK0dqJyT1SBqXK1h3/at9rxr9GQA6KWOxTjUFURsU7ok/6SIlm8uLRPNKO+yq0GDjgaO LzN+xykuBA0FlhQAXJnpZLcVfPJdWv7sSHGedL5ln8P8rxR+XnmsA5TUaaPcbhTB+mG+iKFj GghASDSfGqLWFPBlX/fpXikBDZ1gvOr8nyMY9nXhgfXpq3B6QCRYKPy58ChrZ5weeJZ29b7/ QdEO8NFNWHjSD9meiLdWQaqo9Y7uUxN3wySc/YUZxtS0bhAd8zJdNPsJYG8sXgKjeBQMVGuT eCAJFEYJqbwWvIXMfVWop4+O4xB+z2YE3jAbG/9tB/GSnQdVSj3G8MS80iLS58frnt+RSEw/ psahrfh0dh6SFHttE049xYiC+cM8J27Aaf0i9RflyITq57NuJm+AHJoU9SQUkIF0nc6lfA+o JRiyRlHZHKoRQkIg4aiKaZSWjQYRl5Txl0IZUP1dSWMX4s3XTMurC/pnja45dge/4ESOtJ9R 8XuIWg45Oq6MeIWdjKddGhRj3OohsltKgkEU3eLKYtB6qRTQypHHUawCXz88uYt5e3w4V16H lCpSTZV/EVHnNe45FVBlvK7k7HFfDDkryM7BTQRZuCyIARAAlq0slcsVboY/+IUJdcbEiJRW be9HKVz4SUchq0z9MZPX/0dcnvz/gkyYA+OuM78dNS7Mbby5dTvOqfpLJfCuhaNYOhlE0wY+ 1T6Tf1f4c/uA3U/YiadukQ3+6TJuYGAdRZD5EqYFIkreARTVWg87N9g0fT9BEqLw9lJtEGDY EWUE7L++B8o4uu3LQFEYxcrb4K/WKmgtmFcm77s0IKDrfcX4doV92QTIpLiRxcOmCC/OCYuO jB1oaaqXQzZrCutXRK0L5XN1Y1PYjIrEzHMIXmCDlLYnpFkK+itlXwlE2ZQxkfMruCWdQXye syl2fynAe8hvp7Mms9qU2r2K9EcJiR5N1t1C2/kTKNUhcRv7Yd/vwusK7BqJbhlng5ZgRx0m WxdntU/JLEntz3QBsBsWM9Y9wf2V4tLv6/DuDBta781RsCB/UrU2zNuOEkSixlUiHxw1dccI 6CVlaWkkJBxmHX22GdDFrcjvwMNIbbyfQLuBq6IOh8nvu9vuItup7qemDG3Ms6TVwA7BD3j+ 3fGprtyW8Fd/RR2bW2+LWkMrqHffAr6Y6V3h5kd2G9Q8ZWpEJk+LG6Mk3fhZhmCnHhDu6CwN MeUvxXDVO+fqc3JjFm5OxhmfVeJKrbCEUJyM8ESWLoNHLqjywdZga4Q7P12g8DUQ1mRxYg/L HgZY3zfKOqcAEQEAAcLBfAQYAQgAJhYhBL4sQ7ueBCdcDgGOvnfybtNRzXKPBQJZuCyIAhsM BQkFo5qAAAoJEHfybtNRzXKPBVwQAKfFy9P7N3OsLDMB56A4Kf+ZT+d5cIx0Yiaf4n6w7m3i ImHHHk9FIetI4Xe54a2IXh4Bq5UkAGY0667eIs+Z1Ea6I2i27Sdo7DxGwq09Qnm/Y65ADvXs 3aBvokCcm7FsM1wky395m8xUos1681oV5oxgqeRI8/76qy0hD9WR65UW+HQgZRIcIjSel9vR XDaD2HLGPTTGr7u4v00UeTMs6qvPsa2PJagogrKY8RXdFtXvweQFz78NbXhluwix2Tb9ETPk LIpDrtzV73CaE2aqBG/KrboXT2C67BgFtnk7T7Y7iKq4/XvEdDWscz2wws91BOXuMMd4c/c4 OmGW9m3RBLufFrOag1q5yUS9QbFfyqL6dftJP3Zq/xe+mr7sbWbhPVCQFrH3r26mpmy841ym dwQnNcsbIGiBASBSKksOvIDYKa2Wy8htPmWFTEOPRpFXdGQ27awcjjnB42nngyCK5ukZDHi6 w0qK5DNQQCkiweevCIC6wc3p67jl1EMFY5+z+zdTPb3h7LeVnGqW0qBQl99vVFgzLxchKcl0 R/paSFgwqXCZhAKMuUHncJuynDOP7z5LirUeFI8qsBAJi1rXpQoLJTVcW72swZ42IdPiboqx NbTMiNOiE36GqMcTPfKylCbF45JNX4nF9ElM0E+Y8gi4cizJYBRr2FBJgay0b9Cp Message-ID: Date: Wed, 4 Apr 2018 20:04:32 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2018 17:04:42 -0000 On 02/04/2018 21:30, Stilez Stilezy wrote: > Thanks on that tip, Andriy, > > I don't have the knowledge to tell if that's the first of the 2 issues I'm > seeing.  It could be. What exactly is that bug describing and what behaviour > would it create? Sub-optimal performance for ZFS write throughput (and latency) when dedup is enabled and you are trying to write as fast as you can. > Assuming it's the same - > > 1) Are there any known workarounds or sysctls that help to reduce the issue? I don't know of any. > 2) Are there any easy diagnostic tests I can easily do, to confirm if this is > the same behaviour as that bug?  Nobody else uses the test system, so I can > change any sysctls to sane or extreme values for testing, or create large txg's > and spread out reads and writes to see what's happening and what coincides with > what. We used DTrace to observe internal ZFS behavior. I do not have any simple recipes. > On 2 April 2018 at 18:47, Andriy Gapon > wrote: > > On 02/04/2018 16:36, Stilez Stilezy wrote: > > The first issue is specific to the > > dedup write pathway.   I've tested locally to a point where it seems it's > > not due to inadequate hardware and it's very consistent and specific, even > > on idle conditions/minimal load.  I'm wondering whether there's a code > > bottleneck specifically affecting just the dedup write pathway. > > I think that this might be https://www.illumos.org/issues/8353 > > > -- > Andriy Gapon > > -- Andriy Gapon From owner-freebsd-fs@freebsd.org Wed Apr 4 18:27:33 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CD235F57F79 for ; Wed, 4 Apr 2018 18:27:32 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost2.sentex.ca (smarthost2.sentex.ca [IPv6:2607:f3e0:80:80::2]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "smarthost.sentex.ca", Issuer "smarthost.sentex.ca" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 6EA0469E80; Wed, 4 Apr 2018 18:27:32 +0000 (UTC) (envelope-from mike@sentex.net) Received: from lava.sentex.ca (lava.sentex.ca [IPv6:2607:f3e0:0:5::11]) by smarthost2.sentex.ca (8.15.2/8.15.2) with ESMTPS id w34IRV4s095532 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 4 Apr 2018 14:27:31 -0400 (EDT) (envelope-from mike@sentex.net) Received: from [192.168.43.26] (saphire3.sentex.ca [192.168.43.26]) by lava.sentex.ca (8.15.2/8.15.2) with ESMTP id w34IRTL4017005; Wed, 4 Apr 2018 14:27:29 -0400 (EDT) (envelope-from mike@sentex.net) To: "freebsd-fs@freebsd.org" From: Mike Tancsa Subject: Linux NFS client and FreeBSD server strangeness Organization: Sentex Communications Message-ID: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> Date: Wed, 4 Apr 2018 14:27:30 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------0092DAC1F1203FC0CDCBEF55" Content-Language: en-US X-Scanned-By: MIMEDefang 2.78 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2018 18:27:33 -0000 This is a multi-part message in MIME format. --------------0092DAC1F1203FC0CDCBEF55 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Not sure where the tweaking needs to happen, but I am getting strange behaviour between a Linux nfs client and FreeBSD RELENG_11 NFS server. The FreeBSD server starts with nfs_client_enable="YES" nfs_server_enable="YES" rpcbind_enable="YES" rpc_lockd_enable="YES" rpc_statd_enable="YES" nfs_server_flags="-u -t -n 16" and on the Linux client I have been trying various options to no avail. The mount works, but on a straight up write to the FreeBSD server, everything is very bursty. I noticed this (I think) a few months ago where Linux dumps across an nfs mount seemed to take a lot longer and were getting very bursty. It seems if there are a mixture of reads and writes, everything is pretty fast. But if a client is just writing to the server, something, somewhere is blocking. Doing something simple like ls -l /nfsmount from the client "wakes" up the server/client so that write stream can keep going. Otherwise, it will do a big blast of writes and then several seconds of pausing on the dump. Linux Dump is a simple /sbin/dump u -0 -f - / | /bin/bzip2 >/backup/dump-root-0.bz2 Mount is mount.nfs -o tcp,intr,noatime,vers=3 192.168.yy.xx:/path If I run ifstat on the FreeBSD nfs server, the traffic pattern looks like # ifstat -b -i cxl0 cxl0 Kbps in Kbps out 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 8.12e+06 45127.03 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 6.04e+06 33525.76 901122.1 4983.72 0.00 0.00 if I do a bunch of ls -l /nfsmount on the client eg while true do ls -l /backup/ > /dev/null done traffic pattern is cxl0 Kbps in Kbps out 0.00 0.00 3.31e+06 18520.03 5.89e+06 32571.52 4.84e+06 28325.71 2.12e+06 19466.56 614727.0 12246.10 874927.6 13557.18 1.06e+06 14386.78 917865.4 13696.87 1.09e+06 14608.64 1.06e+06 14376.12 164077.3 5286.64 Leading up to the stall, pcap snippet attached. Note, doing something like dd if=/dev/zero of=/backup/test.bin bs=4096 count=5000000 I can saturate the 10G link and max out the disk on the server # dd if=/dev/zero of=/backup/test.bin bs=4096 count=5000000 5000000+0 records in 5000000+0 records out 20480000000 bytes (20 GB, 19 GiB) copied, 36.6238 s, 559 MB/s and its a pretty steady stream unlike the dump. Any ideas whats going on and how I might be able to work around this ? 192.168.xx.yy:/zbackup1/virtbox4b/backup on /backup type nfs (rw,noatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.242.254,mountvers=3,mountport=774,mountproto=tcp,local_lock=none,addr=192.168.yy.xx) ---Mike ------------------- Mike Tancsa, tel +1 519 651 3400 x203 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada --------------0092DAC1F1203FC0CDCBEF55 Content-Type: text/plain; charset=UTF-8; name="nfs.txt" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="nfs.txt" MTM6MjE6MTUuNDU3NDg0IElQICh0b3MgMHgwLCB0dGwgNjQsIGlkIDU2MDk3LCBvZmZzZXQg MCwgZmxhZ3MgW0RGXSwgcHJvdG8gVENQICg2KSwgbGVuZ3RoIDE3NikKICAgIDE5Mi4xNjgu Y2xpZW50LjkzNyA+IDE5Mi4xNjguc2VydmVyLjIwNDk6IEZsYWdzIFtQLl0sIGNrc3VtIDB4 OGNhOCAoY29ycmVjdCksIHNlcSA4NzUxMzI5Mjo4NzUxMzQxNiwgYWNrIDEwOTYyMSwgd2lu IDU2MzIsIG9wdGlvbnMgW25vcCxub3AsVFMgdmFsIDEwNDg2ODMgZWNyIDUwMDEwMzg3Ml0s IGxlbmd0aCAxMjQ6IE5GUyByZXF1ZXN0IHhpZCA0Njg3MzM4NTAgMTIwIGNvbW1pdCBmaCBV bmtub3duL0IzMEJGNDFFREU0OEVDNUMwQTAwMApCMDAwMDAwMDAwMDg0N0Q4MjAxMDAwMDAw MDAwMDAwMDAwMCAwIGJ5dGVzIEAgMAoxMzoyMToxNS40NTc1MDIgSVAgKHRvcyAweDAsIHR0 bCA2NCwgaWQgMCwgb2Zmc2V0IDAsIGZsYWdzIFtERl0sIHByb3RvIFRDUCAoNiksIGxlbmd0 aCAyMDgsIGJhZCBja3N1bSAwICgtPmQzY2UpISkKICAgIDE5Mi4xNjguc2VydmVyLjIwNDkg PiAxOTIuMTY4LmNsaWVudC45Mzc6IEZsYWdzIFtQLl0sIGNrc3VtIDB4NjcxYyAoaW5jb3Jy ZWN0IC0+IDB4NDEwOSksIHNlcSAxMDk2MjE6MTA5Nzc3LCBhY2sgODc1MTM0MTYsIHdpbiAy OTEyNywgb3B0aW9ucyBbbm9wLG5vcCxUUyB2YWwgNTAwMTAzODczIGVjciAxMDQ4NjgzXSwg bGVuZ3RoIDE1NjogTkZTIHJlcGx5IHhpZCA0Njg3MzM4NTAgcmVwbHkgb2sgMTUyIGNvbW1p dCBQUkU6IHN6IDQxNjAyMgo1MjggbXRpbWUgMTUyMjg2MjQ3NS40NTY5MDQwMDAgY3RpbWUg MTUyMjg2MjQ3NS40NTY5MDQwMDAgUE9TVDogUkVHIDY0NCBpZHMgMjg3NjcvMCBzeiA0MTYw MjI1MjggbmxpbmsgMSByZGV2IDEwMi84MzAzNDEyMTYgZnNpZCAxZWY0MGJiMyBmaWxlaWQg YiBhL20vY3RpbWUgMTUyMjg2MjM1OS4xMDIxNzcwMDAgMTUyMjg2MjQ3NS40NTY5MDQwMDAg MTUyMjg2MjQ3NS40NTY5MDQwMDAKMTM6MjE6MTUuNDYyNTA0IElQICh0b3MgMHgwLCB0dGwg NjQsIGlkIDU2MDk4LCBvZmZzZXQgMCwgZmxhZ3MgW0RGXSwgcHJvdG8gVENQICg2KSwgbGVu Z3RoIDE2NCkKICAgIDE5Mi4xNjguY2xpZW50LjkzNyA+IDE5Mi4xNjguc2VydmVyLjIwNDk6 IEZsYWdzIFtQLl0sIGNrc3VtIDB4YTNiYSAoY29ycmVjdCksIHNlcSA4NzUxMzQxNjo4NzUx MzUyOCwgYWNrIDEwOTc3Nywgd2luIDU2MzIsIG9wdGlvbnMgW25vcCxub3AsVFMgdmFsIDEw NDg2ODQgZWNyIDUwMDEwMzg3M10sIGxlbmd0aCAxMTI6IE5GUyByZXF1ZXN0IHhpZCA0ODU1 MTEwNjYgMTA4IGdldGF0dHIgZmggVW5rbm93bi9CMzBCRjQxRURFNDhFQzVDMEEwMAowOTAw MDAwMDAwMDA2RDdEODIwMTAwMDAwMDAwMDAwMDAwMDAKMTM6MjE6MTUuNDYyNTM1IElQICh0 b3MgMHgwLCB0dGwgNjQsIGlkIDAsIG9mZnNldCAwLCBmbGFncyBbREZdLCBwcm90byBUQ1Ag KDYpLCBsZW5ndGggMTY4LCBiYWQgY2tzdW0gMCAoLT5kM2Y2KSEpCiAgICAxOTIuMTY4LnNl cnZlci4yMDQ5ID4gMTkyLjE2OC5jbGllbnQuOTM3OiBGbGFncyBbUC5dLCBja3N1bSAweDY2 ZjQgKGluY29ycmVjdCAtPiAweGNjYjIpLCBzZXEgMTA5Nzc3OjEwOTg5MywgYWNrIDg3NTEz NTI4LCB3aW4gMjkxMjcsIG9wdGlvbnMgW25vcCxub3AsVFMgdmFsIDUwMDEwMzg3OCBlY3Ig MTA0ODY4NF0sIGxlbmd0aCAxMTY6IE5GUyByZXBseSB4aWQgNDg1NTExMDY2IHJlcGx5IG9r IDExMiBnZXRhdHRyIFJFRyA2NDQgaWRzIDIKODc2Ny8wIHN6IDQwOTYwMDAwMDAKMTM6MjE6 MTUuNDYyNjY5IElQICh0b3MgMHgwLCB0dGwgNjQsIGlkIDU2MDk5LCBvZmZzZXQgMCwgZmxh Z3MgW0RGXSwgcHJvdG8gVENQICg2KSwgbGVuZ3RoIDE2NCkKICAgIDE5Mi4xNjguY2xpZW50 LjkzNyA+IDE5Mi4xNjguc2VydmVyLjIwNDk6IEZsYWdzIFtQLl0sIGNrc3VtIDB4ODlkMSAo Y29ycmVjdCksIHNlcSA4NzUxMzUyODo4NzUxMzY0MCwgYWNrIDEwOTg5Mywgd2luIDU2MzIs IG9wdGlvbnMgW25vcCxub3AsVFMgdmFsIDEwNDg2ODQgZWNyIDUwMDEwMzg3OF0sIGxlbmd0 aCAxMTI6IE5GUyByZXF1ZXN0IHhpZCA1MDIyODgyODIgMTA4IGdldGF0dHIgZmggVW5rbm93 bi9CMzBCRjQxRURFNDhFQzVDMEEwMAowQTAwMDAwMDAwMDA4NDdEODIwMTAwMDAwMDAwMDAw MDAwMDAKMTM6MjE6MTUuNDYyNzEzIElQICh0b3MgMHgwLCB0dGwgNjQsIGlkIDAsIG9mZnNl dCAwLCBmbGFncyBbREZdLCBwcm90byBUQ1AgKDYpLCBsZW5ndGggMTY4LCBiYWQgY2tzdW0g MCAoLT5kM2Y2KSEpCiAgICAxOTIuMTY4LnNlcnZlci4yMDQ5ID4gMTkyLjE2OC5jbGllbnQu OTM3OiBGbGFncyBbUC5dLCBja3N1bSAweDY2ZjQgKGluY29ycmVjdCAtPiAweDEyZmQpLCBz ZXEgMTA5ODkzOjExMDAwOSwgYWNrIDg3NTEzNjQwLCB3aW4gMjkxMjcsIG9wdGlvbnMgW25v cCxub3AsVFMgdmFsIDUwMDEwMzg3OCBlY3IgMTA0ODY4NF0sIGxlbmd0aCAxMTY6IE5GUyBy ZXBseSB4aWQgNTAyMjg4MjgyIHJlcGx5IG9rIDExMiBnZXRhdHRyIFJFRyA2NDQgaWRzIDIK ODc2Ny8wIHN6IDMwNDUKMTM6MjE6MTUuNDYyODQ1IElQICh0b3MgMHgwLCB0dGwgNjQsIGlk IDU2MTAwLCBvZmZzZXQgMCwgZmxhZ3MgW0RGXSwgcHJvdG8gVENQICg2KSwgbGVuZ3RoIDE4 OCkKICAgIDE5Mi4xNjguY2xpZW50LjkzNyA+IDE5Mi4xNjguc2VydmVyLjIwNDk6IEZsYWdz IFtQLl0sIGNrc3VtIDB4ZTJkYiAoY29ycmVjdCksIHNlcSA4NzUxMzY0MDo4NzUxMzc3Niwg YWNrIDExMDAwOSwgd2luIDU2MzIsIG9wdGlvbnMgW25vcCxub3AsVFMgdmFsIDEwNDg2ODQg ZWNyIDUwMDEwMzg3OF0sIGxlbmd0aCAxMzY6IE5GUyByZXF1ZXN0IHhpZCA1MTkwNjU0OTgg MTMyIHJlYWRkaXJwbHVzIGZoIFVua25vd24vQjMwQkY0MUVERTQ4RUM1QwowQTAwMDgwMDAw MDAwMDAwNTQ3RDgyMDEwMDAwMDAwMDAwMDAwMDAwIDQwOTYgYnl0ZXMgQCAwIG1heCAzMjc2 OCB2ZXJmIDAwMDAwMDAwNTc0NjdhMDAKMTM6MjE6MTUuNDYyOTA3IElQICh0b3MgMHgwLCB0 dGwgNjQsIGlkIDAsIG9mZnNldCAwLCBmbGFncyBbREZdLCBwcm90byBUQ1AgKDYpLCBsZW5n dGggOTY4LCBiYWQgY2tzdW0gMCAoLT5kMGQ2KSEpCiAgICAxOTIuMTY4LnNlcnZlci4yMDQ5 ID4gMTkyLjE2OC5jbGllbnQuOTM3OiBGbGFncyBbUC5dLCBja3N1bSAweDZhMTQgKGluY29y cmVjdCAtPiAweGM4M2IpLCBzZXEgMTEwMDA5OjExMDkyNSwgYWNrIDg3NTEzNzc2LCB3aW4g MjkxMjcsIG9wdGlvbnMgW25vcCxub3AsVFMgdmFsIDUwMDEwMzg3OCBlY3IgMTA0ODY4NF0s IGxlbmd0aCA5MTY6IE5GUyByZXBseSB4aWQgNTE5MDY1NDk4IHJlcGx5IG9rIDkxMiByZWFk ZGlycGx1cyBQT1NUOiBESVIKIDcwMCBpZHMgMjg3NjcvMCBzeiA1IG5saW5rIDIgcmRldiAw LzkgZnNpZCAxZWY0MGJiMyBmaWxlaWQgOCBhL20vY3RpbWUgMTUyMjg2MjQ3NS40NjI4NzIw MDAgMTUyMjg2MjM1OS4xMDIxODEwMDAgMTUyMjg2MjM1OS4xMDIxODEwMDAgdmVyZiAwMDAw MDAwMDU3NDY3YTAwCjEzOjIxOjE1LjQ5OTYyNSBJUCAodG9zIDB4MCwgdHRsIDY0LCBpZCA1 NjEwMSwgb2Zmc2V0IDAsIGZsYWdzIFtERl0sIHByb3RvIFRDUCAoNiksIGxlbmd0aCA1MikK ICAgIDE5Mi4xNjguY2xpZW50LjkzNyA+IDE5Mi4xNjguc2VydmVyLjIwNDk6IEZsYWdzIFsu XSwgY2tzdW0gMHhlMjQyIChjb3JyZWN0KSwgc2VxIDg3NTEzNzc2LCBhY2sgMTEwOTI1LCB3 aW4gNTYzMiwgb3B0aW9ucyBbbm9wLG5vcCxUUyB2YWwgMTA0ODY5NCBlY3IgNTAwMTAzODc4 XSwgbGVuZ3RoIDAKMTM6MjE6MTcuNDc2NDE5IElQICh0b3MgMHgwLCB0dGwgNjQsIGlkIDU2 MTAyLCBvZmZzZXQgMCwgZmxhZ3MgW0RGXSwgcHJvdG8gVENQICg2KSwgbGVuZ3RoIDkwMDAp CiAgICAxOTIuMTY4LmNsaWVudC45MzcgPiAxOTIuMTY4LnNlcnZlci4yMDQ5OiBGbGFncyBb Ll0sIGNrc3VtIDB4ZDZmMSAoY29ycmVjdCksIHNlcSA4NzUxMzc3Njo4NzUyMjcyNCwgYWNr IDExMDkyNSwgd2luIDU2MzIsIG9wdGlvbnMgW25vcCxub3AsVFMgdmFsIDEwNDkxODggZWNy IDUwMDEwMzg3OF0sIGxlbmd0aCA4OTQ4OiBORlMgcmVxdWVzdCB4aWQgNTM1ODQyNzE0IDg5 NDQgd3JpdGUgZmggVW5rbm93bi9CMzBCRjQxRURFNDhFQzVDMEEwMDBCMDAwMDAwMDAwMDg0 N0Q4MjAxMDAwMDAwMDAwMDAwMDAwMCAxMzEwNzIgKDEzMTA3MikgYnl0ZXMgQCA0MTYwMjI1 MjggPHVuc3RhYmxlPgoxMzoyMToxNy40NzY0MjkgSVAgKHRvcyAweDAsIHR0bCA2NCwgaWQg NTYxMDMsIG9mZnNldCAwLCBmbGFncyBbREZdLCBwcm90byBUQ1AgKDYpLCBsZW5ndGggOTAw MCkKICAgIDE5Mi4xNjguY2xpZW50LjkzNyA+IDE5Mi4xNjguc2VydmVyLjIwNDk6IEZsYWdz IFsuXSwgY2tzdW0gMHgwN2U4IChjb3JyZWN0KSwgc2VxIDg3NTIyNzI0Ojg3NTMxNjcyLCBh Y2sgMTEwOTI1LCB3aW4gNTYzMiwgb3B0aW9ucyBbbm9wLG5vcCxUUyB2YWwgMTA0OTE4OCBl Y3IgNTAwMTAzODc4XSwgbGVuZ3RoIDg5NDgKMTM6MjE6MTcuNDc2NDQ5IElQICh0b3MgMHgw LCB0dGwgNjQsIGlkIDAsIG9mZnNldCAwLCBmbGFncyBbREZdLCBwcm90byBUQ1AgKDYpLCBs ZW5ndGggNTIsIGJhZCBja3N1bSAwICgtPmQ0NmEpISkKICAgIDE5Mi4xNjguc2VydmVyLjIw NDkgPiAxOTIuMTY4LmNsaWVudC45Mzc6IEZsYWdzIFsuXSwgY2tzdW0gMHg2NjgwIChpbmNv cnJlY3QgLT4gMHgzN2RmKSwgc2VxIDExMDkyNSwgYWNrIDg3NTMxNjcyLCB3aW4gMjg4NDcs IG9wdGlvbnMgW25vcCxub3AsVFMgdmFsIDUwMDEwNTg5MiBlY3IgMTA0OTE4OF0sIGxlbmd0 aCAwCjEzOjIxOjE3LjQ3NjQ1MSBJUCAodG9zIDB4MCwgdHRsIDY0LCBpZCA1NjEwNCwgb2Zm c2V0IDAsIGZsYWdzIFtERl0sIHByb3RvIFRDUCAoNiksIGxlbmd0aCA5MDAwKQogICAgMTky LjE2OC5jbGllbnQuOTM3ID4gMTkyLjE2OC5zZXJ2ZXIuMjA0OTogRmxhZ3MgWy5dLCBja3N1 bSAweDlhMGMgKGNvcnJlY3QpLCBzZXEgODc1MzE2NzI6ODc1NDA2MjAsIGFjayAxMTA5MjUs IHdpbiA1NjMyLCBvcHRpb25zIFtub3Asbm9wLFRTIHZhbCAxMDQ5MTg4IGVjciA1MDAxMDM4 NzhdLCBsZW5ndGggODk0OAoxMzoyMToxNy40NzY0NTMgSVAgKHRvcyAweDAsIHR0bCA2NCwg aWQgNTYxMDUsIG9mZnNldCAwLCBmbGFncyBbREZdLCBwcm90byBUQ1AgKDYpLCBsZW5ndGgg OTAwMCkKICAgIDE5Mi4xNjguY2xpZW50LjkzNyA+IDE5Mi4xNjguc2VydmVyLjIwNDk6IEZs YWdzIFsuXSwgY2tzdW0gMHg5Mjg1IChjb3JyZWN0KSwgc2VxIDg3NTQwNjIwOjg3NTQ5NTY4 LCBhY2sgMTEwOTI1LCB3aW4gNTYzMiwgb3B0aW9ucyBbbm9wLG5vcCxUUyB2YWwgMTA0OTE4 OCBlY3IgNTAwMTAzODc4XSwgbGVuZ3RoIDg5NDgKMTM6MjE6MTcuNDc2NDU1IElQICh0b3Mg MHgwLCB0dGwgNjQsIGlkIDAsIG9mZnNldCAwLCBmbGFncyBbREZdLCBwcm90byBUQ1AgKDYp LCBsZW5ndGggNTIsIGJhZCBja3N1bSAwICgtPmQ0NmEpISkKICAgIDE5Mi4xNjguc2VydmVy LjIwNDkgPiAxOTIuMTY4LmNsaWVudC45Mzc6IEZsYWdzIFsuXSwgY2tzdW0gMHg2NjgwIChp bmNvcnJlY3QgLT4gMHhmMzBlKSwgc2VxIDExMDkyNSwgYWNrIDg3NTQ5NTY4LCB3aW4gMjg1 NjcsIG9wdGlvbnMgW25vcCxub3AsVFMgdmFsIDUwMDEwNTg5MiBlY3IgMTA0OTE4OF0sIGxl bmd0aCAwCjEzOjIxOjE3LjQ3NjQ1NSBJUCAodG9zIDB4MCwgdHRsIDY0LCBpZCA1NjEwNiwg b2Zmc2V0IDAsIGZsYWdzIFtERl0sIHByb3RvIFRDUCAoNiksIGxlbmd0aCA5MDAwKQogICAg MTkyLjE2OC5jbGllbnQuOTM3ID4gMTkyLjE2OC5zZXJ2ZXIuMjA0OTogRmxhZ3MgWy5dLCBj a3N1bSAweDFlYmUgKGNvcnJlY3QpLCBzZXEgODc1NDk1Njg6ODc1NTg1MTYsIGFjayAxMTA5 MjUsIHdpbiA1NjMyLCBvcHRpb25zIFtub3Asbm9wLFRTIHZhbCAxMDQ5MTg4IGVjciA1MDAx MDM4NzhdLCBsZW5ndGggODk0OAoxMzoyMToxNy40NzY0NTcgSVAgKHRvcyAweDAsIHR0bCA2 NCwgaWQgNTYxMDcsIG9mZnNldCAwLCBmbGFncyBbREZdLCBwcm90byBUQ1AgKDYpLCBsZW5n dGggOTAwMCkKICAgIDE5Mi4xNjguY2xpZW50LjkzNyA+IDE5Mi4xNjguc2VydmVyLjIwNDk6 IEZsYWdzIFsuXSwgY2tzdW0gMHgwYjE3IChjb3JyZWN0KSwgc2VxIDg3NTU4NTE2Ojg3NTY3 NDY0LCBhY2sgMTEwOTI1LCB3aW4gNTYzMiwgb3B0aW9ucyBbbm9wLG5vcCxUUyB2YWwgMTA0 OTE4OCBlY3IgNTAwMTAzODc4XSwgbGVuZ3RoIDg5NDgKMTM6MjE6MTcuNDc2NDU5IElQICh0 b3MgMHgwLCB0dGwgNjQsIGlkIDAsIG9mZnNldCAwLCBmbGFncyBbREZdLCBwcm90byBUQ1Ag KDYpLCBsZW5ndGggNTIsIGJhZCBja3N1bSAwICgtPmQ0NmEpISkKICAgIDE5Mi4xNjguc2Vy dmVyLjIwNDkgPiAxOTIuMTY4LmNsaWVudC45Mzc6IEZsYWdzIFsuXSwgY2tzdW0gMHg2Njgw IChpbmNvcnJlY3QgLT4gMHhhZTNkKSwgc2VxIDExMDkyNSwgYWNrIDg3NTY3NDY0LCB3aW4g MjgyODgsIG9wdGlvbnMgW25vcCxub3AsVFMgdmFsIDUwMDEwNTg5MiBlY3IgMTA0OTE4OF0s IGxlbmd0aCAwCjEzOjIxOjE3LjQ3NjQ2MCBJUCAodG9zIDB4MCwgdHRsIDY0LCBpZCA1NjEw OCwgb2Zmc2V0IDAsIGZsYWdzIFtERl0sIHByb3RvIFRDUCAoNiksIGxlbmd0aCA5MDAwKQog ICAgMTkyLjE2OC5jbGllbnQuOTM3ID4gMTkyLjE2OC5zZXJ2ZXIuMjA0OTogRmxhZ3MgWy5d LCBja3N1bSAweDBlZTMgKGNvcnJlY3QpLCBzZXEgODc1Njc0NjQ6ODc1NzY0MTIsIGFjayAx MTA5MjUsIHdpbiA1NjMyLCBvcHRpb25zIFtub3Asbm9wLFRTIHZhbCAxMDQ5MTg4IGVjciA1 MDAxMDM4NzhdLCBsZW5ndGggODk0OAoxMzoyMToxNy40NzY0NjMgSVAgKHRvcyAweDAsIHR0 bCA2NCwgaWQgNTYxMDksIG9mZnNldCAwLCBmbGFncyBbREZdLCBwcm90byBUQ1AgKDYpLCBs ZW5ndGggOTAwMCkKICAgIDE5Mi4xNjguY2xpZW50LjkzNyA+IDE5Mi4xNjguc2VydmVyLjIw NDk6IEZsYWdzIFsuXSwgY2tzdW0gMHgzNjU3IChjb3JyZWN0KSwgc2VxIDg3NTc2NDEyOjg3 NTg1MzYwLCBhY2sgMTEwOTI1LCB3aW4gNTYzMiwgb3B0aW9ucyBbbm9wLG5vcCxUUyB2YWwg MTA0OTE4OCBlY3IgNTAwMTAzODc4XSwgbGVuZ3RoIDg5NDgKMTM6MjE6MTcuNDc2NDY1IElQ ICh0b3MgMHgwLCB0dGwgNjQsIGlkIDAsIG9mZnNldCAwLCBmbGFncyBbREZdLCBwcm90byBU Q1AgKDYpLCBsZW5ndGggNTIsIGJhZCBja3N1bSAwICgtPmQ0NmEpISkKICAgIDE5Mi4xNjgu c2VydmVyLjIwNDkgPiAxOTIuMTY4LmNsaWVudC45Mzc6IEZsYWdzIFsuXSwgY2tzdW0gMHg2 NjgwIChpbmNvcnJlY3QgLT4gMHg2OTZkKSwgc2VxIDExMDkyNSwgYWNrIDg3NTg1MzYwLCB3 aW4gMjgwMDgsIG9wdGlvbnMgW25vcCxub3AsVFMgdmFsIDUwMDEwNTg5MiBlY3IgMTA0OTE4 OF0sIGxlbmd0aCAwCjEzOjIxOjE3LjQ3NjQ2NiBJUCAodG9zIDB4MCwgdHRsIDY0LCBpZCA1 NjExMCwgb2Zmc2V0IDAsIGZsYWdzIFtERl0sIHByb3RvIFRDUCAoNiksIGxlbmd0aCA5MDAw KQogICAgMTkyLjE2OC5jbGllbnQuOTM3ID4gMTkyLjE2OC5zZXJ2ZXIuMjA0OTogRmxhZ3Mg Wy5dLCBja3N1bSAweDI1MzkgKGNvcnJlY3QpLCBzZXEgODc1ODUzNjA6ODc1OTQzMDgsIGFj ayAxMTA5MjUsIHdpbiA1NjMyLCBvcHRpb25zIFtub3Asbm9wLFRTIHZhbCAxMDQ5MTg4IGVj ciA1MDAxMDM4NzhdLCBsZW5ndGggODk0OAoxMzoyMToxNy40NzY0ODMgSVAgKHRvcyAweDAs IHR0bCA2NCwgaWQgNTYxMTEsIG9mZnNldCAwLCBmbGFncyBbREZdLCBwcm90byBUQ1AgKDYp LCBsZW5ndGggOTAwMCkKICAgIDE5Mi4xNjguY2xpZW50LjkzNyA+IDE5Mi4xNjguc2VydmVy LjIwNDk6IEZsYWdzIFsuXSwgY2tzdW0gMHgzMzk5IChjb3JyZWN0KSwgc2VxIDg3NTk0MzA4 Ojg3NjAzMjU2LCBhY2sgMTEwOTI1LCB3aW4gNTYzMiwgb3B0aW9ucyBbbm9wLG5vcCxUUyB2 YWwgMTA0OTE4OCBlY3IgNTAwMTAzODc4XSwgbGVuZ3RoIDg5NDgKMTM6MjE6MTcuNDc2NDg1 IElQICh0b3MgMHgwLCB0dGwgNjQsIGlkIDAsIG9mZnNldCAwLCBmbGFncyBbREZdLCBwcm90 byBUQ1AgKDYpLCBsZW5ndGggNTIsIGJhZCBja3N1bSAwICgtPmQ0NmEpISkKICAgIDE5Mi4x Njguc2VydmVyLjIwNDkgPiAxOTIuMTY4LmNsaWVudC45Mzc6IEZsYWdzIFsuXSwgY2tzdW0g MHg2NjgwIChpbmNvcnJlY3QgLT4gMHgyMDNlKSwgc2VxIDExMDkyNSwgYWNrIDg3NjAzMjU2 LCB3aW4gMjg4NDcsIG9wdGlvbnMgW25vcCxub3AsVFMgdmFsIDUwMDEwNTg5MiBlY3IgMTA0 OTE4OF0sIGxlbmd0aCAwCjEzOjIxOjE3LjQ3NjU2NyBJUCAodG9zIDB4MCwgdHRsIDY0LCBp ZCA1NjExMiwgb2Zmc2V0IDAsIGZsYWdzIFtERl0sIHByb3RvIFRDUCAoNiksIGxlbmd0aCA5 MDAwKQogICAgMTkyLjE2OC5jbGllbnQuOTM3ID4gMTkyLjE2OC5zZXJ2ZXIuMjA0OTogRmxh Z3MgWy5dLCBja3N1bSAweGY3OTkgKGNvcnJlY3QpLCBzZXEgODc2MDMyNTY6ODc2MTIyMDQs IGFjayAxMTA5MjUsIHdpbiA1NjMyLCBvcHRpb25zIFtub3Asbm9wLFRTIHZhbCAxMDQ5MTg4 IGVjciA1MDAxMDU4OTJdLCBsZW5ndGggODk0OAoxMzoyMToxNy40NzY1NzEgSVAgKHRvcyAw eDAsIHR0bCA2NCwgaWQgNTYxMTMsIG9mZnNldCAwLCBmbGFncyBbREZdLCBwcm90byBUQ1Ag KDYpLCBsZW5ndGggOTAwMCkKICAgIDE5Mi4xNjguY2xpZW50LjkzNyA+IDE5Mi4xNjguc2Vy dmVyLjIwNDk6IEZsYWdzIFsuXSwgY2tzdW0gMHg2YTEyIChjb3JyZWN0KSwgc2VxIDg3NjEy MjA0Ojg3NjIxMTUyLCBhY2sgMTEwOTI1LCB3aW4gNTYzMiwgb3B0aW9ucyBbbm9wLG5vcCxU UyB2YWwgMTA0OTE4OCBlY3IgNTAwMTA1ODkyXSwgbGVuZ3RoIDg5NDgKMTM6MjE6MTcuNDc2 NTc0IElQICh0b3MgMHgwLCB0dGwgNjQsIGlkIDAsIG9mZnNldCAwLCBmbGFncyBbREZdLCBw cm90byBUQ1AgKDYpLCBsZW5ndGggNTIsIGJhZCBja3N1bSAwICgtPmQ0NmEpISkKICAgIDE5 Mi4xNjguc2VydmVyLjIwNDkgPiAxOTIuMTY4LmNsaWVudC45Mzc6IEZsYWdzIFsuXSwgY2tz dW0gMHg2NjgwIChpbmNvcnJlY3QgLT4gMHhkYjZkKSwgc2VxIDExMDkyNSwgYWNrIDg3NjIx MTUyLCB3aW4gMjg1NjcsIG9wdGlvbnMgW25vcCxub3AsVFMgdmFsIDUwMDEwNTg5MiBlY3Ig MTA0OTE4OF0sIGxlbmd0aCAwCjEzOjIxOjE3LjQ3NjU3NSBJUCAodG9zIDB4MCwgdHRsIDY0 LCBpZCA1NjExNCwgb2Zmc2V0IDAsIGZsYWdzIFtERl0sIHByb3RvIFRDUCAoNiksIGxlbmd0 aCA5MDAwKQogICAgMTkyLjE2OC5jbGllbnQuOTM3ID4gMTkyLjE2OC5zZXJ2ZXIuMjA0OTog RmxhZ3MgWy5dLCBja3N1bSAweDFmM2MgKGNvcnJlY3QpLCBzZXEgODc2MjExNTI6ODc2MzAx MDAsIGFjayAxMTA5MjUsIHdpbiA1NjMyLCBvcHRpb25zIFtub3Asbm9wLFRTIHZhbCAxMDQ5 MTg4IGVjciA1MDAxMDU4OTJdLCBsZW5ndGggODk0OAoxMzoyMToxNy40NzY1NzggSVAgKHRv cyAweDAsIHR0bCA2NCwgaWQgNTYxMTUsIG9mZnNldCAwLCBmbGFncyBbREZdLCBwcm90byBU Q1AgKDYpLCBsZW5ndGggOTAwMCkKICAgIDE5Mi4xNjguY2xpZW50LjkzNyA+IDE5Mi4xNjgu c2VydmVyLjIwNDk6IEZsYWdzIFsuXSwgY2tzdW0gMHhhMThjIChjb3JyZWN0KSwgc2VxIDg3 NjMwMTAwOjg3NjM5MDQ4LCBhY2sgMTEwOTI1LCB3aW4gNTYzMiwgb3B0aW9ucyBbbm9wLG5v cCxUUyB2YWwgMTA0OTE4OCBlY3IgNTAwMTA1ODkyXSwgbGVuZ3RoIDg5NDgKMTM6MjE6MTcu NDc2NTgwIElQICh0b3MgMHgwLCB0dGwgNjQsIGlkIDAsIG9mZnNldCAwLCBmbGFncyBbREZd LCBwcm90byBUQ1AgKDYpLCBsZW5ndGggNTIsIGJhZCBja3N1bSAwICgtPmQ0NmEpISkK --------------0092DAC1F1203FC0CDCBEF55-- From owner-freebsd-fs@freebsd.org Wed Apr 4 19:14:26 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 718E7F6EAF7 for ; Wed, 4 Apr 2018 19:14:26 +0000 (UTC) (envelope-from mad@madpilot.net) Received: from mail.madpilot.net (grunt.madpilot.net [78.47.145.38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0BF916EBFE for ; Wed, 4 Apr 2018 19:14:25 +0000 (UTC) (envelope-from mad@madpilot.net) Received: from mail (mail [192.168.254.3]) by mail.madpilot.net (Postfix) with ESMTP id 40Gb3m6PCmzZr4; Wed, 4 Apr 2018 21:04:20 +0200 (CEST) Received: from mail.madpilot.net ([192.168.254.3]) by mail (mail.madpilot.net [192.168.254.3]) (amavisd-new, port 10024) with ESMTP id Qj7lXimlZ47g; Wed, 4 Apr 2018 21:04:09 +0200 (CEST) Received: from tommy.madpilot.net (host190-122-dynamic.6-87-r.retail.telecomitalia.it [87.6.122.190]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.madpilot.net (Postfix) with ESMTPSA; Wed, 4 Apr 2018 21:04:09 +0200 (CEST) Subject: Re: Linux NFS client and FreeBSD server strangeness To: Mike Tancsa , "freebsd-fs@freebsd.org" References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> From: Guido Falsi Message-ID: <2019ee5a-5b2b-853d-98c5-a365940d93b5@madpilot.net> Date: Wed, 4 Apr 2018 21:04:09 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2018 19:14:26 -0000 On 04/04/18 20:27, Mike Tancsa wrote: > Not sure where the tweaking needs to happen, but I am getting strange > behaviour between a Linux nfs client and FreeBSD RELENG_11 NFS server. > > The FreeBSD server starts with > > > nfs_client_enable="YES" > nfs_server_enable="YES" > > > rpcbind_enable="YES" > rpc_lockd_enable="YES" > rpc_statd_enable="YES" > nfs_server_flags="-u -t -n 16" > > and on the Linux client I have been trying various options to no avail. > The mount works, but on a straight up write to the FreeBSD server, > everything is very bursty. I noticed this (I think) a few months ago > where Linux dumps across an nfs mount seemed to take a lot longer and > were getting very bursty. > > It seems if there are a mixture of reads and writes, everything is > pretty fast. But if a client is just writing to the server, something, > somewhere is blocking. Doing something simple like > ls -l /nfsmount > from the client "wakes" up the server/client so that write stream can > keep going. Otherwise, it will do a big blast of writes and then several > seconds of pausing on the dump. Looks like you're using ZFS. Most probably that's the cause. NFS requires frequent syncs, and ZFS respects those, becoming slow, being unable to take advantage of the ZIL. you can find more details here: https://wiki.freebsd.org/ZFSTuningGuide in the NFS tuning section. If you're not using ZFS I don't have an idea right away what your problem could be. -- Guido Falsi From owner-freebsd-fs@freebsd.org Wed Apr 4 19:25:52 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 56A7EF6F864 for ; Wed, 4 Apr 2018 19:25:52 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost2.sentex.ca (smarthost2.sentex.ca [IPv6:2607:f3e0:80:80::2]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "smarthost.sentex.ca", Issuer "smarthost.sentex.ca" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 93D726F423 for ; Wed, 4 Apr 2018 19:25:51 +0000 (UTC) (envelope-from mike@sentex.net) Received: from lava.sentex.ca (lava.sentex.ca [IPv6:2607:f3e0:0:5::11]) by smarthost2.sentex.ca (8.15.2/8.15.2) with ESMTPS id w34JP0QK006996 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Wed, 4 Apr 2018 15:25:01 -0400 (EDT) (envelope-from mike@sentex.net) Received: from [192.168.43.26] (saphire3.sentex.ca [192.168.43.26]) by lava.sentex.ca (8.15.2/8.15.2) with ESMTP id w34JOwTY017147; Wed, 4 Apr 2018 15:24:58 -0400 (EDT) (envelope-from mike@sentex.net) Subject: Re: Linux NFS client and FreeBSD server strangeness To: Guido Falsi , "freebsd-fs@freebsd.org" References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> <2019ee5a-5b2b-853d-98c5-a365940d93b5@madpilot.net> From: Mike Tancsa Organization: Sentex Communications Message-ID: Date: Wed, 4 Apr 2018 15:24:59 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <2019ee5a-5b2b-853d-98c5-a365940d93b5@madpilot.net> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.78 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2018 19:25:52 -0000 On 4/4/2018 3:04 PM, Guido Falsi wrote: > > https://wiki.freebsd.org/ZFSTuningGuide > > in the NFS tuning section. > > > If you're not using ZFS I don't have an idea right away what your > problem could be. Thanks, same sort of bursty traffic patterns with a ufs filesystem. I just tried with a spare disk I made into a UFS2 partition and exported it to the linux client. Also no difference if I disable sync for the underlying file system when using zfs. ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 x203 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada From owner-freebsd-fs@freebsd.org Wed Apr 4 19:36:52 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F1DB5F70296 for ; Wed, 4 Apr 2018 19:36:51 +0000 (UTC) (envelope-from mad@madpilot.net) Received: from mail.madpilot.net (grunt.madpilot.net [78.47.145.38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 90B6A6F9E3 for ; Wed, 4 Apr 2018 19:36:51 +0000 (UTC) (envelope-from mad@madpilot.net) Received: from mail (mail [192.168.254.3]) by mail.madpilot.net (Postfix) with ESMTP id 40GbnG20bgzZr4; Wed, 4 Apr 2018 21:36:50 +0200 (CEST) Received: from mail.madpilot.net ([192.168.254.3]) by mail (mail.madpilot.net [192.168.254.3]) (amavisd-new, port 10024) with ESMTP id 8ubAZ9kOivi3; Wed, 4 Apr 2018 21:36:39 +0200 (CEST) Received: from tommy.madpilot.net (host190-122-dynamic.6-87-r.retail.telecomitalia.it [87.6.122.190]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.madpilot.net (Postfix) with ESMTPSA; Wed, 4 Apr 2018 21:36:39 +0200 (CEST) Subject: Re: Linux NFS client and FreeBSD server strangeness To: Mike Tancsa , "freebsd-fs@freebsd.org" References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> <2019ee5a-5b2b-853d-98c5-a365940d93b5@madpilot.net> From: Guido Falsi Message-ID: Date: Wed, 4 Apr 2018 21:36:38 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2018 19:36:52 -0000 On 04/04/18 21:24, Mike Tancsa wrote: > On 4/4/2018 3:04 PM, Guido Falsi wrote: >> >> https://wiki.freebsd.org/ZFSTuningGuide >> >> in the NFS tuning section. >> >> >> If you're not using ZFS I don't have an idea right away what your >> problem could be. > > Thanks, same sort of bursty traffic patterns with a ufs filesystem. I > just tried with a spare disk I made into a UFS2 partition and exported > it to the linux client. > > Also no difference if I disable sync for the underlying file system when > using zfs. Well, my suggestion was worth a spin. I've never used NFS on linux much so I can't help much more. -- Guido Falsi From owner-freebsd-fs@freebsd.org Wed Apr 4 19:50:08 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 67E6BF7108A for ; Wed, 4 Apr 2018 19:50:08 +0000 (UTC) (envelope-from karli@inparadise.se) Received: from mail.inparadise.se (h-246-50.A444.priv.bahnhof.se [155.4.246.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id EAF34701B5 for ; Wed, 4 Apr 2018 19:50:07 +0000 (UTC) (envelope-from karli@inparadise.se) Received: from localhost (localhost [127.0.0.1]) by mail.inparadise.se (Postfix) with ESMTP id 270774400E; Wed, 4 Apr 2018 21:44:54 +0200 (CEST) Received: from mail.inparadise.se ([127.0.0.1]) by localhost (mail.inparadise.se [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id MIQtb5SAPwMc; Wed, 4 Apr 2018 21:44:53 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by mail.inparadise.se (Postfix) with ESMTP id 108754400F; Wed, 4 Apr 2018 21:44:53 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.inparadise.se 108754400F X-Virus-Scanned: amavisd-new at inparadise.se Received: from mail.inparadise.se ([127.0.0.1]) by localhost (mail.inparadise.se [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id YaoeCp1CY0wj; Wed, 4 Apr 2018 21:44:52 +0200 (CEST) Received: from nexus6p.inparadise.se (nexus6p.inparadise.se [172.16.1.122]) by mail.inparadise.se (Postfix) with ESMTPSA id C9A824400E; Wed, 4 Apr 2018 21:44:52 +0200 (CEST) Date: Wed, 04 Apr 2018 21:44:51 +0200 User-Agent: K-9 Mail for Android In-Reply-To: References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> <2019ee5a-5b2b-853d-98c5-a365940d93b5@madpilot.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: Linux NFS client and FreeBSD server strangeness To: freebsd-fs@freebsd.org, Mike Tancsa , Guido Falsi , "freebsd-fs@freebsd.org" From: =?ISO-8859-1?Q?Karli_Sj=F6berg?= Message-ID: X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2018 19:50:08 -0000 Mike Tancsa skrev: (4 april 2018 21:24:59 CEST) >On 4/4/2018 3:04 PM, Guido Falsi wrote: >>=20 >> https://wiki=2Efreebsd=2Eorg/ZFSTuningGuide >>=20 >> in the NFS tuning section=2E >>=20 >>=20 >> If you're not using ZFS I don't have an idea right away what your >> problem could be=2E > >Thanks, same sort of bursty traffic patterns with a ufs filesystem=2E I >just tried with a spare disk I made into a UFS2 partition and exported >it to the linux client=2E > >Also no difference if I disable sync for the underlying file system >when >using zfs=2E Darn, would have been my bet as well:) Although I understand it might be difficult for you, it would be interrest= ing to know if a FreeBSD client exhibits the same behaviour=2E It may be a = client problem instead of the server? /K > > > > ---Mike --=20 Skickat fr=C3=A5n min Android-enhet med K-9 Mail=2E Urs=C3=A4kta min f=C3= =A5ordighet=2E From owner-freebsd-fs@freebsd.org Wed Apr 4 19:56:03 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C861AF71678 for ; Wed, 4 Apr 2018 19:56:03 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost2.sentex.ca (smarthost2.sentex.ca [IPv6:2607:f3e0:80:80::2]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "smarthost.sentex.ca", Issuer "smarthost.sentex.ca" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 1D6D4706AF for ; Wed, 4 Apr 2018 19:56:03 +0000 (UTC) (envelope-from mike@sentex.net) Received: from lava.sentex.ca (lava.sentex.ca [IPv6:2607:f3e0:0:5::11]) by smarthost2.sentex.ca (8.15.2/8.15.2) with ESMTPS id w34Ju2tu012741 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Wed, 4 Apr 2018 15:56:02 -0400 (EDT) (envelope-from mike@sentex.net) Received: from [192.168.43.26] (saphire3.sentex.net [192.168.43.26]) by lava.sentex.ca (8.15.2/8.15.2) with ESMTP id w34JtxMr017214; Wed, 4 Apr 2018 15:56:00 -0400 (EDT) (envelope-from mike@sentex.net) Subject: Re: Linux NFS client and FreeBSD server strangeness To: =?UTF-8?Q?Karli_Sj=c3=b6berg?= , freebsd-fs@freebsd.org References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> <2019ee5a-5b2b-853d-98c5-a365940d93b5@madpilot.net> From: Mike Tancsa Organization: Sentex Communications Message-ID: Date: Wed, 4 Apr 2018 15:56:00 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.78 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2018 19:56:03 -0000 On 4/4/2018 3:44 PM, Karli Sjöberg wrote: > > Although I understand it might be difficult for you, it would be interresting to know if a FreeBSD client exhibits the same behaviour. It may be a client problem instead of the server? I am starting to wonder if its the Linux client's (seeming) massive write caching, or the way that dump might sputter out data on linux.... It could be causing a certain edge case where the LINUX's NFS client doesnt bother writing for a while. I do note that the burst is pushing 8-9Gb/s. doing a straight up cat of /dev/zero to a file over NFS is nice and steady and works as expected. FreeBSD dump also works as expected, but then again, the client might have all sorts of different caching rules that are different igb1 Kbps in Kbps out 250.20 16.57 16299.44 441.57 121562.3 3267.12 304158.3 8128.84 205250.2 5485.41 159533.5 4294.59 252426.0 6465.67 59065.87 1587.44 109203.0 2935.75 139746.6 3750.78 167724.0 4735.38 96619.25 2597.97 ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 x203 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada From owner-freebsd-fs@freebsd.org Wed Apr 4 20:04:25 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C5444F7237A for ; Wed, 4 Apr 2018 20:04:24 +0000 (UTC) (envelope-from kayasaman@gmail.com) Received: from mail-wm0-x22d.google.com (mail-wm0-x22d.google.com [IPv6:2a00:1450:400c:c09::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3A80870E89 for ; Wed, 4 Apr 2018 20:04:24 +0000 (UTC) (envelope-from kayasaman@gmail.com) Received: by mail-wm0-x22d.google.com with SMTP id g8so321515wmd.2 for ; Wed, 04 Apr 2018 13:04:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-transfer-encoding:content-language; bh=pcgMpPAI+kRcmUkvhOQaS74xtU5mHblGBD/yQvzkcio=; b=IhaWH/E4EVN5wG/BKp3HGH4DP01JofbysVs8/eghI6UeXbCXSXW7EsbF7C/5F3NmaN TKcFrYfkqCwcHxLha//4r4uuy3PCB8LruDBqD4aTKr3tHBjpUO8evnAAYhZHKeLJFOxc ekmX8OdYcW9g4RNaVAd+vReO8SUnm1UJgvR620yJd6JMObjdlnyxJwJLUsNYLeOk6HmS /bRaQtd+SeNovLyZ5BvrH2I5VtIFxnTldXvaIzLyShEccDG3eq59Rt6dBXzUfzGKfDc9 10bHkDAzXMeU3V06QqB0uRDmJOWsYtXUfYmZSyfnwnQTSiTlbfJFP/kCFA4IFrvP/BvH aThA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=pcgMpPAI+kRcmUkvhOQaS74xtU5mHblGBD/yQvzkcio=; b=IkA8UuxK9SvC+/IVuCVsgcfymbKT+mxMLuL9L2hHkSml4ChbmHmPjv0P2hNQvFn9IX zzVV6p74lU9h3724Ms7zPNw79Yl40h48tKFHXnFF0qyNLxqm7VzlovFWZMAqnWz/9OPm Iypcr0QBtqZjpDjQIIcQRMeLiSSgyKRh3yPMI7pxBMZBE1dDXOyfT++nT4wFcQrXgS4y Mx+JBOZwlYx4v8QLO09OV/35pKPw+D5Z4rVdGomtIC5lS49YyyIGFX2WJkB9oYp44ZdQ ClvyUqBhoAn8/EprVWNNdsDkCg90vwjYXaK3mG5+ulBtV9tfS5OW7ZVdKNPVqffzy8ut qB4Q== X-Gm-Message-State: ALQs6tD8LbaDMn4n/HHgFQO6JvvhNKIosdzeZ8tNUSKEGIrAtuu99Pzd xybK/sW6tF+rSQ8yojYy7fTTzqHr X-Google-Smtp-Source: AIpwx48LEoWcGN8egMID2noZnYAniSN6mnNM8OAJpLX8DlssltTOY8ecUBbgc/NTNJJ5/vXuzY9HXA== X-Received: by 10.28.124.13 with SMTP id x13mr7671588wmc.71.1522872263028; Wed, 04 Apr 2018 13:04:23 -0700 (PDT) Received: from Sting-Ray.optiplex-networks.com (optiplexnetworks.plus.com. [212.159.80.17]) by smtp.gmail.com with ESMTPSA id d48sm9582519wrd.12.2018.04.04.13.04.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Apr 2018 13:04:22 -0700 (PDT) Subject: Re: Linux NFS client and FreeBSD server strangeness To: =?UTF-8?Q?Karli_Sj=c3=b6berg?= , freebsd-fs@freebsd.org, Mike Tancsa , Guido Falsi References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> <2019ee5a-5b2b-853d-98c5-a365940d93b5@madpilot.net> From: Kaya Saman Message-ID: <4da08f8b-2c28-cf18-77d2-6b498004d435@gmail.com> Date: Wed, 4 Apr 2018 21:04:21 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2018 20:04:25 -0000 I also see this behavior between Linux and FBSD. Yep my server does run ZFS however, when using FreeBSD as a client I am able to use NFSv4 without any problem however, with Linux I get many timeouts so I switched to v3 which gives smoother performance at least on the read side. It is however, not possible to use the FBSD machine as an nfs /home directory as the Linux client will lock up and wait for a long time while certain files from Window Managers etc.... get written/read. For this reason I switched over to using iscsi and a ZFS udev. I think I posted a similar topic ages ago but got no response. It would be interesting to see the cause for this. I have a hunch it could be something to do with the nfs implementation of Linux but it is just pure conjecture on my side. Regards, Kaya On 04/04/18 20:44, Karli Sjöberg via freebsd-fs wrote: > > Mike Tancsa skrev: (4 april 2018 21:24:59 CEST) >> On 4/4/2018 3:04 PM, Guido Falsi wrote: >>> https://wiki.freebsd.org/ZFSTuningGuide >>> >>> in the NFS tuning section. >>> >>> >>> If you're not using ZFS I don't have an idea right away what your >>> problem could be. >> Thanks, same sort of bursty traffic patterns with a ufs filesystem. I >> just tried with a spare disk I made into a UFS2 partition and exported >> it to the linux client. >> >> Also no difference if I disable sync for the underlying file system >> when >> using zfs. > Darn, would have been my bet as well:) > > Although I understand it might be difficult for you, it would be interresting to know if a FreeBSD client exhibits the same behaviour. It may be a client problem instead of the server? > > /K > >> >> >> ---Mike From owner-freebsd-fs@freebsd.org Wed Apr 4 20:15:00 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 20533F72F0E for ; Wed, 4 Apr 2018 20:15:00 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost2.sentex.ca (smarthost2.sentex.ca [IPv6:2607:f3e0:80:80::2]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "smarthost.sentex.ca", Issuer "smarthost.sentex.ca" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 9CFD571670 for ; Wed, 4 Apr 2018 20:14:59 +0000 (UTC) (envelope-from mike@sentex.net) Received: from lava.sentex.ca (lava.sentex.ca [IPv6:2607:f3e0:0:5::11]) by smarthost2.sentex.ca (8.15.2/8.15.2) with ESMTPS id w34KEwKH016160 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Wed, 4 Apr 2018 16:14:59 -0400 (EDT) (envelope-from mike@sentex.net) Received: from [192.168.43.26] (saphire3.sentex.ca [192.168.43.26]) by lava.sentex.ca (8.15.2/8.15.2) with ESMTP id w34KEuap017258; Wed, 4 Apr 2018 16:14:56 -0400 (EDT) (envelope-from mike@sentex.net) Subject: Re: Linux NFS client and FreeBSD server strangeness To: Kaya Saman , =?UTF-8?Q?Karli_Sj=c3=b6berg?= , freebsd-fs@freebsd.org, Guido Falsi References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> <2019ee5a-5b2b-853d-98c5-a365940d93b5@madpilot.net> <4da08f8b-2c28-cf18-77d2-6b498004d435@gmail.com> From: Mike Tancsa Organization: Sentex Communications Message-ID: <2937ffcc-6b47-91af-8745-2117006660db@sentex.net> Date: Wed, 4 Apr 2018 16:14:57 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <4da08f8b-2c28-cf18-77d2-6b498004d435@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.78 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2018 20:15:00 -0000 On 4/4/2018 4:04 PM, Kaya Saman wrote: > I think I posted a similar topic ages ago but got no response. It would > be interesting to see the cause for this. I have a hunch it could be > something to do with the nfs implementation of Linux but it is just pure > conjecture on my side. I know what you mean about the zfs / zil lockups. It makes a HUGE difference performance wise if you run with sync disabled on writes for the zfs dataset. Not sure I want to do that. I am going to try later with ZIL on some NVE devices and see I can have the best of both worlds. I have individual zfs data sets for each server to write their backups to. I might be able to live with the risk of running with sync disabled to receive backup files with. I think I may have found the setting on the LINUX client. If I mount with the options -o tcp,intr,noatime,sync,vers=3 ie sync I get the "smooth" traffic patterns. Its actually a bit slower than the with the default async which I guess is able to burst up higher. I am not sure how dangerous running async writes are on an nfs client. But for now, this mystery seems to be a bit better explained to me! ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 x203 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada From owner-freebsd-fs@freebsd.org Wed Apr 4 20:23:41 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BDDC7F73A74 for ; Wed, 4 Apr 2018 20:23:41 +0000 (UTC) (envelope-from kayasaman@gmail.com) Received: from mail-io0-x22d.google.com (mail-io0-x22d.google.com [IPv6:2607:f8b0:4001:c06::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4F85171F0A for ; Wed, 4 Apr 2018 20:23:41 +0000 (UTC) (envelope-from kayasaman@gmail.com) Received: by mail-io0-x22d.google.com with SMTP id q80so27847852ioi.13 for ; Wed, 04 Apr 2018 13:23:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=QJUiGhmSPQlD6Ra1xzFesbGETYjVZktoAEovtYlATzo=; b=f6vTOf9c04uo1xeITg2pKRuKHkPir2woFzBhSqL0qmmM4rAllJLrTow4Alli7Px2iD x77eKG4H/v6i/dxtD/qmu8D9djedZFwFxRVjIhR39DxFDUJ/qqXTNY2EiC+SNetJzN9j RsjM/xYWG1hEf/DPANb+ThFrfFuOMeWwKleV/lKspmopqbUTi4Uvins8XWclh5EC0uWH kCH4JemjKNwVxG2qNT7NAwG1gb9nICUtp+3EZnkI37x9sZO/3QT76DbvUz3zGWm6ugC4 leOBBZbXMXQwBNQyxMEbug1KzLjRugeNKuznCJC+HD9Kw+zQziCaqxW2tRGyl1YWocQ0 f+Zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=QJUiGhmSPQlD6Ra1xzFesbGETYjVZktoAEovtYlATzo=; b=SP77ocySu+NwtczSzFevKhjKdEEFs/BAveePMy/jC052ssAlX4GjvDKntfSwBrwpyy /KjANVGBxW64rIprCTVBwXXoQ4I1h6+q3f4PWcmW8WKWmI0zgN/Ma54I45/yizfCG7QK Izf2R2JIofFzF0ywqj3WKrA6+3C6xRh6JWxjJaMnW0CHVYZo65LL0biS17cjnc5dnZrG gb2YQYVWIxPylFXByzUf4w6X2X7WY7W/Jxb7+wedtX6UBH6pYtuyQNztmlJLcVM3nT0p qzKQOIiwzsjwVSI9qIok8qrrmYOwfADANzYeJ5OHbahUXu5NQN3JsQKUGaDknDdPmVhx xmmw== X-Gm-Message-State: AElRT7EkmUA2cfmEklBt85Pn1tGgChW6YNNyT9tfw3QwOlmuihjNrNGb WTZbd+L9N8H978OPW+9X7FpHXR7vSR+USXB76zM= X-Google-Smtp-Source: AIpwx49CDjFiRkQi5LfCdI5vavs+fD3lvnUvZGewV6x606bolEtYlr93cf5KhWywZtG8tEpxOYW+XprOGX5wLg4W0Hg= X-Received: by 10.107.178.70 with SMTP id b67mr18231651iof.186.1522873420818; Wed, 04 Apr 2018 13:23:40 -0700 (PDT) MIME-Version: 1.0 Received: by 10.192.148.22 with HTTP; Wed, 4 Apr 2018 13:23:40 -0700 (PDT) In-Reply-To: <2937ffcc-6b47-91af-8745-2117006660db@sentex.net> References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> <2019ee5a-5b2b-853d-98c5-a365940d93b5@madpilot.net> <4da08f8b-2c28-cf18-77d2-6b498004d435@gmail.com> <2937ffcc-6b47-91af-8745-2117006660db@sentex.net> From: Kaya Saman Date: Wed, 4 Apr 2018 21:23:40 +0100 Message-ID: Subject: Re: Linux NFS client and FreeBSD server strangeness To: Mike Tancsa Cc: =?UTF-8?Q?Karli_Sj=C3=B6berg?= , FreeBSD Filesystems , Guido Falsi Content-Type: text/plain; charset="UTF-8" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2018 20:23:42 -0000 If I recall correctly the "sync" option is default. Though it might be different depending on the Linux distro in use? I use this: vers=3,defaults,auto,tcp,rsize=8192,wsize=8192 though I could get rid of the tcp as that's also a default. The rsize and wsize options are for running Jumbo Frames ie. larger MTU then 1500; in my case 9000 for 1Gbps links. At least for Arch Linux the guide does explain quite a bit, though the 'man' page does too ;-) https://wiki.archlinux.org/index.php/NFS On Wed, Apr 4, 2018 at 9:14 PM, Mike Tancsa wrote: > On 4/4/2018 4:04 PM, Kaya Saman wrote: >> I think I posted a similar topic ages ago but got no response. It would >> be interesting to see the cause for this. I have a hunch it could be >> something to do with the nfs implementation of Linux but it is just pure >> conjecture on my side. > > I know what you mean about the zfs / zil lockups. It makes a HUGE > difference performance wise if you run with sync disabled on writes for > the zfs dataset. Not sure I want to do that. I am going to try later > with ZIL on some NVE devices and see I can have the best of both worlds. > I have individual zfs data sets for each server to write their backups > to. I might be able to live with the risk of running with sync disabled > to receive backup files with. > > I think I may have found the setting on the LINUX client. If I mount > with the options > > -o tcp,intr,noatime,sync,vers=3 > > ie sync > > I get the "smooth" traffic patterns. Its actually a bit slower than the > with the default async which I guess is able to burst up higher. > > I am not sure how dangerous running async writes are on an nfs client. > But for now, this mystery seems to be a bit better explained to me! > > ---Mike > > > -- > ------------------- > Mike Tancsa, tel +1 519 651 3400 x203 > Sentex Communications, mike@sentex.net > Providing Internet services since 1994 www.sentex.net > Cambridge, Ontario Canada From owner-freebsd-fs@freebsd.org Wed Apr 4 20:29:03 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0DC02F73EFF for ; Wed, 4 Apr 2018 20:29:03 +0000 (UTC) (envelope-from rcarter@pinyon.org) Received: from h2.pinyon.org (h2.pinyon.org [65.101.20.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 831D9720ED for ; Wed, 4 Apr 2018 20:29:02 +0000 (UTC) (envelope-from rcarter@pinyon.org) Received: by h2.pinyon.org (Postfix, from userid 58) id 179CF3275B; Wed, 4 Apr 2018 13:20:36 -0700 (MST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=pinyon.org; s=DKIM; t=1522873236; bh=brbQdYKz0oioCUFsw9oUmxj08Ef2WgOfwe2N/Tvmczw=; h=Subject:To:References:From:Cc:Date:In-Reply-To; b=mbHal0O34kzVVeGniyJkBbcJoM1J7V6myWV8VizPMz+6TSIku4obGjCQsZMvA9sSQ 6A6Otay0R+modQKsCSTJESpuzX8/HeTQGD4NzkobUSqFGxd1m7dxDsL4a7vXl2Cu/i G4Ac6qZYdTewzRsiBwDTnCKuhdSoJ/qQMJ+gfEBc= X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on h2.n1.pinyon.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,URIBL_BLOCKED shortcircuit=no autolearn=ham autolearn_force=no version=3.4.1 Received: from [10.0.10.15] (h1.pinyon.org [65.101.20.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by h2.pinyon.org (Postfix) with ESMTPSA id 0BF753273E; Wed, 4 Apr 2018 13:20:34 -0700 (MST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=pinyon.org; s=DKIM; t=1522873234; bh=brbQdYKz0oioCUFsw9oUmxj08Ef2WgOfwe2N/Tvmczw=; h=Subject:To:References:From:Cc:Date:In-Reply-To; b=JF1DH6G5/PyKuTzw+oNw43J4cJp9J8mW1h5EAgiqrKTjsIWN/p1tzprBWq4j8PTek mkexcjWrZ85nVIcYmWGydtjri1b18TUgwoRpVqAy2qBncNtqKGW9cKYdT86sHEUKbQ Hgxm9Q8Sy7WjSlO8y0J2htFQH80s7sE2joumLWRY= Subject: Re: Linux NFS client and FreeBSD server strangeness To: Mike Tancsa References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> From: "Russell L. Carter" Cc: "freebsd-fs@freebsd.org" Message-ID: <6ca61caa-f4a9-68e4-32d9-6142e175e09b@pinyon.org> Date: Wed, 4 Apr 2018 13:20:33 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2018 20:29:03 -0000 On 04/04/18 11:27, Mike Tancsa wrote: > Not sure where the tweaking needs to happen, but I am getting strange > behaviour between a Linux nfs client and FreeBSD RELENG_11 NFS server. I initially had terrible performance with a Linux client writing to a FreeBSD server several years back, and after a lot of false starts this snippet in /etc/sysctl.conf solved it: # NFS client sux0rs w/o: net.inet.tcp.tso=0 A caveat is I use NFS V4.1, though. Good luck, Russell > The FreeBSD server starts with > > > nfs_client_enable="YES" > nfs_server_enable="YES" > > > rpcbind_enable="YES" > rpc_lockd_enable="YES" > rpc_statd_enable="YES" > nfs_server_flags="-u -t -n 16" > > and on the Linux client I have been trying various options to no avail. > The mount works, but on a straight up write to the FreeBSD server, > everything is very bursty. I noticed this (I think) a few months ago > where Linux dumps across an nfs mount seemed to take a lot longer and > were getting very bursty. > > It seems if there are a mixture of reads and writes, everything is > pretty fast. But if a client is just writing to the server, something, > somewhere is blocking. Doing something simple like > ls -l /nfsmount > from the client "wakes" up the server/client so that write stream can > keep going. Otherwise, it will do a big blast of writes and then several > seconds of pausing on the dump. > > Linux Dump is a simple > > /sbin/dump u -0 -f - / | /bin/bzip2 >/backup/dump-root-0.bz2 > > Mount is > > mount.nfs -o tcp,intr,noatime,vers=3 192.168.yy.xx:/path > > If I run ifstat on the FreeBSD nfs server, the traffic pattern looks like > > # ifstat -b -i cxl0 > cxl0 > Kbps in Kbps out > 0.00 0.00 > 0.00 0.00 > 0.00 0.00 > 0.00 0.00 > 0.00 0.00 > 8.12e+06 45127.03 > 0.00 0.00 > 0.00 0.00 > 0.00 0.00 > 0.00 0.00 > 6.04e+06 33525.76 > 901122.1 4983.72 > 0.00 0.00 > > if I do a bunch of ls -l /nfsmount on the client > > eg > > while true > do > ls -l /backup/ > /dev/null > done > > traffic pattern is > > > cxl0 > Kbps in Kbps out > 0.00 0.00 > 3.31e+06 18520.03 > 5.89e+06 32571.52 > 4.84e+06 28325.71 > 2.12e+06 19466.56 > 614727.0 12246.10 > 874927.6 13557.18 > 1.06e+06 14386.78 > 917865.4 13696.87 > 1.09e+06 14608.64 > 1.06e+06 14376.12 > 164077.3 5286.64 > > > Leading up to the stall, pcap snippet attached. > > Note, doing something like > > dd if=/dev/zero of=/backup/test.bin bs=4096 count=5000000 > > I can saturate the 10G link and max out the disk on the server > > # dd if=/dev/zero of=/backup/test.bin bs=4096 count=5000000 > 5000000+0 records in > 5000000+0 records out > 20480000000 bytes (20 GB, 19 GiB) copied, 36.6238 s, 559 MB/s > > and its a pretty steady stream unlike the dump. Any ideas whats going > on and how I might be able to work around this ? > > > 192.168.xx.yy:/zbackup1/virtbox4b/backup on /backup type nfs > (rw,noatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.242.254,mountvers=3,mountport=774,mountproto=tcp,local_lock=none,addr=192.168.yy.xx) > > > > ---Mike > > > ------------------- > Mike Tancsa, tel +1 519 651 3400 x203 > Sentex Communications, mike@sentex.net > Providing Internet services since 1994 www.sentex.net > Cambridge, Ontario Canada > > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@freebsd.org Wed Apr 4 20:34:15 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4A376F74421 for ; Wed, 4 Apr 2018 20:34:15 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost2.sentex.ca (smarthost2.sentex.ca [IPv6:2607:f3e0:80:80::2]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "smarthost.sentex.ca", Issuer "smarthost.sentex.ca" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id D541972622 for ; Wed, 4 Apr 2018 20:34:14 +0000 (UTC) (envelope-from mike@sentex.net) Received: from lava.sentex.ca (lava.sentex.ca [IPv6:2607:f3e0:0:5::11]) by smarthost2.sentex.ca (8.15.2/8.15.2) with ESMTPS id w34KYEOr019252 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Wed, 4 Apr 2018 16:34:14 -0400 (EDT) (envelope-from mike@sentex.net) Received: from [192.168.43.26] (saphire3.sentex.ca [192.168.43.26]) by lava.sentex.ca (8.15.2/8.15.2) with ESMTP id w34KYBAd017296; Wed, 4 Apr 2018 16:34:12 -0400 (EDT) (envelope-from mike@sentex.net) Subject: Re: Linux NFS client and FreeBSD server strangeness To: "Russell L. Carter" Cc: "freebsd-fs@freebsd.org" References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> <6ca61caa-f4a9-68e4-32d9-6142e175e09b@pinyon.org> From: Mike Tancsa Organization: Sentex Communications Message-ID: <2df965b6-9d39-2474-da22-5ed53756c467@sentex.net> Date: Wed, 4 Apr 2018 16:34:12 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <6ca61caa-f4a9-68e4-32d9-6142e175e09b@pinyon.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.78 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2018 20:34:15 -0000 On 4/4/2018 4:20 PM, Russell L. Carter wrote: > # NFS client sux0rs w/o: > net.inet.tcp.tso=0 > > A caveat is I use NFS V4.1, though. Thanks, I do have tso enabled the the Chelsio, but it doesnt seem to be in use according to the stats as the counters are zero net.inet.tcp.tso: 1 hw.hn.tso_maxlen: 65535 hw.vtnet.tso_disable: 0 dev.cxl.0.txq.7.tso_wrs: 0 dev.cxl.0.txq.6.tso_wrs: 0 dev.cxl.0.txq.5.tso_wrs: 0 dev.cxl.0.txq.4.tso_wrs: 0 dev.cxl.0.txq.3.tso_wrs: 0 dev.cxl.0.txq.2.tso_wrs: 0 dev.cxl.0.txq.1.tso_wrs: 0 dev.cxl.0.txq.0.tso_wrs: 0 dev.igb.1.mac_stats.tso_ctx_fail: 0 dev.igb.1.mac_stats.tso_txd: 0 dev.igb.0.mac_stats.tso_ctx_fail: 0 dev.igb.0.mac_stats.tso_txd: 0 # sysctl -ad dev.cxl.0.txq.4.tso_wrs dev.cxl.0.txq.4.tso_wrs: # of TSO work requests From owner-freebsd-fs@freebsd.org Wed Apr 4 20:55:03 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 80A85F7590C for ; Wed, 4 Apr 2018 20:55:03 +0000 (UTC) (envelope-from miku@muszka.com) Received: from mailadmin.dnacih.com (mailadmin.dnacih.com [71.177.183.54]) by mx1.freebsd.org (Postfix) with ESMTP id 14AD67346F for ; Wed, 4 Apr 2018 20:55:02 +0000 (UTC) (envelope-from miku@muszka.com) Received: from internal.domain; Wed, 04 Apr 2018 12:54:24 -0700 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (1.0) Subject: Re: Linux NFS client and FreeBSD server strangeness From: Michael Usher X-Mailer: iPhone Mail (15D100) In-Reply-To: Date: Wed, 4 Apr 2018 12:54:24 -0700 Cc: freebsd-fs@freebsd.org, Mike Tancsa , Guido Falsi Content-Transfer-Encoding: quoted-printable Message-Id: References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> <2019ee5a-5b2b-853d-98c5-a365940d93b5@madpilot.net> To: =?utf-8?Q?Karli_Sj=C3=B6berg?= X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2018 20:55:03 -0000 I have found examining a network trace to be helpful with NFS issues. Helps e= liminate window size or buffer issues.=20 Michael > On Apr 4, 2018, at 12:44, Karli Sj=C3=B6berg via freebsd-fs wrote: >=20 >=20 >=20 > Mike Tancsa skrev: (4 april 2018 21:24:59 CEST) >>> On 4/4/2018 3:04 PM, Guido Falsi wrote: >>>=20 >>> https://wiki.freebsd.org/ZFSTuningGuide >>>=20 >>> in the NFS tuning section. >>>=20 >>>=20 >>> If you're not using ZFS I don't have an idea right away what your >>> problem could be. >>=20 >> Thanks, same sort of bursty traffic patterns with a ufs filesystem. I >> just tried with a spare disk I made into a UFS2 partition and exported >> it to the linux client. >>=20 >> Also no difference if I disable sync for the underlying file system >> when >> using zfs. >=20 > Darn, would have been my bet as well:) >=20 > Although I understand it might be difficult for you, it would be interrest= ing to know if a FreeBSD client exhibits the same behaviour. It may be a cli= ent problem instead of the server? >=20 > /K >=20 >>=20 >>=20 >>=20 >> ---Mike >=20 > --=20 > Skickat fr=C3=A5n min Android-enhet med K-9 Mail. Urs=C3=A4kta min f=C3=A5= ordighet. > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@freebsd.org Thu Apr 5 00:38:44 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BC68BF83A0F for ; Thu, 5 Apr 2018 00:38:43 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-eopbgr670046.outbound.protection.outlook.com [40.107.67.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "Microsoft IT TLS CA 4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 512E17C955 for ; Thu, 5 Apr 2018 00:38:42 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM (52.132.66.153) by YQBPR0101MB1889.CANPRD01.PROD.OUTLOOK.COM (52.132.71.22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.653.12; Thu, 5 Apr 2018 00:38:41 +0000 Received: from YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM ([fe80::185:356:49c5:794c]) by YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM ([fe80::185:356:49c5:794c%13]) with mapi id 15.20.0653.012; Thu, 5 Apr 2018 00:38:41 +0000 From: Rick Macklem To: Mike Tancsa , "freebsd-fs@freebsd.org" Subject: Re: Linux NFS client and FreeBSD server strangeness Thread-Topic: Linux NFS client and FreeBSD server strangeness Thread-Index: AQHTzEKdJ7nozqXV6kKGauYOIuptM6PxTtMy Date: Thu, 5 Apr 2018 00:38:41 +0000 Message-ID: References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> In-Reply-To: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=rmacklem@uoguelph.ca; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; YQBPR0101MB1889; 7:qN3CsLeg8KVEP7fBcB7fEEdVC9m8tnEbIS5qzuZLWxsjYDUz+HMNtcDPRCngbyxUDgrPWPm4tTJxTzsZFnfTDQaOompIIXEvBSi1q6qYdoCeX8L/QgoTgfyIbLDZztw3Vk3bs54VaTSPdKWpOEaKmoS/q20nTwJ+Pm0IN2IPbZLnlfKlG+aIfxFq5oA/Y5ch9D+Zxf2Yc62dgMZb0D8YUj0hqgN8W502wWeWc4pv2GK30hethLFAGI/N3f+MyqQN x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-correlation-id: b6974e61-0f6b-4fe5-79c7-08d59a8d94a1 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(8989080)(5600026)(4604075)(3008032)(4534165)(4627221)(201703031133081)(201702281549075)(8990040)(2017052603328)(7153060)(7193020); SRVR:YQBPR0101MB1889; x-ms-traffictypediagnostic: YQBPR0101MB1889: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(158342451672863); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040522)(2401047)(5005006)(8121501046)(10201501046)(93006095)(93001095)(3231221)(944501327)(52105095)(3002001)(6041310)(201703131423095)(201702281529075)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123562045)(20161123558120)(20161123564045)(6072148)(201708071742011); SRVR:YQBPR0101MB1889; BCL:0; PCL:0; RULEID:; SRVR:YQBPR0101MB1889; x-forefront-prvs: 06339BAE63 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(376002)(39860400002)(366004)(396003)(39380400002)(346002)(189003)(199004)(3660700001)(99286004)(86362001)(106356001)(305945005)(7696005)(26005)(102836004)(97736004)(476003)(6246003)(2900100001)(786003)(74316002)(76176011)(186003)(316002)(8936002)(68736007)(110136005)(2501003)(74482002)(5250100002)(6436002)(9686003)(8676002)(59450400001)(55016002)(229853002)(53936002)(2906002)(3280700002)(105586002)(33656002)(486006)(5660300001)(81166006)(6506007)(25786009)(81156014)(446003)(478600001)(11346002)(14454004); DIR:OUT; SFP:1101; SCL:1; SRVR:YQBPR0101MB1889; H:YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: uoguelph.ca does not designate permitted sender hosts) x-microsoft-antispam-message-info: G47CJ7TE0GzkWZMwIkaczJva3nt7eMuziSUvVLm0u/Ue/YIfZoZ1RNatR6uSFk2bE4+WZVX4NGccyGpv48th+bTEukJKdAKwnI0XuQJELuni5mCwOrCMj5FyY2ierxm/9wyj5+rO3ER4lZsNMCgUQRQuJ2jcB/VzvFaVOmHfMzIY4kSKt7gH/uyEA1XyZkWS8iFfdKS3qq5lgL3x9yTk1lyZ5PzkOVBpHJJ1ymfQ3oo6fIhqMSBDrxCdBP7H41Rr9lrJZX3oY+DJpwDJMehEeQpcQ7fuQHDD0hiBcKJK5qtFjs1N4NbZrwW8vINSt/ebcwaWPyMcepRzxE8eh7XN6G7q7j6pmEfzvz1gHHRJdoEaX4i7m2QX93YOxhyGladu+lL6EqC84U96aifg8tkIVPBPnooFRrLDkRszLpGXvkA= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-Network-Message-Id: b6974e61-0f6b-4fe5-79c7-08d59a8d94a1 X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Apr 2018 00:38:41.6376 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQBPR0101MB1889 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2018 00:38:44 -0000 Mike Tancsa wrote: >Not sure where the tweaking needs to happen, but I am getting strange >behaviour between a Linux nfs client and FreeBSD RELENG_11 NFS server. > >The FreeBSD server starts with > > >nfs_client_enable=3D"YES" >nfs_server_enable=3D"YES" > > >rpcbind_enable=3D"YES" >rpc_lockd_enable=3D"YES" >rpc_statd_enable=3D"YES" Although it probably isn't related to what you are seeing, I avoid the NSM,= NLM since they are fundamentally flawed protocols. You only need them for NFSv3 clien= ts where the clients must see each others byte range locks. If byte range locks only need to be visible to processes within a client, y= ou can get rid of these and use the "nfslockd" mount option, called "nfslock" on Linux= , I think? >nfs_server_flags=3D"-u -t -n 16" 16 nfsd threads is very low. The default (if you don't specify "-n") is 8 p= er core, which is still very low. Extra ones cause very little overhead (a kernel stack fo= r each one), so I usually use "-n 256" if the server is going to be under any amount of loa= d. Another thing you can try is: # sysctl vfs.nfsd.cachetcp=3D0 which disables use of the DRC for TCP mounts. (Many NFS servers never use t= he DRC for TCP mounts. I designed one to try and make NFS over TCP more fault tole= rant, but it does result in quite a bit of overhead for write loads. If this fixe= s the problem, but you want to use it, it can be tuned with something like: vfs.nfsd.tcpcachetimeo=3D300 (five minutes instead of hours) vfs.nfsd.tcphighwater=3D100 (limit of 100 cached entries) --> The smaller you make these, the lower the overheads and the less effect= ive at making NFS over TCP reliable when TCP reconnects occur it becomes. There are several tunables for NFSv4 (but none of these affect NFSv3): vfs.nfsd.sessionhashsize=3D1000 vfs.nfsd.fhhashsize=3D1000 vfs.nfsd.clienthashsize=3D1000 vfs.nfsd.statehashsize=3D100 (A fairly large system dedicated to serving NFS might make the above "1000"= s "10000"s.) >and on the Linux client I have been trying various options to no avail. >The mount works, but on a straight up write to the FreeBSD server, >everything is very bursty. I noticed this (I think) a few months ago >where Linux dumps across an nfs mount seemed to take a lot longer and >were getting very bursty. > >It seems if there are a mixture of reads and writes, everything is >pretty fast. But if a client is just writing to the server, something, >somewhere is blocking. Doing something simple like >ls -l /nfsmount >from the client "wakes" up the server/client so that write stream can >keep going. Otherwise, it will do a big blast of writes and then several >seconds of pausing on the dump. This sounds like a network device driver issue to me. The main difference b= etween a FreeBSD client and a Linux client that I am aware of is that the Linux cl= ient likes to do page size (4K) writes, so it generates lots of them. One example might be interrupt moderation. It's a wonderful thing for some = TCP loads, but can be a terrible thing for NFS loads. Basically anything that adds del= ay to interrupt delivery/processing will increase latency and that kills NFS performance, f= rom what I've seen. Someone else suggested disabling TSO, which is often broken in the net devi= ce drivers. If you have a different type of net interface that uses a different driver,= you might try that and see if it has the same problem. I might look at your packet trace someday, but I haven't yet. Good luck with it, rick [stuff snipped]= From owner-freebsd-fs@freebsd.org Thu Apr 5 04:39:06 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AAEF7F87D7B for ; Thu, 5 Apr 2018 04:39:06 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail110.syd.optusnet.com.au (mail110.syd.optusnet.com.au [211.29.132.97]) by mx1.freebsd.org (Postfix) with ESMTP id EEE63874F5 for ; Thu, 5 Apr 2018 04:39:05 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail110.syd.optusnet.com.au (Postfix) with ESMTPS id 3B4F710474D; Thu, 5 Apr 2018 14:38:57 +1000 (AEST) Date: Thu, 5 Apr 2018 14:38:56 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Kaya Saman cc: Mike Tancsa , FreeBSD Filesystems Subject: Re: Linux NFS client and FreeBSD server strangeness In-Reply-To: Message-ID: <20180405134730.V1123@besplex.bde.org> References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> <2019ee5a-5b2b-853d-98c5-a365940d93b5@madpilot.net> <4da08f8b-2c28-cf18-77d2-6b498004d435@gmail.com> <2937ffcc-6b47-91af-8745-2117006660db@sentex.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=VJytp5HX c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=H-qYfXVrYLlSfQt42DAA:9 a=CjuIK1q_8ugA:10 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2018 04:39:06 -0000 On Wed, 4 Apr 2018, Kaya Saman wrote: > If I recall correctly the "sync" option is default. Though it might be > different depending on the Linux distro in use? > > I use this: vers=3,defaults,auto,tcp,rsize=8192,wsize=8192 > > though I could get rid of the tcp as that's also a default. The rsize > and wsize options are for running Jumbo Frames ie. larger MTU then > 1500; in my case 9000 for 1Gbps links. These rsize and wsize options are pessimizations. They override the default sizes which are usually much larger for tcp. The defaults are not documented in the man page, and the current settings are almost equally impossible to see (e.g., mount -v doesn't show them). The defaults are not quite impossible to see in the source code of course, but the source code for them is especially convoluted. It seems to give the following results: - for udp, the initial defaults are NFS_RSIZE and NFS_WSIZE. These are 8K. This is almost simple. The values are low because at least some versions have bugs with larger values. 32K never worked well for me. It hangs in some versions and is slower in others. 16K works for me. - for tcp, the initial defaults are maxbcachebuf. This is a read-only tunable. It defaults to MAXBCACHEBUF. This is an unsupported option. It defaults to MAXBSIZE. MAXBSIZE is honestly non-optional -- it is always 64K unless the source code is edited. Unsupported for an options means that it is ifdefed in a header file but is not in conf/options*, so it must be added to CFLAGS in some way before every include of the file (or just the ones that use MAXBCACHE if you know what they are. MAXBCACHE has a maximum of MAXPHYS. MAXPHYS is a supported option with a default of 128K. Although it is supported, it is much harder to change since it can reasonably used in applications where the support is null. - for remount, the defaults are from the previous mount. Their values are almost impossible to see. Large values from a previous tcp mount tend to break remounting with udp. I always use udp since tcp is just slower (higher overhead and latency). I usually force the sizes to 8K since I don't want then to depend on undocumented defaults or change if these defaults change. Changes invalidate benchmarks. Sometimes I force them to 16K to see if this is an optimization and keep it and update the benchmark results if it is. I don't like large block sizes and mostly use 16K for all file systems, but 32K is now better for throughput. 64K is just slower in all of my tests, due to it being too large for small metadata. nfs (v3) used to (5-30 years ago) have bursty behaviour even with a FreeBSD client and server. IIRC, this was from not much write combining and too many daemons on the server. (You will have to mount the client async to give the server a chance to combine small writes. Large block sizes give large writes which may arrive out of order so need recombining, and waiting for this in the sync case. Too many daemons give more reordering.) The server couldn't keep up with the client, and the client stopped to let the server catch up. After fixing this the problem moved to the client not being able to keep up with disks (for copying uncached files). This gave less bursty behaviour. E.g., consistently 10-20% below the disk bandwidth with network bandwidth to spare. Bruce From owner-freebsd-fs@freebsd.org Thu Apr 5 15:49:10 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 89635F8EED4 for ; Thu, 5 Apr 2018 15:49:10 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost2.sentex.ca (smarthost2.sentex.ca [IPv6:2607:f3e0:80:80::2]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "smarthost.sentex.ca", Issuer "smarthost.sentex.ca" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 0E64682DFF for ; Thu, 5 Apr 2018 15:49:09 +0000 (UTC) (envelope-from mike@sentex.net) Received: from lava.sentex.ca (lava.sentex.ca [IPv6:2607:f3e0:0:5::11]) by smarthost2.sentex.ca (8.15.2/8.15.2) with ESMTPS id w35Fn9VZ083570 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Thu, 5 Apr 2018 11:49:09 -0400 (EDT) (envelope-from mike@sentex.net) Received: from [192.168.43.26] (saphire3.sentex.ca [192.168.43.26]) by lava.sentex.ca (8.15.2/8.15.2) with ESMTP id w35Fn792021206; Thu, 5 Apr 2018 11:49:07 -0400 (EDT) (envelope-from mike@sentex.net) Subject: Re: Linux NFS client and FreeBSD server strangeness To: Kaya Saman Cc: FreeBSD Filesystems References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> <2019ee5a-5b2b-853d-98c5-a365940d93b5@madpilot.net> <4da08f8b-2c28-cf18-77d2-6b498004d435@gmail.com> <2937ffcc-6b47-91af-8745-2117006660db@sentex.net> From: Mike Tancsa Organization: Sentex Communications Message-ID: <5aee728a-bf51-9f8c-3c75-4e4768cc84af@sentex.net> Date: Thu, 5 Apr 2018 11:49:08 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.78 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2018 15:49:10 -0000 On 4/4/2018 4:23 PM, Kaya Saman wrote: > If I recall correctly the "sync" option is default. Though it might be > different depending on the Linux distro in use? > > I use this: vers=3,defaults,auto,tcp,rsize=8192,wsize=8192 Thanks. For ubuntu 16, it looks like the default is async. It would be nice if all of these defaults were exposed to view. Also the rw buffers are higher than the above at 131072. I too am using 9000 for the MTU. After doing mount.nfs -o intr,noatime,vers=3 -v the options that come up are mount.nfs: prog 100003, trying vers=3, prot=6 mount.nfs: trying 192.168.242.254 prog 100003 vers 3 prot TCP port 2049 mount.nfs: prog 100005, trying vers=3, prot=17 mount.nfs: trying 192.168.242.254 prog 100005 vers 3 prot UDP port 944 kind of odd in that they dont display all of them mount | grep nfs 192.168.xx.yy:/zbackup1/virtbox4b/backup on /backup type nfs (rw,noatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.xx.yy,mountvers=3,mountport=944,mountproto=udp,local_lock=none,addr=192.168.xx.yy) The sync/async The sync mount option The NFS client treats the sync mount option differently than some other file systems (refer to mount(8) for a description of the generic sync and async mount options). If neither sync nor async is specified (or if the async option is specified), the NFS client delays sending application writes to the server until any of these events occur: Memory pressure forces reclamation of system memory resources. An application flushes file data explicitly with sync(2), msync(2), or fsync(3). An application closes a file with close(2). The file is locked/unlocked via fcntl(2). ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 x203 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada From owner-freebsd-fs@freebsd.org Thu Apr 5 15:53:41 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D1DFFF8F3F6 for ; Thu, 5 Apr 2018 15:53:40 +0000 (UTC) (envelope-from stilezy@gmail.com) Received: from mail-ua0-x22a.google.com (mail-ua0-x22a.google.com [IPv6:2607:f8b0:400c:c08::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6C8FA83330 for ; Thu, 5 Apr 2018 15:53:40 +0000 (UTC) (envelope-from stilezy@gmail.com) Received: by mail-ua0-x22a.google.com with SMTP id q38so15759441uad.5 for ; Thu, 05 Apr 2018 08:53:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=n5NRFoJbIWqkLUfb2oXtITcIPwJorW3RI9DZpGyH6JU=; b=nG8ZzLQqQoamPA4QX9fTxyVRHtdES4TGLwbVYq6ANI3F4XrnWHyrJ1ZKBQe19WRWvn 83imqh3m9IWc0r+8hqSvvAcuA6FrYkWC9ZhJZnW3zpu5jMZ+DulBdpoTL8HZYM+eausO vhIvkU2tBI38Nrxg3J50L4EgNXYxXc2Y5HT3NwjKTsKVw4o7tLcO7+AAggp1JKfCONBd k8Wze+Yzb1O4M2G9K15r3DgP5Hsvo4ZVrJ7sLUnrjeNqrZVJFKETmn5REfrr+zwD7ocj HvJi5MlN/pkJbNG6om04J2UB0zS7x6ZhTVXR8CleOa/cd+mfS2daH5dhSdM8tocoGDUf A6dw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=n5NRFoJbIWqkLUfb2oXtITcIPwJorW3RI9DZpGyH6JU=; b=Y652PZaWdbseoqwj1wCJzIckq4clQzJs2uMJXNt92aGZR3RG7AP6MblDqAGCYn3yrv e91n1Y+ErQURTCz17IaSO3stDwrnlAWTQrBZYDMFipIGKnEU7X+6nYSgGWaXEr4KjHF+ stmX5AouXsI/Wb/JB36VEcl5mRndlMP+PTwlRsklPOVREAvuMwYKcW8XevV9Y41BkWzG izwFT7T9F0Im19zFuDfFu6C6ndFgerrIaJDob5ahkgg0F5Hrb32L3X1AXkdUH0h6q2mT 6isSkE8mvS+wmK1L/uNzkFKuJr5+iaiR6g9GLe95Lfprx8V2EnAcJKM+ajrlbw6ATNP0 KETw== X-Gm-Message-State: ALQs6tBEcaw7ejHRONGb3NyBf0F3rwXVAqhKl9Zuw9qXbdmedpYe5Tw/ C0IwaWP7aUmpQGwf7ppwCDwkdzIo3eArtfDK8iN/LQ== X-Google-Smtp-Source: AIpwx48CVYvWZwJhLhkQ6UcYT/ucMyd+/dAYfuqrKkBTjLB19hln8+4ehZhEd8k3ytg4r8yTnYBCktmugNXCaesPBKA= X-Received: by 10.176.48.169 with SMTP id b9mr14473091uam.55.1522943619859; Thu, 05 Apr 2018 08:53:39 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.219.148 with HTTP; Thu, 5 Apr 2018 08:53:09 -0700 (PDT) From: Stilez Stilezy Date: Thu, 5 Apr 2018 16:53:09 +0100 Message-ID: Subject: Does setuid=on work on ZFS datasets, or is the man page for zfs misleading? To: freebsd-fs Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2018 15:53:41 -0000 I'm trying to use the setuid property in ZFS. The man pages are a bit conflicted but overall man zfs seems most specific and implies the property is valid (man zfs says use setuid=on and it'll work, man mount says use -o suiddir but won't work except on UFS). It seems that man zfs is more specific/authoritative but I can't be sure. I'm starting to wonder if the page is wrong/misleading, and setuid isn't working/not implemented on datasets, despite the apparent meaning of man zfs. Here's an easily reproduced test case: # zfs create Test_pool/test # zfs list | grep test Test_pool/test 88K 446G 88K /mnt/Test_pool/test # zfs set setuid=on Test_pool/test # zfs get all Test_pool/test | grep setuid Test_pool/test setuid on local # mount | grep test Test_pool/test on /mnt/Test_pool/test (zfs, local, noatime, nfsv4acls) # umount Test_pool/test # mount -t zfs -o local,noatime,nfsv4acls,suiddir Test_pool/test /mnt/Test_pool/test # mount | grep test Test_pool/test on /mnt/Test_pool/test (zfs, local, noatime, suiddir, nfsv4acls) # chown www /mnt/Test_pool/test # chmod 4770 /mnt/Test_pool/test/ # ls -lt /mnt/Test_pool | grep test drwsrwx--- 2 www wheel 2 2018-04-05 16:11:48 test/ # ls -lt /mnt/Test_pool/test total 0 # mkdir /mnt/Test_pool/test/dir_created_as_root # touch /mnt/Test_pool/test/file_created_as_root # ls -al /mnt/Test_pool/test total 2 drwsrwx--- 3 www wheel 4 2018-04-05 16:19:17 ./ drwxrwxr-x+ 5 root wheel 6 2018-04-05 16:11:49 ../ drwxr-xr-x 2 root wheel 2 2018-04-05 16:18:33 dir_created_as_root/ -rw-r--r-- 1 root wheel 0 2018-04-05 16:19:17 file_created_as_root As far as I can see, everything's done that's needed. - Clean newly created dataset - setuid property set and checked - dataset mounted with suiddir option on - dataset given a different owner than my current invoked account, and setuid bit set - dataset properties, mount options, setuid bit, owner, etc all checked and seem correct - as root, created a new dir and file immediately within the dir - ..... but neither of them pick up the containing dir's actual owner and the dir doesn't pick up the setuid bit. Unless man zfs is misleading, setuid=on should work. (The exception is if it should say that it does work - but only for ZVOLs not datasets, which isn't what's said) What's going on? Or is this not implemented in ZFS and the man page not as clear as needed? If it *can't* be done within a normal ZFS dataset and the man page needs updating to be clearer, is there any "second best" workaround/fix to automatically get the right owner for that dataset's newly created files/dirs? For info, the platform is 11.1-REL on amd64. The files in the dataset could be created/modified/deleted by a number of users, hence why I want to mandate a fixed owner. I also don't want to create this directory as UFS-within-ZVOL, but to use a normal ZFS dataset. Thanks for any technical insight into this Stilez From owner-freebsd-fs@freebsd.org Thu Apr 5 15:58:01 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C2189F8F78C for ; Thu, 5 Apr 2018 15:58:01 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost2.sentex.ca (smarthost2.sentex.ca [IPv6:2607:f3e0:80:80::2]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "smarthost.sentex.ca", Issuer "smarthost.sentex.ca" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 5C892834BD for ; Thu, 5 Apr 2018 15:58:01 +0000 (UTC) (envelope-from mike@sentex.net) Received: from lava.sentex.ca (lava.sentex.ca [IPv6:2607:f3e0:0:5::11]) by smarthost2.sentex.ca (8.15.2/8.15.2) with ESMTPS id w35Fw0oY085191 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Thu, 5 Apr 2018 11:58:00 -0400 (EDT) (envelope-from mike@sentex.net) Received: from [192.168.43.26] (saphire3.sentex.ca [192.168.43.26]) by lava.sentex.ca (8.15.2/8.15.2) with ESMTP id w35FvwPn021225; Thu, 5 Apr 2018 11:57:58 -0400 (EDT) (envelope-from mike@sentex.net) Subject: Re: Linux NFS client and FreeBSD server strangeness To: Rick Macklem , "freebsd-fs@freebsd.org" References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> From: Mike Tancsa Organization: Sentex Communications Message-ID: <9040d0fa-f9c2-2cc3-efbd-f96408cff73b@sentex.net> Date: Thu, 5 Apr 2018 11:57:59 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.78 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2018 15:58:02 -0000 Thank you for all the feedback, pointers/insights. Coming directly from 'Mr. NFS', its particularly appreciated :) I think I am on a better track now to getting things playing well between FreeBSD+Linux, or at least better understanding the interactions. WRT the locking, I think I added it as Virtbox would not work otherwise when the file was accessed over NFS. KVM as the hypervisor I dont think has this limitation. I think the root of the issue partially stems from the client having a LOT of RAM. So according to this default behaviour ---------------- The NFS client treats the sync mount option differently than some other file systems (refer to mount(8) for a description of the generic sync and async mount options). If neither sync nor async is specified (or if the async option is specified), the NFS client delays sending application writes to the server until any of these events occur: Memory pressure forces reclamation of system memory resources. An application flushes file data explicitly with sync(2), msync(2), or fsync(3). An application closes a file with close(2). The file is locked/unlocked via fcntl(2). In other words, under normal circumstances, data written by an application may not immediately appear on the server that hosts the file. ----------------------------- it sort of makes more sense now. I will check out some of your tuning suggestions too. ---Mike On 4/4/2018 8:38 PM, Rick Macklem wrote: > Mike Tancsa wrote: >> Not sure where the tweaking needs to happen, but I am getting strange >> behaviour between a Linux nfs client and FreeBSD RELENG_11 NFS server. >> >> The FreeBSD server starts with >> >> >> nfs_client_enable="YES" >> nfs_server_enable="YES" >> >> >> rpcbind_enable="YES" >> rpc_lockd_enable="YES" >> rpc_statd_enable="YES" > Although it probably isn't related to what you are seeing, I avoid the NSM, NLM since > they are fundamentally flawed protocols. You only need them for NFSv3 clients where > the clients must see each others byte range locks. > If byte range locks only need to be visible to processes within a client, you can get > rid of these and use the "nfslockd" mount option, called "nfslock" on Linux, I think? > >> nfs_server_flags="-u -t -n 16" > 16 nfsd threads is very low. The default (if you don't specify "-n") is 8 per core, which > is still very low. Extra ones cause very little overhead (a kernel stack for each one), so > I usually use "-n 256" if the server is going to be under any amount of load. > > Another thing you can try is: > # sysctl vfs.nfsd.cachetcp=0 > which disables use of the DRC for TCP mounts. (Many NFS servers never use the DRC > for TCP mounts. I designed one to try and make NFS over TCP more fault tolerant, > but it does result in quite a bit of overhead for write loads. If this fixes the problem, > but you want to use it, it can be tuned with something like: > vfs.nfsd.tcpcachetimeo=300 (five minutes instead of hours) > vfs.nfsd.tcphighwater=100 (limit of 100 cached entries) > --> The smaller you make these, the lower the overheads and the less effective at > making NFS over TCP reliable when TCP reconnects occur it becomes. > > There are several tunables for NFSv4 (but none of these affect NFSv3): > vfs.nfsd.sessionhashsize=1000 > vfs.nfsd.fhhashsize=1000 > vfs.nfsd.clienthashsize=1000 > vfs.nfsd.statehashsize=100 > (A fairly large system dedicated to serving NFS might make the above "1000"s "10000"s.) > >> and on the Linux client I have been trying various options to no avail. >> The mount works, but on a straight up write to the FreeBSD server, >> everything is very bursty. I noticed this (I think) a few months ago >> where Linux dumps across an nfs mount seemed to take a lot longer and >> were getting very bursty. >> >> It seems if there are a mixture of reads and writes, everything is >> pretty fast. But if a client is just writing to the server, something, >> somewhere is blocking. Doing something simple like >> ls -l /nfsmount >>from the client "wakes" up the server/client so that write stream can >> keep going. Otherwise, it will do a big blast of writes and then several >> seconds of pausing on the dump. > This sounds like a network device driver issue to me. The main difference between > a FreeBSD client and a Linux client that I am aware of is that the Linux client likes > to do page size (4K) writes, so it generates lots of them. > > One example might be interrupt moderation. It's a wonderful thing for some TCP loads, > but can be a terrible thing for NFS loads. Basically anything that adds delay to interrupt > delivery/processing will increase latency and that kills NFS performance, from what I've > seen. > Someone else suggested disabling TSO, which is often broken in the net device drivers. > If you have a different type of net interface that uses a different driver, you might try > that and see if it has the same problem. > > I might look at your packet trace someday, but I haven't yet. > > Good luck with it, rick > [stuff snipped] > > -- ------------------- Mike Tancsa, tel +1 519 651 3400 x203 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada From owner-freebsd-fs@freebsd.org Thu Apr 5 16:29:06 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BA2C5F914A0 for ; Thu, 5 Apr 2018 16:29:06 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wr0-x232.google.com (mail-wr0-x232.google.com [IPv6:2a00:1450:400c:c0c::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3618284ADF for ; Thu, 5 Apr 2018 16:29:05 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by mail-wr0-x232.google.com with SMTP id s18so29395769wrg.9 for ; Thu, 05 Apr 2018 09:29:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=TFcj1prExwCR/WuaszYmdllnie3vCmyNt6Aq9xos/TM=; b=Zlq2aEro3tWLfdgxs0NsM67G9MkOlHfJu384feDJ0UpgSxrhqk6qtmxp2JBXOxdd+k mmfcUrC43fPUnTKQ5lj2h/ptoCWJIgckLdz6+RViBEBpNK/ZpTV4fDG4A58b4/lu1BJq o5BKn4FkSPudK0gWZAFE34ijBz8sCuuzGKiKxiX5IemU6MBJlkIXJcqWMbcWuQ8JUKfT 8G8R8J/04Y1sPZL9U0T+Nwit7ZsyFo4o7nVY1PlLs6rzMZMqJ4OBCbq7BqM3koglYKU1 CL4USFB4S/JJyTK2h0vBkjI4Kz717MiesrlnE474snOdFb6K4tIW81fSeJoR9j+/7Hmt obCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=TFcj1prExwCR/WuaszYmdllnie3vCmyNt6Aq9xos/TM=; b=jHIEQr4JOLVPhEU73GINs5VyYeAo/vQ0OL0KKYN6W3jHFvGZkOVvZsiYZecLlKl/fa EMS07tXHj6puC+HxPhsPoeZleJciHYnPUKt8/klQQUbtMpiZZ60xDOkxO6j1z3COTj6B p4pwRrkaf45pZr+c3w2SSqAYlaI2iGlEaOYRm1reJZGgaxiePd0tmo+sg7nMxVPnlqcm ktW+ovy5ZPCvDoYWbomha4hrJpmucSXlZvJMrzATolYZYHN3oFFslXB7d67RPivPClHl JvFDUAKvcUairdwHwHZ77SHdAEuCo8gP64Kdg81o/fKuSMQ5X7hxNL7v/omf1XHBt+q1 FjhQ== X-Gm-Message-State: AElRT7FnVQI6GoubEvrtScVzWfcpLrkAyx+hMjF6/m8WyVHTmGULQl72 nNxbvf5Cxv9dbvtjlhhxabuARMz/ X-Google-Smtp-Source: AIpwx4+EZ03J3uP8LHSBqil7E4Y/9iZutNLpdNPREBx4SU64gm97b4I2NPTxbSiq/WjL2dAe7nDkKg== X-Received: by 10.223.225.17 with SMTP id d17mr15936332wri.51.1522945744639; Thu, 05 Apr 2018 09:29:04 -0700 (PDT) Received: from bens-mac.home (LFbn-NIC-1-211-113.w2-15.abo.wanadoo.fr. [2.15.58.113]) by smtp.gmail.com with ESMTPSA id u17sm6116759wmu.16.2018.04.05.09.29.03 for (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 05 Apr 2018 09:29:03 -0700 (PDT) Content-Type: text/plain; charset=us-ascii; delsp=yes; format=flowed Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: Linux NFS client and FreeBSD server strangeness From: Ben RUBSON In-Reply-To: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> Date: Thu, 5 Apr 2018 18:29:02 +0200 Content-Transfer-Encoding: 7bit Message-Id: References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> To: "freebsd-fs@freebsd.org" X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2018 16:29:06 -0000 On 04 Apr 2018 20:27, Mike Tancsa wrote: > Note, doing something like > > dd if=/dev/zero of=/backup/test.bin bs=4096 count=5000000 Note that this test may not be really relevant if you have ZFS compression enabled. > I too am using 9000 for the MTU. Did you try to use smaller MTU ? Some network adapters are known to have bugs requesting 9K mbufs for large MTUs. Especially Mellanox, not sure about Chelsio though. My 2 cents :) Ben From owner-freebsd-fs@freebsd.org Thu Apr 5 17:11:57 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E1BB4F93CC5 for ; Thu, 5 Apr 2018 17:11:56 +0000 (UTC) (envelope-from stilezy@gmail.com) Received: from mail-vk0-x235.google.com (mail-vk0-x235.google.com [IPv6:2607:f8b0:400c:c05::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 74EBC86ACC; Thu, 5 Apr 2018 17:11:56 +0000 (UTC) (envelope-from stilezy@gmail.com) Received: by mail-vk0-x235.google.com with SMTP id n64so11745463vkf.12; Thu, 05 Apr 2018 10:11:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=JPtADEI/S+Rcb45a4GqyOmkr5vDXJWTTCPtep/EWhms=; b=btlgqcEnpweoyr6hJaIIuq51wR9tNJAOkPJysiqHPJuHVZV0lrJwT5HQ0GxocqHoKo WoG+6NGNHFOYmJw7IpJLZ/PEvqi4dHkaquH+27p6jPy84AqJsUf8geD0LxD4q9IqI2A/ g7fuefcCWxOulA3gKex4aCGb7+ouSx9j+TFYcAf1ZMMR5sC5SW3ZVLPqW5gnBPoQMTfu 6eyhzN6oyZEYsJvp/ZLWHzVNpil0CviHsnrbiRa/xiy8a2YJJAHfF1rbGDgcKS1Z/d6e oQX+K6Neh/dt0Auzb8X3Kndr7Dpo0tdB9+25gLiA2Sw3nCvsUATVUHtZYtAYLRdrP+Iw jQpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=JPtADEI/S+Rcb45a4GqyOmkr5vDXJWTTCPtep/EWhms=; b=bAuH+TIXtvMwKTJn2Gk09D1pW29TY+CiyLfxKqc1cSitSitHwIrh0j+4OgRKF5m6ng 5ijFMcvDXTT1Hi77G13MQPoA783p4v33DUPD0/oZ3ncf9IRA2nmXn2BSybDoQzsGH6VH +b3E2xEqagIpXlTNbp2HCpDhAPo9SddBZ/yGsrimr2wNZp+cHaqoLWi2wMbCdAvlUt+k Uo/A6/rMWa6Qp0d3nQQ09UikB6P3wTSiRE3zajUUhBafSKB/LQbQYkQ5ytCKu/Sz7+5F 1mmaSRviuLA6hCN+0aLWAvhglAoGDvnFqD4CwtqJI9z/ENy7otS2yboBlN/fpfSEi3iN dxDw== X-Gm-Message-State: ALQs6tAr/xMXRDF2X0ba4gPgAtg0w0dpiXicHpx5MInikmTS4aZm/ud8 RCnuL5N+vmf7KtmfpiNm7wbXJdUQukx8nfahBBo= X-Google-Smtp-Source: AIpwx497EVPxPUy5nGy2wjsk+OD0y7hrvYi0pcDJSw02+B9IHpPNb+OD+Tx0c/fTR1ygOQIGEiyIxbeHR2H1thmf0dg= X-Received: by 10.31.147.212 with SMTP id v203mr13972242vkd.39.1522948315778; Thu, 05 Apr 2018 10:11:55 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.219.148 with HTTP; Thu, 5 Apr 2018 10:11:25 -0700 (PDT) In-Reply-To: References: <14c857cc-463f-a56e-bcf6-c0702da6d3bc@FreeBSD.org> From: Stilez Stilezy Date: Thu, 5 Apr 2018 18:11:25 +0100 Message-ID: Subject: Re: ZFS dedup write pathway - possible inefficiency or..? To: Andriy Gapon Cc: freebsd-fs@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2018 17:11:57 -0000 Thanks Andriy, That does suck, but it's how bugs are. I'm using 10G and dedup, so even with good hardware the test case is definitely piling on the incoming data and workload during writes, probably as fast as the OS can take it, or a little more. Without dedup it flies, though. Hard to disentangle what's dedup overhead and what's dedup write bug, though, or to get an idea which is responsible for how much of the problem. As I don't know Illumos/OpenZFS's track record with bugs, is this likely to get attention/resolved "at some time this year" or is it a "who knows, bugs take forever, could be still here in 5 years time"? Is it helpful if I nudge and offer a "causing a problem" note on their bug tracker? Perhaps it's worth it in case it gives extra data? What do you reckon? Last, if I have high data rates but want to minimise the dirty data issue, wcan you suggest broadly, how to customise the dirty data/caching sysctls/loaders, to try and mitigate the impact (get best possible handling without losing 99% of throughput or sky-high latency?). I understand from your reply that there's no recipes in debugging, but any suggestions at all which way to try for at least some mitigation, or which values might be worth experimenting with, to reduce the effect of the problem pathway? Stilez On 4 April 2018 at 18:04, Andriy Gapon wrote: > On 02/04/2018 21:30, Stilez Stilezy wrote: > > Thanks on that tip, Andriy, > > > > I don't have the knowledge to tell if that's the first of the 2 issues > I'm > > seeing. It could be. What exactly is that bug describing and what > behaviour > > would it create? > > Sub-optimal performance for ZFS write throughput (and latency) when dedup > is > enabled and you are trying to write as fast as you can. > > > Assuming it's the same - > > > > 1) Are there any known workarounds or sysctls that help to reduce the > issue? > > I don't know of any. > > > 2) Are there any easy diagnostic tests I can easily do, to confirm if > this is > > the same behaviour as that bug? Nobody else uses the test system, so I > can > > change any sysctls to sane or extreme values for testing, or create > large txg's > > and spread out reads and writes to see what's happening and what > coincides with > > what. > > We used DTrace to observe internal ZFS behavior. > I do not have any simple recipes. > > > On 2 April 2018 at 18:47, Andriy Gapon > > wrote: > > > > On 02/04/2018 16:36, Stilez Stilezy wrote: > > > The first issue is specific to the > > > dedup write pathway. I've tested locally to a point where it > seems it's > > > not due to inadequate hardware and it's very consistent and > specific, even > > > on idle conditions/minimal load. I'm wondering whether there's a > code > > > bottleneck specifically affecting just the dedup write pathway. > > > > I think that this might be https://www.illumos.org/issues/8353 > > > > > > -- > > Andriy Gapon > > > > > > > -- > Andriy Gapon > From owner-freebsd-fs@freebsd.org Thu Apr 5 17:16:03 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AA969F94086 for ; Thu, 5 Apr 2018 17:16:03 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost2.sentex.ca (smarthost2.sentex.ca [IPv6:2607:f3e0:80:80::2]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "smarthost.sentex.ca", Issuer "smarthost.sentex.ca" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 515D586CDD for ; Thu, 5 Apr 2018 17:16:03 +0000 (UTC) (envelope-from mike@sentex.net) Received: from lava.sentex.ca (lava.sentex.ca [IPv6:2607:f3e0:0:5::11]) by smarthost2.sentex.ca (8.15.2/8.15.2) with ESMTPS id w35HG2JR098766 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Thu, 5 Apr 2018 13:16:02 -0400 (EDT) (envelope-from mike@sentex.net) Received: from [192.168.43.26] (saphire3.sentex.net [192.168.43.26]) by lava.sentex.ca (8.15.2/8.15.2) with ESMTP id w35HG0ca021420; Thu, 5 Apr 2018 13:16:01 -0400 (EDT) (envelope-from mike@sentex.net) Subject: Re: Linux NFS client and FreeBSD server strangeness To: Ben RUBSON , "freebsd-fs@freebsd.org" References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> From: Mike Tancsa Organization: Sentex Communications Message-ID: <938022c7-efe0-7ba2-3c71-9036f65ce06e@sentex.net> Date: Thu, 5 Apr 2018 13:16:01 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.78 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2018 17:16:03 -0000 On 4/5/2018 12:29 PM, Ben RUBSON wrote: > On 04 Apr 2018 20:27, Mike Tancsa wrote: > >> Note, doing something like >> >> dd if=/dev/zero of=/backup/test.bin bs=4096 count=5000000 > > Note that this test may not be really relevant if you have ZFS > compression enabled. hehe, the first time I did this I thought gstat was broken since the disk was doing next to nothing :) However, I did the same sort of tests creating a number of large files from /dev/urandom > >> I too am using 9000 for the MTU. > > Did you try to use smaller MTU ? Yes, I noticed the same behaviour on igb nics at a gig and normal 1500 MTU ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 x203 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada From owner-freebsd-fs@freebsd.org Thu Apr 5 19:56:24 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8BE43F9E603 for ; Thu, 5 Apr 2018 19:56:24 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-wm0-f47.google.com (mail-wm0-f47.google.com [74.125.82.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id F0BAC6E274 for ; Thu, 5 Apr 2018 19:56:23 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-wm0-f47.google.com with SMTP id i3so10418591wmf.3 for ; Thu, 05 Apr 2018 12:56:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:openpgp:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=SKgaRhr52mMRfKws1pB6BE8CSGetUzgXsOAt7/ir/KU=; b=oa3oBFu7KG3w4vC9Oa0kP1ZvIo8aB71zQTL+GLGXkvzYRLmjoxNHUQJ45CO2rC7pZ4 XSAhxdSLJEkSMeQz8h1nlId1D2yc4U1lP74SkHfEVNSgwfRdVJKRCYk8FjxQrRbQGiBa ETIgG1IJ+x1iIvynJACaFGfUdQpAJBaOGgZhZ6LvREFnP2L/lY5J+S7xreUKG7aQvbUB Vd/lNDTNXsVfjtJK6lLIDsts8Ta7uxdrenUulF2eQFORJ+eYNrnKiq/1XJerGJzeDupt gLKCCu5hJLN08puExtt9FjcLTfKRcgdqIBZPvUa08xpyKGaf5EvUkTfoyI4vmUGJJie3 vODw== X-Gm-Message-State: ALQs6tAIPRhuOWuJ0OHlx/1KfZmvoYdHfgHDfBaKU+GUQ5NYcJsZfK0d qkbmg9BRnay7qlGKW+s11TK9jNWB X-Google-Smtp-Source: AIpwx4+kp2RWzmVvOqRQg1r8v2wPuYm1hzQGcWLhfARHojKZDRQmOxYowWp0hPeEhSONWjyzGEb94w== X-Received: by 10.46.56.6 with SMTP id f6mr15246400lja.4.1522958176533; Thu, 05 Apr 2018 12:56:16 -0700 (PDT) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id 93-v6sm1685563lfy.5.2018.04.05.12.56.14 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 05 Apr 2018 12:56:15 -0700 (PDT) Subject: Re: ZFS dedup write pathway - possible inefficiency or..? To: Stilez Stilezy Cc: freebsd-fs@freebsd.org References: <14c857cc-463f-a56e-bcf6-c0702da6d3bc@FreeBSD.org> From: Andriy Gapon Openpgp: preference=signencrypt Autocrypt: addr=avg@FreeBSD.org; prefer-encrypt=mutual; keydata= xsFNBFm4LIgBEADNB/3lT7f15UKeQ52xCFQx/GqHkSxEdVyLFZTmY3KyNPQGBtyvVyBfprJ7 mAeXZWfhat6cKNRAGZcL5EmewdQuUfQfBdYmKjbw3a9GFDsDNuhDA2QwFt8BmkiVMRYyvI7l N0eVzszWCUgdc3qqM6qqcgBaqsVmJluwpvwp4ZBXmch5BgDDDb1MPO8AZ2QZfIQmplkj8Y6Z AiNMknkmgaekIINSJX8IzRzKD5WwMsin70psE8dpL/iBsA2cpJGzWMObVTtCxeDKlBCNqM1i gTXta1ukdUT7JgLEFZk9ceYQQMJJtUwzWu1UHfZn0Fs29HTqawfWPSZVbulbrnu5q55R4PlQ /xURkWQUTyDpqUvb4JK371zhepXiXDwrrpnyyZABm3SFLkk2bHlheeKU6Yql4pcmSVym1AS4 dV8y0oHAfdlSCF6tpOPf2+K9nW1CFA8b/tw4oJBTtfZ1kxXOMdyZU5fiG7xb1qDgpQKgHUX8 7Rd2T1UVLVeuhYlXNw2F+a2ucY+cMoqz3LtpksUiBppJhw099gEXehcN2JbUZ2TueJdt1FdS ztnZmsHUXLxrRBtGwqnFL7GSd6snpGIKuuL305iaOGODbb9c7ne1JqBbkw1wh8ci6vvwGlzx rexzimRaBzJxlkjNfMx8WpCvYebGMydNoeEtkWldtjTNVsUAtQARAQABzR5BbmRyaXkgR2Fw b24gPGF2Z0BGcmVlQlNELm9yZz7CwZQEEwEIAD4WIQS+LEO7ngQnXA4Bjr538m7TUc1yjwUC WbgsiAIbIwUJBaOagAULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRB38m7TUc1yj+JAEACV l9AK/nOWAt/9cufV2fRj0hdOqB1aCshtSrwHk/exXsDa4/FkmegxXQGY+3GWX3deIyesbVRL rYdtdK0dqJyT1SBqXK1h3/at9rxr9GQA6KWOxTjUFURsU7ok/6SIlm8uLRPNKO+yq0GDjgaO LzN+xykuBA0FlhQAXJnpZLcVfPJdWv7sSHGedL5ln8P8rxR+XnmsA5TUaaPcbhTB+mG+iKFj GghASDSfGqLWFPBlX/fpXikBDZ1gvOr8nyMY9nXhgfXpq3B6QCRYKPy58ChrZ5weeJZ29b7/ QdEO8NFNWHjSD9meiLdWQaqo9Y7uUxN3wySc/YUZxtS0bhAd8zJdNPsJYG8sXgKjeBQMVGuT eCAJFEYJqbwWvIXMfVWop4+O4xB+z2YE3jAbG/9tB/GSnQdVSj3G8MS80iLS58frnt+RSEw/ psahrfh0dh6SFHttE049xYiC+cM8J27Aaf0i9RflyITq57NuJm+AHJoU9SQUkIF0nc6lfA+o JRiyRlHZHKoRQkIg4aiKaZSWjQYRl5Txl0IZUP1dSWMX4s3XTMurC/pnja45dge/4ESOtJ9R 8XuIWg45Oq6MeIWdjKddGhRj3OohsltKgkEU3eLKYtB6qRTQypHHUawCXz88uYt5e3w4V16H lCpSTZV/EVHnNe45FVBlvK7k7HFfDDkryM7BTQRZuCyIARAAlq0slcsVboY/+IUJdcbEiJRW be9HKVz4SUchq0z9MZPX/0dcnvz/gkyYA+OuM78dNS7Mbby5dTvOqfpLJfCuhaNYOhlE0wY+ 1T6Tf1f4c/uA3U/YiadukQ3+6TJuYGAdRZD5EqYFIkreARTVWg87N9g0fT9BEqLw9lJtEGDY EWUE7L++B8o4uu3LQFEYxcrb4K/WKmgtmFcm77s0IKDrfcX4doV92QTIpLiRxcOmCC/OCYuO jB1oaaqXQzZrCutXRK0L5XN1Y1PYjIrEzHMIXmCDlLYnpFkK+itlXwlE2ZQxkfMruCWdQXye syl2fynAe8hvp7Mms9qU2r2K9EcJiR5N1t1C2/kTKNUhcRv7Yd/vwusK7BqJbhlng5ZgRx0m WxdntU/JLEntz3QBsBsWM9Y9wf2V4tLv6/DuDBta781RsCB/UrU2zNuOEkSixlUiHxw1dccI 6CVlaWkkJBxmHX22GdDFrcjvwMNIbbyfQLuBq6IOh8nvu9vuItup7qemDG3Ms6TVwA7BD3j+ 3fGprtyW8Fd/RR2bW2+LWkMrqHffAr6Y6V3h5kd2G9Q8ZWpEJk+LG6Mk3fhZhmCnHhDu6CwN MeUvxXDVO+fqc3JjFm5OxhmfVeJKrbCEUJyM8ESWLoNHLqjywdZga4Q7P12g8DUQ1mRxYg/L HgZY3zfKOqcAEQEAAcLBfAQYAQgAJhYhBL4sQ7ueBCdcDgGOvnfybtNRzXKPBQJZuCyIAhsM BQkFo5qAAAoJEHfybtNRzXKPBVwQAKfFy9P7N3OsLDMB56A4Kf+ZT+d5cIx0Yiaf4n6w7m3i ImHHHk9FIetI4Xe54a2IXh4Bq5UkAGY0667eIs+Z1Ea6I2i27Sdo7DxGwq09Qnm/Y65ADvXs 3aBvokCcm7FsM1wky395m8xUos1681oV5oxgqeRI8/76qy0hD9WR65UW+HQgZRIcIjSel9vR XDaD2HLGPTTGr7u4v00UeTMs6qvPsa2PJagogrKY8RXdFtXvweQFz78NbXhluwix2Tb9ETPk LIpDrtzV73CaE2aqBG/KrboXT2C67BgFtnk7T7Y7iKq4/XvEdDWscz2wws91BOXuMMd4c/c4 OmGW9m3RBLufFrOag1q5yUS9QbFfyqL6dftJP3Zq/xe+mr7sbWbhPVCQFrH3r26mpmy841ym dwQnNcsbIGiBASBSKksOvIDYKa2Wy8htPmWFTEOPRpFXdGQ27awcjjnB42nngyCK5ukZDHi6 w0qK5DNQQCkiweevCIC6wc3p67jl1EMFY5+z+zdTPb3h7LeVnGqW0qBQl99vVFgzLxchKcl0 R/paSFgwqXCZhAKMuUHncJuynDOP7z5LirUeFI8qsBAJi1rXpQoLJTVcW72swZ42IdPiboqx NbTMiNOiE36GqMcTPfKylCbF45JNX4nF9ElM0E+Y8gi4cizJYBRr2FBJgay0b9Cp Message-ID: <5fbf6ac8-fcb4-0d2f-9f8d-aad129eda118@FreeBSD.org> Date: Thu, 5 Apr 2018 22:56:14 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2018 19:56:24 -0000 On 05/04/2018 20:11, Stilez Stilezy wrote: > > That does suck, but it's how bugs are.  I'm using 10G and dedup, so even with > good hardware the test case is definitely piling on the incoming data and > workload during writes, probably as fast as the OS can take it, or a little > more. Without dedup it flies, though. Hard to disentangle what's dedup overhead > and what's dedup write bug, though, or to get an idea which is responsible for > how much of the problem. > > As I don't know Illumos/OpenZFS's track record with bugs, is this likely to get > attention/resolved "at some time this year" or is it a "who knows, bugs take > forever, could be still here in 5 years time"? Is it helpful if I nudge and > offer a "causing a problem" note on their bug tracker? Perhaps it's worth it in > case it gives extra data? What do you reckon? I don't know. At this time it seems that there is not much interest in the issue. But maybe tomorrow someone will get excited and fix the problem in an hour. Or maybe that will happen in 5 years. > Last, if I have high data rates but want to minimise the dirty data issue, wcan > you suggest broadly, how to customise the dirty data/caching sysctls/loaders, to > try and mitigate the impact (get best possible handling without losing 99% of > throughput or sky-high latency?). Minimizing maximum dirty data threshold has its flip side that also affects performance. I am far from sure that tuning that can actually help. If you want to experiment, you can start with $ sysctl -d vfs.zfs | fgrep -i dirty > I understand from your reply that there's no recipes in debugging, but any > suggestions at all which way to try for at least some mitigation, or which > values might be worth experimenting with, to reduce the effect of the problem > pathway? I don't know of any mitigation. But I want to note that typically "I want to use dedup" and "I want to be able to write as fast as possible" arise for very different use cases and both are rarely needed at the same time (except for benchmarks, synthetic tests, etc). If one needs streaming writes then usually there is not much to dedup. -- Andriy Gapon From owner-freebsd-fs@freebsd.org Fri Apr 6 00:19:18 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 26AFDF8E469 for ; Fri, 6 Apr 2018 00:19:18 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-eopbgr660086.outbound.protection.outlook.com [40.107.66.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "Microsoft IT TLS CA 4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id ADA1F78FFD for ; Fri, 6 Apr 2018 00:19:17 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM (52.132.66.153) by YQBPR0101MB1394.CANPRD01.PROD.OUTLOOK.COM (52.132.68.155) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.653.12; Fri, 6 Apr 2018 00:19:16 +0000 Received: from YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM ([fe80::185:356:49c5:794c]) by YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM ([fe80::185:356:49c5:794c%13]) with mapi id 15.20.0653.012; Fri, 6 Apr 2018 00:19:16 +0000 From: Rick Macklem To: Ben RUBSON , "freebsd-fs@freebsd.org" Subject: Re: Linux NFS client and FreeBSD server strangeness Thread-Topic: Linux NFS client and FreeBSD server strangeness Thread-Index: AQHTzEKdJ7nozqXV6kKGauYOIuptM6PyXeQAgACA7h8= Date: Fri, 6 Apr 2018 00:19:16 +0000 Message-ID: References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net>, In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=rmacklem@uoguelph.ca; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; YQBPR0101MB1394; 7:zsiNatuc3pXH3sdlLgweos+QzNBuMZ8R7hjVtcYkJ2T1vJsx96Ey0MNZNMQUJRucEzOE1e4OkCYPAtAb++lCYO9NHJjTkOA08F52c7bNomKSNJxyMc72pAvr3pJIdTLkepAB0JaJMd2n3XwkQYVQtrefG9+3qvN9ypCcksT2PMdRXGT1qbEu3ltoF6KKNty0WXpLIv4LgqI+KuOzW9dqQ7RB4TzaRQqxBAYyvw5fH6z/BM8yxX2bgZIPiKzRsHVS x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-correlation-id: cbcdf5ff-1dbe-4d45-98fb-08d59b540875 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(8989080)(5600026)(4604075)(3008032)(4534165)(4627221)(201703031133081)(201702281549075)(8990040)(2017052603328)(7153060)(7193020); SRVR:YQBPR0101MB1394; x-ms-traffictypediagnostic: YQBPR0101MB1394: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(158342451672863); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040522)(2401047)(8121501046)(5005006)(3002001)(10201501046)(3231221)(944501327)(52105095)(93006095)(93001095)(6041310)(20161123564045)(20161123562045)(20161123558120)(20161123560045)(201703131423095)(201702281529075)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(6072148)(201708071742011); SRVR:YQBPR0101MB1394; BCL:0; PCL:0; RULEID:; SRVR:YQBPR0101MB1394; x-forefront-prvs: 0634F37BFF x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(366004)(346002)(376002)(39380400002)(396003)(39860400002)(199004)(189003)(7696005)(110136005)(86362001)(5250100002)(39060400002)(74482002)(2900100001)(97736004)(186003)(76176011)(99286004)(33656002)(446003)(2906002)(26005)(53936002)(6246003)(786003)(5660300001)(229853002)(478600001)(105586002)(2501003)(14454004)(316002)(81156014)(9686003)(68736007)(81166006)(8676002)(8936002)(3280700002)(11346002)(106356001)(6506007)(25786009)(3660700001)(6436002)(305945005)(55016002)(74316002)(486006)(476003)(102836004); DIR:OUT; SFP:1101; SCL:1; SRVR:YQBPR0101MB1394; H:YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: uoguelph.ca does not designate permitted sender hosts) x-microsoft-antispam-message-info: V/wO/Iw1/TJFX1koWrQSiYHofEJpIbzGiwAKyYZR7YnyNhbkfLRno1lYsaTfOrZeRwCLZlKPsSJVAekx6pMY9Mbws7lQtL7UQvM9Y7m35HhZKEN9bYikWkjeJix2tb//uWvKGdlCzoFbbpYrL1A2Tp2lLoc4P+izqBHGufEknUc2ecco5onC1EjZ88BzUcSrzW8tLi4CKHOIyRE1RNmu8qHVWyJ1zN14lh+uUEr4UBCCajxdny+/YityVgnihYEDnWlCt6wlM2AuFd1UyG1c7Ag61Lhh6cXqxr/gEApcl3OS6Gkux3p1+qnNPJ1cZjNZkURI5GeEQ/vr3BsI9uLB1n7t9DKjAEv9qnovsZr3SnF9/8Na51IQoCoX+bLYOwPLuqn/Uv1Jt0LvWfBokEeStbBCYbNTPfXrBEtrn7QK3rU= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-Network-Message-Id: cbcdf5ff-1dbe-4d45-98fb-08d59b540875 X-MS-Exchange-CrossTenant-originalarrivaltime: 06 Apr 2018 00:19:16.3338 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQBPR0101MB1394 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2018 00:19:18 -0000 Ben RUBSON wrote: >On 04 Apr 2018 20:27, Mike Tancsa wrote: >> Note, doing something like >> >> dd if=3D/dev/zero of=3D/backup/test.bin bs=3D4096 count=3D5000000 > >Note that this test may not be really relevant if you have ZFS compression >enabled. > >> I too am using 9000 for the MTU. > >Did you try to use smaller MTU ? >Some network adapters are known to have bugs requesting 9K mbufs for large >MTUs. >Especially Mellanox, not sure about Chelsio though. When the system uses a mix of mbuf cluster sizes (almost always happens whe= n you use jumbo packets), you can fragment the kernel memory pool they are allocated from to the point that jumbo clusters can't be allocated easi= ly. That breaks NFS performance badly. I had some patches that used jumbo mbuf clusters for the NFS read reply any write requests. They worked fine for a while, but would then get hammered by the fragmentat= ion problem. (As such, they never went into head, etc.) (The main advantage of using jumbo clusters was that the mbuf chain for an RPC had fewer mbufs in it and wouldn't get bit by bugs implementing TSO for interfaces that could only handle 32 buffers for a TSO segment. There is code that avoids this problem in tcp_output(), but it only works if the net device driver sets the parameters correctly.) Some have said that 9K jumbo clusters shouldn't exist in FreeBSD because of the fragmentation problem. Others proposed using separate pools for each mbuf cluster size, but nothing has happened as far as I know. rick From owner-freebsd-fs@freebsd.org Fri Apr 6 00:44:37 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 00BFCF9055E for ; Fri, 6 Apr 2018 00:44:37 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-eopbgr660068.outbound.protection.outlook.com [40.107.66.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "Microsoft IT TLS CA 4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 84F7A7A38D for ; Fri, 6 Apr 2018 00:44:36 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM (52.132.66.153) by YQBPR0101MB1106.CANPRD01.PROD.OUTLOOK.COM (52.132.67.17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.653.12; Fri, 6 Apr 2018 00:44:34 +0000 Received: from YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM ([fe80::185:356:49c5:794c]) by YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM ([fe80::185:356:49c5:794c%13]) with mapi id 15.20.0653.012; Fri, 6 Apr 2018 00:44:34 +0000 From: Rick Macklem To: Bruce Evans , Kaya Saman CC: FreeBSD Filesystems Subject: Re: Linux NFS client and FreeBSD server strangeness Thread-Topic: Linux NFS client and FreeBSD server strangeness Thread-Index: AQHTzEKdJ7nozqXV6kKGauYOIuptM6Pw9uaAgAAF0oCAAAWNgIAABXOAgAAC9oCAAAJwAIAAimAAgAFKdyY= Date: Fri, 6 Apr 2018 00:44:34 +0000 Message-ID: References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> <2019ee5a-5b2b-853d-98c5-a365940d93b5@madpilot.net> <4da08f8b-2c28-cf18-77d2-6b498004d435@gmail.com> <2937ffcc-6b47-91af-8745-2117006660db@sentex.net> , <20180405134730.V1123@besplex.bde.org> In-Reply-To: <20180405134730.V1123@besplex.bde.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=rmacklem@uoguelph.ca; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; YQBPR0101MB1106; 7:ag053OehrJxQ4VxhSRO1RXI6joHmSU3mIMnp8kPcXlOmFEMy914I+SQHqngNgWCt6iS1bcEfj2uh50wDqDE+W4Bw3AnqIKsIpjItXkSWJG3sH3umEqngdKj0ObZdW0Sj6Pq8adcgiREO2nfPCXnPaPVaEaNJRJ1FpXha6GwjVNInYplxf2Sk7IfKQyto430baUvffHWm4cCYImqgw2iAhla30YlxCxigEJAXUGwZ5asEVfTu81KUadit/TAnUCha x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-correlation-id: f803fec3-d1f8-4ba9-ab96-08d59b579181 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(8989080)(5600026)(4604075)(3008032)(4534165)(4627221)(201703031133081)(201702281549075)(8990040)(2017052603328)(7153060)(7193020); SRVR:YQBPR0101MB1106; x-ms-traffictypediagnostic: YQBPR0101MB1106: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(158342451672863); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040522)(2401047)(8121501046)(5005006)(3002001)(93006095)(93001095)(10201501046)(3231221)(944501327)(52105095)(6041310)(20161123564045)(201703131423095)(201702281529075)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(20161123560045)(20161123558120)(6072148)(201708071742011); SRVR:YQBPR0101MB1106; BCL:0; PCL:0; RULEID:; SRVR:YQBPR0101MB1106; x-forefront-prvs: 0634F37BFF x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(39850400004)(396003)(346002)(39380400002)(376002)(366004)(189003)(199004)(6436002)(25786009)(9686003)(74316002)(55016002)(53936002)(105586002)(74482002)(86362001)(7696005)(76176011)(5250100002)(93886005)(99286004)(305945005)(106356001)(97736004)(316002)(33656002)(786003)(229853002)(6506007)(102836004)(3660700001)(26005)(3280700002)(14454004)(186003)(486006)(446003)(478600001)(11346002)(476003)(39060400002)(81166006)(68736007)(5660300001)(4326008)(2900100001)(2906002)(8676002)(81156014)(110136005)(8936002)(6246003); DIR:OUT; SFP:1101; SCL:1; SRVR:YQBPR0101MB1106; H:YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: uoguelph.ca does not designate permitted sender hosts) x-microsoft-antispam-message-info: obETjdQAtFogoqySd9vIExoxd4bUJQ+MzUcx6vKByE4U3vsHVpC11Q3J/cVRAw8n15P6jFG+8gWkcOqnAYOJidL/nJ/hBrct+49wD4LciBEZMokAqc1xC+k9XFK4FvolWaWiwTkQwuJaycDs4efIxoJB+Ufjy1gk3zMUa4Rl+Pkd6N7TIIk+U3luiR4idlaZWdPZuTbkrabZ6dZulqR4AR0tCGnDgT7kJqzBXecxvh0FlmVQZym79OlOR7a+O6JgO775gHSisEFfxiToZ2mzzPyM7Hbz+3zBycDeB8JpquxsGZ7RM4kfer87jK7/sObRZJxvNpKxO1iGputzv35ACuSIz9jUbfWrCipuvVRj0fHW0h94ElPQ6XPfnnHoOsMdAniEkEJZgEhRE0m5j1lN6PoCWk4b8+zHQOytH26K4Jg= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-Network-Message-Id: f803fec3-d1f8-4ba9-ab96-08d59b579181 X-MS-Exchange-CrossTenant-originalarrivaltime: 06 Apr 2018 00:44:34.7978 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQBPR0101MB1106 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2018 00:44:37 -0000 Bruce Evans wrote >On Wed, 4 Apr 2018, Kaya Saman wrote: >> If I recall correctly the "sync" option is default. Though it might be >> different depending on the Linux distro in use? >> >> I use this: vers=3D3,defaults,auto,tcp,rsize=3D8192,wsize=3D8192 >> >> though I could get rid of the tcp as that's also a default. The rsize >> and wsize options are for running Jumbo Frames ie. larger MTU then >> 1500; in my case 9000 for 1Gbps links. > >These rsize and wsize options are pessimizations. They override the >default sizes which are usually much larger for tcp. Yes, for TCP, the FreeBSD client uses the largest size supported by the server, up to 128K (because MAXPHYS is set to that and, as such, that is the largest size safely supported by the buffer cache. I chose to make it this large by default for a couple of reasons: 1 - Solaris used 256K by default (and a maximum of 1Mbyte) back when it was Sun and their engineers were pretty good at this stuff. (I believe they argued that fewer RPCs implied lower server load for a given # of bytes. Usually the NFS engineering types have been concern= ed with server load and, therefore, the server's capacity and not the pe= rformance of a single client doing a single file write.) 2 - I don't do ZFS, but some thought that 128K would be a better I/O read/w= rite size for ZFS. Personally, since all I have for testing is 100Mbits/sec networking, I alwa= ys get "wire speed" and don't see any difference for different rsize/wsize ove= r TCP, so long as it is at least 16K. One case where large rsize/wsize plus a larger readahead setting should get better performance is when the network connection is a "long, fat pipe" such as a high bandwidth WAN connection. (Basically, you need to push a lot of bits down the TCP pipe before you wait for an RPC reply, to try and keep the long, fat pipe filled. In theory NFSv4 was meant for the Internet. Does anyone use it on WAN links. Probably yes, but not typically. I have no idea what Linux uses, except that packet traces often show page size (4K) I/O sizes, but not always. For UDP, I think the FreeBSD default is 16K for NFSv3 (UDP is not allowed f= or NFSv4 since congestion control at the transport level is required by the RF= Cs). Congestion control and reliability is why I always use TCP and, again,= for 100Mbit/sec networking, I see wire speed. Both Linux and Solaris use T= CP by default for NFSv3 mounts, which is mainly why it is the default for F= reeBSD too. =20 >The defaults are not documented in the man page, and the current >settings are almost equally impossible to see (e.g., mount -v doesn't >how them). The defaults are not quite impossible to see in the source >code of course, but the source code for them is especially convoluted. For FreeBSD, "nfsstat -m" on the client shows what is actually being used. (I think Linux has a similar option, but I can't remember for sure?) [lots of good stuff snipped] rick= From owner-freebsd-fs@freebsd.org Fri Apr 6 01:38:24 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 991F7F94066 for ; Fri, 6 Apr 2018 01:38:24 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-eopbgr660047.outbound.protection.outlook.com [40.107.66.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "Microsoft IT TLS CA 4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2A5AF7C979 for ; Fri, 6 Apr 2018 01:38:23 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM (52.132.66.153) by YQBPR0101MB0852.CANPRD01.PROD.OUTLOOK.COM (52.132.65.154) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.653.12; Fri, 6 Apr 2018 01:38:22 +0000 Received: from YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM ([fe80::185:356:49c5:794c]) by YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM ([fe80::185:356:49c5:794c%13]) with mapi id 15.20.0653.012; Fri, 6 Apr 2018 01:38:22 +0000 From: Rick Macklem To: Mike Tancsa , "freebsd-fs@freebsd.org" Subject: Re: Linux NFS client and FreeBSD server strangeness Thread-Topic: Linux NFS client and FreeBSD server strangeness Thread-Index: AQHTzEKdJ7nozqXV6kKGauYOIuptM6PxTtMygAEGZICAAJxWpg== Date: Fri, 6 Apr 2018 01:38:22 +0000 Message-ID: References: <369fab06-6213-ba87-cc66-c9829e8a76a0@sentex.net> , <9040d0fa-f9c2-2cc3-efbd-f96408cff73b@sentex.net> In-Reply-To: <9040d0fa-f9c2-2cc3-efbd-f96408cff73b@sentex.net> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=rmacklem@uoguelph.ca; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; YQBPR0101MB0852; 7:Ocxda/WBp9l3CuxhMTZS1Wpq1FbQj8OeTh44LUyxAwnYrKhOrRLgOy7vb9UOh4jZlBnX+jp313dJsEmXQQEYN50ft+pfkXjYiSaRlZ1dUIyG+9wXD5jBQM2pPaOAPl2DpahiYBVEdm+LMohM8dR9veK72MoxxUNZUsGrRqwTJ9cFltUrkOXK+ExRbBPKr95LFWIb7zG6Z1NkufMmwv4VRVNVRXnVMC5mb5gcllsBX9jbOZ5hh7ya7g0sSTiKnxY9 x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-correlation-id: 536dea51-4121-4efe-1198-08d59b5f1532 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(8989080)(5600026)(4604075)(3008032)(4534165)(4627221)(201703031133081)(201702281549075)(8990040)(2017052603328)(7153060)(7193020); SRVR:YQBPR0101MB0852; x-ms-traffictypediagnostic: YQBPR0101MB0852: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(158342451672863)(788757137089); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040522)(2401047)(5005006)(8121501046)(3002001)(10201501046)(93006095)(93001095)(3231221)(944501327)(52105095)(6041310)(201703131423095)(201702281529075)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123558120)(20161123564045)(20161123562045)(6072148)(201708071742011); SRVR:YQBPR0101MB0852; BCL:0; PCL:0; RULEID:; SRVR:YQBPR0101MB0852; x-forefront-prvs: 0634F37BFF x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(366004)(346002)(39380400002)(376002)(39850400004)(396003)(199004)(189003)(6436002)(102836004)(110136005)(105586002)(345774005)(106356001)(6506007)(76176011)(53936002)(26005)(486006)(2906002)(7696005)(99286004)(55016002)(74316002)(86362001)(6246003)(186003)(9686003)(33656002)(3660700001)(68736007)(2501003)(305945005)(74482002)(25786009)(81166006)(59450400001)(3280700002)(229853002)(81156014)(316002)(11346002)(786003)(446003)(8676002)(97736004)(14454004)(478600001)(5660300001)(476003)(2900100001)(8936002)(5250100002); DIR:OUT; SFP:1101; SCL:1; SRVR:YQBPR0101MB0852; H:YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: uoguelph.ca does not designate permitted sender hosts) x-microsoft-antispam-message-info: hk/8z3voJ3V0cULdCQkM/hb4zPYxe5sIvdfpm4zFE03VZbYCJw/Y5c/2UwX9LFm4UgqBK/cSoN5a43n7+nY2CuLvTesOGkFxl9lZ/lyo9Mt/OGxkWhmtDMlaGS/qxliGNmdJ4J32sjjnLnmGWN2SWS0ZPEMEGT/O0GQhm6zJPtTaOBjVzUPf7AHWk5s/Mzr+BYKcmeUUaCrONJMWLcnaXrwukNkiNGrLefiX+AwRcspgv2xuezR54MepnkVdbjcB9d9G/A0RfVs8GaRxrHntDNiBgHjwYR1sdXn6PPywWyhNrP2Q9tH3ogXDt57Qc/Y9GNJDNSg7igUOg0zJNSp7rTZtPfdsJZNFYAki5gu6zpQ6wFai0GyYdaNoVvl/C63PDCbpmAMxprNAR1QtNOQFKofoMxe/6cv6IEl6yBEtEug= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-Network-Message-Id: 536dea51-4121-4efe-1198-08d59b5f1532 X-MS-Exchange-CrossTenant-originalarrivaltime: 06 Apr 2018 01:38:22.1656 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQBPR0101MB0852 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2018 01:38:24 -0000 Mike Tancsa wrote: >Thank you for all the feedback, pointers/insights. Coming directly from >'Mr. NFS', its particularly appreciated :) I could replace "Mr. NFS" with "Mr. stupid enough to do NFS without getting paid to do it";-) However, I should note that, although I am fairly familiar with the protoco= l and the FreeBSD code, I don't have a lot of experience w.r.t. performance, at l= east on newer hardware and fast networking. [good stuff snipped] >I think the root of the issue partially stems from the client having a >LOT of RAM. So according to this default behaviour There is a now rather ancient connectathon test suite for NFS, where one of the tests is writing/reading a large 10Mbyte file. The 10Mbyte size was selected because it was guaranteed to exceed the NFS client's buffer cache capacity. (Maybe no longer true;-) >---------------- > The NFS client treats the sync mount option differently than some >other file systems (refer to mount(8) for a description of the > generic sync and async mount options). If neither sync nor >async is specified (or if the async option is specified), the NFS > client delays sending application writes to the server until any >of these events occur: > > Memory pressure forces reclamation of system memory resource= s. > > An application flushes file data explicitly with sync(2), >msync(2), or fsync(3). > > An application closes a file with close(2). > > The file is locked/unlocked via fcntl(2). > > In other words, under normal circumstances, data written by an >application may not immediately appear on the server that hosts > the file. >----------------------------- Just fyi, the FreeBSD client starts a write when the buffer cache block is completely written with new data. (Called B_ASYNC in the code.) If only part of a block has been written with new data, the write is delaye= d until it is fully written with new data or one of the above cases applies. You might want to slap to-gether a test program that loops on write(2) for a while, does an fsync(2), then more writing... and see how that performs on both FreeBSD and Linux clients. I do find the fact that doing an "ls" concurrently with the writes makes th= ings work better interesting/weird. All the "ls" will do is inject a bunch of ot= her RPC messages into the TCP stream (small ones in the client->server dire= ction). The only thing I can think of is that the net interface is somehow "awakene= d" by the small RPC messages (each one almost always in one TCP segment). (Maybe something related to how the net interface device driver handles receive interrupts or ???) If you find the "magic bullet" that makes the Linux case work well without the concurrent "ls", please post and let us know what it is. Good luck with it, rick [lots of stuff snipped]= From owner-freebsd-fs@freebsd.org Fri Apr 6 08:35:28 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 911C5F92236 for ; Fri, 6 Apr 2018 08:35:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 1A3956D65D for ; Fri, 6 Apr 2018 08:35:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id C8B43F92229; Fri, 6 Apr 2018 08:35:27 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8BF5EF92228 for ; Fri, 6 Apr 2018 08:35:27 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 23A306D657 for ; Fri, 6 Apr 2018 08:35:27 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id 226CE1D1AE for ; Fri, 6 Apr 2018 08:35:26 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w368ZP9Q067068 for ; Fri, 6 Apr 2018 08:35:25 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w368ZPw6067067 for fs@FreeBSD.org; Fri, 6 Apr 2018 08:35:25 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 203906] ZFS lockup, spa_namespace_lock Date: Fri, 06 Apr 2018 08:35:26 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: emz@norma.perm.ru X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2018 08:35:28 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D203906 emz@norma.perm.ru changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |emz@norma.perm.ru --- Comment #3 from emz@norma.perm.ru --- I'm getting this on a recent r332062, after upgrade from 11.1-RELEASE-p6. zfs [whatever] commands are freezing, totally: # zfs list -t all load: 9.48 cmd: zfs 12668 [spa_namespace_lock] 126.35r 0.00u 0.09s 0% 3912k load: 7.44 cmd: zfs 12668 [spa_namespace_lock] 438.34r 0.00u 0.10s 0% 3236k gstat is freezing (but after a minute is able to start showing data, howeve= r, it contunues to freeze while showing data). Since procstat is enormous, I'm attaching it as a file. All the zfs pools are healthy, at least they were before the upgrade - now I cannot get any output from zpool status. This behavior is reproducible - onvce I was obvserving it on a less recent -STABLE, but due to the lack of time I was forced to downgrade to 11.1-RELE= ASE. ps: [root@san1:~]# ps ax | grep D PID TT STAT TIME COMMAND 0 - DLs 462:29,65 [kernel] 2 - DL 0:00,00 [crypto] 3 - DL 0:00,00 [crypto returns] 4 - DL 30:55,60 [cam] 5 - DL 0:00,00 [soaiod1] 6 - DL 0:00,00 [soaiod2] 7 - DL 0:00,00 [soaiod3] 8 - DL 0:00,00 [soaiod4] 9 - DL 38:57,83 [zfskern] 10 - DL 0:00,00 [audit] 13 - DL 2:42,11 [geom] 14 - DL 0:02,46 [usb] 15 - DL 0:00,00 [sctp_iterator] 16 - DL 0:01,62 [pf purge] 17 - DL 0:05,83 [rand_harvestq] 18 - DL 0:00,01 [enc_daemon0] 19 - DL 0:00,01 [enc_daemon1] 20 - DL 0:00,11 [enc_daemon2] 21 - DL 0:00,60 [g_mirror swap] 22 - DL 1:04,21 [pagedaemon] 23 - DL 0:00,20 [vmdaemon] 24 - DNL 0:00,00 [pagezero] 25 - DL 0:00,04 [bufdaemon] 26 - DL 0:00,04 [bufspacedaemon] 27 - DL 0:00,13 [syncer] 28 - DL 0:00,07 [vnlru] 740 - DL 2:19,16 [ctl] 898 - DL 0:00,00 [ng_queue] 12174 - D 0:00,75 /sbin/zfs list -t all 12236 - D 0:00,78 /sbin/zfs list -t all 12240 - D 0:00,77 /sbin/zfs list -t all 12338 - D 0:00,75 /sbin/zfs list -t all 12340 - D 0:00,75 /sbin/zfs list -t all 12364 - DL 0:00,00 [ftcleanup] 12368 - D 0:00,76 /sbin/zfs list -t all 12404 - D 0:00,75 /sbin/zfs list -t all 12407 - D 0:00,75 /sbin/zfs list -t all 12440 - D 0:00,77 /sbin/zfs list -t all 12518 - D 0:00,02 /usr/sbin/ctladm remove -b block -l 827 12520 - D 0:00,09 /sbin/zfs list -t all 12554 - D 0:00,09 /sbin/zfs list -t all 12639 - D 0:00,08 /sbin/zfs list -t all 12651 - D 0:00,09 /sbin/zfs list -t all 12716 - D 0:00,09 zfs list -t snapshot 12721 - D 0:00,09 /sbin/zfs list -t all 12741 - D 0:00,00 /sbin/zfs destroy data/reference@ver13_2137 12743 - D 0:00,00 /usr/sbin/ctladm remove -b block -l 809 12791 - D 0:00,00 /sbin/zfs list -t all 12558 0 D+ 0:00,31 gstat -do 12764 5 D+ 0:00,00 gstat -d 12668 2 D+ 0:00,09 zfs list -t all 4898 3 D+ 0:28,82 gstat 12814 6 S+ 0:00,00 grep D 12737 4 D+ 0:00,03 zfs destroy data/kvm/hv34/worker390 12747 8 D+ 0:00,00 zfs get origin data/kvm/desktop/desktop-master@desktop23 --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Fri Apr 6 08:36:53 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1BF25F92436 for ; Fri, 6 Apr 2018 08:36:53 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id A39426D76F for ; Fri, 6 Apr 2018 08:36:52 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id 5D18CF92432; Fri, 6 Apr 2018 08:36:52 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 400C1F92430 for ; Fri, 6 Apr 2018 08:36:52 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D185B6D76B for ; Fri, 6 Apr 2018 08:36:51 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id 116C61D1B5 for ; Fri, 6 Apr 2018 08:36:51 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w368aoEI069111 for ; Fri, 6 Apr 2018 08:36:50 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w368aovR069110 for fs@FreeBSD.org; Fri, 6 Apr 2018 08:36:50 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 203906] ZFS lockup, spa_namespace_lock Date: Fri, 06 Apr 2018 08:36:50 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: emz@norma.perm.ru X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2018 08:36:53 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D203906 --- Comment #4 from emz@norma.perm.ru --- Created attachment 192275 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D192275&action= =3Dedit procstat -kk -a output procstat -kk -a output --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Fri Apr 6 08:37:32 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D2430F92511 for ; Fri, 6 Apr 2018 08:37:31 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 6186A6D7D8 for ; Fri, 6 Apr 2018 08:37:31 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id 25D97F92510; Fri, 6 Apr 2018 08:37:31 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E9124F9250F for ; Fri, 6 Apr 2018 08:37:30 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7262C6D7D6 for ; Fri, 6 Apr 2018 08:37:30 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id BBAAB1D1BA for ; Fri, 6 Apr 2018 08:37:29 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w368bTtn070109 for ; Fri, 6 Apr 2018 08:37:29 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w368bTYP070108 for fs@FreeBSD.org; Fri, 6 Apr 2018 08:37:29 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 203906] ZFS lockup, spa_namespace_lock Date: Fri, 06 Apr 2018 08:37:29 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: emz@norma.perm.ru X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2018 08:37:32 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D203906 --- Comment #5 from emz@norma.perm.ru --- Finally I got my zpool status output, showing that the pools are healthy: [root@san1:~]# zpool status pool: data state: ONLINE scan: scrub repaired 0 in 28h24m with 0 errors on Thu Feb 15 13:26:36 2018 config: NAME STATE READ WRITE CKSUM data ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 raidz1-1 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 da9 ONLINE 0 0 0 da10 ONLINE 0 0 0 da11 ONLINE 0 0 0 raidz1-2 ONLINE 0 0 0 da12 ONLINE 0 0 0 da13 ONLINE 0 0 0 da14 ONLINE 0 0 0 da15 ONLINE 0 0 0 da16 ONLINE 0 0 0 errors: No known data errors pool: userdata state: ONLINE scan: scrub repaired 0 in 1h7m with 0 errors on Wed Feb 14 10:06:19 2018 config: NAME STATE READ WRITE CKSUM userdata ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gpt/userdata0 ONLINE 0 0 0 gpt/userdata1 ONLINE 0 0 0 errors: No known data errors pool: zroot state: ONLINE scan: scrub repaired 0 in 0h1m with 0 errors on Mon Aug 7 18:26:15 2017 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gpt/zroot0 ONLINE 0 0 0 gpt/zroot1 ONLINE 0 0 0 errors: No known data errors --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Fri Apr 6 08:39:35 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 79FE9F9276A for ; Fri, 6 Apr 2018 08:39:35 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 133CD6D97B for ; Fri, 6 Apr 2018 08:39:35 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id C69C3F92769; Fri, 6 Apr 2018 08:39:34 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B3EBDF92768 for ; Fri, 6 Apr 2018 08:39:34 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 48B816D978 for ; Fri, 6 Apr 2018 08:39:34 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id 60D0A1D1BC for ; Fri, 6 Apr 2018 08:39:33 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w368dXk1072824 for ; Fri, 6 Apr 2018 08:39:33 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w368dXJ6072823 for fs@FreeBSD.org; Fri, 6 Apr 2018 08:39:33 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 203906] ZFS lockup, spa_namespace_lock Date: Fri, 06 Apr 2018 08:39:33 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: emz@norma.perm.ru X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2018 08:39:35 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D203906 --- Comment #6 from emz@norma.perm.ru --- Follow-up: now the freeze is gone, and I don't know what is triggering it. While in this freeze, the system is fully responsive and smooth besides the gstat/zfs./zpool commands. After the freeze is over, the gstat/zpool/zfs commands are fully responsive too. I guess this makes the procstat output vital. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Fri Apr 6 11:10:28 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 10902F9CA95 for ; Fri, 6 Apr 2018 11:10:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 9DFCB748AF for ; Fri, 6 Apr 2018 11:10:27 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id 62761F9CA94; Fri, 6 Apr 2018 11:10:27 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 51037F9CA93 for ; Fri, 6 Apr 2018 11:10:27 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E222D748AB for ; Fri, 6 Apr 2018 11:10:26 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id 2A7101E648 for ; Fri, 6 Apr 2018 11:10:26 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w36BAPWg027508 for ; Fri, 6 Apr 2018 11:10:25 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w36BAPNb027507 for fs@FreeBSD.org; Fri, 6 Apr 2018 11:10:25 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 203906] ZFS lockup, spa_namespace_lock Date: Fri, 06 Apr 2018 11:10:26 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: avg@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2018 11:10:28 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D203906 --- Comment #7 from Andriy Gapon --- (In reply to emz from comment #6) A rather wild guess, but I suspect that it may had to do with the hardware. I see that ZFS was just blocked waiting for I/O. I do not see intrinsic problems with it. Maybe a disk took a nap or a command like TRIM blocked its operation for a = long time or something of that nature. A suspicious thread: 0 101671 kernel zio_free_issue_3_0 _mtx_lock_spin_cookie+= 0xc1 callout_lock+0xcb callout_reset_sbt_on+0x79 mprsas_action+0xf2d xpt_run_devq+0x48a xpt_action_default+0x8fc dastart+0x2f3 xpt_run_allocq+0x= 173 dastrategy+0x8d g_disk_start+0x34f g_io_request+0x2a7 zio_vdev_io_start+0x2= ae zio_execute+0xac zio_nowait+0xcb vdev_raidz_io_start+0x6cc zio_vdev_io_start+0x2ae zio_execute+0xac zio_nowait+0xcb Try to examine your system logs, maybe RAID event log too. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Fri Apr 6 11:50:19 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DD5BEF9F62C for ; Fri, 6 Apr 2018 11:50:18 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 7976A7633D for ; Fri, 6 Apr 2018 11:50:18 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id 38A95F9F62A; Fri, 6 Apr 2018 11:50:18 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 24E0BF9F629 for ; Fri, 6 Apr 2018 11:50:18 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B1DDB7633A for ; Fri, 6 Apr 2018 11:50:17 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id D8CF01EC01 for ; Fri, 6 Apr 2018 11:50:16 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w36BoGVr019297 for ; Fri, 6 Apr 2018 11:50:16 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w36BoGAk019296 for fs@FreeBSD.org; Fri, 6 Apr 2018 11:50:16 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 203906] ZFS lockup, spa_namespace_lock Date: Fri, 06 Apr 2018 11:50:16 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: emz@norma.perm.ru X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2018 11:50:19 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D203906 --- Comment #8 from emz@norma.perm.ru --- I took a thorough look in the logs now, and I really don't see anything suspicious. Furthermore, this happened exactly after I upgraded to -STABLE (I have an e= xact answer "Why did you upgrade anyway ? - Because I needed the trasz@ iSCSI pa= tch that I didn't have, and also I wanted to check if the overall situation wou= ld improve). Prior to the upgrade the system didn't have this issue. Plus, I've seen this probable regression (I know developers highly dislike this word, = but still) a couple of months ago - but then I didn't have the time to investig= ate, so I just rolled back the revistion to RELEASE and the problem was gone. I'm sure it would be gone if I will downgrade this time too (not that I plan to= o). Also, I checked with iSCSI clients and my colleagues told me the iSCSI sysbsystem was serving data just fine during this lockup. And I shoudl noti= ce that in my opinion, I would get the read/write errors on the zpool disks in case they are timing out - at least I'm getting these in the situations when the hardware is really faulty (not on this server). Just in case (very probable) I will get more similar lockups - what else ca= n I do to diagnose the problem ? To rule out the hardware, or, if I'm still mistaken, to prove it is really it ? --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Fri Apr 6 12:14:19 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 17698FA1AE6 for ; Fri, 6 Apr 2018 12:14:19 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 981C8778DB for ; Fri, 6 Apr 2018 12:14:18 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id 59278FA1AE4; Fri, 6 Apr 2018 12:14:18 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 42883FA1AE2 for ; Fri, 6 Apr 2018 12:14:18 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D737A778D9 for ; Fri, 6 Apr 2018 12:14:17 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id EF1591F01E for ; Fri, 6 Apr 2018 12:14:16 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w36CEGcn008794 for ; Fri, 6 Apr 2018 12:14:16 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from bugzilla@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w36CEGv3008781 for fs@FreeBSD.org; Fri, 6 Apr 2018 12:14:16 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: bugzilla set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 225960] zfs: g_access leak when unmounting UFS on a zvol Date: Fri, 06 Apr 2018 12:14:16 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: commit-hook@freebsd.org X-Bugzilla-Status: In Progress X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: avg@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2018 12:14:19 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D225960 --- Comment #17 from commit-hook@freebsd.org --- A commit references this bug: Author: avg Date: Fri Apr 6 12:13:32 UTC 2018 New revision: 332095 URL: https://svnweb.freebsd.org/changeset/base/332095 Log: MFC r330977: g_access: deal with races created by geoms that drop the topology lock PR: 225960 Changes: _U stable/11/ stable/11/sys/geom/geom.h stable/11/sys/geom/geom_subr.c --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-fs@freebsd.org Fri Apr 6 12:23:38 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 659ADF806A6 for ; Fri, 6 Apr 2018 12:23:38 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id F2F30781AF for ; Fri, 6 Apr 2018 12:23:37 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id B4476F806A1; Fri, 6 Apr 2018 12:23:37 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A3016F806A0 for ; Fri, 6 Apr 2018 12:23:37 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3E7C8781AC for ; Fri, 6 Apr 2018 12:23:37 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id 9301A1F17E for ; Fri, 6 Apr 2018 12:23:36 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w36CNa0i082370 for ; Fri, 6 Apr 2018 12:23:36 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w36CNa5X082369 for fs@FreeBSD.org; Fri, 6 Apr 2018 12:23:36 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 203906] ZFS lockup, spa_namespace_lock Date: Fri, 06 Apr 2018 12:23:36 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: avg@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2018 12:23:38 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D203906 --- Comment #9 from Andriy Gapon --- (In reply to emz from comment #8) Run kgdb and check what thread owns spa_namespace_lock. The switch to that thread and get its backtrace. That's in addition to capturing procstat -kk -a. If you can afford some down time, it may also be useful to trigger a kernel panic and save a crash dump. The crash dump is most useful when you have debugging symbols for the kernel and modules. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Fri Apr 6 12:24:31 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2F2C9F808F0 for ; Fri, 6 Apr 2018 12:24:31 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id B45587834C for ; Fri, 6 Apr 2018 12:24:30 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id 6B773F808E5; Fri, 6 Apr 2018 12:24:30 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 59265F808E4 for ; Fri, 6 Apr 2018 12:24:30 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A33D17834A for ; Fri, 6 Apr 2018 12:24:29 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id EE5A11F186 for ; Fri, 6 Apr 2018 12:24:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w36COSwa083547 for ; Fri, 6 Apr 2018 12:24:28 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from bugzilla@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w36COSms083546 for fs@FreeBSD.org; Fri, 6 Apr 2018 12:24:28 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: bugzilla set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 225960] zfs: g_access leak when unmounting UFS on a zvol Date: Fri, 06 Apr 2018 12:24:28 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: commit-hook@freebsd.org X-Bugzilla-Status: In Progress X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: avg@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2018 12:24:31 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D225960 --- Comment #18 from commit-hook@freebsd.org --- A commit references this bug: Author: avg Date: Fri Apr 6 12:24:00 UTC 2018 New revision: 332096 URL: https://svnweb.freebsd.org/changeset/base/332096 Log: MFC r330977: g_access: deal with races created by geoms that drop the topology lock PR: 225960 Changes: _U stable/10/ stable/10/sys/geom/geom.h stable/10/sys/geom/geom_subr.c --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-fs@freebsd.org Fri Apr 6 12:31:31 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 04703F813B2 for ; Fri, 6 Apr 2018 12:31:31 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-lf0-f41.google.com (mail-lf0-f41.google.com [209.85.215.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 59BAD787F1 for ; Fri, 6 Apr 2018 12:31:30 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-lf0-f41.google.com with SMTP id g203-v6so311815lfg.11 for ; Fri, 06 Apr 2018 05:31:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:openpgp:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=DnS/g+D8Q43vp0KNrRdjF99wvBEDEI7DAb82L3oovm0=; b=A5tKDG72uknqToS0J96QoDtqKhRFHFfmniDKEzVrw1/xB5tnwpX87if9IiuQXF74A7 wetRiXa1TuCotFlFH0DiQMeUyV1cAfqKKo9utPSflX+zCZgn1APL+aeLcrzlpb1t+vjg ShIBbv49QCOPoye9S5+eHW6uV4I4D28sIUEHMEVA/slWKSzsYKJ0jhhwhlLlLJM2zebW dPv3tC70Rb0JXOHdhp2vaAAE1m3K3C61h8tPar1wQoD1FPnYdThN7WWJ0yXn8xM7ZfzZ bl8RVLzP0f+ecoQzsCbgEZa9xolW7LqeQ769x/fwJwYV+QzaLMcC1zPKMZ2cUPZeCHwD PoeQ== X-Gm-Message-State: ALQs6tBUaPpllMIYjj1//YVcA5VoTxA+MIvzEUK5GUxej/j5TP0ntdEE HZa7MJtP4T1iG9J9ZrZVtxdnQdD0 X-Google-Smtp-Source: AIpwx4+8ZewxdGxhnJuPMz/ognD5QpKjUSMIaYhON1spuigXKXc5mcHN/wYaTIYFc9GIvW4gFzQkjw== X-Received: by 2002:a19:e511:: with SMTP id c17-v6mr16451123lfh.106.1523017881551; Fri, 06 Apr 2018 05:31:21 -0700 (PDT) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id c7sm1734188ljk.51.2018.04.06.05.31.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 06 Apr 2018 05:31:20 -0700 (PDT) Subject: Re: Does setuid=on work on ZFS datasets, or is the man page for zfs misleading? To: Stilez Stilezy , freebsd-fs References: From: Andriy Gapon Openpgp: preference=signencrypt Autocrypt: addr=avg@FreeBSD.org; prefer-encrypt=mutual; keydata= xsFNBFm4LIgBEADNB/3lT7f15UKeQ52xCFQx/GqHkSxEdVyLFZTmY3KyNPQGBtyvVyBfprJ7 mAeXZWfhat6cKNRAGZcL5EmewdQuUfQfBdYmKjbw3a9GFDsDNuhDA2QwFt8BmkiVMRYyvI7l N0eVzszWCUgdc3qqM6qqcgBaqsVmJluwpvwp4ZBXmch5BgDDDb1MPO8AZ2QZfIQmplkj8Y6Z AiNMknkmgaekIINSJX8IzRzKD5WwMsin70psE8dpL/iBsA2cpJGzWMObVTtCxeDKlBCNqM1i gTXta1ukdUT7JgLEFZk9ceYQQMJJtUwzWu1UHfZn0Fs29HTqawfWPSZVbulbrnu5q55R4PlQ /xURkWQUTyDpqUvb4JK371zhepXiXDwrrpnyyZABm3SFLkk2bHlheeKU6Yql4pcmSVym1AS4 dV8y0oHAfdlSCF6tpOPf2+K9nW1CFA8b/tw4oJBTtfZ1kxXOMdyZU5fiG7xb1qDgpQKgHUX8 7Rd2T1UVLVeuhYlXNw2F+a2ucY+cMoqz3LtpksUiBppJhw099gEXehcN2JbUZ2TueJdt1FdS ztnZmsHUXLxrRBtGwqnFL7GSd6snpGIKuuL305iaOGODbb9c7ne1JqBbkw1wh8ci6vvwGlzx rexzimRaBzJxlkjNfMx8WpCvYebGMydNoeEtkWldtjTNVsUAtQARAQABzR5BbmRyaXkgR2Fw b24gPGF2Z0BGcmVlQlNELm9yZz7CwZQEEwEIAD4WIQS+LEO7ngQnXA4Bjr538m7TUc1yjwUC WbgsiAIbIwUJBaOagAULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRB38m7TUc1yj+JAEACV l9AK/nOWAt/9cufV2fRj0hdOqB1aCshtSrwHk/exXsDa4/FkmegxXQGY+3GWX3deIyesbVRL rYdtdK0dqJyT1SBqXK1h3/at9rxr9GQA6KWOxTjUFURsU7ok/6SIlm8uLRPNKO+yq0GDjgaO LzN+xykuBA0FlhQAXJnpZLcVfPJdWv7sSHGedL5ln8P8rxR+XnmsA5TUaaPcbhTB+mG+iKFj GghASDSfGqLWFPBlX/fpXikBDZ1gvOr8nyMY9nXhgfXpq3B6QCRYKPy58ChrZ5weeJZ29b7/ QdEO8NFNWHjSD9meiLdWQaqo9Y7uUxN3wySc/YUZxtS0bhAd8zJdNPsJYG8sXgKjeBQMVGuT eCAJFEYJqbwWvIXMfVWop4+O4xB+z2YE3jAbG/9tB/GSnQdVSj3G8MS80iLS58frnt+RSEw/ psahrfh0dh6SFHttE049xYiC+cM8J27Aaf0i9RflyITq57NuJm+AHJoU9SQUkIF0nc6lfA+o JRiyRlHZHKoRQkIg4aiKaZSWjQYRl5Txl0IZUP1dSWMX4s3XTMurC/pnja45dge/4ESOtJ9R 8XuIWg45Oq6MeIWdjKddGhRj3OohsltKgkEU3eLKYtB6qRTQypHHUawCXz88uYt5e3w4V16H lCpSTZV/EVHnNe45FVBlvK7k7HFfDDkryM7BTQRZuCyIARAAlq0slcsVboY/+IUJdcbEiJRW be9HKVz4SUchq0z9MZPX/0dcnvz/gkyYA+OuM78dNS7Mbby5dTvOqfpLJfCuhaNYOhlE0wY+ 1T6Tf1f4c/uA3U/YiadukQ3+6TJuYGAdRZD5EqYFIkreARTVWg87N9g0fT9BEqLw9lJtEGDY EWUE7L++B8o4uu3LQFEYxcrb4K/WKmgtmFcm77s0IKDrfcX4doV92QTIpLiRxcOmCC/OCYuO jB1oaaqXQzZrCutXRK0L5XN1Y1PYjIrEzHMIXmCDlLYnpFkK+itlXwlE2ZQxkfMruCWdQXye syl2fynAe8hvp7Mms9qU2r2K9EcJiR5N1t1C2/kTKNUhcRv7Yd/vwusK7BqJbhlng5ZgRx0m WxdntU/JLEntz3QBsBsWM9Y9wf2V4tLv6/DuDBta781RsCB/UrU2zNuOEkSixlUiHxw1dccI 6CVlaWkkJBxmHX22GdDFrcjvwMNIbbyfQLuBq6IOh8nvu9vuItup7qemDG3Ms6TVwA7BD3j+ 3fGprtyW8Fd/RR2bW2+LWkMrqHffAr6Y6V3h5kd2G9Q8ZWpEJk+LG6Mk3fhZhmCnHhDu6CwN MeUvxXDVO+fqc3JjFm5OxhmfVeJKrbCEUJyM8ESWLoNHLqjywdZga4Q7P12g8DUQ1mRxYg/L HgZY3zfKOqcAEQEAAcLBfAQYAQgAJhYhBL4sQ7ueBCdcDgGOvnfybtNRzXKPBQJZuCyIAhsM BQkFo5qAAAoJEHfybtNRzXKPBVwQAKfFy9P7N3OsLDMB56A4Kf+ZT+d5cIx0Yiaf4n6w7m3i ImHHHk9FIetI4Xe54a2IXh4Bq5UkAGY0667eIs+Z1Ea6I2i27Sdo7DxGwq09Qnm/Y65ADvXs 3aBvokCcm7FsM1wky395m8xUos1681oV5oxgqeRI8/76qy0hD9WR65UW+HQgZRIcIjSel9vR XDaD2HLGPTTGr7u4v00UeTMs6qvPsa2PJagogrKY8RXdFtXvweQFz78NbXhluwix2Tb9ETPk LIpDrtzV73CaE2aqBG/KrboXT2C67BgFtnk7T7Y7iKq4/XvEdDWscz2wws91BOXuMMd4c/c4 OmGW9m3RBLufFrOag1q5yUS9QbFfyqL6dftJP3Zq/xe+mr7sbWbhPVCQFrH3r26mpmy841ym dwQnNcsbIGiBASBSKksOvIDYKa2Wy8htPmWFTEOPRpFXdGQ27awcjjnB42nngyCK5ukZDHi6 w0qK5DNQQCkiweevCIC6wc3p67jl1EMFY5+z+zdTPb3h7LeVnGqW0qBQl99vVFgzLxchKcl0 R/paSFgwqXCZhAKMuUHncJuynDOP7z5LirUeFI8qsBAJi1rXpQoLJTVcW72swZ42IdPiboqx NbTMiNOiE36GqMcTPfKylCbF45JNX4nF9ElM0E+Y8gi4cizJYBRr2FBJgay0b9Cp Message-ID: <7eba73db-3097-5c8a-eb2c-e3880fb5b501@FreeBSD.org> Date: Fri, 6 Apr 2018 15:31:19 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2018 12:31:31 -0000 On 05/04/2018 18:53, Stilez Stilezy wrote: > I'm trying to use the setuid property in ZFS. > > The man pages are a bit conflicted but overall man zfs seems most specific > and implies the property is valid (man zfs says use setuid=on and it'll > work, man mount says use -o suiddir but won't work except on UFS). Please read in the manual what ZFS setuid property means. By the way, it's on by default, so you would typically turn it off if you don't want suid binaries. And, of course, suiddir != setuid and ZFS does not support it, afaict. TLDR: yes, setuid works; no, it's not suiddir. -- Andriy Gapon From owner-freebsd-fs@freebsd.org Fri Apr 6 12:32:00 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2F6ECF81425 for ; Fri, 6 Apr 2018 12:32:00 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id B9D1878896 for ; Fri, 6 Apr 2018 12:31:59 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id 7413FF81423; Fri, 6 Apr 2018 12:31:59 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 62AFBF81422 for ; Fri, 6 Apr 2018 12:31:59 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0337E7888E for ; Fri, 6 Apr 2018 12:31:59 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id 464BE1F2BD for ; Fri, 6 Apr 2018 12:31:58 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w36CVwqC038651 for ; Fri, 6 Apr 2018 12:31:58 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w36CVwKg038642 for fs@FreeBSD.org; Fri, 6 Apr 2018 12:31:58 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 225960] zfs: g_access leak when unmounting UFS on a zvol Date: Fri, 06 Apr 2018 12:31:58 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: avg@FreeBSD.org X-Bugzilla-Status: Closed X-Bugzilla-Resolution: FIXED X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: avg@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: resolution bug_status Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2018 12:32:00 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D225960 Andriy Gapon changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Status|In Progress |Closed --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-fs@freebsd.org Fri Apr 6 13:12:42 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DA385F84B98 for ; Fri, 6 Apr 2018 13:12:41 +0000 (UTC) (envelope-from stilezy@gmail.com) Received: from mail-vk0-x233.google.com (mail-vk0-x233.google.com [IPv6:2607:f8b0:400c:c05::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7CF727AF29; Fri, 6 Apr 2018 13:12:41 +0000 (UTC) (envelope-from stilezy@gmail.com) Received: by mail-vk0-x233.google.com with SMTP id m72so617112vkh.9; Fri, 06 Apr 2018 06:12:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=GUpeJioIGIAy2MTFOxCyxweSXnsveOy6+a9eCr5o+p8=; b=CNgGkxfMFwgtY79ctw0vIg+NDyRZj8RiJ0TSLgOPnleaqHETG92GvVDBM+NULBRg2K oGaaKQPeTAumc13rjGqG3Ab0ZSzQh4yooWVNwq6185CDaXPFWtZ71X+cdF2EhxLNOq7A L9tFx0ejZ7Ybroc4DkSLegAQNXBjDIBR9WCbK/OVjyi416zwhxrzGkuzCdQLxl+ZIfTy QM7zZMs5oWG03cdc+I+LT2G7n5/RZ+7x6DkmMdYL0pt5Y8gCtyP6ETbDnIWQlOFIKqaY sn7j4fnXjpZi2uIWir62wNviH9aI+ws6Y2vri+a9q2kpPzsK6AlgiBP7Kl+kG0KwChDS QO1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=GUpeJioIGIAy2MTFOxCyxweSXnsveOy6+a9eCr5o+p8=; b=lBPmBBUJ0UxDPe37faI+WpvS/DWolsg6kFqoyxhegOETFLv0GFbZgi4p0Y/aI8EAZW o4Ps2N7v7Ml44WNSpDzL+CLXiJ894mohI6UxTlMDtye0BNSloJZgt34Jrpu8TTwswAG6 8vjEDiUdHo7Jtdv01HVx23nUCIil/Eg9hFOXpKzVviIg779KunMW4YWmKrlxO8wilxQ1 FZiZiT80OSnojdIKUOVSnBf3mo1fy2ZcwvLnpHAUbAxgRyU2mTo4Tk0s49d/EAUL2Vi2 n8yDB695NMDFrxtawKHHLZsa8/LPHIDOCcYn+VcLnCi5Ubu0zWn3Ff76YyV2GDn4/A02 iA8w== X-Gm-Message-State: ALQs6tAvnDBVF/XEvhFytXOhGZb3GtKj1iu3Ao1IJB7qsC/kswW797rh +WLe++P8M1wIIb16ZiN4Oig8CY6t9sF7XmeSFoRYLw== X-Google-Smtp-Source: AIpwx48fDMulWqYot4IDd0M0ya1KdPPoeQMIz/cEdz46BILS0uhwPMl1h46gn9/qUlP3dkw24hN6wUghp75519f9prY= X-Received: by 10.31.197.197 with SMTP id v188mr16441956vkf.18.1523020360825; Fri, 06 Apr 2018 06:12:40 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.219.148 with HTTP; Fri, 6 Apr 2018 06:12:10 -0700 (PDT) In-Reply-To: <7eba73db-3097-5c8a-eb2c-e3880fb5b501@FreeBSD.org> References: <7eba73db-3097-5c8a-eb2c-e3880fb5b501@FreeBSD.org> From: Stilez Stilezy Date: Fri, 6 Apr 2018 14:12:10 +0100 Message-ID: Subject: Re: Does setuid=on work on ZFS datasets, or is the man page for zfs misleading? To: Andriy Gapon Cc: freebsd-fs Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2018 13:12:42 -0000 Thanks Andriy, Please read in the manual what ZFS setuid property means. By the way, it's on by default, so you would typically turn it off if you don't want suid binaries. And, of course, suiddir != setuid and ZFS does not support it, afaict. TLDR: yes, setuid works; no, it's not suiddir. I did look up the ZFS setuid property in the man pages. If there are there pages I missed, can you point me to them (and sorry for not finding them!) *[man zfs]:* setuid=on | off Controls whether the set-UID bit is respected for the file system. [Does not say anything else, seems perfectly clear] *[man chmod]* - where it's documented what the set-UID bit does when set on a file system: 4000 (the setuid bit). Directories with this bit set will force all files and sub- directories created in them to be owned by the directory owner and not by the uid of the creating process, if the underlying file system supports this feature... [Does **not** say that mount -o suiddir is/isn't required, or is/isn't a "blocker". Just says "see suiddir mounting option". But zfs man page has already said the bit **will** be respected. It's a bit conflicting.] Like I said, the man pages seem a bit conflicted. *[man zfs]* definitely says it provides an option for the setuid bit to be respected for the file system - it doesn't say "for files only" or any other limitation. It just says that setuid will be "respected for the file system" if the flag is enabled on the dataset. *[man chmod]* describes what happens if setuid is "respected on a file system". It's clear that this will force+inherit directory ownership "if the underlying file system supports this feature". As [man zfs] already says set_UID will be "respected", set-UID is clearly supported by ZFS. As you can see, I did read the man pages carefully. That's why I asked help to understand if it was documentation, implementation, or invocation, which was the issue. If the zfs setuid property *doesn't* mean the same as normal enabling of the setuid bit functionality, then the [man zfs] page is misleading. If it works only for files but not for directories, it's also misleading. So I hope you can see, I'm not asking because of failure to read the man pages. I really did read, and followed them carefully, before asking. So your answer was helpful (thank you!), even if I don't understand what info I didn't read in the man pages. I have 2 quick points arising: 1. I gather from your reply that even with this flag set, set-UID for ZFS based directories' ownership/inheritance is not "respected for the file system" - or not fully respected in the sense normally understood as in [man chmod]? If that's the case then [man zfs] is incorrect - please can you confirm exactly what is this flag's functionality, since it's unclear? 2. Returning to the original issue, is there any way one can automatically force owner+owner inheritance, for data in a zfs dataset? Thank you for your help, even if not the ideal answer. I hope these last couple of points are easy to clear up, and not annoying :) Stilez On 6 April 2018 at 13:31, Andriy Gapon wrote: > On 05/04/2018 18:53, Stilez Stilezy wrote: > > I'm trying to use the setuid property in ZFS. > > > > The man pages are a bit conflicted but overall man zfs seems most > specific > > and implies the property is valid (man zfs says use setuid=on and it'll > > work, man mount says use -o suiddir but won't work except on UFS). > > Please read in the manual what ZFS setuid property means. > By the way, it's on by default, so you would typically turn it off if you > don't > want suid binaries. And, of course, suiddir != setuid and ZFS does not > support > it, afaict. > > TLDR: yes, setuid works; no, it's not suiddir. > > -- > Andriy Gapon > > From owner-freebsd-fs@freebsd.org Fri Apr 6 14:07:58 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D8D14F88826 for ; Fri, 6 Apr 2018 14:07:57 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-lf0-f41.google.com (mail-lf0-f41.google.com [209.85.215.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 30C5F7D543 for ; Fri, 6 Apr 2018 14:07:56 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-lf0-f41.google.com with SMTP id m200-v6so738574lfm.4 for ; Fri, 06 Apr 2018 07:07:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:openpgp:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=3CKWnpEv/2LnTCdohJv2wqf6boU3+uRLwXQA971cDgQ=; b=IRBvJh+dOUduJ2UR7ta1XMSZXph60DvU5V16NGaZ+ZeYr9xBMEjBMtpIZydtX9SmiR 9phIWJTjXBhAjU0lDZWhqzeP8Vs81oxrKHQgfaKXUuKsnJoyhKDMXg5sAhAu/jF7uW2W gY1DD6l7NPb7cA1Js+V0cMUbyw9voW2YQHeoQJQMq8DbrsDWwFQz2MbObncB1DkW+POW VGRqNLB1047c012OYJsbexq/lyVmq7wSLj4DtgRV7zPatxcqZhYByJb3wxivWKIUcEhg 6Uc+fNNp5+hOcM4skezgSH3jFy29/CEs5+BTCi4Ei+rTbKAQspM/dDDXZjs+LT+nLRuy tQsQ== X-Gm-Message-State: AElRT7FyyRGMkhwteGB2vncYT+gZR5TfVSVHcTQ2OmLvJ6JQZoHKtBtK CJLCJZtLaRFNx9Cb+i2DQnR71CXC X-Google-Smtp-Source: AIpwx4+iIARMTHliuahqBwgWGwg5zJVAVb+hOMpU2gOXcu+Vt2OKHnUyaEZ4eE6nLjPH0Gu3l8jyZQ== X-Received: by 10.46.156.132 with SMTP id x4mr16632303lji.19.1523023291132; Fri, 06 Apr 2018 07:01:31 -0700 (PDT) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id x17sm1765566ljx.80.2018.04.06.07.01.28 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 06 Apr 2018 07:01:28 -0700 (PDT) Subject: Re: Does setuid=on work on ZFS datasets, or is the man page for zfs misleading? To: Stilez Stilezy Cc: freebsd-fs References: <7eba73db-3097-5c8a-eb2c-e3880fb5b501@FreeBSD.org> From: Andriy Gapon Openpgp: preference=signencrypt Autocrypt: addr=avg@FreeBSD.org; prefer-encrypt=mutual; keydata= xsFNBFm4LIgBEADNB/3lT7f15UKeQ52xCFQx/GqHkSxEdVyLFZTmY3KyNPQGBtyvVyBfprJ7 mAeXZWfhat6cKNRAGZcL5EmewdQuUfQfBdYmKjbw3a9GFDsDNuhDA2QwFt8BmkiVMRYyvI7l N0eVzszWCUgdc3qqM6qqcgBaqsVmJluwpvwp4ZBXmch5BgDDDb1MPO8AZ2QZfIQmplkj8Y6Z AiNMknkmgaekIINSJX8IzRzKD5WwMsin70psE8dpL/iBsA2cpJGzWMObVTtCxeDKlBCNqM1i gTXta1ukdUT7JgLEFZk9ceYQQMJJtUwzWu1UHfZn0Fs29HTqawfWPSZVbulbrnu5q55R4PlQ /xURkWQUTyDpqUvb4JK371zhepXiXDwrrpnyyZABm3SFLkk2bHlheeKU6Yql4pcmSVym1AS4 dV8y0oHAfdlSCF6tpOPf2+K9nW1CFA8b/tw4oJBTtfZ1kxXOMdyZU5fiG7xb1qDgpQKgHUX8 7Rd2T1UVLVeuhYlXNw2F+a2ucY+cMoqz3LtpksUiBppJhw099gEXehcN2JbUZ2TueJdt1FdS ztnZmsHUXLxrRBtGwqnFL7GSd6snpGIKuuL305iaOGODbb9c7ne1JqBbkw1wh8ci6vvwGlzx rexzimRaBzJxlkjNfMx8WpCvYebGMydNoeEtkWldtjTNVsUAtQARAQABzR5BbmRyaXkgR2Fw b24gPGF2Z0BGcmVlQlNELm9yZz7CwZQEEwEIAD4WIQS+LEO7ngQnXA4Bjr538m7TUc1yjwUC WbgsiAIbIwUJBaOagAULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRB38m7TUc1yj+JAEACV l9AK/nOWAt/9cufV2fRj0hdOqB1aCshtSrwHk/exXsDa4/FkmegxXQGY+3GWX3deIyesbVRL rYdtdK0dqJyT1SBqXK1h3/at9rxr9GQA6KWOxTjUFURsU7ok/6SIlm8uLRPNKO+yq0GDjgaO LzN+xykuBA0FlhQAXJnpZLcVfPJdWv7sSHGedL5ln8P8rxR+XnmsA5TUaaPcbhTB+mG+iKFj GghASDSfGqLWFPBlX/fpXikBDZ1gvOr8nyMY9nXhgfXpq3B6QCRYKPy58ChrZ5weeJZ29b7/ QdEO8NFNWHjSD9meiLdWQaqo9Y7uUxN3wySc/YUZxtS0bhAd8zJdNPsJYG8sXgKjeBQMVGuT eCAJFEYJqbwWvIXMfVWop4+O4xB+z2YE3jAbG/9tB/GSnQdVSj3G8MS80iLS58frnt+RSEw/ psahrfh0dh6SFHttE049xYiC+cM8J27Aaf0i9RflyITq57NuJm+AHJoU9SQUkIF0nc6lfA+o JRiyRlHZHKoRQkIg4aiKaZSWjQYRl5Txl0IZUP1dSWMX4s3XTMurC/pnja45dge/4ESOtJ9R 8XuIWg45Oq6MeIWdjKddGhRj3OohsltKgkEU3eLKYtB6qRTQypHHUawCXz88uYt5e3w4V16H lCpSTZV/EVHnNe45FVBlvK7k7HFfDDkryM7BTQRZuCyIARAAlq0slcsVboY/+IUJdcbEiJRW be9HKVz4SUchq0z9MZPX/0dcnvz/gkyYA+OuM78dNS7Mbby5dTvOqfpLJfCuhaNYOhlE0wY+ 1T6Tf1f4c/uA3U/YiadukQ3+6TJuYGAdRZD5EqYFIkreARTVWg87N9g0fT9BEqLw9lJtEGDY EWUE7L++B8o4uu3LQFEYxcrb4K/WKmgtmFcm77s0IKDrfcX4doV92QTIpLiRxcOmCC/OCYuO jB1oaaqXQzZrCutXRK0L5XN1Y1PYjIrEzHMIXmCDlLYnpFkK+itlXwlE2ZQxkfMruCWdQXye syl2fynAe8hvp7Mms9qU2r2K9EcJiR5N1t1C2/kTKNUhcRv7Yd/vwusK7BqJbhlng5ZgRx0m WxdntU/JLEntz3QBsBsWM9Y9wf2V4tLv6/DuDBta781RsCB/UrU2zNuOEkSixlUiHxw1dccI 6CVlaWkkJBxmHX22GdDFrcjvwMNIbbyfQLuBq6IOh8nvu9vuItup7qemDG3Ms6TVwA7BD3j+ 3fGprtyW8Fd/RR2bW2+LWkMrqHffAr6Y6V3h5kd2G9Q8ZWpEJk+LG6Mk3fhZhmCnHhDu6CwN MeUvxXDVO+fqc3JjFm5OxhmfVeJKrbCEUJyM8ESWLoNHLqjywdZga4Q7P12g8DUQ1mRxYg/L HgZY3zfKOqcAEQEAAcLBfAQYAQgAJhYhBL4sQ7ueBCdcDgGOvnfybtNRzXKPBQJZuCyIAhsM BQkFo5qAAAoJEHfybtNRzXKPBVwQAKfFy9P7N3OsLDMB56A4Kf+ZT+d5cIx0Yiaf4n6w7m3i ImHHHk9FIetI4Xe54a2IXh4Bq5UkAGY0667eIs+Z1Ea6I2i27Sdo7DxGwq09Qnm/Y65ADvXs 3aBvokCcm7FsM1wky395m8xUos1681oV5oxgqeRI8/76qy0hD9WR65UW+HQgZRIcIjSel9vR XDaD2HLGPTTGr7u4v00UeTMs6qvPsa2PJagogrKY8RXdFtXvweQFz78NbXhluwix2Tb9ETPk LIpDrtzV73CaE2aqBG/KrboXT2C67BgFtnk7T7Y7iKq4/XvEdDWscz2wws91BOXuMMd4c/c4 OmGW9m3RBLufFrOag1q5yUS9QbFfyqL6dftJP3Zq/xe+mr7sbWbhPVCQFrH3r26mpmy841ym dwQnNcsbIGiBASBSKksOvIDYKa2Wy8htPmWFTEOPRpFXdGQ27awcjjnB42nngyCK5ukZDHi6 w0qK5DNQQCkiweevCIC6wc3p67jl1EMFY5+z+zdTPb3h7LeVnGqW0qBQl99vVFgzLxchKcl0 R/paSFgwqXCZhAKMuUHncJuynDOP7z5LirUeFI8qsBAJi1rXpQoLJTVcW72swZ42IdPiboqx NbTMiNOiE36GqMcTPfKylCbF45JNX4nF9ElM0E+Y8gi4cizJYBRr2FBJgay0b9Cp Message-ID: <672e2c84-b906-4073-0206-7eb1720adc7e@FreeBSD.org> Date: Fri, 6 Apr 2018 17:01:27 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2018 14:07:58 -0000 On 06/04/2018 16:12, Stilez Stilezy wrote: > Thanks Andriy, > > Please read in the manual what ZFS setuid property means. > By the way, it's on by default, so you would typically turn it off if you don't > want suid binaries.  And, of course, suiddir != setuid and ZFS does not support > it, afaict. > TLDR: yes, setuid works; no, it's not suiddir.  > >   > I did look up the ZFS setuid property in the man pages. If there are there pages > I missed, can you point me to them (and sorry for not finding them!) >   > > *[man zfs]:* > >      setuid=on | off >        Controls whether the set-UID bit is respected for the file system. >      >        [Does not say anything else, seems perfectly clear] >   Except that the original, conventional and default meaning for set-UID is for executable files. Also, don't forget that this manual page originated on illumos (!= FreeBSD). E.g., see POSIX: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_stat.h.html S_ISUID 04000 Set-user-ID on execution. S_ISGID 02000 Set-group-ID on execution. > *[man chmod]* - where it's documented what the set-UID bit does when set on > a file system: >   >     4000 (the setuid bit). >       Directories with this bit set will force all files and sub- >       directories created in them to be owned by the directory >       owner and not by the uid of the creating process, if the >       underlying file system supports this feature... Right. The last clause is very important.      >       [Does **not** say that mount -o suiddir is/isn't required, or is/isn't > a "blocker". >        Just says "see suiddir mounting option". But zfs man page has already > said >        the bit **will** be respected. It's a bit conflicting.] Yes, ZFS respects the bit in the standard compliant sense. Not in a FreeBSD-specific optional extension sense. > Like I said, the man pages seem a bit conflicted. *[man zfs]* definitely says it > provides an option for the setuid bit to be respected for the file system - it > doesn't say "for files only" or any other limitation. It just says that setuid > will be "respected for the file system" if the flag is enabled on the dataset.  > *[man chmod]* describes what happens if setuid is "respected on a file system". > It's clear that this will force+inherit directory ownership "if the underlying > file system supports this feature". As [man zfs] already says set_UID will be > "respected", set-UID is clearly supported by ZFS. > > As you can see, I did read the man pages carefully. That's why I asked help to > understand if it was documentation, implementation, or invocation, which was the > issue. > > If the zfs setuid property _doesn't_ mean the same as normal enabling of the > setuid bit functionality, then the [man zfs] page is misleading. If it works > only for files but not for directories, it's also misleading. No, it is not misleading. You just have wrong default expectations :-) This wikipedia page seems to be surprisingly correct: https://en.wikipedia.org/wiki/Setuid > So I hope you can > see, I'm not asking because of failure to read the man pages. I really did read, > and followed them carefully, before asking. >   > So your answer was helpful (thank you!), even if I don't understand what info I > didn't read in the man pages. I have 2 quick points arising: >   > > 1. I gather from your reply that even with this flag set, set-UID for ZFS based > directories' ownership/inheritance is not "respected for the file system" - > or not fully respected in the sense normally understood as in [man chmod]?  > If that's the case then [man zfs] is incorrect - please can you confirm > exactly what is this flag's functionality, since it's unclear? Just to repeat what I said above. It is respected in the normally understood sense. The FreeBSD extension (that has to be enabled via a special non-default kernel option and that works only for a small set of filesystems) is not supported. > 2. Returning to the original issue, is there any way one can automatically > force owner+owner inheritance, for data in a zfs dataset? This is 21st century. Access control lists. > Thank you for your help, even if not the ideal answer. > I hope these last couple of points are easy to clear up, and not annoying :) -- Andriy Gapon From owner-freebsd-fs@freebsd.org Fri Apr 6 17:42:32 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id ABD34F99752 for ; Fri, 6 Apr 2018 17:42:31 +0000 (UTC) (envelope-from stilezy@gmail.com) Received: from mail-wm0-x235.google.com (mail-wm0-x235.google.com [IPv6:2a00:1450:400c:c09::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0E5D26B839; Fri, 6 Apr 2018 17:42:31 +0000 (UTC) (envelope-from stilezy@gmail.com) Received: by mail-wm0-x235.google.com with SMTP id r82so4778247wme.0; Fri, 06 Apr 2018 10:42:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:date:message-id:in-reply-to:references:subject :mime-version:content-transfer-encoding; bh=+dnI7jtTBv1eE6L0TZI7N4cnz1MGal5UQwQMpBUKnpE=; b=K2py1jakeU/p1WXvu1HqBU0IOnDUJzZ4oBEr224er3sqJiBaVzD7o3StOK0Vr5kIsJ 3PVN4iCaWmwINNhw63ryuz+nVvYpM1ik4wtXHkbQ5xs/0ie5C1w9bSIFk+qDMKKsdQte gVH1OEbQfJw+pue+OCEXRyapED9+8t0rWuLXkG2VZE/XUFeQhkJQvklct4RnN7pnnvEN IIyWZhdMeNtQWXKyfW1o6kCnz/XAzf98bI2oSz2EqTpwEOVlGu4othu+bMMBp0rxhaTt CROxfTPcbD4OYpAtKH7t8aT+nFqeG/vGC10+naBnQxZ+74E+aoCvHbUBSLmPcQlSlmo+ iLYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:date:message-id:in-reply-to :references:subject:mime-version:content-transfer-encoding; bh=+dnI7jtTBv1eE6L0TZI7N4cnz1MGal5UQwQMpBUKnpE=; b=L6OVHdTTtlSwYgJyCc+yR4fqJr4I4d3kPF4xyE87/8kBwiV9ZvtBkdypCtJxntw6YU VCLF3j7TQJQRcQGKgsLBvWSIjmJlL306R9c7eQTt5lHt/JEbMIHa7jbgYERlCfmGmyZu 87zLB/JI4BaTq675M2cozBxN/CS9dxZSV5GqRA9DIA64abHkpM9l5CLkl37rRcWbXMzp YTSxhQQycxC2l6coeFhugfqGW3pVh77boP4mMHR1swGGLEm4D9fX40OcGyrb3KUxKSFa Vqc5JzdaEN7EuRackQJVg5/nuxUvA00YovuptNNEZ6OELTv9pUtLeOmXEHajkSkI7uHS CcmA== X-Gm-Message-State: ALQs6tBPgy8vD3Kt2AW1GWDplCsOPiZj34prkksB0IQ30PdS37PILLz/ hVLsrDeNCUoP9u2tUe/eQC5heH4r X-Google-Smtp-Source: AIpwx4+CVhFr+MRrxDp9bxncHt7no7UXeB0mOhL5a48MrHF0kCCuyuZ/HRc2Ka/oLe+vtE37xUKp9g== X-Received: by 10.28.16.18 with SMTP id 18mr13726597wmq.81.1523036549173; Fri, 06 Apr 2018 10:42:29 -0700 (PDT) Received: from [10.27.127.37] ([85.255.237.75]) by smtp.gmail.com with ESMTPSA id o88sm11456764wrb.44.2018.04.06.10.42.28 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 06 Apr 2018 10:42:28 -0700 (PDT) From: Stilez To: Andriy Gapon CC: "freebsd-fs" Date: Fri, 06 Apr 2018 18:42:26 +0100 Message-ID: <1629c0d63d0.2756.49a377fccbf53440a4b582c142a1ed88@gmail.com> In-Reply-To: <672e2c84-b906-4073-0206-7eb1720adc7e@FreeBSD.org> References: <7eba73db-3097-5c8a-eb2c-e3880fb5b501@FreeBSD.org> <672e2c84-b906-4073-0206-7eb1720adc7e@FreeBSD.org> Subject: Re: Does setuid=on work on ZFS datasets, or is the man page for zfs misleading? MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="UTF-8" Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2018 17:42:32 -0000 Thanks Andriy I think we might differ on what "default expectations" are. My ezpextations for FreeBSD are based on the FreeBSD man pages - whatbelse should they be? But we can agree that only FreeBSD is being discussed (not any other OS nor ZFS or permissions norms for any other OS). And on **FreeBSD 11.x** surely the expectations a user should have for setuid and how it's handled are derived 100% from FreeBSD 11.x man pages, not some other "original" or "conventional" meaning that isn't FreeBSD 11.X. And the relevant man page says files+dirs, not just files. If the zfs man page originated on illumos then - like other files originating elsewhere - it really needs inconsistencies not matching **this** OS fixed within FreeBSD builds. tl;dr - I don't think one can say "learn about this from the OS's man pages" if they are misleading **for this OS**, or as fallback say its based on some external or historical norm/meaning that the user should somehow know instead :) That's a bit much to hope for. The kind of user who knows that, probably doesn't need the man page for ZFS or chmod anyway :) I had looked at ACLs before asking. They don't work for this, your info looks wrong AFAIK. They only allow inheritance of permissions, not ownership. None of the ACL flags and nothing in setfacl man page, says anything about ownership inheritance. I'm using NFSv4 of it matters, but I'm guessing that's the default for ZFS based file hierarchies? So the question stands - is there any working method to ensure files in a ZFS dataset or contained dir have a predetermined owner? Including within ACLs if I missed the right page? Thanks again, Stilez On 6 April 2018 3:01:31 pm Andriy Gapon wrote: > On 06/04/2018 16:12, Stilez Stilezy wrote: >> Thanks Andriy, >> >> Please read in the manual what ZFS setuid property means. >> By the way, it's on by default, so you would typically turn it off if you don't >> want suid binaries.  And, of course, suiddir != setuid and ZFS does not support >> it, afaict. >> TLDR: yes, setuid works; no, it's not suiddir.  >> >>   >> I did look up the ZFS setuid property in the man pages. If there are there >> pages >> I missed, can you point me to them (and sorry for not finding them!) >>   >> >> *[man zfs]:* >> >>      setuid=on | off >>        Controls whether the set-UID bit is respected for the file system. >>      >>        [Does not say anything else, seems perfectly clear] >>   > > Except that the original, conventional and default meaning for set-UID is for > executable files. Also, don't forget that this manual page originated on > illumos (!= FreeBSD). > E.g., see POSIX: > http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_stat.h.html > S_ISUID 04000 Set-user-ID on execution. > S_ISGID 02000 Set-group-ID on execution. > >> *[man chmod]* - where it's documented what the set-UID bit does when set on >> a file system: >>   >>     4000 (the setuid bit). >>       Directories with this bit set will force all files and sub- >>       directories created in them to be owned by the directory >>       owner and not by the uid of the creating process, if the >>       underlying file system supports this feature... > > Right. The last clause is very important. >      >>       [Does **not** say that mount -o suiddir is/isn't required, or is/isn't >> a "blocker". >>        Just says "see suiddir mounting option". But zfs man page has already >> said >>        the bit **will** be respected. It's a bit conflicting.] > > Yes, ZFS respects the bit in the standard compliant sense. > Not in a FreeBSD-specific optional extension sense. > >> Like I said, the man pages seem a bit conflicted. *[man zfs]* definitely >> says it >> provides an option for the setuid bit to be respected for the file system - it >> doesn't say "for files only" or any other limitation. It just says that setuid >> will be "respected for the file system" if the flag is enabled on the dataset.  >> *[man chmod]* describes what happens if setuid is "respected on a file system". >> It's clear that this will force+inherit directory ownership "if the underlying >> file system supports this feature". As [man zfs] already says set_UID will be >> "respected", set-UID is clearly supported by ZFS. >> >> As you can see, I did read the man pages carefully. That's why I asked help to >> understand if it was documentation, implementation, or invocation, which >> was the >> issue. >> >> If the zfs setuid property _doesn't_ mean the same as normal enabling of the >> setuid bit functionality, then the [man zfs] page is misleading. If it works >> only for files but not for directories, it's also misleading. > > No, it is not misleading. > You just have wrong default expectations :-) > This wikipedia page seems to be surprisingly correct: > https://en.wikipedia.org/wiki/Setuid > >> So I hope you can >> see, I'm not asking because of failure to read the man pages. I really did >> read, >> and followed them carefully, before asking. >>   >> So your answer was helpful (thank you!), even if I don't understand what info I >> didn't read in the man pages. I have 2 quick points arising: >>   >> >> 1. I gather from your reply that even with this flag set, set-UID for ZFS based >> directories' ownership/inheritance is not "respected for the file system" - >> or not fully respected in the sense normally understood as in [man chmod]?  >> If that's the case then [man zfs] is incorrect - please can you confirm >> exactly what is this flag's functionality, since it's unclear? > > Just to repeat what I said above. It is respected in the normally understood > sense. The FreeBSD extension (that has to be enabled via a special non-default > kernel option and that works only for a small set of filesystems) is not > supported. > >> 2. Returning to the original issue, is there any way one can automatically >> force owner+owner inheritance, for data in a zfs dataset? > > This is 21st century. > Access control lists. > > >> Thank you for your help, even if not the ideal answer. >> I hope these last couple of points are easy to clear up, and not annoying :) > > -- > Andriy Gapon