From nobody Thu Sep 2 10:10:42 2021 X-Original-To: bugs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 3B0A317B16FD for ; Thu, 2 Sep 2021 10:10:42 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4H0cBV0w93z3F2k for ; Thu, 2 Sep 2021 10:10:42 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 0934517289 for ; Thu, 2 Sep 2021 10:10:42 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 182AAf8a042736 for ; Thu, 2 Sep 2021 10:10:41 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 182AAf0M042735 for bugs@FreeBSD.org; Thu, 2 Sep 2021 10:10:41 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 258208] [zfs] locks up when using rollback or destroy on both 13.0-RELEASE & sysutils/openzfs port Date: Thu, 02 Sep 2021 10:10:42 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: bin X-Bugzilla-Version: 13.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: dch@freebsd.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Bug reports List-Archive: https://lists.freebsd.org/archives/freebsd-bugs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-bugs@freebsd.org MIME-Version: 1.0 X-ThisMailContainsUnwantedMimeParts: N https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D258208 Bug ID: 258208 Summary: [zfs] locks up when using rollback or destroy on both 13.0-RELEASE & sysutils/openzfs port Product: Base System Version: 13.0-RELEASE Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: bin Assignee: bugs@FreeBSD.org Reporter: dch@freebsd.org zfs operations such as rollback or destroy deadlock FreeBSD. Subsequent commands such as `mount -p` or `bectl list` also hang. writing data as files still works. dmesg, syslog etc are all empty. ## environment tried under "default" 13.0-RELEASE-p3 zfs, and also under "kmod" zfs-2.1.99-1 zfs-kmod-v2021073000-zfs_7eebcd2be I will build CURRENT with debug, re-try this, and report back. ## pools embiggen/koans dataset (zpool of 4 drive striped mirrors) envy (nvme zpool) pool: embiggen state: ONLINE scan: scrub repaired 0B in 02:42:41 with 0 errors on Thu Aug 26 18:22:44 = 2021 config: NAME STATE READ WRITE CKSUM embiggen ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gpt/zfs0 ONLINE 0 0 0 gpt/zfs1 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 gpt/zfs2 ONLINE 0 0 0 gpt/zfs3 ONLINE 0 0 0 errors: No known data errors pool: envy state: ONLINE scan: scrub repaired 0B in 00:07:20 with 0 errors on Thu Aug 26 15:47:42 = 2021 config: NAME STATE READ WRITE CKSUM envy ONLINE 0 0 0 gpt/envy ONLINE 0 0 0 errors: No known data errors ## default 13.0-RELEASE zfs - the parent process is not a zombie process - not writing a coredump - can't be attached to from lldb etc - won't respond to `kill -CONT ...` ``` $ top -SjwbHzPp 47255 last pid: 20225; load averages: 0.78, 0.81, 0.70 up 2+02:11:14 14:1= 4:38 3744 threads: 10 running, 3693 sleeping, 6 stopped, 35 waiting CPU 0: 2.6% user, 0.8% nice, 1.0% system, 0.0% interrupt, 95.6% idle CPU 1: 3.0% user, 0.9% nice, 1.1% system, 0.6% interrupt, 94.3% idle CPU 2: 3.2% user, 1.0% nice, 1.2% system, 0.0% interrupt, 94.6% idle CPU 3: 3.3% user, 1.0% nice, 1.2% system, 0.0% interrupt, 94.6% idle CPU 4: 3.4% user, 1.0% nice, 1.2% system, 0.1% interrupt, 94.4% idle CPU 5: 3.5% user, 1.0% nice, 1.2% system, 0.1% interrupt, 94.3% idle CPU 6: 3.5% user, 1.0% nice, 1.3% system, 1.1% interrupt, 93.0% idle CPU 7: 3.4% user, 1.0% nice, 1.2% system, 0.0% interrupt, 94.3% idle Mem: 2489M Active, 20G Inact, 472M Laundry, 66G Wired, 8074K Buf, 35G Free ARC: 51G Total, 34G MFU, 9441M MRU, 15M Anon, 518M Header, 7401M Other 36G Compressed, 69G Uncompressed, 1.93:1 Ratio Swap: 252G Total, 252G Free PID JID USERNAME PRI NICE SIZE RES SWAP STATE C TIME WC= PU COMMAND 47255 0 dch 20 0 1316M 111M 0B STOP 4 0:00 0.0= 0% beam.smp{10_dirty_io_sch} 47255 0 dch 20 0 1316M 111M 0B STOP 4 0:00 0.0= 0% beam.smp{sys_sig_dispatc} $ ps augx -p 47255 USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND dch 47255 0.0 0.1 1347516 113316 1 TX 13:57 0:03.48 /usr/local/lib/erlang24/erts-12.0.3/bin/beam.smp -- -root /usr/local/lib/erlang24 -progname erl $ sudo procstat -kk 47255 PID TID COMM TDNAME KSTACK=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 47255 574625 beam.smp sys_sig_dispatc mi_switch+0xc1 thread_suspend_switch+0xc0 thread_single+0x69c exit1+0xc1 sys_sys_exit+0xd amd64_syscall+0x10c fast_syscall_common+0xf8=20 47255 574653 beam.smp 10_dirty_io_sch mi_switch+0xc1 _sleep+0x1cb rms_rlock_fallback+0x90 zfs_lookup+0x7e zfs_freebsd_cachedlookup+0x6b vfs_cache_lookup+0xad lookup+0x68c namei+0x487 kern_statat+0xcf sys_fstatat+0x2f amd64_syscall+0x10c fast_syscall_common+0= xf8=20 ``` This situation repeats after reboot, and shows a hanging zfs rollback or similar command each time: ``` /sbin/zfs rollback -r embiggen/...@pristine 18990 168480 zfs - mi_switch+0xc1 _vm_page_busy_sleep+0x100 vm_page_sleep_if_busy+0x28 vm_object_page_remove+= 0xdf vn_pages_remove+0x4c zfs_rezget+0x35 zfs_resume_fs+0x258 zfs_ioc_rollback+0= x158 zfsdev_ioctl_common+0x4e3 zfsdev_ioctl+0x143 devfs_ioctl+0xc7 vn_ioctl+0x1a4 devfs_ioctl_f+0x1e kern_ioctl+0x26d sys_ioctl+0xf6 amd64_syscall+0x10c fast_syscall_common+0xf8=20 ``` --=20 You are receiving this mail because: You are the assignee for the bug.=