Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 25 Sep 2014 09:04:56 +0200
From:      Peter Holm <peter@holm.cc>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        FreeBSD FS <freebsd-fs@freebsd.org>
Subject:   Re: Deadlock with umount -f involving tmpfs on top of ZFS on r271170
Message-ID:  <20140925070456.GA43144@x2.osted.lan>
In-Reply-To: <20140924174315.GO8870@kib.kiev.ua>
References:  <5420D5FC.4030600@FreeBSD.org> <20140923131244.GC8870@kib.kiev.ua> <5422240F.4080003@FreeBSD.org> <20140924102758.GH8870@kib.kiev.ua> <20140924132605.GA11772@x2.osted.lan> <20140924134725.GI8870@kib.kiev.ua> <20140924153045.GA15685@x2.osted.lan> <20140924155728.GN8870@kib.kiev.ua> <20140924171509.GA18965@x2.osted.lan> <20140924174315.GO8870@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Sep 24, 2014 at 08:43:15PM +0300, Konstantin Belousov wrote:
> On Wed, Sep 24, 2014 at 07:15:09PM +0200, Peter Holm wrote:
> > On Wed, Sep 24, 2014 at 06:57:28PM +0300, Konstantin Belousov wrote:
> > > On Wed, Sep 24, 2014 at 05:30:45PM +0200, Peter Holm wrote:
> > > > On Wed, Sep 24, 2014 at 04:47:25PM +0300, Konstantin Belousov wrote:
> > > > > On Wed, Sep 24, 2014 at 03:26:05PM +0200, Peter Holm wrote:
> > > > > > The patch is an improvement, but:
> > > > > > 
> > > > > > http://people.freebsd.org/~pho/stress/log/kostik718.txt
> > > > > 
> > > > > Does you load included both rename and link, or only one of those
> > > > > syscalls ?  I see a bug in the rename part of the patch, below is
> > > > > the update.
> > > > > 
> > > > 
> > > > Both. I have split the tests in two now. Uptime is by now one hour.
> > > > I'll let that run for a few hours more, before switching to random
> > > > tests.
> > > > 
> > > > I did get this page fault once:
> > > > http://people.freebsd.org/~pho/stress/log/kostik719.txt
> > > > but I guess it's unrelated? I have recompiled uma_core.c and
> > > > vm_pageout with "-O0" in case it shows up again.
> > > 
> > > This looks unrelated.  But, in the log, I see user-controllable LOR
> > > caused by my patch.  Please use the following update instead.
> > > 
> > > diff --git a/sys/kern/vfs_syscalls.c b/sys/kern/vfs_syscalls.c
> > > index b3b7ed5..a4aa19e 100644
> > > --- a/sys/kern/vfs_syscalls.c
> > 
> > Seems unchanged to me?
> > 
> > 20140924 19:08:31 all (1/2): link.sh
> > lock order reversal:
> >  1st 0xfffff800b06ce068 ufs (ufs) @ kern/vfs_subr.c:2137
> >  2nd 0xfffffe0785edfeb8 bufwait (bufwait) @ ufs/ffs/ffs_vnops.c:261
> >  3rd 0xfffff800b06a6548 ufs (ufs) @ kern/vfs_subr.c:2137
> > KDB: stack backtrace:
> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe081db19150
> > kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe081db19200
> > witness_checkorder() at witness_checkorder+0xdc2/frame 0xfffffe081db19290
> > __lockmgr_args() at __lockmgr_args+0x9d2/frame 0xfffffe081db193c0
> > ffs_lock() at ffs_lock+0x92/frame 0xfffffe081db19410
> > VOP_LOCK1_APV() at VOP_LOCK1_APV+0xfc/frame 0xfffffe081db19440
> > _vn_lock() at _vn_lock+0xd2/frame 0xfffffe081db194b0
> > vget() at vget+0x67/frame 0xfffffe081db194f0
> > vfs_hash_get() at vfs_hash_get+0xe1/frame 0xfffffe081db19540
> > ffs_vgetf() at ffs_vgetf+0x40/frame 0xfffffe081db195d0
> > softdep_sync_buf() at softdep_sync_buf+0xac0/frame 0xfffffe081db196b0
> > ffs_syncvnode() at ffs_syncvnode+0x286/frame 0xfffffe081db19730
> > ffs_sync() at ffs_sync+0x20f/frame 0xfffffe081db197f0
> > dounmount() at dounmount+0x3da/frame 0xfffffe081db19870
> > sys_unmount() at sys_unmount+0x2ec/frame 0xfffffe081db199a0
> > amd64_syscall() at amd64_syscall+0x278/frame 0xfffffe081db19ab0
> > Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe081db19ab0
> > --- syscall (22, FreeBSD ELF64, sys_unmount), rip = 0x800891bca, rsp = 0x7fffffffdf08, rbp = 0x7fffffffe020 ---
> > lock order reversal:
> >  1st 0xfffff800290fa068 ufs (ufs) @ kern/vfs_mount.c:1223
> >  2nd 0xfffff800b0214068 devfs (devfs) @ ufs/ffs/ffs_vfsops.c:1375
> > KDB: stack backtrace:
> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe081db19370
> > kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe081db19420
> > witness_checkorder() at witness_checkorder+0xdc2/frame 0xfffffe081db194b0
> > __lockmgr_args() at __lockmgr_args+0x9d2/frame 0xfffffe081db195e0
> > vop_stdlock() at vop_stdlock+0x3c/frame 0xfffffe081db19600
> > VOP_LOCK1_APV() at VOP_LOCK1_APV+0xfc/frame 0xfffffe081db19630
> > _vn_lock() at _vn_lock+0xd2/frame 0xfffffe081db196a0
> > ffs_flushfiles() at ffs_flushfiles+0x120/frame 0xfffffe081db19710
> > softdep_flushfiles() at softdep_flushfiles+0x232/frame 0xfffffe081db19780
> > ffs_unmount() at ffs_unmount+0xe5/frame 0xfffffe081db197f0
> > dounmount() at dounmount+0x424/frame 0xfffffe081db19870
> > sys_unmount() at sys_unmount+0x2ec/frame 0xfffffe081db199a0
> > amd64_syscall() at amd64_syscall+0x278/frame 0xfffffe081db19ab0
> > Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe081db19ab0
> > --- syscall (22, FreeBSD ELF64, sys_unmount), rip = 0x800891bca, rsp = 0x7fffffffdf08, rbp = 0x7fffffffe020 ---
> > 20140924 19:10:34 all (2/2): link2.sh
> > 
> > with
> > FreeBSD 11.0-CURRENT (PHO) #0 r272060M: Wed Sep 24 19:00:20 CEST 2014
> 
> No, these two are known and harmless.  The patch added new LOR,
> where you link between two different mount points.  The code first
> locked vnodes, and only then checked for EXDEV.  The log show some
> like ufs/tmpfs etc.

Yes thank you, I see.

There seems to be a problem with the patch, causing a umount to
fail:

# umount /mnt
umount: unmount of /mnt failed: Device busy
# 

http://people.freebsd.org/~pho/stress/log/kostik720.txt

-- 
Peter



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140925070456.GA43144>