Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 5 Sep 2016 21:21:42 +0000
From:      Rick Macklem <>
To:        Harry Schmalzbauer <>
Cc:        Konstantin Belousov <>, FreeBSD Stable <>, Mark Johnston <>, "" <>
Subject:   Re: unionfs bugs, a partial patch and some comments [Was: Re: 1-BETA3 Panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ /usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1905]
In-Reply-To: <>
References:  <> <YQBPR01MB0401201977AEA8A803F27B23DD1A0@YQBPR01MB0401.CANPRD01.PROD.OUTLOOK.COM> <> <20160809060213.GA67664@raichu> <> <YTOPR01MB0412B2A08F1A3C1A3B2EB160DD1E0@YTOPR01MB0412.CANPRD01.PROD.OUTLOOK.COM>, <> <YTXPR01MB018919BE87B12E458144E218DD140@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM>, <>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
Harry Schmalzbauer <> wrote:
>Bez=FCglich Rick Macklem's Nachricht vom 18.08.2016 02:03 (localtime):
>>  Kostik wrote:
>> [stuff snipped]
>>> insmnque() performs the cleanup on its own, and that default cleanup is=
not suitable >for the situation.  I think that insmntque1() would betterfit=
 your requirements, your >need to move the common code into a helper.It see=
ms that >unionfs_ins_cached_vnode() cleanup could reuse it.
>> <>;
>> I've attached an updated patch (untested like the last one). This one cr=
eates a
>> custom version insmntque_stddtr() that first calls unionfs_noderem() and=
>> does the same stuff as insmntque_stddtr(). This looks like it does the r=
>> stuff (unionfs_noderem() is what the unionfs VOP_RECLAIM() does).
>> It switches the node back to using its own v_vnlock that is exclusively =
>> among other things.
>Thanks a lot, today I gave it a try.
>With this patch, one reproducable panic can still be easily triggered:
>        I have directory A unionfs_mounted under directory B.
>Then I mount_unionfs the same directory A below another directory C.
>panic: __lockmgr_args: downgrade a recursed lockmgr nfs @
>Result is this backtrace, hardly helpful I guess:
>#1  0xffffffff80ae5fd9 in kern_reboot (howto=3D260) at
>#2  0xffffffff80ae658b in vpanic (fmt=3D<value optimized out>, ap=3D<value
>optimized out>)
>    at
>#3  0xffffffff80ae63c3 in panic (fmt=3D0x0) at
>#4  0xffffffff80ab7ab7 in __lockmgr_args (lk=3D<value optimized out>,
>flags=3D<value optimized out>, ilk=3D<value optimized out>, wmesg=3D<value
>optimized out>,
>    pri=3D<value optimized out>, timo=3D<value optimized out>, file=3D<val=
>optimized out>, line=3D<value optimized out>)
>  >   at /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_lock.c:=
>#5  0xffffffff80ba510c in vop_stdlock (ap=3D<value optimized out>) at
>#6  0xffffffff8111932d in VOP_LOCK1_APV (vop=3D<value optimized out>,
>a=3D<value optimized out>) at vnode_if.c:2087
>#7  0xffffffff80a18cfc in unionfs_lock (ap=3D0xfffffe007a3ba6a0) at
>#8  0xffffffff8111932d in VOP_LOCK1_APV (vop=3D<value optimized out>,
>a=3D<value optimized out>) at vnode_if.c:2087
>#9  0xffffffff80bc9b93 in _vn_lock (vp=3D<value optimized out>,
>flags=3D66560, file=3D<value optimized out>, line=3D<value optimized out>)=
>#10 0xffffffff80a18460 in unionfs_readdir (ap=3D<value optimized out>) at
>#11 0xffffffff81118ecf in VOP_READDIR_APV (vop=3D<value optimized out>,
>a=3D<value optimized out>) at vnode_if.c:1822
>#12 0xffffffff80bc6e3b in kern_getdirentries (td=3D<value optimized out>,
>fd=3D<value optimized out>, buf=3D0x800c3d000 <Address 0x800c3d000 out of
>    count=3D<value optimized out>, basep=3D0xfffffe007a3ba980, residp=3D0x=
>at vnode_if.h:758
>#13 0xffffffff80bc6bf8 in sys_getdirentries (td=3D0x0,
>uap=3D0xfffffe007a3baa40) at
>#14 0xffffffff80fad6b8 in amd64_syscall (td=3D<value optimized out>,
>traced=3D0) at subr_syscall.c:135
>#15 0xffffffff80f8feab in Xfast_syscall () at
>#16 0x0000000000452eea in ?? ()
>Previous frame inner to this frame (corrupt stack?
Ok, I finally got around to looking at this and the panic() looks like a pr=
etty straightforward
bug in the unionfs code.
- In unionfs_readdir(), it does a vn_lock(..LK_UPGRADE) and then later in t=
he code
  vn_lock(..LK_DOWNGRADE) if it did the upgrade. (At line#1531 as noted in =
the backtrace.)
  - In unionfs_lock(), it sets LK_CANRECURSE when it is the rootvp and LK_E=
   (So it allows recursive acquisition in this case.)
--> Then it would call vn_lock(..LK_DOWNGRADE), which would panic if it has=

Now, I'll admit unionfs_lock() is too obscure for me to understand, but...
Is it necessary to vn_lock(..LK_DOWNGRADE) or can unionfs_readdir() just re=
with the vnode exclusively locked?
(It would be easy to change the code to avoid the vn_lock(..LK_DOWNGRADE) c=
 when it has done the vn_lock(..LK_EXCLUSIVE) after vn_lock(..LK_UPGRADE) f=


>I ran your previous patch with for some time.
>Similarly, mounting one directory below a 2nd mountpount crashed the
>machine (forgot to config dumpdir, so can't compare backtrace with the
>current patch).
>Otherwise, at least with the previous patch, I haven't had any other
>panic for about one week.

Want to link to this message? Use this URL: <>