Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 18 Aug 2003 16:13:26 +0900
From:      Seigo Tanimura <tanimura@tanimura.dyndns.org>
To:        arch@freebsd.org
Cc:        Seigo Tanimura <tanimura@tanimura.dyndns.org>
Subject:   Embedding a vnode type to its interlock mutex type
Message-ID:  <200308180713.h7I7DQPE005765@urban>

next in thread | raw e-mail | index | archive | help
In short: A vnode should embed its type name (VREG, VCHR, etc.) in the
type of the interlock mutex to avoid a false LOR alarm by Witness.


The Details:

With my p4 branch (tanimura_socket), I get a LOR shown below:

--- v --- LOR log --- v ---
lock order reversal
 1st 0xfffff800262faa90 local rcv (local rcv) @ kern/uipc_socket.c:1250
 2nd 0xc03c9a08 vm page queue mutex (vm page queue mutex) @ kern/uipc_cow.c:88
Stack backtrace:
_mtx_lock_flags() at _mtx_lock_flags+0x9c
socow_iodone() at socow_iodone+0x1c
m_free() at m_free+0xf4
soreceive() at soreceive+0xe50
soo_read() at soo_read+0x78
dofileread() at dofileread+0x60
read() at read+0x3c
syscall() at syscall+0x280
--- ^ --- LOR log --- ^ ---

where the lock order is:

--- v --- lock order by Witness --- v ---
4   pseudofs_vncache -- last acquired @ fs/pseudofs/pseudofs_vncache.c:227
5    mntvnode -- last acquired @ nfsclient/nfs_vfsops.c:1014
13    vnode interlock -- last acquired @ nfsclient/nfs_vfsops.c:1010
14     Syncer mtx -- last acquired @ kern/vfs_subr.c:1798
14     spechash -- last acquired @ kern/vfs_subr.c:2009
14     vnode_free_list -- last acquired @ kern/vfs_subr.c:760
14     local snd -- last acquired @ kern/uipc_socket.c:2173
15      local rcv -- last acquired @ kern/uipc_socket.c:2174
16       mbuf PCPU list lock -- last acquired @ kern/subr_mbuf.c:926
17        mbuf subsystem general lists lock -- last acquired @ kern/subr_mbuf.c:676
16       Malloc Stats -- last acquired @ kern/kern_malloc.c:324
16       sleep mtxpool -- last acquired @ kern/kern_prot.c:1686
16       socket generation -- last acquired @ kern/uipc_socket.c:263
16       sellck -- last acquired @ kern/sys_generic.c:816
16       UMA pcpu -- (already displayed)
10   vm object_list -- last acquired @ vm/vm_object.c:621
11    kmem object -- last acquired @ vm/vm_meter.c:179
12     vm page queue mutex -- last acquired @ vm/vm_pageout.c:1464
13      vnode interlock -- (already displayed)
--- ^ --- lock order by Witness --- ^ ---

Witness yells out because:

- vm page queue mutex is immediately followed by vnode interlock, and

- local rcv (the receive lock of an AF_LOCAL socket) is locked after
  vnode interlock in another place.

Witness seems to thus presume that vm page queue mutex should have
been acquired before local rcv.  Although Witness treats this as a
LOR, I believe this should be safe because AF_LOCAL sockets and VSOCK
vnodes should not be used during vm operation.

If we distinguished the interlocks for VSOCK vnodes from the ones for
VREG (or VCHR), then there would be two lock orders:

vm page queue mutex
 VREG/VCHR vnode interlock
  Syncer mtx
  spechash
   :
   :

and

VSOCK vnode interlock
 local snd
  local rcv
   vm page queue mutex
    :
    :

They should safely coexist with each other.

In order to accomplish that, the type of a mutex must be changed when
the type of a vnode is changed.  While it might work to just destroy
and reinit the interlock of a vnode, those operations are likely to be
overdoing because the type of a mutex is meaningful for only Witness.
It would hence be better for Witness to provide an API to change the
type of an inited mutex.

For sockets, I implemented a trick to change lock types quite a few
days ago.  The idea is to embed the address family name to the types
of socket locks.  It solved almost all of the false LOR alerts for
routing sockets, where a routing socket may need to be altered during
sending from an inet or inet6 socket.  It is quite easy for sockets to
change its lock type because a socket is always freed when it is
closed.

Comments?

-- 
Seigo Tanimura <tanimura@tanimura.dyndns.org> <tanimura@FreeBSD.org>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200308180713.h7I7DQPE005765>