Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 12 Feb 2019 09:19:30 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Conrad Meyer <cem@freebsd.org>
Cc:        "freebsd-arch@freebsd.org" <arch@freebsd.org>
Subject:   Re: RFC: What to do about VOP_INACTIVE?
Message-ID:  <20190212071929.GB24863@kib.kiev.ua>
In-Reply-To: <CAG6CVpU8J=G8za8W2uan8SAGEbe4PD=SXwMow=E_mkJnMGB96A@mail.gmail.com>
References:  <CAG6CVpU8J=G8za8W2uan8SAGEbe4PD=SXwMow=E_mkJnMGB96A@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Feb 11, 2019 at 04:52:00PM -0800, Conrad Meyer wrote:
> Hello,
> 
> The nominal return type of the VOP_INACTIVE vnode method is 'int', but
> in practice any error returned is silently discarded.
> 
> The only caller is vinactive(), which is also a void routine.
> vinactive ignores the return value of VOP_INACTIVE().  (vinactive
> tends to be called by other void routines, like vput(), so propagating
> an error up the stack is non-trivial.)
> 
> In practice, most filesystems in the kernel unconditionally return
> zero for INACTIVE.  The exceptions are: msdosfs, ext2fs, nandfs, and
> (notably) ufs.
> 
> The problem (as I see it) is that the return type makes it appear that
> INACTIVE is allowed to fail, but it is not.  One important
> ramification of this is that interruptible sleeps in INACTIVE are
> basically not permitted.
> 
> This seems problematic because INACTIVE is invoked as part of
> close(2), and we can potentially block that user process indefinitely
> when the kernel filesystem is stalled on a network resource, or
> something like a FUSE userspace filesystem (which can also access
> network resources).
> 
> Can we live with the current behavior (INACTIVE cannot fail)?  In that
> case, I think we should change its return type to void to match.
> 
> Thoughts?

Our close(2) always removes the file descriptor from the process table,
regardless of the error returned, except for the EBADF situation.
Due to this, if some filesystem like FUSE have to stop executing its
VOP_INACTIVE due to signal, it does not change anything for the caller.

On the other hand, allowing unbound interruptible sleeps in the
implementation of inactive or reclaim is very dangerous practice, since
executing the VOPs on the vnode reclamation from VFS daemons would stop
free vnode supply to the system, effectively blocking it.  In less
dangerous situation, it would block unmount.

I do not think that efforts to change VOP_INACTIVE() return type to void
are worth the time.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190212071929.GB24863>